Personally, I just [More or less] finished working on an assignment due Wednesday, and I'm about to go through and do some touch ups and refractoring, and more importantly, documentation and commenting.

I suppose I don't really need to do it, but Python makes built-in documentation so easy, I try to do it with everything I do.

Which got me thinking; how many of you document your functions/classes/modules/methods? If so, when do you start the documentation process? While writting, or after, or before?

~~~

While I'm at it, how is it that one can document in Java, so that another can see it? Is is possible to do so inside of the .java file itself?

I think Javadoc is the tool you are looking for. Look at http://developer.classpath.org/doc/java ... ource.html for some sample Javadocs. That type of comments can be rendered to html with the javadoc command line tool (might be a gui version too? Never used one). This makes it easy for someone to use your API, or method signatures, without being overwhelmed with all the source code.

One can also comment pretty much anywhere with // comment or /* commment */ You might use this to explain a non-intuitive part of your code. However, this may also be a sign of code smell. If you have to explain it, you probably need to simplify it.

If only slightly possible, I try to write some documentation before writing every function or other chunk. In general, if you describe what the function does nicely it helps when actually writing it, and if you can't describe it in simple terms, the functional breakdown is probably less than stellar.

BOOL AddPlayer(Player** players, int* totalPlayers, int* maxPlayers, char* name, COLORREF colour, BOOL isPlaying, int moves, int boxes, int score){ //Takes the total number of players and the maximum number of players as //pointers so it can update them if needed.

The function signature describes what the function does and what the variables are for fairly clearly (usually), and the comments explain why I'm doing things the way I am.

I do enjoy making my javadoc comments overly verbose though. This is from the maze generator I wrote:

/** * Initiliases all member variables with the supplied values. No error * checking, but this is for all intents and purposes a data structure. * @param direction The direction this wall is in relative to the cell * @param x The x position of the cell the wall belongs to * @param y The y position of the aforementioned cell */ public Wall(int direction, int x, int y)

/** * Constructor creates a maze automatically. * @param width The width in cells of the maze * @param height The height in cells of the maze */ public Maze(int width, int height)

There are four kinds of documentation I like: The kind that pops up in an IDE whenever I need it (mainly when I'm coding Java), the kind that follows values around (python style), the kind that's nicely formatted and navigable html (Java again, and Haskell), and types.

I have no fondness for comments that aren't parsed by some tool at some point. If you need to explain what you're doing, you need to choose better names and structure your code better. Those kinds of comments only take care of the symptoms, not their cause.

There are two ways to read code. One is to figure out how to use it, the other is to figure out how to fix it. In the first case, good API documentation should be enough, and in the second case you need to read the code anyway, at which point comments just get in the way.

It is practically impossible to teach good programming to students who are motivated by money: As potential programmers they are mentally mutilated beyond hope of regeneration.

Less is more is the name of the game when it comes to code comments. It makes sense to comment code if there are hidden caveats or side-effects that are not immediately obvious. Particularly intricate algorithms merit a rough outline of what they do as well (not line-by-line though). Hacks should also be documented, especially if they involve inline assembly or little known features of the language.

I edit my posts a lot and sometimes the words wrong order words appear in sentences get messed up.

Berengal wrote:If you need to explain what you're doing, you need to choose better names and structure your code better. Those kinds of comments only take care of the symptoms, not their cause.

Broadly, I agree with this, but it is often (in my work) unavoidable. If you're (say) writing assembly to implement a complicated floating-point algorithm using integer arithmetic, you are going to need to explain what you're doing all over the place, because the actual instruction stream has (almost) nothing superficially in common with what you are actually doing. A single "multiply" might be split over 20 instructions, with other pieces of the algorithm interspersed where there are unavoidable pipeline bubbles. When I'm writing this sort of thing, I usually have a couple paragraphs of explanation for every 5-10 instructions in the code.

GENERATION -16 + 31i: The first time you see this, copy it into your sig on any forum. Square it, and then add i to the generation.

Berengal wrote:I have no fondness for comments that aren't parsed by some tool at some point. If you need to explain what you're doing, you need to choose better names and structure your code better. Those kinds of comments only take care of the symptoms, not their cause.

I couldn't disagree more. Code only says what's happening. It doesn't say that the NIC hardware incorrectly counts broadcast packets in the multicast bin, which is why you have to subtract the number of broadcast packets received out of the multicast bin to get the correct value.

Berengal wrote:I have no fondness for comments that aren't parsed by some tool at some point. If you need to explain what you're doing, you need to choose better names and structure your code better. Those kinds of comments only take care of the symptoms, not their cause.

I couldn't disagree more. Code only says what's happening. It doesn't say that the NIC hardware incorrectly counts broadcast packets in the multicast bin, which is why you have to subtract the number of broadcast packets received out of the multicast bin to get the correct value.

That is not explaining 'what' you're doing. It is explaining 'why' you are doing it[/nitpick]

It is a somewhat important distinction though, because it is very reasonable to comment something like that (or more commonly, where I work, how one of Microsoft's APIs gives us some strange values in certain conditions, which is not documented).

i.e. you do not explain that the variable is being incremented by 1, you explain -why- the variable needs to be incremented by one, -if- it is not obvious. Makes for a vastly different commenting style.

If there's the possibility for any ambiguity in the interpretation of my code, I document it. I've learned the hard way that you don't often remember what you were thinking about at 2 AM a month ago. I comment on every variable when it's declared. On functions I comment on precondition/postcondition. I try to comment every few lines within functions so my logic is followed.

Berengal wrote:I have no fondness for comments that aren't parsed by some tool at some point. If you need to explain what you're doing, you need to choose better names and structure your code better. Those kinds of comments only take care of the symptoms, not their cause.

While you do have a point, there are useful ways use this type of comment. Improving your code can make it more obvious exactly what the code is trying to do, but only comments can explain why, in particular, it is trying to do it. This can include explaining any interesting pieces of maths it relies on, and (relatedly) explaining the general algorithm at a higher level. These things aren't overly useful to be in, say, API documentation (which is usually what goes into nicely formatted HTML, follows values around, and pops up in IDEs) - it is only really useful to other people looking at your code, not for people who just want to call it. Like others have said, good comments (other than the four you mention) tend to be characterised by explaining why the code is there, rather than what it achieves.

Berengal wrote:If you need to explain what you're doing, you need to choose better names and structure your code better. Those kinds of comments only take care of the symptoms, not their cause.

Broadly, I agree with this, but it is often (in my work) unavoidable. If you're (say) writing assembly to implement a complicated floating-point algorithm using integer arithmetic, you are going to need to explain what you're doing all over the place, because the actual instruction stream has (almost) nothing superficially in common with what you are actually doing. A single "multiply" might be split over 20 instructions, with other pieces of the algorithm interspersed where there are unavoidable pipeline bubbles. When I'm writing this sort of thing, I usually have a couple paragraphs of explanation for every 5-10 instructions in the code.

We write very different code. I code to some abstract machine in my head, not actual hardware. I can see how short comments naming various blocks of code could be useful in assembly, but in higher-level languages you have a much higher name/instruction ratio, and a much greater ability to abstract. Anyway, if you're implementing a complicated algorithm, the documentation should exist in one place and the code in another (possibly with short comment-names refering to things named in the documentation). Interleaving the code and explaination only serves to confuse the reader, as he has to read two different pieces of text at the same time, greatly increasing the chance of a stack overflow of the human brain.

Rysto wrote:

Berengal wrote:I have no fondness for comments that aren't parsed by some tool at some point. If you need to explain what you're doing, you need to choose better names and structure your code better. Those kinds of comments only take care of the symptoms, not their cause.

I couldn't disagree more. Code only says what's happening. It doesn't say that the NIC hardware incorrectly counts broadcast packets in the multicast bin, which is why you have to subtract the number of broadcast packets received out of the multicast bin to get the correct value.

This is what abstraction was made for. Make a new getMulticaseCount function, put som doc-comments on it even if you don't export it. Putting comments on the internal API is also useful, if it pops up in the IDE when you need it.

lvc wrote:

Berengal wrote:I have no fondness for comments that aren't parsed by some tool at some point. If you need to explain what you're doing, you need to choose better names and structure your code better. Those kinds of comments only take care of the symptoms, not their cause.

While you do have a point, there are useful ways use this type of comment. Improving your code can make it more obvious exactly what the code is trying to do, but only comments can explain why, in particular, it is trying to do it. This can include explaining any interesting pieces of maths it relies on, and (relatedly) explaining the general algorithm at a higher level. These things aren't overly useful to be in, say, API documentation (which is usually what goes into nicely formatted HTML, follows values around, and pops up in IDEs) - it is only really useful to other people looking at your code, not for people who just want to call it. Like others have said, good comments (other than the four you mention) tend to be characterised by explaining why the code is there, rather than what it achieves.

I disagree. Explaining the math or this piece of code's place in the way of things is something that belongs at the class/module/package level. If you feel it doesn't, perhaps you should restructure your modules so it does. At least in my experience that has proved to be the better course of action, not just for the documentation, but in fact mainly to improve the general readability of the code and to ease further reasoning about it.

Additionally, I at least tend to ignore long blocks of comments explaining the finer points of the code whenever I'm trying to figure out why it's one off on negative numbers. To put it simply, pure unparsed comments are usually only glanced briefly upon; the code is much better at telling me what I need to know at that point.

It is practically impossible to teach good programming to students who are motivated by money: As potential programmers they are mentally mutilated beyond hope of regeneration.

I have to agree with Berengal, at least in principle. Well structured code is better than well commented code. That said, there are enough situations I can think of where you can't achieve that ideal...

The language or environment might be a limit,especially when performance is an issue, but also when you are working in some specific-purpose environment, like say Matlab, not to mention some horrible corporate hacked-together environments I have seen. Creating a nice environment is not something you can afford for every task. Fortran77 also still exists, and there is only so much information you can put in 6-character variable names.

And as already mentioned, sometimes you just have to use a nasty trick yourself to make things run. Ideally, you shouldn't, but the work has to be finished and removing the trick might involve redoing a large amount of work. In such cases, a little explanation next to it is IMO not worse than creating a DoTheNastyHack() function.

Rysto wrote:I couldn't disagree more. Code only says what's happening. It doesn't say that the NIC hardware incorrectly counts broadcast packets in the multicast bin, which is why you have to subtract the number of broadcast packets received out of the multicast bin to get the correct value.

This is what abstraction was made for. Make a new getMulticaseCount function, put som doc-comments on it even if you don't export it. Putting comments on the internal API is also useful, if it pops up in the IDE when you need it.

I understood Rysto to be saying that the code in question is already along these lines:

The API documentation for getMulticastPackets() can't really explain why it is needed here. But that this happens is an implementation detail of getPackets(), which doesn't really belong in its API documentation - clients don't care about the adjustments it makes to the number the hardware gives, only that the end result is right. People looking through this code (to fix a bug, for example) do care about this: so you give them a comment along the lines of '# Card incorrectly includes multicast packets in this'.

You propose restructuring the module to avoid situations like this, but I fail to see how that can be done in this case. As another example, this is a function I wrote in python to determine the powerset of some set:

def powerset(s): "Calculate the power set - the set of all subsets - of s"

s = tuple(s) # Need indexing pset = set()

# Each unique element of the powerset is defined by the # presence or absence of each element of the original set. # So, if we use '1' for 'element is there' and '0' for 'not', # then we can generate each subset by counting from 0 to the # cardinality of the set (inclusive) in binary. for i in range(int("1" + "0"*len(s), 2)): key = basen(i, 2)

Why I add zeros to the front of key (the current comment here isn't perfect, being a little void of detail, but it is better than none at all)

Why I play around with binary numbers in the first place

I would argue that none of these things belong in the doc comment (since, again, clients don't care how the powerset is calculated), but these details are not obvious from the code, and are necessary to understand the function if you want to modify it.

While it would be ideal to structure code such that these explanations aren't needed, that isn't always something that is reasonable.

The API documentation for getMulticastPackets() can't really explain why it is needed here.

I would say that this is already close to clear code, it is obvious from reading that the multicast packages are being removed, which in turn implies that the original variable included them. If anything, you could change the original variable name:

But of course, it's just individual style, and I write a lot of code like yours. But I agree with Berengal that often these kinds of comments are a hint that the code itself should be made more clear, and the ideal of not needing large comment blocks is a good one to strive for.

We have a coding standard where I work that requires a block of legalese at the head of the file, then a short statement describing the overall purpose of the class or module (in high-level English), then a short block at the head of each method describing the method's purpose, inputs, outputs, and return value (again, all high-level descriptions). Inline comments are typically limited to referencing standards, identifying patches, or justifying a hack.

I try to document such that when I come back to the code in 5 years I can not only understand what I did, I can also identify why I did it that way. And sometimes this does include what a variable name is. Long descriptive names are all very well sometimes, but if they appear multiple times in a complex formula, it's often clearer to use short names so you can see its structure. There's a reason we use x and y in algebra, and not the-variable-along-the-x-axis and the-variable-along-the-y-axis.

Zamfir wrote:I would say that this is already close to clear code, it is obvious from reading that the multicast packages are being removed, which in turn implies that the original variable included them.

Which is fine, until some programmer goes and looks at the hardware documentation which says that broadcast packets and multicast packets are counted separately, says "WTF is that code doing?" and "fixes" it.

i've become a huge fan of documentation. i didn't see much point so long as the code was written clearly when i was in school, but after getting out into the 'real world', i've come around. case in point: i currently am in charge of updating and maintaing a several hundred page website, which is almost a decade old, and it was designed and updated by a half dozen people over that time period. any given page will have sql, javascript, vb, asp, .net, html, css, etc, and no guarantees on who did what over the years, so the code / variable / comment styles are all over the place. sure, websites are less hairy than debugging a huge java app, but it's nice to have some commentation in there to clarify what someone was thinking at 2AM in 1999.

Grumpy Code Monkey wrote:We have a coding standard where I work that requires a block of legalese at the head of the file, then a short statement describing the overall purpose of the class or module (in high-level English), then a short block at the head of each method describing the method's purpose, inputs, outputs, and return value (again, all high-level descriptions). Inline comments are typically limited to referencing standards, identifying patches, or justifying a hack.

I would love this level of commenting. Right now, I'm porting a library with Windows and Mac versions to Linux, and the commenting standard used by the original programmer involves things like "/* Installs the specified font. Obviously machine-specific */" on a function called InstallFont() in "winfuncs.c". To make things more exciting, these functions don't always do what the description says, usually have undocumented side effects, and are sometimes called for the side effect alone, with the main result being discarded.

The Linux port is getting comments that are essentially my research notes on what the Mac and Windows versions of functions do, with my conclusions on which bits are important functionality and which aren't. It's not uncommon for a 50-line function to be preceeded by 200 lines of comments.

Berengal wrote:There are four kinds of documentation I like: The kind that pops up in an IDE whenever I need it (mainly when I'm coding Java), the kind that follows values around (python style), the kind that's nicely formatted and navigable html (Java again, and Haskell), and types.

I have no fondness for comments that aren't parsed by some tool at some point. If you need to explain what you're doing, you need to choose better names and structure your code better. Those kinds of comments only take care of the symptoms, not their cause.

There are two ways to read code. One is to figure out how to use it, the other is to figure out how to fix it. In the first case, good API documentation should be enough, and in the second case you need to read the code anyway, at which point comments just get in the way.

Berengal wrote:There are four kinds of documentation I like: The kind that pops up in an IDE whenever I need it (mainly when I'm coding Java), the kind that follows values around (python style), the kind that's nicely formatted and navigable html (Java again, and Haskell), and types.

I have no fondness for comments that aren't parsed by some tool at some point. If you need to explain what you're doing, you need to choose better names and structure your code better. Those kinds of comments only take care of the symptoms, not their cause.

There are two ways to read code. One is to figure out how to use it, the other is to figure out how to fix it. In the first case, good API documentation should be enough, and in the second case you need to read the code anyway, at which point comments just get in the way.

While I agree with you, I do think comments like TODO, FIXME, HACK, UNDONE, and other such comments are useful to simply act as a trail of breadcrumbs while I'm working so I know what I need to work on later.

Berengal wrote:There are four kinds of documentation I like: The kind that pops up in an IDE whenever I need it (mainly when I'm coding Java), the kind that follows values around (python style), the kind that's nicely formatted and navigable html (Java again, and Haskell), and types.

I have no fondness for comments that aren't parsed by some tool at some point. If you need to explain what you're doing, you need to choose better names and structure your code better. Those kinds of comments only take care of the symptoms, not their cause.

There are two ways to read code. One is to figure out how to use it, the other is to figure out how to fix it. In the first case, good API documentation should be enough, and in the second case you need to read the code anyway, at which point comments just get in the way.

While I agree with you, I do think comments like TODO, FIXME, HACK, UNDONE, and other such comments are useful to simply act as a trail of breadcrumbs while I'm working so I know what I need to work on later.

I use those myself, but it's not what I'd call documentation. They're just labels for work you've pushed on the stack (and are usually parsed by an IDE anyway, or if not, grepped for manually).

It is practically impossible to teach good programming to students who are motivated by money: As potential programmers they are mentally mutilated beyond hope of regeneration.

Having maintained a dozen large programs simultaneously, I learned to write comments and documentation everywhere to get up to speed ASAP, without having to re-read code and remember DB schemas, &c. It's very helpful to have a human interpretation and the coder's intent chronicled to track down bugs and add/refactor code.Funny how much you lose after a 2-week vacation, and how much it helps - esp. noticeable when your coworkers don't have the same habit."Legacy" (useful) code always has a patchwork of fixes and added functionality which wasn't in the design. Keeping a high-level document in sync with code and comments is crucial (and rare) for apps to run for decades without total rewrites (which is never in the budget).

Berengal wrote:I have no fondness for comments that aren't parsed by some tool at some point. If you need to explain what you're doing, you need to choose better names and structure your code better. Those kinds of comments only take care of the symptoms, not their cause.

Sure, that may be ideal; in practice, though, not realistic. On a project of any significant scale you have to work with a number of 3rd-party libraries, platforms and tools, each version of which will have its quirks, bugs, work-arounds, and interfaces which may change - the "cause" is outside of your control. Often you'll have to support several versions of a library, and completely restructuring your code to accommodate the newest one is not practical.Comments in code are important to communicate with your fellow coders (and your future self) the minutiae, hints, warnings, reasoning and options, which is hard to do or would just trash your design docs.

The problem with too much documentation is that documentation also has to be maintained. If you delete a variable, the IDE will tell you all the code it broke so you can fix it. It will not tell you all all the comments that are no longer correct. In most cases, I would rather just read the code next time to figure out what it does, rather than doubling the development time trying to maintain plain text comments that the IDE can't assist me with.

Okay, so I've noticed many people agreeing that well written code speaks for itself. While this may be true for experienced programmers, when you're just starting out, like I am, plain English in other people's code is no end of help. And if you're working in a place where there's lots of programmers maintaining years of other people's code, coding for the new guy that's going to be employed to replace you when you resign is a good thing to do.

And what if you always make sure to use most descriptive and useful variable and method names, but you're working with an API that does not. You might be so familiar with it that you don't notice the problems with it, but every programmer that needs to read your code might not.

I already have a hate thread. Necromancy > redundancy here, so post there.

roc314 wrote:America is a police state that communicates in txt speak...

It may be necessary to use comments to explain what some code is doing, and perhaps to explain why it's doing it. They should not be required to explain how it's doing it, unless the code is using some obscure but efficient algorithm. In general, use of arcane clever code is discouraged, since it breaks Kernighan's rule: "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it."

We have mandatory commenting for all public interfaces at work (via C#'s xml comments), particularly since anything public can be seen and used by customers in their own calls to our code. Outside of that, I don't comment too much, as has been said, if it's not easily discernible by the code what you're doing, you probably shouldn't be doing it. There's still a few exceptions here and there, or I might add comments to a larger function to break it down into smaller blocks but brevity and code clarity is generally better.

I find the problem with documentation (or lack thereof) is less code comments, and more for non-code I think. Sometimes it'd be really handy to have a spec document for a feature, just to know what it can do/basic configuration/ect. and at work often times we either don't have it or it's way out of date (and the help files we have are only so helpful).

I like commenting my code if only for one reason: Understanding it. Soundbackwards right? If I wrote the code I should know what it does. Right.

Until I stop working on it and come back to it a year later. Even well structuredprograms take time to understand, especially if you're working with large projects.Having those comments there makes it much faster for me to remember how mycode is structured and how things work.

Small projects (< 10k loc) are nice because you can usually fit the whole thing withinyour head but once you start getting bigger, the comments *really* come in handy.

I develop in VS mostly and I use triple comments right before every function, propertyor class with (usually) one line explanation. It's nice when I start typing out new codeand the auto complete includes the comments when I'm looking for what to use

http://en.wikipedia.org/wiki/DSV_Alvin#Sinking wrote:Researchers found a cheese sandwich which exhibited no visible signs of decomposition, and was in fact eaten.

I always document the interfaces to my code but rarely document the code itself. TO paraphrase Fred Brooks, "Show me your code and conceal your APIs, and I shall continue to be mystified. Show me your APIs, and I won't usually need your code; it'll be obvious."

Zamfir wrote:I would say that this is already close to clear code, it is obvious from reading that the multicast packages are being removed, which in turn implies that the original variable included them.

Which is fine, until some programmer goes and looks at the hardware documentation which says that broadcast packets and multicast packets are counted separately, says "WTF is that code doing?" and "fixes" it.

//have to use strncpy here because src isn't guaranteed to be null terminated. Don't use strlcpy; it might segfaultstrncpy(dest, src, dest_len-1);dest[dest_len] = '\0';

I can guarantee you that without the comment, somebody will come along and "fix" this code by replacing strncpy with strlcpy

And "not guaranteed" is an important part.

I've found that when working on large code bases, you don't get clean code. You get code that has a whole bunch of corner cases in it.

You don't know that the person writing it was making a mistake, or was doing something smart at each line. Maybe they did a strncpy because they screwed up, or maybe it was because the incoming string may not be 0 terminated. Without documentation there, there is no way the next person can see it.

And I've seen functions that consist of dozens or more of such things -- each catching a corner case that a clean, simple algorithm wouldn't have to deal with. What is worse, sometimes they aren't documented why they are doing some strange operation. Which means when I want to refactor some code, and it appears that the old code was doing something that seems extremely strange, I need to track down where that code was written, examine the check-in notes, find the bug or issue that was associated with it, discover that the database that stored bugs 8 years ago isn't online anymore, try to track down the person who wrote it...

In a non-trivial code base, solving non-toy problems, you need documentation. Much documentation can be loaded into the names of functions, the names of parameters and the signatures of functions, the flow of code -- but not all documentation can. And while you can take every strange case and build a "clean API", there will be corner cases.

I even put pseudo code in a function, or describe why my solution to a problem is less than ideal, and if I had my druthers what I'd do to fix it.

One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

I like documentation whenever I can't reasonably quickly understand what code is doing, and why, from the code. Unfortunately this varies per programmer, and is probably an AI-hard problem.

To that end, I'm going to say the Haskell main libraries have the best kind of documentation. The documentation tells you what the code does and how to use it without having to actually see the code, and the functions themselves are usually tidy and concise enough that you can sit down and figure out exactly how and why one works just from the documentation of the functions it uses (IE you are not digging through code recursively.) They also tend to avoid any sort of gotchas or corner cases in the main libraries, except for things like head and tail in Prelude.

1) Before coding - If it's not immediately obvious how I'm going to do something, or if I can't think of an identifier that's both brief and obviously descriptive. This also helps organizing my work and these comments can always be removed if they're inane.

2) After coding - If I've done anything abnormal (relative to either myself or peers); getters and setters are obvious to everyone, lookup enums aren't. Also if a block of code stretches on awhile, it can help to have reminders of the forest in amongst the trees (Though I generally consider it bad form to have a function/method stretch for more than one screen). Also consider blank lines and statement order (when flexible) to enhance readability. If your language is not strictly typed comment on type as strongly as possible.

3) When reviewing - Often this happens with other people's code. If I have to think/ask about what/ how/ why code does something it's a good idea to record the answer. Also in real world applications it's often much better to put a TODO or comment questioning something than fix it right away (i.e. that strncpy example above).

If you're looking to improve your commenting I'd recommend asking a classmate to read your code and see if they understand how your program works.

The thing about recursion problems is that they tend to contain other recursion problems.