Friday, 23 November 2012

Javadoc coding standards

Javadoc is a key part of coding in Java, yet there is relatively little discussion of what makes good
Javadoc style - a coding standard.

Javadoc coding standard

These are the standards I tend to use when writing Javadoc.
Since personal tastes differ, I've tried to explain some of the rationale for some of my choices.
Bear in mind that this is more about the formatting of Javadoc, than the content of Javadoc.

There is an Oracle guide
which is longer and more detailed than this one.
The two agree in most places, however these guidelines are more explicit about HTML tags,
two spaces in @param and null-specification, and differ in line lengths and sentence layout.

Each of the guidelines below consists of a short description of the rule and an explanation, which may include an example:

Write Javadoc to be read as source code

When we think of "Javadoc" we often think of the
online Javadoc HTML pages.
However, this is not the only way that Javadoc is consumed.
A key way of absorbing Javadoc is reading source code, either of code you or your team wrote,
or third party libraries.
Making Javadoc readable as source code is critical, and these standards are guided by this principal.

Public and protected

All public and protected methods should be fully defined with Javadoc.
Package and private methods do not have to be, but may benefit from it.

If a method is overridden in a subclass, Javadoc should only be present
if it says something distinct to the original definition of the method.
The @Override annotation should be used to indicate to source code readers
that the Javadoc is inherited in addition to its normal meaning.

Use the standard style for the Javadoc comment

Javadoc only requires a '/**' at the start and a '*/' at the end.
In addition to this, use a single star on each additional line:

Javadoc uses HTML tags to identify paragraphs and other elements.
Many developers get drawn to the thought that XHTML is necessarily best, ensuring that all
tags open and close correctly. This is a mistake.
XHTML adds many extra tags that make the Javadoc harder to read as source code.
The Javadoc parser will interpret the incomplete HTML tag soup just fine.

Use a single <p> tag between paragraphs

Longer Javadoc always needs multiple paragraphs.
This naturally results in a question of how and where to add the paragraph tags.
Place a single <p> tag on the blank line between paragraphs:

/**
* First paragraph.
* <p>
* Second paragraph.
* May be on multiple lines.
* <p>
* Third paragraph.
*/
public ...

Use a single <li> tag for items in a list

Lists are useful in Javadoc when explaining a set of options, choices or issues.
These standards place a single <li> tag at the start of the line and no closing tag.
In order to get correct paragraph formatting, extra paragraph tags are required:

The first sentence, typically ended by a dot, is used in the next-level higher Javadoc.
As such, it has the responsibility of summing up the method or class to readers scanning
the class or package.
To achieve this, the first sentence should be clear and punchy, and generally short.

While not required, it is recommended that the first sentence is a paragraph to itself.
This helps retain the punchiness for readers of the source code.

It is recommended to use the third person form at the start.
For example, "Gets the foo", "Sets the "bar" or "Consumes the baz".
Avoid the second person form, such as "Get the foo".

Use "this" to refer to an instance of the class

When referring to an instance of the class being documented, use "this" to reference it.
For example, "Returns a copy of this foo with the bar value updated".

Aim for short single line sentences

Wherever possible, make Javadoc sentences fit on a single line.
Allow flexibility in the line length, favouring between 80 and 120 characters to make this work.

In most cases, each new sentence should start on a new line.
This aids readability as source code, and simplifies refactoring re-writes of complex Javadoc.

/**
* This is the first paragraph, on one line.
* <p>
* This is the first sentence of the second paragraph, on one line.
* This is the second sentence of the second paragraph, on one line.
* This is the third sentence of the second paragraph which is a bit longer so has been
* split onto a second line, as that makes sense.
* This is the fourth sentence, which starts a new line, even though there is space above.
*/
public ...

Use @link and @code wisely

Many Javadoc descriptions reference other methods and classes.
This can be achieved most effectively using the @link and @code features.

The @link feature creates a visible hyperlink in generated Javadoc to the target.
The @link target is one of the following forms:

/**
* First paragraph.
* <p>
* Link to a class named 'Foo': {@link Foo}.
* Link to a method 'bar' on a class named 'Foo': {@link Foo#bar}.
* Link to a method 'baz' on this class: {@link #baz}.
* Link specifying text of the hyperlink after a space: {@link Foo the Foo class}.
* Link to a method handling method overload {@link Foo#bar(String,int)}.
*/
public ...

The @code feature provides a section of fixed-width font, ideal for references to
methods and class names.
While @link references are checked by the Javadoc compiler, @code references are not.

Only use @link on the first reference to a specific class or method.
Use @code for subsequent references.
This avoids excessive hyperlinks cluttering up the Javadoc.

Never use @link in the first sentence

The first sentence is used in the higher level Javadoc.
Adding a hyperlink in that first sentence makes the higher level documentation more confusing.
Always use @code in the first sentence if necessary.
@link can be used from the second sentence/paragraph onwards.

Do not use @code for null, true or false

The concepts of null, true and false are very common in Javadoc.
Adding @code for every occurrence is a burden to both the reader and writer of the
Javadoc and adds no real value.

Use @param, @return and @throws

Almost all methods take in a parameter, return a result or both.
The @param and @return features specify those inputs and outputs.
The @throws feature specifies the thrown exceptions.

The @param entries should be specified in the same order as the parameters.
The @return should be after the @param entries, followed by @throws.

Use @param for generics

If a class or method has generic type parameters, then these should be documented.
The correct approach is an @param tag with the parameter name of <T> where T
is the type parameter name.

Use one blank line before @param

There should be one blank line between the Javadoc text and the first @param
or @return. This aids readability in source code.

Treat @param and @return as a phrase

The @param and @return should be treated as phrases rather than complete sentences.
They should start with a lower case letter, typically using the word "the".
They should not end with a dot.
This aids readability in source code and when generated.

Treat @throws as an if clause

The @throws feature should normally be followed by "if" and the rest of the
phrase describing the condition.
For example, "@throws IllegalArgumentException if the file could not be found".
This aids readability in source code and when generated.

@param should two spaces after the parameter name

When reading the Javadoc as source code, a single space after the parameter name is
a lot harder to read than two spaces. Avoid aligning the parameters in a column,
as it is prone to difficulty in refactoring where parameter names are changed or added.

Whether a method accepts null on input, or can return null is critical information
for building large systems.
All non-primitive methods should define their null-tolerance in the @param or @return.
Some standard forms expressing this should be used wherever possible:

"not null" means that null is not accepted and passing in null will
probably throw an exception , typically NullPointerException

"may be null" means that null may be passed in. In general the behaviour
of the passed in null should be defined

"null treated as xxx" means that a null input is equivalent to the specified value

While it may be tempting to define null-handling behaviour in a single central location,
such as the class or package Javadoc, this is far less useful for developers.
The Javadoc at the method level appears in IDEs during normal coding, whereas class
or package level Javadoc requires a separate "search and learn" step.

Other simple constraints may be added as well if applicable, for example "not empty, not null".
Primitive values might specify their bounds, for example "from 1 to 5", or "not negative".

Specifications require implementation notes

If you are writing a more formal specification that will be implemented by third parties,
consider adding an "implementation notes" section.
This is an additional section, typically at the class level, that specifies any behaviours
required by implementations that are not otherwise specified, or not of general interest.
See this example.

Avoid @author

The @author feature can be used to record the authors of the class.
This should be avoided, as it is usually out of date, and it can promote code ownership by an individual.
The source control system is in a much better position to record authors.

There's no reason to think that every get method simply returns some member of the class, they can very well be large methods possibly even with side effects. Those using the method perhaps from a just without actually seeing the code should be aware company,or simplicity, of the method.

I agree with the above comment.We should not be trying to do something efficiently which should not be done at all.

There are excellent opportunities to document well through JavaDocs for public APIs.Here is a sample from lucene code that explains fairly complex feature very well.http://lucene.apache.org/core/old_versioned_docs/versions/2_9_0/api/all/org/apache/lucene/search/Similarity.html

Just took example from your example, Sorry can't appreciate the necessity of for having this that too for private variables.

/** * The year. */ private final int year; /** * The month-of-year, not null. */ private final short month; /** * The day-of-month. */ private final short day;

I agree too with previous comments (that "all public and protected methods should be fully defined with Javadoc" is a bad idea).I add that compressed comments (on a single line) should be greatly encouraged #myScreenSpaceIsValuable

I agree with most of the advice here. In particular, I like the idea of sentence-line documentation. I thought I was the only one who encouraged that.

I would add that, whatever your documentation guidelines, they should be described in the Javadoc of package-info.java at the highest appropriate level in the project, or in overview.html. I shouldn't have to explicitly identify every parameter as being non-null if that is the explicit understanding. When something deviates from the guidelines, I try to use consistent phrasing, both for readability and in order to create an expectation that facilitates understanding. For example, "This argument may be null, in which case a default value of xxx is assumed."

Personally, I always choose to document getters and setters. I understand the "doesn't add value" argument, but disagree on two counts. Firsly, its inconsistent - why should those methods have a special status. Secondly, the Javadoc is useful to me. Not all methods starting with "get" or "set" are simple getters/setters. This can be discovered by doing ctrl+space in the IDE and seeing the Javadoc. If it is simply 4 or 5 words of a standard pattern, then it is reasonable to assume the getter is simple. If there is no Javadoc I can't tell anything.

And since I often generate my Javadoc (in a better way than IDE generation), there is no effort in writing.

I tend to document private fields, as I have a memory like a sieve. I believe most of us do. Any piece of information is helpful when looking at a piece of code 6 months later.

@Fraaargh, I'm not a huge fan of compressed comments. Buy yourself a bigger screen or use comment folding in your IDE. Sometimes they make sense, such as in inner classes.

@Nathan, as I indicated, I strongly believe that documenting null-handling centrally is useless to developers casually using your API. Most developers do not read the Javadoc as a whole, just the small parts after ctrl+space. Thus, that is where the null-handling needs to be defined. Adding ", not null" to the end of each parameter is not a burden.

Do you consider null handling to be of special interest to documentation because of its ubiquity with regard to reference types? I ask because I tend to centrally document a number of such constraints, not just whether a value can be null. For example, floats/doubles are not NaN unless specified, arrays may be zero-length unless specified, collections/maps may be empty unless specified, collections/maps do not contain null unless specified, etc. Documenting each variation of only the null constraint feels inconsistent, but I think documenting each variation of each constraint for each documented value is tedious and more prone to cause documentation errors. Worse, it has been my experience (anecdotal, granted) that documenting every non-exceptional case has the affect of washing out the exceptional cases, making the users of my APIs more likely to miss the exceptional cases.

Personally, I consider overview and package-level documentation to be essential, not intended to be ignored by even the casual developer. I can't help but feel there's a bit of a RTFM problem here, and I'm not convinced that the "locally document everything" approach minimizes the probability of misuse.

I find null to be the most common error, resulting in NPE. By defining the expected behaviour on the parameter, I am forced to think about that case and what the code should do. I also indicate to others that I have thought about it.

The impact in code maintenance is very positive. If a NPE occurs, it is clear who is at fault. If the NPE occurs in a method that declares it accepts null, then that method is at fault. If the method declares it does not accept null, the caller is at fault. This can be baked into teams as a very simple rule to follow.

But it relies on those writing code in the first place to actually think about null inputs/outputs for every single case. And adding documentation inline is the best proof that they did.

Thanks for your post Stephen. I agree with much with what you said but have two thoughts.

1) What is your opinion with regards to using annotations to document "nullness"? May be we need to wait until JSR308 and JSR305 are done?

2) I noticed that the threeten code documents thread safety, should this be a part of your post as well? Also, do you think we should have a standard set of annotations to help define these levels? (Effective Java 2nd, Item #70 mentions the idea of thread safety levels)

I dislike the null annotations, as they are very verbose within the source code. (I'm aware methods can "inherit" from classes, but find that to be unhelpful).

Its also the case that a method with parameters marked @NotNull can still accept nulls. Its only if the annotation checker is actually plugged in that the extra safety arrives. That means that the tool provides false safety, something I strongly object to.

The "right" answer is something like Fantom (or other languages) where the type system can recognise and manage the difference between a nullable and non-nullable.

You are right that thread-safety should be documented. I think standard annotations only tend to work if they are in the JDK. This may happen to some degree with John Rose's value types.

This comments demonstrate exactly what is wrong with most developers today.

All public/protected methods *should* be documented. As should private variables. You're taking too much for granted when you this is obvious. It's *not* obvious whether a method return value may be null or not. It's *not* obvious what a private variable means a few months after you wrote the code. All this stuff *must* be documented.

You're not the target audience of your own code. Other people are. Anyone who thinks this is obvious should be made to maintain someone else's code for a couple of years. Trust me, it's hell.

I make it a policy to dump employees who don't document their code properly. I give them many warnings but if they persist I show them the door. You can't build a team without a team-oriented attitude.

My comment on p/ul is based on the tag soup interpretation I have observed. I don't doubt that ul shouldn't be in a paragraph (although thats exactly the kind of official rule that makes XHTML so hard for humans to write).

I agree with Marcin but would extend his argument to include all general purpose libraries and frameworks, including those developed for internal use only. Other classes should be self-documenting by sensible use of variable, method and class names, and unit tests. My colleagues at work all disagree with me and advocate comprehensive Javadoc, and without exception they all write totally useless comments that add no value whatsoever, just code-bloat.

I prefer to not add comments for getters and setters, by default. I find this adds noise. If a getXYZ or setXYZ method has additional side-effect or does something other than just returning or setting a value, then yes, a comment in this case should be present to document the unconventional behavior. But an even better solution, if possible, would be to rename these methods or rewrite the code somehow so that the methods are indeed the expected simple getters/setters.

Another situation where documenting getters and setters could be useful is if the field isn't obvious. For example, getMonth/setMonth: is the month 0-based or 1-based?

I guess you can know if a getter/setter comment is useful by what it says. If the comment for getFoo only says "return the foo", and if there's nothing more you can really add to that, then I would prefer to leave that comment out.