Monday, November 17, 2008

More Effective Javadoc

Andrew Binstock's recent review of the book Clean Code: A Handbook of Agile Software Craftsmanship got me thinking about what I like and dislike about working with Javadoc. The specific comment that started me thinking about this was the observation that the both authors (primary author of the book and author of the blog review) feel that "Javadoc should not contain HTML." I am somewhat torn on this and often compromise with minimalistic HTML (such as <p> tags to separate paragraphs in package and class descriptions rather than allowing the text to be all munged together in HTML).

There is really a bigger issue at the heart of this: who is your Javadoc intended for? In this blog entry, I intend to discuss some of the things I have found that have made the Javadoc I generate or consume (others' documentation) to be more effective. The audience for the Javadoc is often the most significant consideration when considering what should go in the Javadoc comments.


The Case for Javadoc

I'll start by saying that I do think there are times when Javadoc is essential, times when it is helpful, and times when it is just about useless. Even worse, it can be harmful if allowed to become stale and overcome by events as the code changes. Useless Javadoc is often written as a result of mandated conventions or processes, but the other extreme is writing no Javadoc at all. I find myself writing very simplistic Javadoc comments for routine things such as get/set methods and no-argument constructors, but I do tend to write some Javadoc for all public methods and other publicly available Java constructs.

Documenting all publicly exposed APIs of a class is one of the items that Joshua Bloch calls out in Effective Java (Item 44 in the Second Edition and Item 28 in the First Edition). In addition, he points out the usefulness of documenting exceptions (Item 62 in Second Edition/Item 44 in the First Edition) and documenting thread safety (Item 70 in Second Edition/Item 52 in First Edition).

The most common argument against writing Javadoc is that clean code should speak for itself and that comments are only needed to act like deodrant when the code is smelly. There is certainly truth to this for comments in general, but I think some comments are appropriate some of the time and I have found Javadoc comments to be very helpful in my work.

The value of Javadoc comments certainly depends on the use of the code being documented and its audience. I rarely look at the source code of the JDK. Instead, I use its Javadoc-generated API documentation. Why do I use the Javadoc instead of looking directly at the source code? There are numerous reasons. I have it bookmarked in my favorite browser, so it is easy to access. My favorite IDE references Javadoc for me automatically as I use the IDE's code completion and other functions. Finally, I rarely need more detail than what is provided in the Javadoc for the standard Java libraries. Similarly, I find myself using the Javadoc for other open source products (Spring Framework, JFreeChart, etc.) far more than I actually look into their code. I usually only look at the code of these products if I need to explain an unexpected behavior, need to know exactly how something is being done rather than what is being done, or am curious about the implementation. This is even true of the non-Java frameworks I use. For instance, I heavily use the Javadoc-like Flex 3 Language Reference.

Other perspectives on the value of Javadoc are available in The Importance of Humble Javadoc, To Javadoc or Not Javadoc That is the Question..., and The Value of Javadoc.


Tips for More Effective Javadoc

With the case for useful Javadoc comments made above, it is time to look at a few detailed tips that I have found useful in writing effective Javadoc documentation. I realize that there are many more tips and useful practices related to Javadoc that I am not covering here. Feel free to add any you have found useful to the comments section.


You Don't Necessarily Need to Javadoc Everything!

There are situations where the boss, the client, the process, the code convention, or someone or something else requires you to write Javadoc comments for every thing in your code. These types of edicts are often what turn developers away from writing truly useful Javadoc. Often, the baby (good and useful Javadoc) is thrown out with the bath water (time-consuming and largely useless Javadoc).

I'm a strong believer that efficient Javadoc is often effective Javadoc. If one has documented data members and chosen to employ the -private option when generating Javadoc, then get/set methods for that particular data member likely do not need to be documented. For external purposes, Java developers may not want to expose the intentionally private, protected, or package level data members in the public API documentation. In this case, documenting the get/set methods can add value because this can be one of the easiest places to advertise to client code the valid values (range), units, and whether null is allowed or possible for that data member.

If I'm developing code for internal use only by other developers within the organization, I might favor documenting the data members and turning on private documentation generation. However, if my audience is an external one, it might be preferable to document the accessors and mutators. Adding Javadoc commenting to a no-arguments constructor may be superfluous at times, but it can also be useful when explaining why that constructor is not public and what constructor or builder should be used instead.

Some situations I find myself writing very little Javadoc for include JUnit tests and for the data members of nested builder classes as outlined in Item #2 of the Second Edition of Effective Java. In the latter case, the data members of the nested class essentially mimic the data members of the enclosing class, so rather than document those data members in both places, I prefer to reference the comments of the enclosing class's data members.

Sometimes, the code can speak for itself. Other times, there is no good way to communicate the intricacies or decisions made that led to something in the code without comments (especially Javadoc). Finally, it is worth noting that Javadoc becomes more important in situations where non-developers (test personnel, for example) might be using your Javadoc to better understand the application. I try to think about who will be using my Javadoc rather than just my own perspective when deciding if Javadoc for a particular element is useful or not.


The First Sentence Matters

Not all text within a particular Javadoc is created equal. The first sentence of the Javadoc for packages, classes, and methods is especially vital because that is what appears in summary information about those respective Java elements. Therefore, it makes sense the effort should be placed in making that first sentence as concise and useful as possible. Often for simpler cases, the most efficient Javadoc can be written with just a single sentence, satisfying both the previous recommendation regarding minimalistic Javadoc and satisfying this recommendation to make the first sentence matter.


Use -linksource for Source Code Accessibility

This is another tip that depends on the audience. Because the debate on whether to use Javadoc or code that speaks for itself (I try to do both as much as possible with emphasis on minimalistic but useful Javadoc), using the -linksource option can increase the value of the Javadoc-generated documentation for both sides of the debate.

The -linksource Javadoc option allows the class names in the Javadoc-generated documentation to link to a copy of the source code itself. By clicking on the class's name, the viewer is taken to that class's source code with line numbers. A good example of this is the JFreeChart API documentation. One can select any of the classes in that library (such as the highly significant JFreeChart class) and then click on the class name after "public class" to see the source code.

An important caveat here is that all source code is made available via the -linksource option regardless of whether -private is used or not. This caveat is a reminder about the importance of considering the audience of your Javadoc comments.


Document All Thrown Exceptions

It is useful to advertise to a method's potential clients what exceptions the method might knowingly throw. This is more obvious for checked exceptions due to the throws clause on the method definition, but that clause only indicates what type of checked exception will be thrown. The @throws Javadoc tag can be used to explicitly specify the checked exceptions that might be thrown and why they might be thrown. The @throws tag can also be used to document unchecked (or runtime) exceptions in the same way even those these are not documented in the method definition.


Document Parameter Details

The @param Javadoc tag allows one to specify what each parameter to a method represents. This is where significant information about the parameter can be specified such as allowed range, whether null is allowed or not, and any relevant units for that parameter (such as if the expected unit is seconds or minutes). Some of this can be specified by parameter name (such as secondsUntilOperationIsFinished), but it actually becomes less readable to include every detail about the parameter in the name (secondsUntilOperationIsFinishedCannotBeNullMustBeBetweenZeroAndSixtySeconds).

Much of this discussion also applies to the @return tag, where range, null or not null, and units considerations are often important.


Document Package and Class Usage

More recent Java packages and classes in the JDK seem to have better descriptions of how to apply those packages and classes. A good example of this at the class/interface level is the highly informative Javadoc documentation for the JAXB Marshaller and Unmarshaller interfaces. Each of these interfaces demonstrates how the respective interface can be used respectively to write out XML from bound Java objects and read XML into bound Java objects. The javax.management package description similarly provides an example of a highly informative description of how a package and its significant classes are used.

Many of the SDK descriptions provide links to non-Javadoc references on the subject and the Spring Framework does the same thing with links to its reference documentation from its API documentation. This is often accomplished with {@link}, available since JDK 1.2 with closely related {@linkplain} available since JDK 1.4. Overview and package-level documentation can provide a valuable pointer to where to start with the library or framework being documented.

It is not surprising that the JAXB interfaces and JMX package referenced above have such highly information Description sections because these are likely to be used by Java developers of many different skill levels. The audience here is Java developers of a wide variety of skill levels in terms of both breadth and depth and so it is helpful to provide relatively introductory information into how to use these interfaces, packages, and APIs. Such descriptive details may not be as important for a small team consuming their own code and its Javadoc documentation.


Use HTML Carefully in Javadoc

Javadoc allows HTML tags to be embedded in the Javadoc comments. While this is handy for nicely formatted and styled HTML presentation of the comments, it can be distracting for the person trying to read the comments directly in the code. Many of the IDEs mitigate this problem to a certain degree by doing their own representation of the tags rather than listing the tags directly. However, too many HTML tags can still be a burden for the reader and maintainer of those comments, especially when they are using a text editor or IDE that doesn't process the Javadoc tags in any special way. On the other hand, the nature of HTML is such that even basic things like white space beyond a single space noting be respected in the output can lead to ugly and hard-to-read HTML if no tags are used.

It definitely seems more like art than science to find the happy medium between too little HTML and too much HTML in Javadoc comments. If you know your audience is primarily developers reading the code directly, you might be best suited to writing little or no HTML in your Javadoc. If, however, you are delivering a framework or library or are in some other context where people without access to the source code or without a desire to read source code will be significant users of the HTML version of your Javadoc, the HTML tags might be more important. This is especially true if the goal is to reduce documentation to the code itself with little or no external design or implementation documentation. In that case, you almost certainly will have clients, managers, testers, or other stakeholders who prefer the HTML documentation over readable in-code Javadoc.

I have found that a few general principles have helped me to determine how much HTML to embed in Javadoc. However, I also need to point out that I still struggle with this now and then. Generally, in light of the value of efficient Javadoc, I tend not to use HTML in any Javadoc comments that can adequately describe the Java element in one paragraph or less. This is often the lion's share of my data members and methods and even applies to a large percentage of my classes. When I need more than one paragraph in a Javadoc comment, I don't want it all munged together on the web browser, so I use the <p> and </p> tags to separate paragraphs. These are minimalistic and, because they go on the very beginning and very end of each paragraph, have little impact on the readability of the text between them.

Some other preferences I have observed include preferring use of {@code} over <code> for specifying a code font. I also prefer {@code} or {@literal} for representing text that should not be parsed by the parser when generating the HTML representation of the Javadoc. For example, I think it is far clearer to read the Javadoc comment {@code Map<String,Object>} than it is to read <code>Map&lt;String,Object&gt;</code>. Both {@code} and {@literal} have been available in Javadoc since J2SE 5.

One final observation I've made regarding HTML in Javadoc is that it bothers me less to have HTML embedding in Javadoc when I am immersed in a task that is heavily web-oriented. My only explanation for this is that in such cases I am often working in HTML or related web languages anyway and so the HTML in the Javadoc comments is less distracting. In other words, I get better at quickly scanning and almost subconsciously skipping the HTML portions of the comments.


Consider Javadoc Extensions

In some situations, the reluctance to use Javadoc might stem from feeling like the Javadoc documentation is repetitive of the code itself or of other external documentation. As described above, this can be partially resolved by writing concise Javadoc comments that reference external documentation for much greater detail. Javadoc can also be extended as described in Documenting Java Member Functions to use custom tags that allow you to specify things such as concurrency concerns that are not easily included naturally in the code itself. Finally, there are many creative ideas out there for extending Javadoc's usefulness with third party tools. These include Javadoc tools (including IDEs) that make creation and maintenance of Javadoc comments easier, but they also include things like the ability to embed UML diagrams (see also UMLGraph) in the Javadoc-generated documentation.


Maintain the Javadoc Comments

One of the biggest arguments against comments of any kind is that they can (and often do) become obsolete quickly as the code changes. To help deal with this, the idea of minimalistic Javadoc is again appealing. Carefully chosen smaller amounts of Javadoc are less likely to contain something that becomes obsolete and will be easier to update when necessary. In addition, any developer changing a Java element described by a Javadoc comment should always verify the Javadoc comment as part of the maintenance process rather than assuming it is still okay. That being said, I'll freely admit that this is easier said than done and is often ignored in the heat of meeting deadlines, meeting tight schedules, and simply not wanting to deal with Javadoc because it is less fun than executable code.


Use Standard Javadoc Conventions

Javadoc can be more easily maintained and read when it is consistent and based on standards. The document How to Write Doc Comments for the Javadoc Tool is a good place to start. The articles I Have to Document THAT? and Documenting Java Member Functions also provide insight into what should be documented with Javadoc and how to best accomplish that.


Conclusion

Many of the recommended practices based on my observations depend on the audience. Whether to use HTML and how much HTML to use depends on who will use the Javadoc comments the most and in what contexts. Similarly, how and what is documented depends on the audience. However, even for developer-centric Javadoc where the HTML output is of little importance, Javadoc is still useful for documenting things like thread safety, exception handling expectations, ranges and valid values for parameters and return types, and other information related to the contract that cannot be easily expressed by the code itself. While many comments can be eliminated via highly readable code, well-chosen and well-written Javadoc comments tend to be among those that I prefer to complement the code with.


Feedback is Welcomed!

I would like to hear any suggestions of things you have found useful in making more efficient or effective use of Javadoc. It would also be interesting to hear of any tools you like to use in conjunction with Javadoc or hear about any alternative mechanisms you prefer for documenting your code.

3 comments:

Anirudh Vyas said...

Your strong belief that javadoc should be written only where needed/or by belief does not fly on two accounts:

1.) If a public API expands to include some "private" APIs then in any event developers will have to javadoc.
2.) By javadocing everything we have a rule of thumb for our developers that do this blindly, furthermore makes them more efficient at javadocing itself.

@DustinMarx said...

Anirudh,

Thanks for the comment. I have been known to document all my methods with Javadoc out of habit and don't see anything wrong with that. However, I also have a difficult time faulting anyone for not documenting such obvious methods. I do think there are some advantages to #2 you mention, though documenting trivial get/set interfaces is unlikely to significantly help document more complex examples that really need it other than perhaps the habit.

I'm less concerned about the #1 item because anyone changing a private method to public should be very cautious about changing the public-facing interface and, though important, adding Javadoc documentation should be just one part of the consideration and effort invested in making that change.

As I said, I typically do document all the methods if for no other reason that I use the Javadoc comments sections as easily visual separators for new methods. However, I also understand when other developers don't feel the need to document for a truly clear, short get/set method.

Thanks again for the response.

Dustin

Tomek said...

Hi Dustin,

thanks for this comprehensive post on javadocs. Recently I got really frustrated with the javadocs I see frequently and gathered a few anti-patterns for javadocs writing. You can find them here http://kaczanowscy.pl/tomek/2012-02/pretty-useless-javadocs

I feel like we are attacking the same problem from two different sides - you show how to do it, and I make a laugh at the misuse of javadocs.

--
Cheers,
Tomek Kaczanowski