Tuesday, May 11, 2010

Viral Bad Code: Why and What to Do About It

In the Code Anthem blog post Bad Code is Viral, Amber Shah describes why "writing bad code is viral." This is a great post that reminds me of a pattern I have seen repeatedly: bad code does seem to have a viral nature. Even when the original bad code is rooted out, it is often removed too late and the infection has spread into other pieces of the code. In this post, I look at some of the observations I have made regarding bad viral code and how to contain it and eradicate it.


Why Bad Code is So Viral

There are many reasons why bad code is so infectious and viral. I discuss some of these reasons here. It is often true that code approaches, good or bad, are copied and imitated. When good approaches spread, it's a best practice; when bad approaches spread, it's a virus.


Bad Practices Can Be Learned

The Code Anthem blog post I cited at the beginning of this post mentions one of the reasons bad code can be so viral: "Your code base is the primary training manual for new developers on your team. If they’re learning from crappy code, that’s how they will write code too." We all learn from writing our own code and reading and maintaining existing code. Especially when we don't know better, it is easy to "pick up" new code approaches without regard to how good or bad they are. With experience, we are more likely to recognize bad ideas when we see them. However, even experienced developers can learn bad habits, especially when learning something new to them.


Bad Code Can Be Justified

Most developers don't intentionally write bad code because they enjoy it. They either don't know better or choose to write bad code because they have justified it to themselves. Justifications we might use for knowingly writing bad code include time constraints, lack of resources, and thoughts of replacing it later when there's more time, etc.


Copy-and-paste "Reuse"

There are many problems associated with Copy-and-paste reuse, not the least of which is the greater likelihood of spreading bad code. Of course, if there is a logic problem with the original code or if the code needs to be changed for any reason, it is likely that the same change needs to be made elsewhere as well. Similarly, even if the bad code is completely cleaned up, its effect may remain long after that because of the copied code.


Bad Code is Often Easiest Initial Code

One of the reasons it is easy to write bad code and even easier to copy and use bad code is because bad code often seems easiest at first. There are certain approaches with code that may seem like a good idea at the time or at least like not a bad idea, but then prove to be disastrous in the long run. This is related to the time constraints discussed above when a developer chooses the easiest way for short-term benefit at the price of long-term detriment.


The First Impression is Not Always Correct

I've heard it said that first impressions are often correct. That's increasingly the case as one gains experience, but it's not always true and it's certainly often not true when one lacks experience in a given area. In our rush to deliver, we sometimes go with our first impression because it's "good enough" and we don't want to "undo" anything we've already designed and implemented. This can become very dysfunctional if we start shoehorning a solution into our original concept even when it doesn't really fit. Another developer can be negatively influenced when he or she sees a poorly implemented solution and it becomes his or her "first impression" as well, blocking out other ideas. There is an interesting twist here: even "good code" in one situation might become "bad code" in a different situation. This is one of the reasons I believe it is good to infuse new blood into a development team to bring in new ideas and break people free from first impressions and "that's the way we've always done it" thinking. Individuals can also gain these benefits from choosing to work with different developers.


Ulterior Motives

I have blogged before about how selfish developers can be a real drag on a project's success when they succumb to dysfunctional motivations. Bad code is just one symptom of this and can easily spread if multiple developers are infected by such motivations.


Afraid to Ask

One of the reasons that bad code is introduced into systems is that the person introducing it doesn't know better. Sometimes, this may be due to a lack of knowledge. If that developer is afraid to ask questions or otherwise chooses to go it alone rather than getting help, he or she is more likely to introduce bad code or to spread bad code. Indeed, this is another case where even code that is considered good in one use is bad when misapplied. Asking the author of the original code why it was "good" might help the developer to understand that a similar approach won't work so well in his or her different situation.



Remedies for Bad Code Viruses

Just as with human health, the best protection against the bad code virus is preventative. The following are some of the ideas the doctor ordered for dealing with viral bad code.


Emphasis on Quality and Craft

I doubt that there will ever be a day where we developers don't need to be as efficient as possible and think we have more time than we need to finish a given project. That being said, it is important to instill the concept of software craftsmanship in ourselves and our fellow developers. Long-term considerations make bad code seem more expensive.


Code Reviews

One of the best ways to stop bad code in its tracks before it can spread is to have experienced developers review each others' code and review less experienced developers' code. It is also helpful to have less experienced developers review experienced developers' code not only to learn from what they're reviewing, but also because even experienced developers make mistakes. Indeed, experienced developers who work together for a long time might learn from each other and reinforce each others' bad habits and a less experienced developer might have a fresh perspective. Code reviews may not be my favorite part of the job, but they can be highly valuable.


Maintain Own and Others' Code

I've heard some very experienced developers state that all new developers should be required to maintain large software code bases for at least one year before being let loose to develop new code. Although this would be difficult for many of us to stomach, I do think there's some wisdom in it. I have probably learned as much or more about what makes good code from maintaining my own code (good and not so good) than I have from writing new code. Writing new code helps a developer learn how to be initially productive and solve an initial problem, but code maintenance helps the developer learn to how to write quality code that stands the test of time. Many positive features of software such as extensibility, flexibility, encapsulation, layering, and so forth are far more obvious during maintenance than during initial development. For me, there's no question that the best developers are those who understand the pain of those who will someday be maintaining their code (may be themselves they are helping).


Read and Apply Lessons Learned and Best Practices

Over the years, Josh Bloch's Effective Java has continually improved as a reference for how to write effective Java code. It's amazing that a book with only two editions can change so much, but the more I work with Java, the better that book gets. Books such as Effective Java (there are similar books for many programming languages) provide good ideas on how to write better code and how to recognize existing bad code.


Use The Code Before Committing It

Some code is most easily recognized as "bad" when one tries to use it. Unit tests can help with this as a built-in mechanism for using the code. If the unit tests are difficult to write to use the code, it is likely a problem with the code that will frustrate other users of that code as well. I often wish developers would use their own APIs first before releasing them onto others. If a piece of code seems "okay" or "good enough," an appropriate question might be, "But would I want to use this or call this?"


Be Wary of Comments Hiding Stinky Code

As with most software development controversies, I find that the two extremes (don't write any comments and comment everything profusely) are too extreme for practical large-scale projects. Instead, the best answer lies somewhere between the extremes and where it falls depends on the problem at hand, the language being used, and so forth. Although I think some comments are appropriate, too many comments are often indicate of bad code. It has been said that "comments are the deodorant for stinky code." If it takes many lines of comments to explain relatively few lines of code, there's a good chance that the code is very stinky. In such cases, this code should be refactored or improved so that code speaks better for itself and to prevent the bad code (with or without the comments) from being applied elsewhere in the system.


Mix It Up

All of the above recommendations can be effective in terms of reducing introduction of bad code and reducing and hindering its spread. However, these practices are most effective when used together. For example, reading quality references such as Effective Java is a useful activity, but reinforcing those concepts with practice in terms of development and maintenance makes it more likely that the principles will be understood and remembered.


Conclusion

Good and bad coding techniques tend get copied and to spread. We welcome the spread of best practices and fret over the spread of dysfunctional practices. The problem is that we don't always know how to differentiate between the two. This post has attempted to cover some of the most common reasons for the rapid spreading of bad code and has also outlined some effective practices for reducing the entry of bad code into our system and hindering its spread.

No comments: