New software projects need to get off the ground. If 1.0 isn’t a success forget about 2.0. You move fast, change direction often, and don’t worry too much about the details.
If you’re lucky version 1 leads to 2 and 3 and 4. A few years pass and you’ve built up a customer base. People like the product, your job is secure, life is good. And you have a big problem.
Everything you did to push 1.0 out the door is coming back to haunt you. The code is full of spaghetti nobody wants to touch and development becomes a series of frustrated meetings where everyone asks, “how can that simple feature possibly take so long?” You have so much legacy code you can’t make any big changes. Your nimble product has turned into an aircraft carrier.
At first it’s hard to tell you’re in trouble. Sure there are a few rough patches, but all software has warts. You made it this far, you must be doing something right.
Then some of your old engineers leave the project. New ones join. If you’re lucky the new ones tell you you’re in a hole. You can dig out, but it takes time, effort, and a lot of money.
I’ve seen this process enough times to know it’s avoidable. It just takes planning for the future at a time when everyone around you wants to focus on today. That’s why I use Sonar and Checkstyle on all my new Java projects.
Sonar
settings
Sonar and Checkstyle are static analysis tools that look through your code for bugs and formatting issues. They’ll tell you if you assigned a variable to itself or put your curly-braces on the wrong line. Code checking tools make your development time shorter and get rid of annoying little bugs, but it’s always hard to sell them to your team.
These tools feel like they make you go slower. Maybe you know it’s good for you, but nobody likes being told to eat their vegetables.
Style tools are also the source of religious wars between programmers. Where should your curly-braces go? What is the right way to name your methods? How many lines before a class is too long? There’s no right answer to these questions, but there are consistent answers.
Checkstyle
checks
I want to share with you the way we use these tools, the rules we think are important, and how we convince new teams to use them.
Curly-braces
The curly-brace debates have been going on since we started writing ALGOL in 1968. Probably before that. So what looks better to you?
if (foo)
{
doStuff();
}
if (foo) {
doStuff();
}
One of them is more compact, the other one easier to line up. Which is the right way? There isn’t one. We’ve debated this issue for over 40 years because there’s no compelling argument either way.
We force curly-braces on the same line, but I can’t tell you why. I can say that keeping them consistent makes our code easier to read. Try switching back and forth too often and you’ll get lost.
No tabs
Using tab characters is another religious debate. Some people love tabs, others hate them. We ban them. We add hooks so you can’t check in files with tabs.
Tabs mostly cause us trouble because of our very mixed development environment. Our developers work on Windows, Linux, and Mac because we ship products on all of those platforms. Moving from one to the other means most developers use different editors.
Some editors make tabs eight spaces and some make them four. Most editors can show you correct indentation if you have one or the other, but mix them up and even the best editors get confused.
We indent four spaces and ban tabs. This is another area where the code consistency pays off.
Member names
We force all member names to start with m_
. Let me show you why with an example. Can you spot the bug here:
public class User { private String firstName; private String lastName; private String email; private String telephone; public User(String strfName, String strlName, String strEmail, String strTelephone) { this.firstName = strfName; this.lastName = strlName; this.email = email; this.telephone = strTelephone; } }
This simple class has a very common bug. Did you find it yet? Take another minute.
The variable email
will never be assigned. Someone might depend on it being there and get a NullPointerException
. Worst of all it will become a run-time exception.
Run-time exceptions are scary. They happen unpredictably and it’s easy to miss them in testing. We always try to turn run-time exceptions into compile-time exceptions. Those are easy to find and rarely cause problems for customers. As long as you have an automated build the chances of shipping a with a compile error are minuscule.
File length
Almost everyone can agree that a 500 line source file is fine and 10,000 lines is too long, but there are a lot of opinions in between. We limit our file lengths to 2,000 lines. There isn’t a really good reason for that number. It could have been 1,500 or 2,500. The key is choosing a length and sticking to it.
You may have a really good reason for making your file giant, but most of the time it turns into a 5,000 line monster. Too big to easily document and too complex to maintain.
JavaDoc
When I started programming I was against JavaDoc. It seemed like a crutch. I thought code should be readable without documentation. I still believe that, but there are some things your code can never say.
The code will never tell another programmer what your intention was. Is that missing block a bug, a feature you haven’t implemented, or there for a reason? The programmer’s intention is very important. The code is the implementation, but not the plan.
The biggest advantage of good comments is that they enable refactoring. The comments make it much easier for another developer to change something if they know how it was meant to work in the first place.
We require that all public
, protected
, or package private methods have JavaDoc which includes every argument and every declared exception. If you skip JavaDoc then you get a Checkstyle warning.
No @author tags
Speaking of JavaDoc, we ban @author
tags in our code. I never liked these tags, but it was Karl Fogel who first explained to me what I didn’t like. Karl is a founding member of the Subversion project where they also ban author
tags. For him it comes down to the bus factor.
The bus factor is the answer to a simple question, “how many people would need to get hit by a bus before nobody understands this code?” You want the number high and @author
tags drive it lower.
The @author
tag is a signpost telling everyone else to stay away. It creates an ownership mentality of a single class. Normally ownership mentality is great. When every developer feels like they own the product it drives quality up and fuels the passion that leads to inspiration.
The problem comes when you feel like the exclusive owner. Then your bus factor stays at one. If you leave the project it goes to zero. We encourage everyone to look at every part of the code base and banning @author
tags helps us do it.
Spacing
This is one more issue with no right answer. What feels better to you?
String s[]=getMyStrings(); for(int i=0;i<s.length;i++) { System.out.println(s[i]); }
String s[] = getMyStrings(); for (int i = 0; i < s.length; i++) { System.out.println(s[i]); }
You might ask why put the space before the parenthesis and not after or why spaces around the plus sign but not the semi-colon. There isn’t a right answer. We chose this because it was the Sun Java coding standard, but the key is consistency.
Cyclomatic complexity
Spaghetti code is about more than just the code formatting. Methods that are too complex to understand are too complex to debug.
Complexity is easy to add. One developer adds a slightly complex method and then another adds one more condition. Then you need a special case for one more condition. And one more. Go through a couple of iterations and your code path is a maze of back alleys and dead ends.
Labyrinthine code is a sanctuary for bugs, but it’s something worse. Complex code is immovable code. Bringing new developers up to speed takes months and even the experienced ones can’t move the project very far.
The most dangerous part is that complex code is easy to ignore. You’ll fix it someday, but someday never comes. This technical debt drags your project down.
Sonar analyzes cyclomatic complexity and moves the complex code argument from opinion to fact. You choose how much complexity you can live with and warn programmers when their code complexity gets too high.
We set our cyclomatic complexity maximum at seven. That’s enough code paths for a few if
statements with a couple of while
loops thrown in. Anything more and it should be a separate method.
Bugs
So far we’ve talked mostly about formatting issues. They often lead to bugs, but they might not be bugs yet. Now let’s look at a few of the real bugs Sonar finds in most projects.
Empty if or else statements
boolean b = getMyBoolean(); if (b) { return; } else { }
What did the programmer who wrote this mean? Is it a copy and paste error or a real bug? Did they just forget to add code to that else statement? This leads to confusion, bad code, and technical debt. We don’t allow it.
Equals without hash code
public class User { private String m_name; public User(String name) { m_name = name; } @Override public boolean equals(Object o) { return m_name.equals(o); } }
This one comes from the excellent Effective Java by Joshua Block. The Java uses the hashcode
and equals
methods together in many of the collection classes. Overriding one without the other is always a bad idea.
Catching NullPointerException
try { myRiskyMethod(); } catch (NullPointerException npe) { // No need to do anything here }
Ever seen code like this? I know programmers that talk about code smell and this code smells like a wet dog fresh from the compost pile.
This code gives more questions than answers. Why is this method risky? What does it mean if it throws a NullPointerException
? Why couldn’t you handle it properly in the first place?
Leave this in your code and you’re a lazy programmer.
Unused locale variables, private methods, and formal parameters
If your method takes three arguments you better use them. If you don’t then why did you write it that way. What did you mean? Did you just forget or is this a bug?
Sonar plus Checkstyle gives us hundreds of checks and we try to fix all of them. I make Checkstyle warnings build errors to force programmers to fix all of them. I do it because fixing the warnings is faster than not fixing them.
Digging your way out
Using Checkstyle and Sonar for a new project is relatively easy. Choose the rules you want to enforce and make the build break if they aren’t followed. Your project will stay at zero warnings forever.
The hard part is existing projects. The first time you run the tools they’ll give you thousands, probably tens of thousands, of warnings. The sheer magnitude of them is overwhelming. I’ve seen many projects run their first report and never look at Checkstyle or Sonar again. These errors are worth fixing and don’t take as long as you’d think. The way forward is a simple process of tools, segregation, and persistence.
Use the tools
There are many products and open source projects that will fix up formatting errors in Java code. They can analyze your code and apply your style guidelines. The best one I’ve found is Eclipse.
Eclipse Code Style
Eclipse will format an entire project with a single click. It takes a little while on large projects, but you can run it overnight. The Eclipse code formatter will fix thousands of errors at once. The problem becomes much more tractable with all the simple formatting issues out of the way.
Segregate the code
It’s much easier to stay at zero warnings than 100. 100 and 101 warnings look identical, but the different between zero and one is staggering. Leave 100 warnings and pretty soon the number will creep up.
Segregating your code is the easiest way to get to zero. If you’re starting a new module check it separately from the rest. Keep that part at zero and then work on some of the others. Segregation makes modules easier to test and gives you more manageable goals. Maybe you can fix one module per month.
Persist
You don’t have to fix every error in a day. Slow and steady progress will improve your code quality over time. Just keep at it.
We often assign a special portion of each development effort to fixing these errors. Every developer devotes a little time to it and slowly works the number of warnings down.
The true cost of technical debt
You’ll never lose a sale because your curly-braces are on the wrong line or your files have tabs. You will lose customers because of missing features, unstable software, and a bad reputation. Spending your limited time adding features is a good thing. That makes these warnings seem trivial, but fixing these warnings will save you time.
Every Sonar or Checkstyle warning is a small bit of technical debt. They cause bugs, make your code more difficult to maintain, and turn your product into spaghetti code that nobody wants to work with. One or two don’t matter, but pile up hundreds and you’re in trouble.
These warnings are the issues you shouldn’t waste time on. They’re the NullPointerException
you spend a day tracking down and the bad formatting that costs you an extra hour understanding the code. Let the tools find these problems so you can focus on the important features of your product.
Good software comes from distance. It’s only when you can step back and see the forest from the trees that you make the hard decisions and design the features your customers really need. If you spend all of your time putting out fires you’ll never get to the fun and profit.