Follow Blog via Email

Category: Practices

I’m currently working on a legacy code base that’s not a shining example of clean code and best practices.

I try to improve the code as I work, a careful balancing act where I do my best not to impact productivity by spending too much time on refactoring and reformatting.

To help me with this task I decided to look into linters. Note that I’m not using ReSharper which would help with code quality, but that’s another story.

I tried StyleCop, but with the default settings I just wasn’t impressed. Many rules are in my opinion needlessly restrictive. It also seemed to simply dump more than a thousand warnings in the error window.

I tried SonarLint next and I must say I was thoroughly impressed. Though the errors are also sent to the error window they are also shown inline in the file. The parsing by SonarLint was fast and running the Visual Studio extension only had a very small impact on performance even while working with a solution containing thousands of files and hundreds of thousands of lines of codes.

While I haven’t run ReSharper in years, performance was one of the two reasons I stopped using it. On large solutions like the one I’m currently working with, it would slow everything down to a crawl, even when disabling solution-wide analysis.

SonarLint isn’t exactly what I was expecting it to be. I thought it would be more for style guidelines but it has instead given me deeper insights. I’m specifically referring to my last two examples.

Or overlapping method signatures with optional parameters that can’t be used:

// This method's signature overlaps the second one
// and the default parameter value can't be used
public File GetFile(bool firstParam,
bool secondParam = false)
{
return new File();
}
public TestFile GetFile(bool firstParam)
{
return new TestFile();
}

This last one having been found three times in a single file.

SonarLint is available on the Visual Studio marketplace. It’s free and open source with a repository on GitHub. It’s created by a company that also sells a similar product (though geared towards CI / Build servers) but I haven’t found any downsides to using the free Visual Studio Extension.

By the way, I haven’t been asked to write this by the company that makes SonarLint.

The best use for TODO comments is when you are actively working on a change set, adding TODOs that will be replaced with actual code as you work. These TODOs will be removed before you commit to source control.

In this case you add a TODO, signifying an intention to return later on to complete the code.

Before committing code to source control, it’s always a good practice to review all your changes with a diff tool. This is the perfect time to scan for leftover TODOs in the code and implement them.

Another case for TODOs is for something that will truly be implemented in a later commit. Before leaving those kinds of TODOs in the code ask yourself if you shouldn’t rather create an issue in your issue tracker. Otherwise this second kind of TODOs tend to accumulate and pollute the source code. They experience the same kind of code rot as old commented out code does.

TODOs that you leave after a commit should be for changes that are small, implementable in the foreseeable future and relating to source code rather than business processes.

Before leaving those kinds of TODOs in the code ask yourself if you shouldn’t rather create an issue in your issue tracker.

On larger projects you tend to see TODO comments which are several years old. Just today I found one that source control indicated was 3 years old. The person responsible for the comment may not even be working here anymore. While reading this specific comment could give me a general idea of what should be accomplished it does not do so clearly enough that I can go do it. Or why for that matters it needs to be done. You tend to see these often enough.

This is similar to commented out code, where nobody knows why the code was commented out in the first place and thus leave it as it is thinking it may be important. As time goes by there is less and less chance of someone actually knowing why the code was commented out. So it stays there until someone is brave enough to just delete the useless comment (how can it be useful if no one knows why it’s commented out). Keeping track of old code is the source control’s job anyway.

As time goes on, I’m distancing myself more and more from rules that you must always follow or guidelines that make you lean towards absolutes.

Commenting every method and variable.

Documenting all classes.

Coding guidelines that need to be applied everywhere exactly the same way.

Over 90% test coverage.

The problem with these rules is that they don’t have any reasoning or intelligence behind them. You must comment everything even if it doesn’t make sense because some tool is checking for it. You must strive for 100% code coverage even if it means writing tests with little business value along the way.

Our strength as developers is our reasoning and our judgement.

I like where these guys are going : Why Inch doesn’t use percentages. They have the right idea. Inch is a documentation coverage tool which gives a you a letter grade rather than a percentage score. Their design takes into account the fact that not everything needs to be documented. Having a grade is less precise that a numerical score but that’s what makes it better. Having a percentage score, particularly with an enforced lower bound will often lead to people gaming the system or fussing too much over the details. Code metrics should be about guiding you to write better code.

Another interesting link on the subject is The code documentation fallacy. The author makes the point that given the choice between 10% or 90% comment coverage, it is better to have 10% because it means the comments that are there have a reason for being there. On the other hand if comments are enforced, most of them will probably be noise.

I can relate to this. I have worked on a large project which required 100% comment coverage of all public classes, methods, method parameters and return values. The team was using a refactoring tool that amongst other things allowed to generate these comments automatically.

I would say about 9 in 10 comments were auto-generated and looked something like this:

It’s surprising how fast the human brain learns to filter out these useless comments until it doesn’t even notice them. When these comments actually contained useful information they were being gleaned over.

The reverse, absolutely no comments at all, is just as bad. The problem here isn’t the comments or the lack thereof but the inflexible rules.

Google understands this concept as well. If you look at their coding style guidelines documents, (C++, Python, JavaScript) you will see the following:

Use common sense and BE CONSISTENT.

If you are editing code, take a few minutes to look at the code around you and determine its style. If they use spaces around their if clauses, you should, too. If their comments have little boxes of stars around them, make your comments have little boxes of stars around them too.

The point of having style guidelines is to have a common vocabulary of coding so people can concentrate on what you are saying, rather than on how you are saying it. We present global style rules here so people know the vocabulary. But local style is also important.

Notice the mention about local style and the focus on the reasons for having guidelines.

If the person working on the web service layer, knowing the guidelines, truly believes he should use a different naming scheme for his methods I’d be inclined to trust him. He knows his specific context.

Context and judgement trumps blind adherence to rules. Aren’t they called guidelines anyway?

If, like me, you are working in an environment where English is not the main language you have encountered the problem of which natural language to use when programming. This is often an issue amongst bilingual teams.

There are three possibilities:

You can have all the code in English

All in the “native” language (French, Spanish, German, …)

Or in a mix of both

I have had work experience with all three situations and debated the issue with various people. After these experiences I have come to the conclusion that for traditional line of business software development the best approach is a mix of both languages.

If you are writing line of business software and the intended users work in another language than English you should use the user’s language for all business domain terms.

Otherwise you will have to translate these terms. Sometimes there will be no existing translation or it will be unknown to most of the users themselves. The developers will use these translated terms on a daily basis while working with the code.

You’ll have a situation where the developers and the users do not use the same vocabulary. Things will eventually get lost in translation, people will need to do constant mental mapping when talking amongst themselves and users will be puzzled when developers use a translation they have never heard of.

One of the common arguments against mixing two languages is the inconsistency it often breeds. Sometimes a word is written in English, later the same word is written in the other language. This creates confusion when people try to search through the solution to find something. This is a frequent problem and it needs to be addressed.

The best way to address it is to have clear guidelines about which words should be in English and which words in the other language.

I have found that what works best is having everything but the words of the problem domain (domain model, business domain, or whatever you call it) in English. This works well because it avoids problem areas like pronouns and prepositions. Two types of words that are susceptible from going either way.

It also keeps all of the software terms in English. Since most literature (online or offline) is in English, developers will appreciate using familiar and searchable terms.

Of course if you are working with other people and your only common language is English or you intend to put your code on the internet (ie: GitHub). You should write all of the code in English.

The same situation applies if your users are English speakers.

Likewise if you are writing code yourself for a personal project you can do as you please and in these case I always write everything in English.

Pros and cons of various choices

All in englishPros: consistency, more people can understand the code, allows for the use of the more common English terms to represent software concepts
Cons: if your clients or users use business domain terms in another language you will be faced with translation problems

All in the other languagePros: consistency, business domain terms are in the language of the users, may be easier for developers who do not have a firm grasp of English

Cons: software terms (ie: Load, Save, Factory, Adapter, Proxy) need to be translated, programming language keywords are in English which makes it impossible to use a single language throughout the source code

A mix of both languagesPros: the best of both worlds, software terms in English, business terms in the other language, reading the source code allows to easily distinguish between business and software concerns

Cons: often leads to inconsistency when not everyone applies the same rules all of the time