Published

Static Code Analysis

These days I’ve been updating the CA settings on some old projects, I wanted to modify the configuration of certain rules that were removed with the shipping of Visual Studio 2008. You can obtain detailed information about what rules are shipped with each version of CA in this post of the FxCop blog.

As you can see in the post, some of the rules were removed because they do not apply anymore or because they were too noisy compared with the benefit introduced. One of the rules that make feel more upset when I knew it was removed was “CA1818 – Do not concatenate strings inside loops”. This rule threw an error (I will continue in my imaginary world where everybody sets warnings as errors in production code) with code like:

One of the first things you learn in .NET is about the immutability of the strings, you can find lot of literature talking about how to handle properly strings. Even Improving .NET Application Performance and Scalability, one of the best papers I’ve seen about .NET performance, makes an explicit reference to do not concatenate strings when the number of concatenations is unknown.

So, today that I’ve been working again with the rule, I had the curiosity (hope) to verify the rule was not removed because it was too noisy but because the CLR was improved to avoid the issue. I know this is again living in my imaginary world, but I’m a bit stubborn…what I’ve done is to compile a console project with the code above with 2.0 (VS 2005) and 3.5 (VS 2008), first one fires the error, second one doesn’t. First step has been to look for differences in the IL generated, both cases have:

No improvements on the compiler side. Next step has been to verify the native code generated, for that I used the command !u at WinDbg (in the post Inline Methods you can see how to obtain the native code after the method is jitted). The code for both projects looks like (note that some memory addresses will be different):

We see there are no improvements either on the jitter side. The next to verify is if the strings are being discarded on each iteration creating a new one or not. To do that I used some good profilers that are on the market but in the end I decided to show how I did it with WinDbg because anybody can download it for free.

With !DumpHeap -type System.String we can see all the string instances of our application. This returns a list with the memory address of the string instances, to view the contents of the object we just need to do a !do (dump object) of the address we want to check. So, just taking some samples from the instances with higher memory addresses we can already see the next:

From the results above we can see how on each iteration we have two strings, the resulting of i.ToString() and the resulting of concatenation.

Conclusion, we still creating new strings and discarding the previous version for GC on each manipulation of the string. Therefore, I suppose the rule was removed just because it was too noisy.

Once I stopped playing with WinDbg I come back to the earth to say what I wanted to say from the beginning. It’s pity to see that a performance issue that has been repeated so many times, now it is just ignored because the rule is too noisy. I know lot of people could argue that by using the StringBuilder, we also discard old versions of strings when the size of the string becomes bigger than the buffer, but it is still better than discarding all modifications.

I don’t understand why a good string handling was that important before and now is just ignored by the main CA tool used by .NET developers.