C++/C# bad practices: learn how to make a good code by bad example

Checking the Source Code of MSBuild

As we continue developing PVS-Studio static code analyzer, we often have to check large open-source projects by renowned developers. The fact that even such projects contain a certain amount of bugs adds even more sense and weight to our work. Unfortunately, everybody makes mistakes. No matter how carefully you control your code’s quality, there is just no way to avoid “human error”. As long as software is developed by humans, analysis tools like PVS-Studio will remain relevant and needed. Today, we are going to discuss errors found in the source code of MSBuild, which is, unfortunately, not perfect either.

Introduction

Microsoft Build Engine (MSBuild) is a platform by Microsoft for building applications. It is usually used together with Microsoft Visual Studio but does not depend on it. MSBuild provides an XML schema for project files (*.csproj, *.vbproj, *.vcxproj) that controls how the build platform processes and builds software. The tool is shipped as part of the .NET platform and is written in C#.

Microsoft provides MSBuild source files for free upload to GitHub. Taking into account the high development standards applied by Microsoft, finding bugs in MSBuild might be a hard task even for top-quality static analyzers. But success comes with tenacity. With the help of PVS-Studio, version 6.07, we have checked the source code of MSBuild project, and here’s what we have found.

Analysis data and statistics

MSBuild consists of 14 projects, which include a total of 1256 source files in C#. That makes approximately 440,000 LOC.

PVS-Studio issued 262 warnings for this project. The general analysis statistics with differentiation of warnings across severity levels are shown in the following chart:

As you can see from the chart, the tool issued 73 high-level warnings, 107 medium-level warnings, and 82 low-level warnings. We will focus on the High and Medium levels, as they contain potentially dangerous constructs and genuine bugs, while Low-level warnings, though dealing with errors as well, often turn out to be false positives, so we don’t usually discuss them in our articles.

The analysis of MSBuild has revealed that about 45% of the High- and Medium-level warnings point to incorrect code (81 errors), while the rest warnings simply refer to constructs that PVS-Studio finds suspicious, and false positives rather than real errors. Some of the warnings were triggered by unit tests or code with comments about potentially dangerous constructs used to check for exceptions. In any case, the remaining warnings need to be examined by the developers as they are the only people who have the full knowledge of the code and can reliably estimate if the warnings are correct or not.

Based on these data, the PVS-Studio ratio of High- and Medium-level errors to 1 KLOC (i.e. error density) is only 0.184 (errors per 1 KLOC). This figure is not something to be surprised at in case of Microsoft projects and is a sign of the high quality of MSBuild’s code.

Now let’s discuss the analysis results in detail. Note also that the job of examining all the warnings issued for this project is beyond the scope of our article, so we’ll only talk about the most interesting defects, classifying them into groups.

This is probably a classical error: we see it in nearly every project we check. It has to do with casting a variable to a different type using the as operator and testing the same variable, instead of the resulting one, for null:

PVS-Studio diagnostic message: V3095 The ‘diskRoots’ object was used before it was verified against null. Check lines: 2656, 2659. ToolLocationHelper.cs 2656

Note the diskRoots parameter. It is an instance of the List class and can have a value of null. However, the corresponding check is done only in the second if block, after the diskRoots variable has already been used for inserting values into a list:

There are 8 more potentially dangerous constructs like that in MSBuild:

V3095 The ‘propertyValue’ object was used before it was verified against null. Check lines: 2760, 2799. Expander.cs 2760

V3095 The ‘publicKeyToken’ object was used before it was verified against null. Check lines: 232, 236. GenerateBindingRedirects.cs 232

V3095 The ‘searchLocation’ object was used before it was verified against null. Check lines: 170, 178. Resolver.cs 170

V3095 The ‘assemblyName’ object was used before it was verified against null. Check lines: 176, 194. Resolver.cs 176

V3095 The ‘searchLocation’ object was used before it was verified against null. Check lines: 249, 264. Resolver.cs 249

V3095 The ‘ReferenceInfo’ object was used before it was verified against null. Check lines: 87, 97. AxReference.cs 87

V3095 The ‘packageFileName’ object was used before it was verified against null. Check lines: 1448, 1457. BootstrapperBuilder.cs 1448

V3095 The ‘metadataNames’ object was used before it was verified against null. Check lines: 243, 253. CommandLineBuilderExtension.cs 243

Wrong assumption about string length

PVS-Studio diagnostic message: V3057 The ‘Substring’ function could receive the ‘-1’ value while non-negative value is expected. Inspect the second argument. Utilities.cs 328

For the if block to execute, there must be a string consisting of one or more characters, while inside that block, the programmer attempts to get a substring from the original string. Obviously, the second parameter of the Substring method will be negative for a one-character string, so the method will throw an ArgumentOutOfRangeException:

The variables now and s_lastLoggedTicks are of long type. They take part in some computations that should yield a value of float type. However, since the division operation is done over values of type long and only then is the resulting value cast to type float, it will result in the loss of precision:

At the same time, the value of type bool returned by the GetTypeLibNameForITypeLib method is checked in the caller. Effects of such behavior are unpredictable. This code needs to be refactored and fixed.

For the if block to execute, the logger variable must have the null value. However, this variable is again tested for null inside that block, in the VerifyThrow method. So, that second check will always be false:

The error is lurking in the second line. The programmer must have written it by copying the first line (the infamous copy-paste) and changing the first parameter in the copied code, but they forgot to remove the second parameter, _schedulingData.EventTime.Ticks, which was no more necessary:

The analyzer detected a construct with an extra assignment to field _nextProjectId. The result of the MaxCPUCount + 2 expression is added to the value of _nextProjectId, and then the resulting value is assigned to the same field using the += operator. After that, this value is again assigned to the _nextProjectId field:

Conclusion

As a conclusion, I’d like to say that even such high-quality projects as MSBuild could benefit quite a lot from regular checks of their source code for potential and actual errors by static analyzers like PVS-Studio.

Feel free to use the demo version of PVS-Studio analyzer to check this project and take a look at the warnings we have discussed, as well as to check your own projects.