C++/C# bad practices: learn how to make a good code by bad example

We continue checking Microsoft projects: analysis of PowerShell

It has become a “good tradition” for Microsoft to make their products open-source: CoreFX, .Net Compiler Platform (Roslyn), Code Contracts, MSBuild, and other projects. For us, the developers of PVS-Studio analyzer, it’s an opportunity to check well-known projects, tell people (including the project authors themselves) about the bugs we find, and additionally test our analyzer. Today we are going to talk about the errors found in another project by Microsoft, PowerShell.

PowerShell

PowerShell is a cross-platform project by Microsoft consisting of a command-line shell and associated scripting language built on the Microsoft .NET Framework and integrated with it. PowerShell also provides convenient access to COM, WMI, and ADSI, and enables administrators to perform various tasks in a single environment both on local and remote Windows systems by running regular command-line commands.

PVS-Studio

According to the statistics of the project repository, 93% of the code is written in C#.

The project was analyzed with PVS-Studio static code analyzer. The version we were using is currently in the development process, so it is newer than PVS-Studio 6.08 but it’s not PVS-Studio 6.09 either. This approach enables us to put the new version through more extensive testing and fix possible defects. It does not replace the multilevel system of tests, of course (read about the seven testing techniques in the article discussing the development of the Linux-version), but rather is another way of testing the tool.

Preparing for analysis

I updated the analyzer and downloaded the project’s code, so everything was ready to go. Well, sometimes things get complicated as early as at the stage of preparing a project for analysis, i.e. at the building stage. It is recommended that you build projects before analyzing them. Why does it matter? The analyzer will have access to more information that way, so it will be able to provide a deeper analysis.

The most common (and convenient) way of using PVS-Studio is to run it from the Visual Studio IDE. It’s quick, easy, and convenient. For PowerShell, however, it’s a problem.

It turned out that the authors themselves didn’t recommend using Visual Studio to build the project. They say it straightforward on GitHub: “We do not recommend building the PowerShell solution from Visual Studio.”

Well, I couldn’t resist the temptation to build and check it in Visual Studio, so I gave it a try anyway. This is what I got:

Well, that’s sad. What did it mean in my situation? That I wouldn’t be able to test all the features of the analyzer on this project. Then you have two scenarios.

Scenario 1. Check the project without building it.

A project would’t build? OK, let’s check it as it is.

What are the pros of this approach? You don’t have to waste time figuring out the problem and trying various tricks to get the project built. It does help you save time; moreover, it is not guaranteed that your tricks will work out after all.

The cons of this approach are also clear. First, the analysis will be incomplete; some bugs will slip from the analyzer. You may also get a certain number of false positives. Second, it makes the estimate of the false/genuine warnings ratio pointless, as it may vary greatly for the built version.

However, even this scenario allows you to find a decent number of errors and write an article.

Scenario 2. Figure it all out and get the project built.

The pros and cons of this approach are opposite to those of the previous one. Yes, you’ll have to spend more time on building, but it’s not guaranteed that it will work out. If you do succeed, however, you’ll be able to analyze the code more thoroughly and maybe find some interesting bugs.

There’s no definite suggestion about what way to choose; everyone decides for themselves.

I struggled with the project for a while, trying to build it, and finally decided to go “as is”. This approach was good enough for my goal to write an article.

Note. While it can’t be built from Visual Studio, the project can be easily built by using the script (build.sh) located in the root directory.

Note 2. One of the developers (many thanks to him) told me that the *.sln-file was intended to make it more comfortable to work with the project, but it wasn’t meant to be used for building, which is just another argument for choosing the first approach.

Analysis results

Duplicate subexpressions

Projects that trigger no V3001 warnings do deserve a medal. PowerShell, unfortunately, wouldn’t get it, and here’s why:

The BaseMaximumVersion reference is tested for null twice, but it’s obviously the BaseMinimumVersion reference that should be checked in the second case. If you are lucky, the program may run for a long time without this error ever showing up, but when it does occur, the information about BaseMinimumVersion will never be included in the error message formed when the exception is thrown, as the BaseMinimumVersion reference will be null. As a result, some portion of useful information will be lost.

Note that I fixed the code formatting in this example to make the error easier to notice. In the original code, however, the whole condition is written in one line, which is another example of why good code formatting is so important: not only does it make the code easier to read and understand, but it also makes errors easier to see.

Again, there is a typo that causes one check to execute twice. What should be checked in the second case is most likely the constant field MaxRunspaces of the static class RemoteDataNameStrings.

Unused return value

There are errors that have to do with unused method return values. The reasons, as well as implications, vary a lot. Sometimes programmers forget that objects of type String are immutable and that string- modifying methods return a new string rather than changing the existing one. In the same way, using LINQ yields a new collection. Errors of this type were also found in PowerShell.

Note that the errorAsts parameter is used with the ref keyword, which implies that the reference gets changed in the method body. The logic of this code is simple: if the errorAsts reference is null, then it is assigned with a reference to another collection; otherwise, the elements of the exceptionTypes collection are added to the existing one. However, the second part doesn’t work properly. The Concat method returns a new collection without modifying the existing one, so the errorAsts collection will remain unchanged, while the new one (containing the elements errorAsts and exceptionTypes) will be ignored.

There are two ways to fix this defect:

Use the AddRange method of the List class to add the new elements to the existing list;

Use the return value of the Concat method and make sure that you cast it to the required type by calling to the ToList method.

Checking a wrong reference after using the ‘as’ operator

The gold medal goes to the V3019 diagnostic rule! I’m not sure about all projects, but almost every C#-project that I checked and discussed in my articles had this bug. Our long-time readers must have learned this rule by heart: when casting a reference to another type by using the as operator, always make sure that you test the resulting reference, not the original one, for null.

The result of casting j to the PSRemotingChildJob type is written to the child reference, which means that this reference may be assigned with the null value (if the original reference is null or if the cast failed). The programmer, however, checks the original reference, j, and then attempts to access the Runspace property of the child object. So, if j != null and child == null, the j == null check won’t help and you’ll get a NullReferenceException when accessing the instance members of the resulting reference.

PVS-Studio warning: V3027 The variable ‘remoteFileStreams’ was utilized in the logical expression before it was verified against null in the same logical expression. System.Management.Automation FileSystemProvider.cs 4126

If you are lucky, the code will execute successfully; if not, you’ll get a NullReferenceException when attempting to dereference a null reference. The remoteFileStreams != null subexpression doesn’t actually do anything, nor does it protect the code from the exception. Obviously, you need to swap the subexpressions to make the code work properly.

Well, we are all humans, and we all make mistakes, and static analyzers are the tools whose purpose is to catch our mistakes.

There is a risk of getting a NullReferenceException when executing this code. The ItemSelectionCondition.SafeForExport() subexpression will be evaluated only if the first subexpression evaluates to false. Therefore, if DisplayEntry.SafeForExport() returns false and ItemSelectionCondition == null, the second subexpression, ItemSelectionCondition.SafeForExport(), will be evaluated, and that’s where the null dereference will occur (and raise the exception).

I found another similar code fragment in this project. The corresponding message: V3080 Possible null dereference. Consider inspecting ‘EntrySelectedBy’. System.Management.Automation displayDescriptionData_Wide.cs 247

Every now and then, you stumble upon code like that. The programmer intended an exception to be of one type, but it ended up being of another type. Why does it happen? In our example, the programmer tests the providerName reference for null, but later, when forming an exception object, they call to the instance method ToString of the same reference. It will result in forming a NullReferenceException instead of the intended ProviderNotFoundException.

The inputParameters != null check implies that the reference being checked may be null. The programmer wanted to play safe to make sure they wouldn’t get a NullReferenceException when accessing the mshParameterList field. This is a right decision, except that they already accessed another instance field of the same object, shapeParameters, earlier. Since inputParameters doesn’t change between these two operations, the null check won’t help if the reference has been null from the beginning.

Method PopulateProperties is called in the ErrorRecord constructor to initialize the fields and perform some other operations. The analyzer warns us that one of the constructor’s parameters, errorCategory_Message, is not used. Indeed, the errorDetails_Message argument is passed twice when calling to the PopulateProperties method, while errorCategory_Message is not passed at all. Checking out the parameter list of PopulateProperties confirms that we are dealing with an error.

An always false condition

One of PVS-Studio’s features that help us implement complex diagnostic rules and find complicated bugs is the so called virtual values, which allow the analyzer to track the possible ranges of values that a variable can take at a particular time of execution. For more information on that feature, see the article Searching for errors by means of virtual values evaluation. This mechanism underlies such diagnostics as V3022 and V3063, which often help us discover interesting errors. One such error was found in this project too:

The analyzer insists that the stateInfo.State == RunspacePoolState.Disconnected expression is always false. Is it really so? Sure! I wouldn’t cite this example if it were otherwise.

The programmer made a mistake in the preceding condition: if stateInfo.State == RunspacePoolState.Disconnected, then the previous if statement will execute all the time. To fix the error, you just need to swap the last two if (else if) statements.

More bugs?

Yes, there are lots of other suspicious fragments. Our regular readers know that we don’t usually discuss all of the errors found. As for this project, there are probably not so many bugs left to make this article as big as the one about the check of Mono, but there is still some material that could be included. It’s the project authors who ought to be most interested in a complete list of warnings; to all the rest, I just show the most interesting errors and defects.

“Have you told the developers about these bugs?”

Oddly enough, people still ask us this question from time to time. We always inform the developers about the bugs we find, but this time I decided to go a little further.

I talked to one of the developers (Sergey, hi!) personally via Gitter. The advantages of such a solution are obvious – we may discuss the bugs found, get feedback on the analyzer, there might be something to correct in the article. It’s great when people understand the usefulness of static analysis. The developers told us that the detected code fragments are bugs indeed, thanked a lot and said that they would fix the bugs over the time. In turn, I decided to help them by giving links to these code fragments in the repository. We also had a talk about the use of the analyzer. It’s great, when people understand that static analysis should be used regularly. I hope it will be so, and the analyzer will embedded into the development process.

It was a nice mutually beneficial cooperation.

(animals are always cute)

Conclusion

As I’d expected, the analyzer managed to find quite a lot of suspicious fragments in PowerShell. The point of this article, however, is not about people writing incorrect code or lacking skill (it does happen at times, of course, but obviously not in this case); it’s just that it is the human error which is to blame. It’s the essence of human – everyone makes mistakes. Static analysis tools are designed to make up for this flaw of ours by catching errors in program code. That’s why regular use of such tools is the path to better code. A picture is worth a thousand words, so welcome to try PVS-Studio with your own code.