C, C++, C#, Java bad practices: learn how to make a good code by bad example

Checking 7-Zip with PVS-Studio analyzer

One of the programs, which allows you to solve the problem of data compression, is a popular file archiver 7-Zip, which I often use myself. Our readers have long asked us to check the code of this application. Well, it’s time to look at its source code, and see what PVS-Studio is able to detect in this application.

Introduction

A couple of words about the project. 7-Zip is a free file archiver with a high data compression ratio, written in C, and C++. The size of this project is 235,000 lines of code. It supports several compression algorithms and a variety of data formats, including its own 7z format, with a highly effective LZMA compression algorithm. It is in development since 1999, free, and open source. 7-Zip is the winner of the SourceForge.net Community Choice Awards of the year 2007 in the categories “Best project” and “Best technical design”. We checked the 16.00 version, whose source code can be downloaded at this link – http://www.7-zip.org/download.html

Analysis results.

To do the analysis of 7-Zip we used the static code analyzer, PVS-Studio v6.04. In this article we provide the most interesting analyzer warnings. Let’s have a look at them.

Typos in conditional statements

We see typos in conditional operators quite often. They can cause a lot of pain if there is a large number of checks. Then static analyzer comes to our aid.

Here are some examples of this error.

V501 There are identical sub-expressions ‘Id == k_PPC’ to the left and to the right of the ‘||’ operator. 7zupdate.cpp 41

The analyzer detected similar conditional expressions. At best, one of the conditions for Id == k_PPC is redundant and does not affect the logic of the program. To fix this typo we should just remove this condition, then the correct expression will be:

The typos most likely appeared because of using Copy-Paste to duplicate the code. It wouldn’t make sense to recommend not using the copy-paste method. It’s too convenient and useful to reject such functionality in the editor. We should just check the result we get more thoroughly.

Identical comparisons

The analyzer detected a potential error in a construction that consists of two conditional statements. Here is an example.

V517 The use of ‘if (A) {…} else if (A) {…}’ pattern was detected. There is a probability of logical error presence. Check lines: 388, 390. archivecommandline.cpp 388

As a result the second condition will never be fulfilled. Let’s try to sort out this problem in detail. Based on the description of the command-line parameters, the -r parameter signals usage of recursion for subdirectories. But in the case of the -r0 parameter, the recursion is used only for the template names. Comparing this with the definition NRecursedType we can draw the conclusion, that in the second case we should use the type NRecursedType::kWildcardOnlyRecursed. Then the correct code will be like this:

The thing is that newSize has unsigned type, and the condition will never be true. If a negative value gets to the SetSize function, then this error will be ignored and the function will start using an incorrect size. There were two more conditions in 7-Zip that are always either true or false because of the confusion with signed/unsigned types.

Suspicious pointer handling

There were such bugs in 7-Zip code, where a pointer first gets dereferenced, and only then it is verified against null.

V595 The ‘outStreamSpec’ pointer was utilized before it was verified against nullptr. Check lines: 753, 755. lzmaalone.cpp 753

It is a very common error in all programs. It usually appears because of negligence during the process of refactoring. Accessing by a null pointer will result in undefined behavior. Let’s look at a code fragment of an application containing an error of this type:

The pointer outStreamSpec is dereferencedin the expression outStreamSpec->ProcessedSize. Then it is verified against null. The check below in the code is either meaningless, or we should verify the pointer in the code above against null. Here is a list of potentially buggy fragments in the program code:

V595 The ‘_file’ pointer was utilized before it was verified against nullptr. Check lines: 2099, 2112. bench.cpp 2099

V595 The ‘ai’ pointer was utilized before it was verified against nullptr. Check lines: 204, 214. updatepair.cpp 204

V595 The ‘options’ pointer was utilized before it was verified against nullptr. Check lines: 631, 636. zipupdate.cpp 631

V595 The ‘volStreamSpec’ pointer was utilized before it was verified against nullptr. Check lines: 856, 863. update.cpp 856

An exception inside a destructor

When an exception is thrown in a program, the stack beings to unwind, and objects get destroyed by calling the destructors. If the destructor of an object being destroyed during the stack folding throws another exception which leaves the destructor, the C++ library will immediately terminate the program by calling the terminate() function. Therefore, the destructors should never throw exceptions. An exception thrown inside a destructor must be handled inside the same destructor.

The analyzer issued the following message:

V509 The ‘throw’ operator inside the destructor should be placed within the try..catch block. Raising exception inside the destructor is illegal. consoleclose.cpp 62

V509 message warns that if the CCtrlHandlerSetter object is destroyed during processing of the exception handling, the new exception will cause an immediate crash of the program. This code should be written in such a way, so as to report an error in the destructor without using the exception mechanism. If the error is not critical, then it can be ignored.

Increment of a bool type variable

Historically, the increment operation is possible for variable of bool type; the operation sets the value of the variable to true. This feature is related to the fact that previously integer values were used to represent boolean variables. Later this feature remained to support backwards compatibility. Starting with the C++98 standard, it is marked as deprecated, and not recommended for use. In the upcoming C++17 standard this possibility to use an increment for a boolean value is marked for deletion.

We found a couple of fragments where this obsolete feature is still used.

V552 A bool type variable is being incremented: numMethods ++. Perhaps another variable should be incremented instead. wimhandler.cpp 308

V552 A bool type variable is being incremented: numMethods ++. Perhaps another variable should be incremented instead. wimhandler.cpp 318

There are two possible variants in this situation. Either the numMethods is a flag, and it’s better to use initialization by a boolean value numMethods = true in this case. Or, judging by the variable, it is a counter which should be an integer.

Checking incorrect memory allocation

The analyzer detected a situation, where the pointer value, returned by the new operator is compared with zero. This usually means that the program won’t behave in the way the programmer expects in the case of it not being possible to allocate the memory.

V668 There is no sense in testing the ‘plugin’ pointer against null, as the memory was allocated using the ‘new’ operator. The exception will be generated in the case of memory allocation error. far.cpp 399

If the new operator was unable to allocate the memory, then according to a C++ standard, an exception std::bad_alloc() is generated. Then the verification against null is pointless. The plugin pointer will never be null. The function will never return a constant value INVALID_HANDLE_VALUE. If it is impossible to allocate the memory, then we have an exception which should be handled on a higher level, and the verification against null may be deleted. In case it’s not desirable to have exceptions in the application, we can use new operator which doesn’t generate exceptions, and thus, the return value can be verified against null. There were three more similar cheks:

V668 There is no sense in testing the ‘m_Formats’ pointer against null, as the memory was allocated using the ‘new’ operator. The exception will be generated in the case of memory allocation error. enumformatetc.cpp 46

V668 There is no sense in testing the ‘m_States’ pointer against null, as the memory was allocated using the ‘new’ operator. The exception will be generated in the case of memory allocation error. bzip2decoder.cpp 445

V668 There is no sense in testing the ‘ThreadsInfo’ pointer against null, as the memory was allocated using the ‘new’ operator. The exception will be generated in the case of memory allocation error. bzip2encoder.cpp 170

Constructions requiring optimization

Now let’s talk about some spots that can potentially be optimized. An object is passed to the function. This object is passed by value, but doesn’t get modified, because of an const keyword. Perhaps it would be sensible to pass it with a constant reference in the C++ language, or with the help of a pointer in C.

Here is an example for the vector:

V801 Decreased performance. It is better to redefine the first function argument as a reference. Consider replacing ‘const .. pathParts’ with ‘const .. &pathParts’. wildcard.cpp 487

During the call of this function we’ll have a call of a copy constructor for the UStringVector class. This can significantly reduce the performance of an application if such object copying happens quite often. This code can be easily optimized by adding a reference:

Conclusion

7-Zip is a small project, which has been developing for quite a while, so there wasn’t much chance of finding a large number of serious bugs. But still, there are some fragments that are worth reviewing, and static code analyzer PVS-Studio can be of great help. If you develop a project in C, C++ or C#, I suggest downloading PVS-Studio and checking your project.