The Evil within the Comparison Functions

This is a computer translation of the original content. It is provided for general information only and should not be relied upon as complete or accurate.

Perhaps, readers remember my article titled "Last line effect". It describes a pattern I've once noticed: in most cases programmers make an error in the last line of similar text blocks. Now I want to tell you about a new interesting observation. It turns out that programmers tend to make mistakes in functions comparing two objects. This statement looks implausible; however, I'll show you a great number of examples of errors that may be shocking to a reader. So, here is a new research, it will be quite amusing and scary.

Problematics

Here is my statement: programmers quite often make mistakes in rather simple functions that are meant to compare two objects. This claim is based on the experience of our team in checking a large number of open source projects in C, C++ and C#.

The functions we are going to consider here are IsEqual, Equals, Compare, AreEqual and so on or overloaded operators as ==, !=.

I noticed that when writing articles, very often I come across errors related to the comparison functions. I decided to explore this question in detail and examined the base of errors we found. I did a search of functions throughout the base containing words Cmp, Equal, Compare and such. The result was very impressive and shocking.

In fact this story is similar to the one we had when writing the article "Last line effect". Similarly, I noticed an anomaly and decided to explore it more carefully. Unfortunately, unlike the aforementioned article, I don't know how to bring statistics here and which figures to provide. Perhaps, later I'll come up with a solution with the statistics. At this point I am guided by intuition and can only share my feelings. They see that there are a lot of errors in the comparison functions and I am sure, you will get the same feeling when you see that huge amount of truly impressive examples.

In general, we can conclude that the cause of the errors in the last lined is related to the fact that the developer has already mentally moved to the new lines/tasks instead of focusing on the completion of the current fragment. As a result - when writing similar blocks of text, there is a higher probability that a programmer will make an error in the last one.

I believe that in the case of writing a comparison function, a developer in general often don't focus on it, considering it to be too trivial. In other words, he writes the code automatically, without thinking over it. Otherwise, it is not clear how one can make an error like this:

PVS-Studio analyzer detected this error in the code of RunAsAdmin Explorer Shim (C++) project: V501 There are identical sub-expressions to the left and to the right of the '==' operator: luid2.HighPart == luid2.HighPart RAACommon raacommonfuncs.cpp 1511

A typo. In the second line it should be: luid1.HighPart == luid2.HighPart.

The code is very simple. Apparently, the simplicity of code spoils everything. A programmer immediately thinks of the task to write such a function as standard and uninteresting. He instantly thinks of the way to write the function and he has just to implement the code. This is a routine, but unfortunately an inevitable process to start writing more important, complex and interesting code. He is already thinking about the new task... and as a result - makes an error.

In addition, programmers rarely write unit tests for such functions. Again the simplicity of these functions prevents from it. It seems that it would be too much to test them, as these functions are simple and repetitive. A person has written hundreds of such functions in his life, can he make an error in another function? Yes, he can and he does.

I would also like to note that we aren't talking about code of students who are just learning to program. We are talking about bugs in the code of such projects as GCC, Qt, GDB, LibreOffice, Unreal Engine, CryEngine 4 V Chromium, MongoDB, Oracle VM Virtual Box, FreeBSD, WinMerge, the CoreCLR, MySQL, Mono, CoreFX, Roslyn, MSBuild, etc. It's all very serious.

We are going to have a look at so many diverse examples that it would be scary to sleep at night.

Erroneous Patterns in Comparison Functions

All errors in comparison functions will be divided into several patterns. In the article we'll be talking about errors in projects in C, C++ and C#, but it makes no sense to separate these languages, as most of the patterns are similar for different languages.

Pattern: A < B, B > A

Very often in the comparison functions there is a need to make such checks:

A < B

A > B

Sometimes programmers think that is more elegant to use the same operator <, but to switch the variables.

A < B

B < A

However, due to the inattentiveness, we get such checks:

A < B

B > A

In fact, one and the same comparison is done twice here. Perhaps, it's not clear what it is about here, but we'll get to the practical examples and it'll all become clearer.

PVS-Studio analyzer detected this error in the code of MongoDB (C++): V581 The conditional expressions of the 'if' operators situated alongside each other are identical. Check lines: 44, 46. parallel.h 46

This condition:

if ( other._server > _server )

Will always be false, as the same check was done two lines before. Correct code variant:

PVS-Studio warning (C#): V3021 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains method return. This means that the second 'if' statement is senseless. SourceLocation.cs 156

I think there is no need in explanation.

Note. For C# there was just one example of an error, but for C++ - two. In general, there will be less bugs in the C# code, than for C/C++. But I do not recommend rushing to the conclusion that C# is much safer. The thing is that PVS-Studio analyzer has only recently learned to check C# code relatively recently, and we have just checked less projects written in C#, than in C and C++.

Pattern: a Member of the Class is Compared with itself

The comparison functions usually consist of successive comparisons of structure/class members. This code tends to be more erronreous, when the member of the class starts being compared with itself. I can specify two subtypes of errors.

In the first case, a programmer forgets to specify the name of the object and writes in the following way:

Let's take a closer look at practical examples of this pattern. Pay attention that incorrect comparison often occurs in the last block of similar code blocks, which reminds us of the "last line effect" again.

PVS-Studio warning: V501 There are identical sub-expressions to the left and to the right of the '-' operator: q2.v.z - q2.v.z entitynode.cpp 93

Pattern: Evaluating the Size of a Pointer Instead of the Size of the Structure/Class

This type of error occurs in programs written in C and C++ and is caused by incorrect use of the sizeof operator. The error in evaluating not the size of the object, but the size of the pointer. Example:

T *a = foo1();
T *b = foo2();
x = memcmp(a, b, sizeof(a));

Instead of the size of the T structure, a size of the pointer gets evaluated. The size of the pointer depends on the used data model, but usually it is 4 or 8. As a result, more or less bites in the memory get compared than take the structure.

Correct variant of the code:

x = memcmp(a, b, sizeof(T));

or

x = memcmp(a, b, sizeof(*a));

Now let's move on to the practical part. Here is how such a bug looks in the code of CryEngine V (C++) code:

PVS-Studio warning: V3038 The 'enum_consts[i]' argument was passed to 'Compare' method several times. It is possible that other argument should be passed instead. CodeCompletion SymTable.cs 2206

I'll give some explanation here. The error in the factual arguments of the Compare function:

string.Compare(enum_consts[i], this.enum_consts[i], true)

The thing is that enum_consts[i] and this.enum_consts[i are the same things. As I understand, a correct call should be like this:

string.Compare(es.enum_consts[i], this.enum_consts[i], true)

or

string.Compare(enum_consts[i], es.enum_consts[i], true)

Pattern: Repetitive Checks A==B && A==B

Quite a common error in programming is when the same check is done twice. Example:

return A == B &&
C == D && // <=
C == D && // <=
E == F;

Two variants are possible in this case. The first is quite harmless: one comparison is redundant and can be simply removed. The second is worse: some other variables were to be compared, but a programmer made a typo.

In any case, such code deserves close attention. Let me scare you a little more, and show that this error can be found even in the code of GCC compiler (C):

PVS-Studio warning: V501 There are identical sub-expressions 'pState->fIgnoreTrailingWhite' to the left and to the right of the '||' operator. scmdiff.cpp 238

Pattern: Incorrect Use of the Value, Returned by memcmp Function

The memcmp function returns the following values of int type:

< 0 - buf1 less than buf2;

0 - buf1 identical to buf2;

> 0 - buf1 greater than buf2;

Please note that '>0' can be any number, not only 1. These numbers can be: 2, 3, 100, 256, 1024, 5555, 65536 and so on. This means that this result cannot be placed to a variable of the char and short type. The high bits can be lost, which might violate the logic of program execution.

Also this means that the result cannot be compared with constants 1 or -1. In other words, it is wrong to write this:

if (memcmp(a, b, sizeof(T)) == 1)
if (memcmp(x, y, sizeof(T)) == -1)

Correct comparisons:

if (memcmp(a, b, sizeof(T)) > 0)
if (memcmp(a, b, sizeof(T)) < 0)

The danger of this code is that it may successfully work for a long time. The errors may start showing up when moving to a new platform or with the change of the compiler version.

PVS-Studio warning: V642 Saving the 'memcmp' function result inside the 'unsigned short' type variable is inappropriate. The significant bits could be lost breaking the program's logic. recyclebin.cpp 542

Pattern: Incorrect Check of Null References

This error pattern is typical for C# programs. Sometimes in the comparison functions programmers write the type casting with the help of the as operator. The error is that inadvertently a programmer verifies against null not the new reference, but the original one. Let's take a look at a synthetic example:

The check if (obj == null) protects from the situation, if the obj variable contains a null reference. However, there is no protection from the case if it turns out that the as operator returns a null reference. The correct code should be like this:

Pattern: Incorrect Loops

In some functions, collections of items are compared. Of course, different variant of the loops are used for its comparison. If a programmer writes the code inattentively, it's easy to mix something up, as it is with the comparison functions. Let's look at a few of these situations.

PVS-Studio warning: V521 Such expressions using the ',' operator are dangerous. Make sure the expression is correct. tpplib peptide.cpp 191

Note that the comma operator is used in the condition. The code is clearly incorrect, because the condition, written to the left of the coma is ignored. That is, the condition on the left is evaluated, but its result is not used in any way.

Pattern: Sloppy Copying of the Code

A large amount of errors, cited previously can be called the consequences of sloppy Copy-Paste. They fell under some categories of the erroneous pattern and I decided that it would be logical to describe them in corresponding sections. However, I have several errors that have clearly appeared because of sloppy code copying, but I have no idea how to classify them. That's why I collected these errors here.

PVS-Studio warning: V778 Two similar code fragments were found. Perhaps, this is a typo and 'weight2' variable should be used instead of 'weight1'. clrjit lclvars.cpp 2702

The function was long that's why it is shortened for the article. If we examine the code of the function, we'll see that a part of the code was copied, but in one fragment a programmer forgot to replace the variable weight1 with weight2.

PVS-Studio warning: V3003 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 418, 422. txtserializerwriter.cs 418

The code of PascalABC.NET project (C#):

public void CompareInternal(....)
{
....
else if (left is int64_const)
CompareInternal(left as int64_const, right as int64_const);
....
else if (left is int64_const)
CompareInternal(left as int64_const, right as int64_const);
....
}

PVS-Studio warning: V3021 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains method return. This means that the second 'if' statement is senseless NamespaceTreeNode.cs 87

PVS-Studio warning: V649 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains function return. This means that the second 'if' statement is senseless. Check lines: 1205, 1206. sbprofilingdata.cpp 1206

Pattern: Equals Method Incorrectly Processes a Null Reference

In C# the accepted practice is to implement the Equals methods in such a way, so that they correctly process a situation, if a null reference is passed as an argument. Unfortunately, not all the methods are implemented according to this rule.

One closing bracket is put incorrectly. As a result, the amount of bites compared is evaluated by the statement sizeof(Matrix4) == 0. The size of any class is more than 0, which means that the result of the expression is 0. Thus, 0 bites get compared.

It's not clear what is the point of a special check against NaN here. If the condition (x == y) is true, it means that both x and y and different from NaN, because NaN isn't equal to any other value, including itself. It seems that the check against NaN is just not necessary, and the code can be shortened to:

PVS-Studio warning: V3112 An abnormality within similar comparisons. It is possible that a typo is present inside the expression 'baseValue == other.counterFrequency'. System-net_4_x CounterSample.cs 139

How Do these Programs Work at all?

Looking through all the errors, it seems miraculous that all these programs generally work. Indeed, the comparison functions do a very important and responsible task in program.

There are several explanations of why these programs work despite these errors:

In a lot of functions, only a part of the object is compared incorrectly. The partial comparison is enough for most of the tasks in this program.

There are no situations (yet) when the function works incorrectly. For example, this applies to the functions that aren't protected from null pointers or those, where the result of the memcmp function call is placed into the variable of char type. The program is simply lucky.

The reviewed comparison function is used very rarely or not used at all.

Who said that the program is working? A lot of programs really do something wrong!

Recommendations

I demonstrated how many errors can be found in the comparison functions. It follows that the efficiency of these functions should be checked with unit-tests by all means.

It is really necessary to write unit-tests for the comparison operators, for Equals functions and so on.

I am quite sure that there was such an understanding among programmers before reading this article, that unit tests for such functions is extra work and they won't detect any errors anyway: the comparison functions are just so simple at the first glance... Well, now I showed the horror that can hide in them.

Code reviews and using static analysis tools would also be a great help.

Conclusion

In this article we mentioned a large amount of big-name projects that are developed by highly qualified experts. These projects are thoroughly tested using different methodologies. Still, it didn't stop PVS-Studio from finding errors in them. This shows that PVS-Studio can become a nice complement to other methodologies used to improve the quality and reliability of the code.