Checking the Code of Reiser4 with the PVS-Studio Static Analyzer

Hello there! In this article, we'll look at the free version (available to the developers of free and open-source software) of the PVS-Studio static analyzer in action. What we are going to check today is the source code of the Reiser4 file system and its utilities.

This article was originally posted on the Habrahabr website and reposted here by permission of the author.

I hope all of you who are about to read this article have heard, if only in passing, about the static code analyzer PVS-Studio. If you haven't, follow this link to read a brief product description.

The developer company also runs an official blog on Habrahabr where they frequently post reports with the analysis results of various open-source projects.

Let's write a small bash script so that we don't have to repeat that process by hand for each file. I'll use the sed stream editor to write the script (the following instruction is written in one line):

Checking reiser4progs-1.2.1

Now, reiser4progs has much more interesting, and sometimes sadder, things to show. By the way, here's what Edward Shishkin said about these tools: "The author left right after these progs were written, and no one has since looked into that code (except for a couple of times when I was asked to fix fsck). So I'm not surprised by that pile of bugs." Indeed, it's no surprise that such specific bugs are still there after so many years.

The first serious error is found in the plugin/key/key_short/key_short_repair.c file:

V616 The 'KEY_SHORT_BAND_MASK' named constant with the value of 0 is used in the bitwise operation.

KEY_SHORT_BAND_MASK is the constant 0xf000000000000000ull, which means the Boolean NOT operation produces false here (in C, all values other than 0 are considered true), i.e., in fact, 0. However, the programmer obviously meant the bitwise NOT (~) operation rather than the Boolean NOT. This warning was triggered several times by different files.

Wait... It's not really an error - it's some sort of black magic or a dirty trick (if you don't believe in magic). Why? Well, would you call the code below clear and straightforward without a deep understanding of the processor's and operating system's inner workings and the programmer's idea?

What'd you say? This is not an error, but we'd better leave this code alone unless we know what goes on here. Let's try to figure it out.

The line *(int *)0 = 0; would trigger a SIGSEGV in a regular program. If you search for information about the kernel, you'll find that this statement is used to make the kernel generate an oops. This subject was discussed in the kernel developers' newsgroup (here), and Torvalds himself mentioned it too. So, if an assignment like that happens, in some mysterious ways, to execute inside the kernel code, you'll get an oops. Why check for the "impossible" condition is something that only the author himself knows, but, as I said, we'd better let the thing be unless we know how it works.The only thing we can safely investigate is why the V547 warning was triggered. The len >= 16 expression is always false. The while loop is executed as long as the value of len is greater than or equal to 16, while the value 16 is subtracted at the end of the loop body at every iteration. This means the variable can be represented as len = 16*n+m, where n and m are integers and m<16. It's obvious that once the loop is over, all the 16*n's will have been subtracted, leaving only m.The other warnings here follow the same pattern.

The following error is found in the plugin/sdext/sdext_plug/sdext_plug.c file:

V595 The 'stat' pointer was utilized before it was verified against nullptr. Check lines: 18, 21.

Either it's a banal typo or the author intended to write something else. The !stat check looks as if it were a nullptr check, but it doesn't make sense for two reasons. Firstly, the stat pointer has been already dereferenced. Secondly, this expression is evaluated from left to right, in accordance with the standard, so if it's really a nullptr check, it should be moved to the beginning of the condition since the pointer is originally dereferenced earlier in that same condition.

The plugin/item/cde40/cde40_repair.c file triggered a number of warnings such as this:

This issue is similar to the previous one. The res variable is assigned the result of a Boolean expression. The programmer is obviously using a C "trick" here, so the expression should be rewritten as (A = B) < C.

Another typo or mistake made from inattention. File libreiser4/flow.c:

There are two integer variables here. Their difference is ALWAYS greater than or equal to zero because, from the viewpoint of how integers are represented in computer memory, subtraction and addition are, in effect, the same operation to the processor (Two's complement). The condition was more likely meant to check if end > off.

The code is contained in a loop, and the loop body is executed only when insert > 0, so the condition is always true. It's either a mistake, and therefore something else is missing, or a pointless check.

Comparing a pointer with a terminal null character is a brilliant idea, no doubt, but the programmer more likely meant the check *(c + 1) == '\0', as the *c + 1 == '\0' version doesn't make much sense.

Now let's discuss a couple of warnings dealing with the use of fprintf(). The messages themselves are straightforward, but we'll need to look in several files at once to understand what's going on.First we'll peek into the libmisc/ui.c file.

V618 It's dangerous to call the 'fprintf' function in such a manner, as the line being passed could contain format specification. The example of the safe code: printf("%s", str);

We'll ignore the warnings dealing with redefinition of the standard types in the headers of the kernel itself since the standard headers are not used in the build; and we are not interested in the kernel code anyway.

The first file to examine is fs/reiser4/carry.c.

V522 Dereferencing of the null pointer 'reference' might take place. The null pointer is passed into 'add_op' function. Inspect the third argument. Check lines: 564, 703.

This is straightforward: the value of result doesn't change, so it's okay to omit the check.

V1004 The 'ref' pointer was used unsafely after it was verified against nullptr. Check lines: 1191, 1210.

carry_node *add_new_znode(znode * brother /* existing left neighbor
* of new node */ ,
carry_node * ref /* carry node after which new
* carry node is to be inserted
* into queue. This affects
* locking. */ ,
carry_level * doing /* carry queue where new node is
* to be added */ ,
carry_level * todo /* carry queue where COP_INSERT
* operation to add pointer to
* new node will ne added */ )
{
....
/* There is a lot of possible variations here: to what parent
new node will be attached and where. For simplicity, always
do the following:
(1) new node and @brother will have the same parent.
(2) new node is added on the right of @brother
*/
fresh = reiser4_add_carry_skip(doing,
ref ? POOLO_AFTER : POOLO_LAST, ref);
....
while (ZF_ISSET(reiser4_carry_real(ref), JNODE_ORPHAN))
{
....
}
....
}

What happens in this check is that ref is checked for nullptr by the ternary operator and then passed to the reiser4_carry_real() function, where null-pointer dereferencing may take place with no prior nullptr check. However, that never happens. Let's look into the reiser4_carry_real() function:

In the comments, Andrey Karpov explained what this error is about. The expression is cast to type int in the if statement, so no overflow will occur even if item_pos is assigned the maximum value since casting the expression to int results in the value 0xFFFF + 1 = 0x010000 rather than 0.

All the other bugs either follow one of the patterns discussed above or are false positives, which we have also talked about.

Conclusions

They're simple.

Firstly, PVS-Studio is cool. A good tool helps you do your job better and faster if you know how to handle it. As a static analyzer, PVS-Studio has more than once proved to be a top-level tool. It provides you with the means to detect and solve hidden problems, typos, and mistakes.

Secondly, be careful writing code. Don't use C "tricks" unless it's the only legal way to implement some feature. Always use additional parentheses in conditions to explicitly state the desired order of calculations because even if you are a super-duper hacker and C ace, you may simply confuse the operators' precedence and make a bunch of mistakes, especially when writing large portions of code at a time.

Acknowledgements: thanking the PVS-Studio developers

I'd like to thank the developers for such a wonderful tool! They have done a really great job adapting PVS-Studio to GNU/Linux systems and carefully designing the analyzer's implementation (see the details here). It elegantly integrates into build systems and generates logs. If you don't need integration, you can simply "intercept" compiler launches by running make.

And above all, thank you so much for giving students, open-source projects, and single developers the opportunity to use your tool for free! That's amazing!

Use PVS-Studio to search for bugs in C, C++, and C# code

We offer you to check your project code with PVS-Studio. Just one bug found in the project will show you the benefits of the static code analysis methodology better than a dozen of the articles.

We use cookies for the analysis of events to improve our content and make user interaction more convenient. By continuing the view of our web-pages you accept the terms of using these files. You can find out more about cookie-files and privacy policy or close the notification, by clicking on the button.
Learn More →