Chromium: Use of Untrusted Data

We'd like to present the series of articles dealing with the recommendations on writing code of high quality using the examples of errors found in the Chromium project. This is the fifth part, which deals with the use of unchecked or incorrectly checked data. A very large number of vulnerabilities exist thanks just to the use of unchecked data that makes the this topic exciting and actual.

In fact, almost any type of error can become a vulnerability, even an ordinary typo. Actually, if a found error is classified according to the Common Weakness Enumeration, this means that it is a potential vulnerability.

PVS-Studio analyzer, starting with version 6.21, learned to classify bugs according to Common Weakness Enumeration and assign them the appropriate CWE ID.

Readers may have already noticed that in previous articles, in addition to the warning number Vxxx I cited CWE ID as well. This means that the errors considered earlier, in theory may cause vulnerabilities. The probability is low, but it takes place. What is interesting, we were able to match a CWE ID almost with every warning issued by PVS-Studio. This means that even though we haven't planned, we created the analyzer that is able to detect a large number of weaknesses :).

In this article I collected the bugs, which can potentially lead to security problems. I'd like to notify that the choice of errors is quite relative and subjective. It may be that a vulnerability is disguised as an error, which I called a trivial typo in one of the previous articles.

So, let's see what security defects I noticed during analyzing the report issued by PVS-Studio for the Chromium project. As I wrote in the introductory article, I skimmed through the report quite fluently, so there may be other, unnoticed errors. The main objective of the article is to outline the way some errors make the program handle incorrect or unchecked data. I haven't decided how to define such data yet, and for now I will use the term "untrusted data".

The function returns an incorrect status. As a result, other parts of the program will take that the function has successfully removed some value. The bug is that the status DELETE_FAILED is always replaced with a status DELETED.

Perhaps, the described error doesn't reflect well the essence of the untrusted data. In this function, creation of false data occurs, but not its checking or using. So let's look at another, more appropriate error.

V501 CWE-570 There are identical sub-expressions 'that.BeginPos > EndPos' to the left and to the right of the '||' operator. cpvt_wordrange.h 46

V501 CWE-570 There are identical sub-expressions 'that.EndPos < BeginPos' to the left and to the right of the '||' operator. cpvt_wordrange.h 46

The condition is spelled wrong.

Let's reduce the condition so that it was easier to notice an error:

if (E2 < B1 || B2 > E1 || E1 < B2 || B1 > E2)

Note, that (E2 < B1) and (B1 > E2) mean the same thing. Similarly, (B2 > E1) is the same thing as (E1 < B2).

It turns out that not all the necessary checks are carried out. So, further an incorrect range may be generated, which, in turns, will affect the functioning of the program.

Now let's look at the large and complex fragment of code from a library of regular expressions RE2 (used in Chromium). Honestly, I don't even understand what's going on here, but the code definitely contains the anomalous check.

First, it has to be shown how some types are declared. If you don't do that, then the code is not very clear.

Let's try to understand why the condition is always false. First, look carefully at these lines:

if(c < Runesync)
return strchr((char*)s, c);

If the variable c < 0x80, the function ends its work. If the function doesn't end its work, and will continue it, you can say for sure that the variable c >= 0x80.

Now look at the condition:

if(c1 < Runeself)

A condition (c1 == c) marked by the comment "// <=", is executed only if c1 < 0x80.

So here's what we know about the values of the variables:

c >= 0x80

c1 < 0x80

It follows that the condition c1 == c is always false. It is very suspicious. It turns out that the function utfrune in the library of regular expressions is not working as planned. The consequences of such an error are unpredictable.

In this case, the protection is made in the way that when accessing to this array the index must not be greater than a certain value. Due to typo errors, or for any other reason, an incorrect constant is used. The constant equal to 454 should have been used, but instead of this the value of the index is compared with 993.

As a result, the array overrun and reading of random untrusted data is possible.

Note. Below there is a correct assert, but it will not help in the Release-version.

PVS-Studio warning: V774 CWE-416 The 'mzMappings' pointer was used after the memory was released. zonemeta.cpp 713

Code is complicated and I find it difficult to say exactly, if there is a bug or not. However, as far as I understood, it is possible that this function will return a pointer to the memory block being freed. A correct handler of incorrect status must reset the pointer:

Because of this error, the function can return a negative number, although it must not. It is a negative number, which can "leak" through the check and there is untrusted data.

Other Errors

Honestly, I don't really like the examples I gave in the previous section of this article. There are few of them and they don't reflect very well the essence of the bugs related to the use of untrusted data. I think eventually I'll write a separate article where I'll show more vivid examples of errors, having collected them from various open source projects.

By the way, the article might include more examples of errors, but I have "wasted" them when writing the preceding articles, and I don't want to repeat myself. For example, in the article "Chromium: Typos" there was such a fragment:

Because of this typo, the object referenced by the pointer negZ is not checked. As a result, the program will work with untrusted data.

Also in this article, I did not consider the situations where the untrusted (tainted) data appear due to the lack of the check of the pointer, which a malloc function returns. If the malloc function returned NULL, this does not mean that the only error of null pointer dereference is possible. There are more insidious situations. Schematically, they look like this:

int *ptr = (int *)malloc(100 * sizeof(int));
ptr[1234567] = 42;

There will be no null pointer dereference. Here data recording and destruction of some data will occur.

It's an interesting story and I will dedicate it the following separate article.

Recommendations

Various errors lead to the use of untrusted (unchecked, tainted) data. Some kind of universal piece of advice cannot be given here. Of course you may write: don't make bugs in your code! But there is no use in such a recommendation :).

So why did I write this article and highlight this type of errors?

For you to know about them. Awareness that a problem exists - this is the factor that helps to prevent it. If one doesn't know that the problem exists it doesn't mean that there is no problem. Nice illustration:

What can we still advise:

Update libraries used in your project. Various errors can be corrected in new versions, which might be vulnerabilities. However, it must be recognized that a vulnerability can appear right in the new version, and be absent the old one. But anyway, a better solution would be to update the libraries. Much more people know about the old vulnerabilities rather than about the new ones.

Thoroughly check all input data, especially coming from somewhere outside. For example, all the data coming from somewhere by the network should be checked very carefully.

Use a variety of tools to check the code. For example, the Chromium project clearly lack the PVS-Studio static analyzer using :).

Note about PVS-Studio

As I have already said, PVS-Studio analyzer is already helping prevent vulnerabilities by detecting errors even at the stage of writing code. But we want more and soon we'll significantly improve PVS-Studio by introducing the concept "use of unchecked data" in Data Flow analysis.

We even have already reserved a special number for this important diagnostic: V1010. Diagnostic will detect errors when the data was obtained from unreliable source (for example, sent by the network), and is used without proper verification. The absence of all necessary checks of input data often cause detection of vulnerabilities in applications. Recently we've written about this in the article "PVS-Studio 2018: CWE, Java, RPG, macOS, Keil, IAR, MISRA" (see the section "Potential vulnerabilities, CWE").

New diagnostic will significantly strengthen the analyzer in the identification of potential vulnerabilities. Most likely V1010 diagnostic will match the CWE-20 identifier (Improper Input Validation).

Conclusion

I suggest you and your colleagues read our article "42 recommendations" on our website. A developer will not become a security expert but will find out a lot of interesting and useful material. These articles will be especially useful for developers, who have just started writing in C or C++ languages and who has no idea how deep is the rabbit-hole into which they fell.

I'm planning to update the "42 recommendations" and update them into "50 recommendations". So I invite you to subscribe to my Twitter @Code_Analysis and our RSS channel not to miss this and other interesting articles in our blog.

Use PVS-Studio to search for bugs in C, C++, and C# code

We offer you to check your project code with PVS-Studio. Just one bug found in the project will show you the benefits of the static code analysis methodology better than a dozen of the articles.

We use cookies for the analysis of events to improve our content and make user interaction more convenient. By continuing the view of our web-pages you accept the terms of using these files. You can find out more about cookie-files and privacy policy or close the notification, by clicking on the button.
Learn More →