Main menu

Post navigation

Heartbleed and static analysis

In the wake of the security disaster that is the Heartbleed vulnerability, a number of people have asked me if Coverity’s static analyzer detects defects like this. It does not yet, but you’d better believe our security team is hard at work figuring out ways to detect and thereby prevent similar defects. (UPDATE: We have shipped a hotfix release that has a checker that will find defects like the HeartBleed defect. The security team works fast!)

I’ll post some links to some articles below, but they’re a big jargonful, so I thought that a brief explanation of this jargon might be appropriate. The basic idea is as follows:

Data which flows into a system being analyzed is said to come from a source.

Certain data manipulations are identified as sinks. Think of them as special areas that data is potentially flowing towards.

A source can taint data. Typically the taint means something like “this data came from an untrusted and potentially hostile agent”. Think of a taint as painting a piece of data red, so that every other piece of data it touches also becomes red. The taint thus spreads through the system.

A sink can require that data flowing into it be free of taint.

If there is a reasonable code path on which tainted data flows into a sink, that’s a potential defect.

So for example, a source might be user input to a web form. A taint might be “this data came from a client that we have no reason to trust”. A sink might be code which builds a SQL string that will eventually be sent to a database. If there is a code path on which tainted data reaches the sink, then that’s a potential SQL injection defect. Or a source might be a number sent over the internet from a client, and a sink might be code that indexes into an array. If an number from an untrustworthy client can become an index into an array, then the array might be indexed out of bounds. And so on; we have great flexibility in determining what sources and sinks are.

Now that you understand what we mean by sources, sinks and taints, you can make sense of:

For the TLDR crowd, basically what Andy is saying here is: identifying sinks is not too hard; in the case of Heartbleed, any call to memcpy could be the sink. But it can be tricky to determine when a source ought to be tainted. To get reasonable performance and a low false positive rate we need a heuristic that is both fast and accurate. The proposed heuristic is: if it looks like you’re swapping bytes to change network endianness into local machine endianness then it is highly likely that the data comes from an untrusted network client. That of course is far from the whole story; once the taint is applied, we still need to have an analyzer that correctly deduces whether tainted data makes it to a sink that requires untainted data.

Taking a step farther back, I’ve got to say that this whole disaster should be a wakeup call: why is anyone still writing security-critical infrastructure in languages that lack memory safety at runtime? I’m fine with this infrastructure being written in C or C++, so long as at runtime the consequence of undefined behaviour is termination of the program rather than leaking passwords and private keys. A compiler and standard library are free to make undefined behaviour have whatever behaviour they like, so for security-critical infrastructure, let’s have a C/C++ compiler and library that makes undefined behaviour into predictably crashing the process. Somehow C# and Java manage to do just that without an outrageous runtime performance cost, so a C/C++ compiler could do the same. With such a runtime in place, the Heartbleed defect would have been a denial of service attack that calls attention to itself, rather than silently leaking the most valuable private data to whomever asks for it, without so much as even a log file to audit.

To argue that we cannot afford the cost of building such a compiler and using it consistently on security-critical infrastructure is to argue that it would be cheaper to just deal with arbitrarily many more Heartbleeds.

UPDATE: A number of people have pointed out to me that safe memory access comes at a real performance cost, and that the authors of OpenSSL had written a custom allocator. Both arguments miss my point. First, for security-critical infrastructure let’s default to safety over performance; if performance turns out to be unacceptable then let’s identify the most performance-crucial hot spots and concentrate lots of attention on how to improve performance without risking safety in those areas. Second, it has been alleged to me that the authors of that custom allocator wrote it for speed, not for safety. Again, this is completely the wrong attitude: write for safety first when you are building security infrastructure. And third, let’s look at the premise of the whole effort: users of SSL by definition are people who are willing to take a hit in the performance of their networking layer in exchange for improved safety; the idea that raw speed is the most important thing is simply false.

Related

53 thoughts on “Heartbleed and static analysis”

I don’t think it would be possible to design a subset of C which could run most existing code at even 50% of “normal” speed, but which would reliably trap invalid memory accesses. A fundamental difficulty is that the C standard mandates that it must be possible to decompose a pointer into some number of `char` values, each of which must behave as a number, and later take any sequence of `char` values representing those same numbers and convert it back into a pointer. While it would be possible for a compiler to define pointers as being 128 bits each, have each pointer combine a sequence number, an allocation handle, and an offset, and have a statement like *p = q; translate as:

the performance burden would be pretty severe. Further, unless pointers’ byte representations were encrypted, there would still be no way of distinguishing a valid pointer from a bunch of arbitrary bits. Having an absolute rule dictating that references and values are fundamentally separate, and having a clear distinction between pointers and references could go a long way to reduce the cost of robustness.

Sure, the performance burden would be high; the question is then “what is the cost of the alternative?” Apparently that cost is that attackers get your private keys *really fast*.

There are simple things we could do that would each address part of the problem, and whose runtime cost could be evaluated. For example: why are local variables allocated on the stack? The C specification does not require that there be a data structure called “the stack”. So our “security-critical infrastructure” compiler could start with: all data allocated by user-written code is allocated off the heap and freed when its lifetime ends.

With this simple change all stack-smashing attacks go away — sure, they become heap smashing attacks, but return addresses aren’t on the heap. All improvements that make the heap better make *all* allocations better. All local variables become more expensive, but I think we can afford that in our security-critical infrastructure.

Once we have the bugs shaken out of that, start to strengthen the heap. And so on.

I believe that most of what an SSL implementation does is not performance critical. Heartbeats certainly aren’t. The crypto algorithms can still use completely unsafe and unchecked C. All the surrounding logic, parsing and resource handling could be safe, managed code.

Frameworks such as Java and .NET achieve a level of performance which is for many purposes just as good as could be achieved in a standards-compliant C dialect which tried to make all pointer accesses “safe”, and thus “what is the cost of the alternative” can already be answered with something cheaper than trying to make C pointers “safe” while continuing to use those aspects of the language which preclude static verification.

To be sure, extended-precision math is a kind of code where .NET and Java come up short because of their lack of lack of instructions to do things like 32×32->64 and 64×64->128 multiplication, array-subrange add-with-carry, etc. but such deficiencies could be corrected fairly easily.

Alternatively, one could introduce a native language which mostly behaved like C, but distinguished between “pointer returned by malloc-ish call” from “pointer to general object”, with the latter including a base pointer and offset, and further imposed strict compile-time checking of pointer operations. Something, somewhere, needs to be written in a language which can dole out chunks of bytes and allow code to interpret them as structures, but such things could be relegated to “unsafe” sections which programmers would be expected to go over with a fine-tooth comb.

What you could do is to have a C++-ish compiler which had a clear distinction between “pointers” and “references”.

So you could have references (which would be an instance of an object or a fixed-length bounds-checked-on-access array) and it wouldn’t be possible to convert a reference into any data type other than what it was allocated as (i.e.being able to do type casting to valid parent or child classes using RTTI but not being able to convert it into something arbitrary).

Then you could have pointers which would be arbitrary blobs of memory with no bounds checking and fast access.

With this, you could be sure that if you had a reference to a MyClass, the memory behind it really DID contain a MyClass instance and that no code touching that reference (or converting it to something else or passing it through) was going to go outside the boundaries of the memory allocated for that MyClass instance.

If you have a MySubClass and a function that takes a MyClass and need to convert the MyClass into a MySubClass, you would use something that checked RTTI to make sure it really IS a MySubClass and that wouldn’t allow you to access it as a MySubClass unless it was one.

That’s the kind of thing I’d like to see. There are some cases where pointers into objects are necessary, but outside of scope-limited byrefs, such things are rare enough that splitting them into objectRef+offset [stack frames would have a special objectRefs associated with them] shouldn’t be too expensive.

Actually, what I’d really see would be a revisiting of segment+offset addressing, but with a design focused on having each object be a “segment”. Hardware which “understood” object references could not only perform bounds checks concurrent other operations (minimizing overhead), but also facilitate concurrent garbage collection without any stop-the-world pauses (the only time it would be necessary to pause a thread would be if it tried to write to an object while the GC was moving it; the thread could be resumed once the move was complete). For such a project to really work, the hardware would have to be designed together with a VM that could work decently on normal hardware, but superbly on the new hardware.

Because addressing relative to stack pointer has received special attention in CPU hardware development over the years while a solution using a heap would probably have much worse performance?

Because C and C++ have ABI to adhere to?

Because there is operating system out there whose APIs work with stack parameter passing and removing local variables would not fully eliminate the risk of stack corruption?

Because stack is part of the underlying hardware and should not be abstracted for the sake of security through obscurity?

>> but return addresses aren’t on the heap

Heap is arbitrary memory. Stack is also arbitrary memory pointed to by the stack pointer. Both will at some point contain untrusted input. How about validating it instead of shifting the problem from one place in memory to another?

Heartbleed bug has happened because the developer trusted a field holding the amount of data in the received packet not to be malicious even though the amount of received data was known and could have been checked against to detect malformed packets.

Many things had to happen for the Heartbleed bug to become a vulnerability. The developer had to make a bad trust decision, yes. But many other things had to happen. The developer had to be able to check in the code. The code had to survive whatever code review process was in place. The project had to be written in a language that doesn’t check buffer sizes automatically. The static analysis tools used had to not find the defect. The code had to be used by a user who relied for their professional-grade security on critical code written by security amateurs. And so on; all of these and many more are required conditions for the bug to become a vulnerability.

” A fundamental difficulty is that the C standard mandates that it must be possible to decompose a pointer into some number of `char` values, each of which must behave as a number, and later take any sequence of `char` values representing those same numbers and convert it back into a pointer.”

This doesn’t strike me as a compelling case against. Yes, if your program does this (access object values through a char*) you will make it particularly hard on a compiler to maintain safety, with a corresponding performance loss. But programs that actually do this are far more likely to hit undefined behavior than those who walk the straight and narrow anyway — I expect actual cases of pointer value aliasing to be rare indeed. Other kinds of aliasing are far more common, but those common cases are undefined behavior.

Ironically (or otherwise) they’re undefined behavior precisely to allow the compiler to introduce assumptions about the program’s sanity and optimize it that way! In this sense a C compiler is certifiably “insane” in that it’s allowed to treat programs as fully correct, not even until proven otherwise. Whatever consequences arise of the program not actually being correct are entirely the problem of the end users.

C is basically a lost cause, worse, a lost soul. It will never wander out of the limbo of undefined behavior it created for itself, and the head start this gave it in early compilers (and to a far lesser extent still gives it in modern compilers) means it’s so entrenched it’ll continue to be used for purposes it’s manifestly unsuited for.

Don’t get me wrong, C was one of my early loves and I still have a fondness for the language, but if you need trusted code, writing it in C is like riding a motorcycle without a helmet. You may look cool, you may never have an accident, but boy are you going to be sorry when you do — and pity the guys who have to clean up the mess.

I agree. The understanding everyone has about C, and every compiler supports, is that a pointer is an integer address in memory; you can do arithmetic on them just like any other integer; you can work out how everything is laid out in memory.

The combination of that fact with the goal of C being available on a wide variety of platforms with every memory architecture under the sun leads to a standard that necessarily leaves important parts of the language being undefined. But even though they’re undefined, everyone knows what they are supposed to do in the context of the machine architecture they’re working with.

I think a compiler and runtime that was strictly standard-compliant but didn’t expose the raw memory architecture would quite rightly be perceived as not real C.

The only way to have a C-like language with memory safety that would be accepted by developers is to write a new C derivative. And here we are – with Objective-C, Java, C#, etc etc.

The issue isn’t that pointers can be cast to integers or vice versa, but rather that (1) all data types, including pointers, are required to be representable as a sequence of unsigned char values which do not contain any “hidden” bits, and (2) the only form of non-ephemeral memory allocation one can request at runtime is a sequence of unsigned char; the only way to get anything else is to take a pointer to such a sequence and cast it to some other type. These requirements together not only mean that pointers can be stored in ways compilers can know nothing about (e.g. encrypted, sent out a TCP port, displayed on screen, etc.), but worse–they are *routinely* stored in memory for which an `unsigned char*` alias is known to have existed.

Yes, but my point is that the sequence you describe (“take a pointer to malloc’d memory and cast it”) is exactly the common case of aliasing that *isn’t* hard to statically analyze because it’s structured (in fact, it’s so common I didn’t even realize it *is* aliasing until you mentioned it). Once you move off the beaten path the analyzer has to do more work. The problem is hard *in general*, but you can optimize for the common cases and still get somewhere. In the case of hardened code, you could even demand that people stay within the bounds of what the analyzer can still analyze, insofar as this is reasonable (“I can’t prove you’re right, therefore you’re wrong”).

I’m not saying static analysis is by any means easy even with shortcuts like these, mind you, but the fact that the general problem is hard should not be taken as a fundamental barrier. A great many interesting problems in computing are unsolvable in the general case, or have worst-case exponential running time, but that doesn’t stop us from coming up with practical solutions.

That’s wrong. See e.g. ATS or felix.
They compile to C or C++, using extensive type annotations to help the compiler find bugs (or type contract violations) and run actually close to or even faster than C.

And comparing that to the secure polarssl:
A frama-c or klee annotated C code does not harm the resulting C at all. It would have just caught such bugs at compile-time also.

And as exciting side-result for symbolic annotations: you can create unit tests cases automatically and create repro testcases which would lead to the found bugs.

As soon as you use a pointer, reference or iterator in C++ (also in Modern C++) it’s possible to mess up. Think invalidated iterator, dangling reference or use-after-free for pointer. Or any combination of these.

C/C++ can be memory safe in practice, never in theory like C# and Java. And in complicated enough programs not in practice either much of the time.

Alternatively, we can extend C type system so that more of these things can be caught at compile time. That’s hard(er) to do for integer overflow. For memory reference errors like heartbleed, it may be enough to attach a size attribute to every pointer. It means “the pointer points to a location with at least these many valid bytes.” The type of memcpy will require both pointers to have at least n bytes. It’s a compile time error if the compiler can’t prove size is large enough. Of course, some code will have to be manually annotated, but compilers can infer sizes for a lot of pointers. For security critical code this may be acceptable.

Extending the language is good, and C++ is making progress in this area. But that doesn’t help for the vast majority of existing code. And annotations are an expense, and you can get them wrong. We can strengthen the runtime without changing the source code.

If, as I suggested above, all memory is allocated on the heap, then pointers don’t have to know their own size because you can ask the heap “did you make this pointer, and how many more bytes does it point to?” The compiler then does not have to statically solve the problem, which is hard, or make pointers bigger.

Sure, making C completely statically memory safe is hard. But runtime safety is not at all easy, either. Firstly, many pointers point to data segment or stack, not to heap. This can be addressed by keeping metadata for those areas too. The big problem is that enforcing safety needs to check each load and store — this is very expensive. Valgrind’s memcheck has overhead around 50x. AddressSanitizer, which was designed as a fast alternative, still has about 2x overhead. Neither catch all the bugs either — if you use pointer arithmetic to jump from one allocation to another without touching read zone no error will be detected. I worked on a project that stored metadata directly in pointers (making them bigger and requiring changes to programs) and had access enforcement done by hardware. That still had a very sizable runtime overhead.

The fundamental problem is that C was *designed* to be unsafe in many ways, so that it can be fast on hardware from 40 years ago. Now that our computers are much much faster, and we have a lot of critical C code that is actually being attacked, we don’t like this design choice very much. Undoing it is, unfortunately, not easy. IMHO a practical solution is likely to be not very complete or pretty.

I think C/C++ were designed not so much to be unsafe but to produce the fastest and the most efficient binaries. Whether these binaries were to run on slow processors or as parts of real-time high-performance systems where a nanosecond could make a difference, does not really matter (don’t have an example from the top of my head — a cruising missile, dodging interception attempts???) There is always a compromise between having the maximum freedom to achieve the ultimate eficiency and protecting the programmer from him/herself. So C/C++, and before that, Assembler, are all about the former: it’s your responsibility to know what you’re doing. And the more modernt tools like CLR/.NET/C# or Java, spend more CPU time on the later. And the solution would be, obviously, to implement the common protocols using the modern tools, but then we’d have to convince people to spend money and upgrade from their old hardware, especially the servers. So I agree: there not going to be any complete and elegant solution in sight.

A pointer to a particular data item should consist of two parts: something that identifies the allocation in which the object resides, and an offset into that allocation. If the allocations are relatively fine-grained, this can provide considerable security. Further, in many cases, the offset part of a pointer could be omitted for pointers that will be used only to access standalone allocations. This could allow applications to address more than 4 gigs of memory while storing “object” references in 32-bits (e.g. using 8x address scaling mode on x86 processors). Unfortunately, C as it presently exists wouldn’t work well with such a concept [incidentally, the approach was used on the 8086, and would have been very nice but for a lack of segment registers in the hardware, and dubious language support].

I don’t think much safety can be achieved without making pointers bigger, though, while the only way to dynamically acquire memory is to request a pointer to a bunch of bytes and cast it to a pointer to one’s data type. Safety requires allocations store at least some information about the types that should be stored in them, and there’s no way one can add such a feature to C without either requiring that code using malloc/calloc be rewritten to some other form, or else adding special-case language rules which recognize and translate certain particular exact constructs (so “MyStruct *foo = malloc(sizeof(MyStruct));” would get translated to “MyStruct *foo = alloc();” [or whatever syntax].)

Getting rid of variables on the stack would not have fixed the HeartBleed problem: in this case the variable was allocated on the heap. Funny thing: there is a debug assert that checks the length – look at the dtls1_write_bytes implementation. The programmer decided to skip the runtime check Eric wants to have in production code and in every component – s/he decided to have that check only in debug mode. That was his / hers call and we should live with that. Either that, or make our own ssl implementation.

I haven’t actually seen that many people argue that we ought to write off OpenSSL — even though, IMO, that’s *exactly* what we should do. Any security-critical project that decides to replace the platform memory allocation with something that defies safe analysis, in the name of performance, has thereby disqualified itself from existing.

Disclaimer: I’m shouting from the sidelines since I have no idea how the alternatives are doing or if the OpenSSL developers are making serious moves to get out of the hole they’ve dug.

Sure, but that’s not my point. There are a great many vulnerabilities due to stack semantics, so let’s get rid of them. Once everything is on the heap, then we have a single subsystem that we can fortify.

Actually C-style malloc is very slow – much slower than C#’s allocation – and can’t be much improved because C isn’t garbage collected – so simple using-the-stack-as-the-heap would be a nice x20-ish performance loss.

What could be done is a RISC-style split of objects and values – i.e. having 2 stacks, say one rooted at %rsp and one at %rbp, and allocating objects whose address is taken only from %rbp (saving temps to %rsp). This would prevent classical buffer overflows.

Again, making the performance argument is arguing that what we should optimize for is getting our private keys shipped off to attackers *really fast*. Arguing that one technique is slower than another is irrelevant; the question is what level of performance and security is acceptable to users. I am plenty willing to accept way lower performance for more security.

The problem of references to stack objects outliving a stack frame has far less to do with whether temporary data is stored on the stack or a garbage-collected heap, than with a more general problem: an inability to ensure that references that are supposed to be ephemeral, actually are. What’s needed, I would suggest, is a type system that, instead of equating the type of a reference with the type of the objects it can identify, would recognize that there are different *kinds of references*. I don’t know how well such concepts could be integrated into an existing framework, but storage locations parameters should indicate things such as [not an exhaustive list]:

1. A reference stored in an instance field, or passed as a parameter to an instance method, will not be exposed to code which does not hold a reference to that instance.

2. All copies of a reference passed as a parameter will disappear before the method returns.

3. All copies of a reference passed as a parameter will disappear before the method returns *except* that the reference, or an object encapsulating it, may be returned. [The caller’s use of the return value would be restricted in the same way as the most restrictive of its arguments].

Once a promiscuous object reference has been exposed to the outside world, static analysis will generally no longer be sufficient to track what might happen to it. I would suggest that having more restricted reference types available would greatly increase the value of static analysis, and would allow code to safely make many optimizations which can at present only be made “unsafely” [e.g. elision of defensive copies when storing or retrieving data].

Someone else suggested Rust to me, but couldn’t point me to a good “getting started” description. From the general description they gave, it distinguished thread-local objects from thread-sharable objects (could be very useful on NUMA architectures) but I wasn’t clear on what things it could and couldn’t do. Can you point me to a good introductory reference?

The tutorial looks interesting. Some ideas there I like, such as distinguishing a declaration which associates an identifier with a value from one which associates it with a storage location that is initialized to a value. I don’t particularly care for the decision to make foo.bar equivalent to (*foo).bar [many of the complaints people have about mutable structures in C# stem, I think, from the failure to distinguish foo.bar from foo->bar], but perhaps it was necessary to make generics work?

I don’t do IRC, but discussing Rust over a period of a few days on SO chat might be helpful. Parts of it certainly do seem intriguing.

Eric – the programmer followed both of your suggestions: 1. variable was allocated on the heap. 2. checked the boundaries – only in debug mode. The programmer made a conscious trade-off to not check boundaries in production code. Bugs happen and at the end of the day I think the programmer did a competent job – which makes a lot of comments in this thread look like posturing in the speed vs reliability debate. By the way – your truly introduced a buffer overflow bug in a component written in C# (the managed component wrote data into a buffer allocated by an un-managed component). The FxCop tests mandatory for checkins accepted the bug. So your suggestions do not fix all situations. How was the buffer overflow caught? We used an instrumented copy of the un-managed component. Why did we not ship the instrumented copy?Because it was too slow….

Some time ago, a vulnerability was revealed in OpenSSL, and I guess there’s no programmer who hasn’t been talking about it since then. I knew that PVS-Studio could not catch the bug leading to this particular vulnerability, so I saw no reason for writing about OpenSSL. Besides, quite a lot of articles have been published on the subject recently. However, I received a pile of e-mails, people wanting to know if PVS-Studio could detect that bug. So I had to give in and write this article: http://www.viva64.com/en/b/0250/

Unfortunately, some people are more concerned with speed than correctness or security. I started on an in progress project for an unreleased web app, and after looking at the code, asked if anyone had ever heard of SQL injection attacks. They didn’t want to bother, “keeping honest people honest.” It took me typing “‘); DROP TABLE Users; –” into the login field to convince my boss that sanitizing input was necessary. After we fixed all the code, my boss complained that the website was slower (it still loaded in less than half a second). I think if he realized that we could have reverted the code with the repository I setup, he would have.

I find it odd how much energy is being spent on “making C safe” when, as a previous commenter put it, “The fundamental problem is that C was *designed* to be unsafe”.

Perhaps it’s time to look into dropping C, or at least libraries written in C in favor for those that are amenable to safety-guarantees — Ada is favored for a lot of safety-critical work precisely because of the level of static analysis that can be done w/ it… moreover, it is one of only two languages I know of where ‘maintainability’ is a design-goal (the other being Eiffel).

It just seems to me that doubling down on C is just “throwing good money after bad.”

There is at least one operating system written in a special version of C#. Heck, there are operating systems written in Haskell! Both have good performance. Just because traditionally operating systems are written in unsafe languages does not make that a requirement.

Well, no. Because 64 K is a lot of memory, and it may span many malloced-not-freed objects — with interesting data. The problem with freed objects that retain their old values is only an additional insult.

And you are aware that a compiler is allowed to optimize “memset(ptr, 0, sizeof(*ptr)); free(ptr);” to just “free(ptr);”, aren’t you?

What do the experts say on how practical this is? “Allocate everything on the heap” sounds like an interesting direction for a start, but has anyone who has actually touched a CRT’s source code sat down and genuinely thought through what it would take to make out-of-bounds accesses reliably trigger a crash without breaking programs written C-style?