Wednesday, 30 January 2008

Three Categories of Buffer Overflow in the JRE

Some people think that writing code in Java is a silver bullet against implementation flaws such as buffer overflows. The truth is a little murky. Certainly, there is no provision for overflows in pure Java code; reading or writing past the end of an array generates an exception, as the following toy code demonstrates:

But real code, though it might be written in 100% Java, depends heavily on the Runtime Environment (JRE) and the JRE contains methods that are written in straight C. We all know what happens when C hangs out with its buddies: fixed size buffer, strcpy and user input.

So how do you even start to assess the attack surface of the JRE? Perhaps I'll go into this in more detail in a future post if anyone is interested, but briefly for now, if we discard logical flaws in the JRE that let you escape the sandbox (as attempting to measure exposure to these is really hard) and concentrate solely on the native code parts, we can:

Or enumerate all methods of all classes in the runtime and count up those marked as native. This can be done in a few lines of code using the Byte Code Engineering Library (BCEL), a great project for low level manipulation and construction of classes.

With a list of the native methods, perform some static analysis to score each method - how much code does it contain (including all the code within all function calls), does it process data that might be untrusted and so on.

But I digress. The main point of this post is to highlight three categories of buffer overflow that exist within the JRE, so here they are:

Buffer overflows in file format parsers

Much of the JRE file format parsing code is implemented in native code, typically either for speed or because the code originated elsewhere. This includes BMP, GIF, JPEG, ICC, TTF and Soundbank parsing, and a few I've probably forgotten.

Incidentally I was first alerted to this when a Java application I was running (it happened to be Burp Proxy, seriously!) started crashing, leaving the familiar hs_err.log file behind. The log showed that I was triggering an access violation in fontmanager.dll, which I tracked down to a corrupted TrueType font I had in my fonts folder (TrueType fonts are hard things to parse - there's a mixture of 16 and 32 bit fields, lengths, offsets and to cap it all, provision for a virtual machine, as you'll already know if you read my last post!).

Chris Evans did some great write ups on the bugs he found in the JRE image parsers here and here.

Buffer overflows in the platform API wrapper code

In addition to file format parsers, methods that interact with the OS are also ultimately implemented in native code as they need to call the appropriate platform API. These methods typically need to convert Java datatypes such as a String into a C datatype, such as a wide character array. Sound like a potentially hazardous operation? Well my colleagues at NGS, Wade Alcorn ("The King of BeEf") and Marcus Pinto (of Web Application Hacker's Handbook fame) found such a bug in BEA's JRockit JVM. The NGS advisory is here. This issue could be triggered remotely as an unauthenticated user against WebLogic Server by requesting a long URL (!) which triggered an overflow as the path was canonicalised.

Buffer overflows in the underlying platform APIs

The previous category comes about from insecurely preprocessing data before handing it off to a platform API. Let's consider the opposite - doing no processing and exposing a bug in a platform API. A notable example of this category is a critical vulnerability discovered by Peter Winter-Smith, another colleague of mine at NGS. He found an overflow that could be triggered by passing a string of 65536 bytes to gethostbyname, exported by ws2_32.dll. This issue was fixed in MS06-041 (NGS advisory here). It was trivial to generate Java code to hit this bug.

Now you may be thinking that it isn't really fair to call this a Java problem as it is clearly an OS/third party library bug. Perhaps it isn't fair :) It is interesting though that in some areas of the JRE, the layer on top of the platform APIs is so thin that these types of bug are exposed (I think Peter actually found the gethostbyname bug while testing a Java application!) Also note that this further complicates attack surface analysis :(

A final note on how these affect different types of Java application. The example I gave in (2), is a clear example of buffer overflow in the Java runtime that can be used to compromise a server. The examples in (1) and (3) less so. Its feasible that a Java Enterprise application may parse a file uploaded by a user but it obviously depends on the purpose of servlet. On the other hand, a malicious applet that attempts to exploit the browser through a file format bug in the JRE is certainly conceivable.

And as for mobile Java, their runtime implementations do not typically share the native code components with the desktop JRE so the chances of there being an all conquering cross-device cross-architecture cross-Java implementation vulnerability are pretty slim (despite news to the contrary), though I'll stop short of saying impossible :)

2 comments:

Chris
said...

Excellent post...

Random point - one of the things that I think makes Java/c# interactions with c hazardous is the difference in the way strings are handled. In c, you'd typically be very careful about string lengths, whereas in Java everything is handled for you. So in a typical c server app, you'd have hard length limits on various parts of a text-based protocol, whereas in Java you don't need to worry about it. That's fine as far as Java is concerned, but once the long string gets passed through to some lower-level c component there's a problem.

My point is, it's more likely that long strings will be passed around by a Java app. So while you might struggle to find a c server that you could use to exploit MS06-041 (requiring a 65536 byte hostname), quite a few Java servers are "vulnerable".

Yes, these native bugs are like wasserleichen (drowned corpses).They were dropped in the java river when Sun coded some parts of the JRE with native libraries (awt,nio,sockets,bidi) in the mid-90s, now with advanced fuzzing and better coverage awareness they are consequently popping up.