Wednesday, March 29, 2017

Black box discovery of memory corruption RCE on box.com

Overview

Robust evidence existed for the presence of a memory corruption based RCE (remote code execution) on box.com servers. The most likely explanation for the evidence presented is the usage of an old ImageMagick which has known vulnerabilities, combined with lack of configuration lockdown. It's hard to be sure, though: see the section on the Box response below.

This blog post explores a different angle to vulnerability research from my normal work. We'll look at how to try and reliably determine the presence or absence of a known server-side memory corruption vulnerability using strictly black box techniques. Black box testing can be fun, because it's very scientific: come up with a hypothesis, devise an experiment, see if the results make sense.

Side notes: box.com security and response

I had a sufficiently unusual experience talking to Box that it's worth a few dedicated notes. I found the experience good in place and awful in others.

The good: the three issues I reported all appeared to get fixed reasonably quickly, including one that was not a simple fix.

The awful: communications were painful, as if they were filtered through a gaggle of PR representatives and an encumbrance of lawyers. The current status is that I believe the issues are fixed -- again via black box testing, but no-one has really confirmed the issues existed in the first place. All I have is some language that says everything is fine and that the security posture was improved, without saying if my reports were accurate or rejected. Being slippery in researcher communications is not the way to build trust in your security program. I also note that Box is behind vs. its competitors due to the lack of a bug bounty program. To avoid a train wreck, open and honest researcher communications need to be addressed prior to the launch of such a program.

It's worth noting that I had an interaction with DropBox for similar reasons, at about the same time (separate blog post pending). The experience could not have been more different! DropBox were friendly, competent and forthcoming. Maybe that's where you want to store your files instead of Box, if security is a priority.

An inaccurate fingerprint

The target of this investigation is image thumbnailing. Online file storage services like Box, DropBox, Google Drive etc. typically support displaying of image thumbnails in the file list and also a preview window. In order to fingerprint the software used, we need to upload a diverse set of files and see if we find any behavioral clues.

In order to "trick" the thumbnailing process into revealing its full capabilities, we use a simple trick of taking unusual file types and renaming them to .png. Often, the thumbnailing process only cares that it thinks it sees a known file extension and then it will stuff the bytes into some process that ignores the file extension and uses header sniffing to work out what to really do.

Very quickly, I found that Box will thumbnail tons of weird and wacky formats. Including the following, and bonus points if you've heard of any: CIN, RLE, MAT, PICT. The list would have probably gone on but when you see a list like that it usually means one of two things.... ImageMagick or GraphicsMagick.

I noticed that a particular CIN file from the GraphicsMagick test set (input_rgb.cin) rendered very differently in my local installs of GraphicsMagick and ImageMagick. In this image below are renderings of the same CIN input file by GraphicsMagick-1.3.23 and ImageMagick-6.8.9. As you can see, GraphicsMagick renders with more extreme contrast.

Unfortunately, I inaccurately decided Box was using GraphicsMagick because the Box thumbnail looked more like the image on the left. This was a very lax determination because there's a lot of color variation depending on the software versions and also the tool and options used to produce or view the thumbnail. We'll correct the mistake later on in this post.

The vulnerability

Under the false belief that GraphicsMagick was in use, I had a look at recently fixed vulnerabilities in GraphicsMagick. The v1.3.24 release notes looked promising (May 2016), because they reference a lot of memory corruption fixes. In addition, my Ubuntu 16.04 LTS has v1.3.23, making testing easy. After a bit of consideration, this GraphicsMagick patch represented a good candidate for exploration. It's a fairly straightforward buffer overflow in the RLE decoder, with good control.

The overflow occurs because of a missed validation on the number of planes (where RGB would be 3 planes, for example) in the canvas allocation vs. the plane number requested to be written in the RLE decode protocol:

Side note: this is a really unusual image format, the Utah Raster Toolkit RLE (run length encoded) format. Here is a link to the home page, complete with 1990's look. And yes, I did download the original toolkit, apply both patches, and quickly read it for the presence of the same vulnerabilities. It does appear to have them. This software was written in the _1980_'s, so not really reasonable to expect otherwise. Since I'm not sure I ever found a security bug still live from the 80's, might as well assign: CESA-2017-0001.

The oracle

In security, an oracle is simply a measurable result that gives the attacker some useful information. For our oracle, we're simply going to use this: "did the uploaded image thumbnail successfully or not".

There are of course multiple reasons why an image thumbnail might fail. The image might simply not pass a validity check, or the image might cause a SEGV in some backend. Ideally, a backend SEGV would be indicated back to us via a different oracle (perhaps a 50x error code from the HTTP request for the thumbnail). Unfortunately, testing eventually showed that a failed thumbnail, for any reason, manifests to us as:

Multiple requests to https://app.box.com/i/f_153109389238/thumbnailURLs over a period of time, leading to a long pause in the UX (throbbing icon) even though this is a "fail fast" case.

HTTP response codes of 200 OK.

Starting to construct a proof of vulnerability

To show we're on the right track, we're going to try and build an input file that will thumbnail successfully with the vulnerable code, and fail if the fix is in place. Ideally, the failure will be because of a deterministic check and not a memory corruption failure (which is not particularly deterministic from our vantage point). Let's look at the fixed code:

As it turns out, there is a subtle difference here that we can use. The vulnerable code lets the p pointer go out of bounds, but stops writing if x or y go out of bounds. The fixed code bails immediately if p goes out of bounds. So we can simply make an image that has a small number of pixels (say, 16x1), and then request a large RLE pixel run (say, 0xff). The vulnerable code will accept this, and clamp the out of bounds values. This is not a case where the vulnerable code will corrupt memory, but it is a case where the vulnerable code will exhibit different behavior. The fixed code will reject this case.

File: gm_intra_oflow_1_3_23.rleNotes: as a bonus, this file also uses an out-of-bounds plane value. The value is chosen such that any validation would reject it, and also that the out-of-bounds behavior it causes will still remain within the bounds of the pixel buffer.Result: thumbnailed successfully on Box. Thumbnailed successfully on Ubuntu 16.04's v1.3.23. Fails on v1.3.25. Fails on Ubuntu's ImageMagick v6.8.9.

This is fairly compelling evidence already. For our next test, we could proceed to try and cause a crash via any old heap overflow... but this is not going to be reliable. A crash vs. non crash is going to depend on heap state, which in turn will depend on versions of GraphicsMagick, versions of libc, versions of everything.

A deterministic heap overflow crash

To proceed, we're going to assume that Linux x64 and glibc malloc() might be in use on the backend, and use a quirk of this combination: allocations larger than a certain size (often 128kB or so) are allocated using mmap(). This will provide 4096 byte alignment and size. We can then calculate the size of any mapping created by our input file, and play tricks like writing a single byte past the end. This will either hit the adjacent mapping (may or may not crash depending on writability) or no mapping (crash). Our first attempt is: gm_oflow_mmap_chunk.rle, which tries to write off the end of a large allocation. It's only 24 bytes so let's look at it in its entirety:

We were expecting a crash, so let's look at what happened. valgrind certainly sees the problem, it reports "Invalid write of size 1... 2 bytes after a block of size 262,080". In the debugger, we can confirm the out of bounds write, which happens to cross over into the adjacent mmap chunk which of course happens to be writable (otherwise we would have received a crash). Here are the mappings in question:

The mapping highlighted in red is actually a concatenation of the mapping we smashed off the end of, and some other mapping. What is that other mapping?

(gdb) x/8xa 0x7ffff7fbf000

0x7ffff7fbf000:0x4100002b4187d70x417ffff141cb21

0x7ffff7fbf010:0x417ffff14162a80x417ffff7415000

0x7ffff7fbf020:0x4100004a41a9880x417ffff141bf3b

0x7ffff7fbf030:0x417ffff141de780x417ffff7415000

Well, it's chock full of pointers and you can see we've partially corrupted them by spraying some 0x41 values around. It appears that no used code path dereferences these particular pointers, otherwise we'd have seen a crash. Seems like a great avenue for exploitation, as we can expect this mapping layout to be reasonably repeatable. However, exploitation is not our current goal. Our current goal is to deterministically crash when we go out of bounds. Linux typically uses a top down and first fit algorithm for placing mmap chunks, so if we simply make the allocation bigger, it won't fit in its current place and will go elsewhere.

Our solution is to tweak our image size to be 255 x 16448, still with 4 planes. This causes an allocation of 16776960 bytes. Taking into account the glibc 16 byte header, and rounding the mapping size will be 0x1000000, with the result looking a bit like this:

Result: Fails to thumbnail on Box and crashes with SEGV in v1.3.23 locally.

To go into detail on the calculations resulting in an off-by-one condition, let's look at the RLE protocol bytes:

02 f4: set plane to 0xf4

03 fe: set X to 254 (default Y is 0, which means bottom)

06 00 41 00: write 1 byte (0x00 + 1) of value 0x41

The calculation of the write offset is (16447 * 255 * 4) + (254 * 4) + 0xf4, which is 0xfffff0. Add in the 16-byte glibc header and the write offset, relative to the start of the mapping, is exactly 0x1000000, which is exactly off-by-one. Here's the crash locally:

=> 0x00007ffff7a7e0af :mov %r13b,(%rcx)

r13 0x4165rcx 0x7ffff1a60000140737247576064

There are other reasons this file might not thumbnail on Box. A failed thumbnail could be because we hit a backend server that was taking a shower, so we run a few tests across a few days to confirm it consistently fails to thumbnail. Another reason could be the large allocation, if the server has some limits on allocation sizes. So as a final very interesting test, we can change the 0xf4 value in the test file to 0xf3, leading to:

File: gm_oflow_mmap_chunk_oob0.rleNotes: An off-by-zero file, i.e. writes perfectly at the very end of the mmap() allocation. This write is out-of-bounds regarding the actual size passed to malloc(), but in-bounds regarding the mmap() mapping.Result: Thumbnails ok on Box and also locally with v1.3.23.

We now have some pretty compelling evidence. By changing a single byte in an input file, which has no effect other than to change an out-of-bounds write offset, we go from consistent success to consistent failure using our thumbnail oracle.

A more accurate fingerprint

In communications with Box, they were quick to deny using GraphicsMagick but very cagey regarding the actual software used. I don't think this was a particularly clever move: image thumbnailing software is fairly easy to fingerprint because each image decoder has its own set of observable decode capability, quirks, bugs and also vulnerabilities. And the inability to have an open conversation likely lowered the amount of benefit obtained.

That said, my hasty and incorrect earlier fingerprint attempt was a minor personal embarrassment, so I set out to more accurately fingerprint the software in use. Given the set of image formats supported, and the denial of usage of GraphicsMagick, the only other reasonable suggestion is ImageMagick. In response to my report, Box disabled all of the crazy / fringe decoders. Good move. One remaining decoder supported is PSD (Adobe Photoshop). In an attempt to fingerprint positively for ImageMagick, here's a PSD file with the following properties:

A v2 PSD file: supported by ImageMagick but not GraphicsMagick, gimp, etc.

Contains a ZIP (deflate, really) compressed channel. Supported by ImageMagick but not GraphicsMagick.

Declares greyscale (1 channel) images but tries to render a magenta pixel by referring to the presumably "out of bounds" green channel. ImageMagick gives an RGB image with magenta pixel, not greyscale!

Result: does not render in GraphicsMagick. Renders a single magenta pixel at offset 2x2 in ImageMagick.

Box: walks like an ImageMagick, quacks like an ImageMagick...

Conclusions

By careful construction of an input file, we've produced strong black box evidence of a memory corruption vulnerability in Box's image thumbnailing process. But we've also determined that Box are not using GraphicsMagick, but likely using ImageMagick instead. Are these two facts compatible? Yes. GraphicsMagick forked from ImageMagick back in 2002. At that time, the codebase was extremely buggy and therefore since 2002, both GraphicsMagick and ImageMagick have suffered from a stream of the same vulnerabilities, with little co-ordination, and patches arriving at very different times.

ImageMagick had the same vulnerability we've been discussing, but it was fixed a couple of years ago. This suggests that Box were running a 2+ years old version of ImageMagick, loaded with vulnerabilities, without any particular attack surface reduction efforts, and without binary ASLR enabled (to be discussed in a future post).

We can also conclude that the split between GraphicsMagick and ImageMagick has led to a bit of a mess. If you look at all the vulnerabilities fixed in GraphicsMagick, not all of them are fixed in ImageMagick and visa versa. This is a potentially great source of bugs; I'm not even sure if you'd call them 0days or 1days.