Analysis of CVE-2009-0658 (Adobe Reader 0day)

Bow here again. It has been a while since we posted a binary analysis on our blog, so I figured we would post one for a vuln that has been getting a lot of hoopla the past few weeks :)

Whenever there is a critical vulnerability in a product that is used frequently, we perform an internal binary analysis in order to get a complete picture of the vulnerability and write reliable countermeasures for it. In some cases, we post the analysis to our threat intelligence portal and in others they simply stay internal to SecureWorks. Enough of that, on to the fun part:

The vuln we will be talking about is CVE-2009-0658, which is a code execution vulnerability in Adobe Acrobat. There are a few PoCs hosted over at milw0rm for this vulnerability:

The interesting thing about these two PoCs is that they are both supposedly for the same vulnerability, however they crash in two different locations (which will be explained later). I started off debugging the vulnerability using the second exploit (8090). If you attach a debugger to Adobe Reader, you?ll see that the crash occurs here:

These are 4 separate JBIG2 streams within the PDF and any of them could result in the trigger of the vulnerability, however debugging reveals that the first stream triggers the vuln. The first 64 bits or so of the initial stream is here:

So now we need to determine what out of this stream causes the crash. I started this by trying to trace back the values used in the pointer calculation that ends up being 0x41414141 in the crash. Before continuing, it is worth noting that the calculation at 0x009ADAF1 results in a value of 0x41414141, but both registers contain the same value, 0x0D0D0D0D (which is 0x41414141/5), which is again obviously user controlled. Both of the registers used in the calculation,EBPandEDX, obtain their values shortly before the crash, so set a breakpoint just before the crash at 0x009ADAD7. The next few instructions are a fairly annoying sequence of arithmetic operations, so we'll walk through them step by step. Here is another annotated version of the disassembly:

It is important to note before moving on that both values inEBPandEDXduring the crash are obtained by accessing a pointer to some object that is pointed to byEAX(which is calculated at [4]). We will need to focus first on what the value inside ofEAXis and how we can control it.

The first calculation ([1]) is part of an array processing loop whereEDIis the counter andEBXis the base pointer to the array. The array holds a series of objects, the loop is fairly small and just seems to be iterating through every available object and moving around some pointers. The second ([2]) instruction moves an integer intoEAX, the value of which is 0?00333333 during exploitation, which should look familiar. If you look at the bytes from the file, 0?00333333 is very clearly at the start of the stream, we can verify this statically or by modifying that value in the file and debugging again. For the sake of time, I will just state that it is indeed from the file and is completely user controlled. If we move down, we see that value is used yet again in a calculation, which results in a value of 0?00FFFFFF ([3]) being placed intoEAX. Shortly after this calculation, an array access is performed using our controlled value and the result is put intoEAX, which is used later on as a pointer to some object [4]. This is where the vulnerability lies, we are able to read a value outside the array which is then used as a pointer for a memory write. It is worth noting that while debugging the exploit, the memory locations surrounding the array access are filled with 0?0D0D0D0D and it seems to be sprayed across a large portion of memory.

So to summarize all of this, we have an array of objects which are accessed via a user-controlled value and whose member variables are used to calculate an address to write to. This results in us being able to write anywhere in Adobe Reader's address space by using some trickery. From here we need to see what the values used in the calculation represent in the file. We?ll start by looking at the stream documentations?

We know that we are dealing with the first few bytes in the JBIG2 stream and it is likely to be a stream header of some sorts. The JBIG2 documentation says that encoded JBIG2 streams are broken down into segments which are made up of two parts, the segment header and the segment data (see page 71). The format of the segment data is different depending upon the segment type, however the format for the segment header is the same across all different segment types. Knowing this, we can start looking at the segment header format. Here are the first few bytes of the stream again:

According to the documentation, the first DWORD in the stream is always the segment number (pg 72, 7.2.2). The next byte is a flags field ([FL]) which indicates the segment type (the lower 6 bits) and two possible characteristics of the stream, page association field size (bit 7) and deferred non-retain (bit 8). In the proof of concept, the 7th bit (page association field size) is the only bit in the flags that is set. The byte that follows the flag field is a varying length object called the referred-to segment count [RT], in this case, it is only a byte long and the value is zero, indicating that this segment does not refer to any other segments. If this value were non-zero, then the object could be larger and would be followed by another variable length object which contains the numbers of the segments that the current segment refers to. Since the referred-to segment count is zero in the PoC, we don?t need to worry about this too much. Finally, the next DWORD is the segment page association ([SPAssociat]), which will be a byte if the page association field bit is not set or a DWORD if it is. In this case, the bit is set and so this field is a DWORD. The segment page association is the value that is responsible for the access violation.

I like to always verify that whatever I?m reversing actually lives up to the specification, so we'll take a look at some of the code that processes this header structure. Unfortunately, the Adobe code that processes JBIG2 streams is significantly more complicated than other code I've seen dealing with JBIG2, so the only thing I am going to deal with here is whether or not the page association bit needs to be set for exploitation. We know from the documentation this is supposed to be true and we could just fiddle with it in a hex editor, but where is the fun in that ;) So first things first, we need to find where our file is in memory and look for references to it (specifically the 5th byte).

The first routine we will be looking at issub_9A8D50(boo no symbols), wherevar_2Cis a pointer to our stream buffer that contains the stream itself. The pointer invar_2Cis obtained from the return value of another function,sub_30D220and is the result of a call to a typical malloc type function (although there is some other code wrapped around it). This occurs during this virtual function call:

There are two important things to note here, first that the stream length is used for the allocation and second that the stream data is not copied over insub_30D220, but another routine we?ll look at a little later. Oncesub_30D220returns, we see that the return value is stored intovar_28,var_2C, andvar_30, so we have three variables holding onto the stream buf pointer. We then run into another virtual function call (yay):

This virtual function call ends up resolving tosub_317B30, which is a routine with a loop and a number of function calls inside of it. During the second iteration of the loop, the following is executed:

You should see a couple of things here. First, the next hardware breakpoint will trigger where indicated above. You then see that the segment flags are copied to another location. Moving down a little ways, the lower byte of theEAXregister isTESTed against 0?40. Thesetnbeinstruction will set the destination operand (lower byte ofEDXin this case) to 1 if bothCFandZFare equal to zero, or set it to 0 otherwise. Since test performs an AND and the 7th bit of the flags is set, the AND will result inZFnot being set, soDLwill be set to 1. Moving down we see that the result of the previous instruction is tossed around and eventually stored back into an object in memory.

This is testing the result of our previous test against 040 and making a function call if it was set. A quick analysis ofsub_9BA6B0indicates that it is looping through and reading the value of the segment page association field byte by byte and returning the full DWORD. The return value is then stored into memory atESI+1Ch. This is the location where the vulnerable routine will read the page association field value from that results in the access violation. Now, moving back intosub_9AD380we see that the value read and stored atESI+1Chis accessed and tested to see if it has a value:

It is of interest to note that this is just above where the crash occurs with one of the other PoCs. If you are debugging, it is important to understand why one PoC triggers here, while the other one does not (8099). In 8099 the add instruction which triggers the crash will not be able to resolve to a valid memory address, resulting in the access violation, however in 8090, the fact we have a lot of memory allocated filled with 0D's increases the likelihood that the address calculation will result in a valid memory address (which is a result of a Javascript heap spray). Moving down, we see that value is accessed once again prior to the final crash:

So, looking back now, we can see where the page association field size is checked and where memory is written accordingly. In addition, we can also see where the value used in the array access is written into memory.

Several sources have indicated that Javascript is not necessary for this vulnerability to be exploited. There is no need for Javascript in order to trigger the vulnerability, however I've yet to see any exploits that are reasonably reliable with or without using Javascript. The sample that we debugged from "the wild" used Javascript for the heap spray and had a fairly small rate of success. There are potential ways of increasing the reliability of the exploit without using Javascript, however we have not captured anything of that nature yet in the wild.

This issue was patched by Adobe on March 10th, 2009. A link to the Adobe advisory is available here: