Part 16: Kernel Exploitation -> Pool Overflow

Hola, and welcome back to part 16 of the Windows exploit development tutorial series. Today we will be exploiting a pool overflow using @HackSysTeam's extreme vulnerable driver. Again, I strongly recommend readers get a leg up and review the resources listed below before getting into this post, additionally for more background on pool allocations see part 15. Details on setting up the debugging environment can be found in part 10.

Obvious bug is obvious! The driver allocates a pool chunk of size X and copies user supplied data into it, however, it does not check if the user supplied data is larger than the memory it has allocated. As a result, any extra data will overflow into the adjacent chunk on the non-paged pool! I suggest you explore the function further in IDA, for completeness the function prologue can be seen below showing the pool tag and allocated chunk size.

We can use the following PowerShell POC to call the function. Notice that we are using the maximum available size, any further data will spill over into the next chunk!

As we can see, the allocated chunk has a size of 0x200 and our buffer stops right next to the adjacent pool header. Let's try that again and increase the size or our buffer by 8 overwriting the subsequent chunk header.

There are a number of bugs we could trigger here depending on the state of the pool and the chunk we are overwriting randomly (in this case a double free). Either way we BSOD the box and we have our exploit primitive!

Pwn all the things!

Game Plan

I think it's auspicious to briefly lay out a game plan. We will (1) get the non-paged pool in a predictable state, (2) trigger a controlled pool overflow, (3) take advantage of pool internals to set a shellcode callback and (4) free the corrupted pool chunk to get code execution!

I strongly recommend you read Tarjei's paper and review part 15 of this series. This will help explain in greater detail how our chunk allocation feng shui works :p!

Derandomize the Non-Paged Pool

In the previous post we sprayed the non-paged pool with IoCompletionReserve objects with a size of 0x60. Here, however, our target object has a size of 0x200 so we need to spray something with that size or with an object which can be multiplied to that size. Fortunately, event objects have a size of 0x40, which multiplied by 8 nicely comes out at 0x200.

The following POC first allocates 10000 event objects to defragment the non-paged pool and then a further 5000 to get predictable allocations. Notice that we are dumping the last 10 object handles to stdout and then manually triggering a breakpoint in WinDBG.

Looking at one of the handles we dumped to stdout we can see nice sequential 0x40 byte allocations.

To get the pool in a desirable state, the only thing we need to do is free segments of 0x200 bytes from our second allocation. This will create holes for the driver object to use. The POC below illustrates this.

As mentioned before, we will be taking advantage of "pool internals" to get code execution. We already saw that messing up these structures invariably results in a BSOD so we would do well to get a better understanding of the layout of pool chunks.

Below we can see the full composition of one single event object and the various structures it is made up of!

First off, there is a WinDBG bug here, it does not really matter as far as illustrating the chunk structure but it is annoying as hell! Anyone can see the issue here? Free cake if someone can tell me why (cake is a lie)! Anyway we have three headers which we need to keep consistent (to a degree) when we later perform our overflow.

Notice the TypeIndex with a size of 0xC in the OBJECT_HEADER, this value is an offset in an array of pointers which describe the object type of the chunk. We can verify this as follows.

We can further enumerate the OBJECT_TYPE associated with our event object pointer. Also, notice that the first pointer in the array is null (0x00000000).

The important part here is the offset to the "OkayToCloseProcedure". If, when the handle to the object is released and the chunk is freed, this value is not null the kernel will jump to the address and execute whatever it finds there. As an aside, it is also possible to use other elements in this structure, such as the "DeleteProcedure".

The question is how can we use this to our advantage? Remember that the pool chunk itself contains the TypeIndex value (0xC), if we overflow the chunk and change that value to 0x0 then the object will attempt to look for the OBJECT_TYPE structure on the process null page. As this is Windows 7, we can allocate the null page and create a fake "OkayToCloseProcedure" pointer to our shellcode. After freeing the corrupted chunk the kernel should execute our code!

Controlling EIP

Ok, we're almost home free! We have controlled pool allocation and we know that after our 0x200 byte object we will have a 0x40 byte event object. We can use the following buffer to precisely overwrite the three chunk headers we saw earlier.

Sw33t, pretty much game over at this point! Again, the observant reader will notice the same annoying WinDBG bug as earlier.

Shellcode

As in the previous posts, we can reuse our shellcode, however there are two small tricks I leave to the diligent reader to figure out! One concerning the shellcode epilogue and the other the null page buffer layout.

Game Over

That's the whole run-down, for further details please refer to the full exploit below.