Reverse Engineering Firmware: Linksys WAG120N

The ability to analyze a firmware image and extract data from it is extremely useful. It can allow you to analyze an embedded device for bugs, vulnerabilities, or GPL violations without ever having access to the device.

In this tutorial, we’ll be examining the firmware update file for the Linksys WAG120N with the intent of finding and extracting the kernel and file system from the firmware image. The firmware image used is for the WAG120N hardware version 1.0, firmware version 1.00.16 (ETSI) Annex B, released on 08/16/2010 and is currently available for download from the Linksys Web site.

The first thing to do with a firmware image is to run the Linux file utility against it to make sure it isn’t a standard archive or compressed file. You don’t want to sit down and start analyzing a firmware image only to realize later that it’s just a ZIP file:

OK, it’s nothing known to the file utility. Next, let’s do a hex dump and run strings on it:

Taking a look at the strings output, we see references to the U-Boot boot loader and the Linux kernel. This is encouraging, as it suggests that this device does in fact run Linux, and U-Boot is a very common and well documented boot loader:

However, taking a quick look at the hexdump doesn’t immediately reveal anything interesting:

So let’s run binwalk against the firmware image to see what it can identify for us. There are a lot of false positive matches (these will be addressed in the up-coming 0.3.0 release!), but there are a few results that stand out:

Binwalk has found two uImage headers (which is the header format used by U-Boot), each of which is immediately followed by an LZMA compressed file.

Binwalk breaks out most of the information contained in these uImage headers, including their descriptions: ‘u-boot image’ and ‘MIPS Linux-2.4.31’. It also shows the reported compression type of ‘lzma’. Since each uImage header is followed by LZMA compressed data, this information appears to be legitimate.

The LZMA files can be extracted with dd and then decompressed with the lzma utility. Don’t worry about specifying a size limit when running dd; any trailing garbage will be ignored by lzma during decompression:

We are now left with the decompressed files ‘uboot’ and ‘kernel’. Running strings against them confirms that they are in fact the U-Boot and Linux kernel images:

We’ve got the kernel and the boot loader images, now all that’s left is finding and extracting the file system. Since binwalk didn’t find any file systems that looked legitimate, we’re going to have to do some digging of our own.

Let’s run strings against the extracted Linux kernel and grep the output for any file system references; this might give us a hint as to what file system(s) we should be looking for:

Ah! SquashFS is a very common embedded file system. Although binwalk has several SquashFS signatures, it is not uncommon to find variations of the ‘sqsh’ magic string (which indicates the beginning of a SquashFS image), so what we may be looking for here is a non-standard SquashFS signature inside the firmware file.

So how do we find an unknown signature inside a 4MB binary file?

Different sections inside of firmware images are often aligned to a certain size. This often means that there will have to be some padding between sections, as the size of each section will almost certainly not fall exactly on this alignment boundary.

An easy way to find these padded sections is to search for lines in our hexdump output that start with an asterisk (‘*’). When hexdump sees the same bytes repeated many times, it simply replaces those bytes with an asterisk to indicate that the last line was repeated many times. A good place to start looking for a file system inside a firmware image is immediately after these padded sections of data, as the start of the file system will likely need to fall on one of these aligned boundaries.

There are a couple interesting sections that contain the string ‘sErCoMm’. This could be something, but given the small size of some of these sections and the fact that they don’t appear to have anything to do with SquashFS, it is unlikely:

There are some other sections as well, but again, these are very small, much too small to be a file system:

Then we come across this section, which has the string ‘sqlz’ :

The standard SquashFS image starts with ‘sqsh’, but we’ve already seen that the firmware developers have used LZMA compression elsewhere in this image. Also, most firmware that uses SquashFS tends to use LZMA compression instead of the standard zlib compression. So this signature could be a modified SquashFS signature that is a concatination of ‘sq’ (SQuashfs) and ‘lz’ (LZma). Let’s extract it with dd and take a look:

Of course, ‘sqlz’ is not a standard signature, so the file utility still doesn’t recognize our extracted data. Let’s try editing the ‘sqlz’ string to read ‘sqsh’:

This definitely looks like a valid SquashFS image! But due to the LZMA compression and the older SquashFS version (2.1), you won’t be able to extract any files from it using the standard SquashFS tools. However, using the unsquashfs-2.1 utility included in Jeremy Collake’s firmware mod kit works perfectly:

Now that we know this works, we should go ahead and add this new signature to binwalk so that it will identify the ‘sqlz’ magic string in the future. Adding this new signature is as easy as opening binwalk’s magic file (/etc/binwalk/magic), copy/pasting the ‘sqsh’ signature and changing the ‘sqsh’ to ‘sqlz’:

Re-running binwalk against the original firmware image, we see that it now correctly identifies the SquashFS entry:

And there you have it. We successfully identified and extracted the boot loader, kernel and file system from this firmware image, plus we have a new SquashFS signature to boot!

I’ve gotten up to the part where unsquashfs-lzma. I’ve downloaded the Firmware Modification Kit, and compiled with make. Then I’ve taken the unsquashfs-lzma and put it in the same directory as my firmware file. Upon running it:
>./unsquashfs-lzma wag120.squashfs
>Reading a different endian SQUASHFS filesystem on wag120.squashfs
>zlib::uncompress failed, unknown error -3
>Bus error

Great info, congrats by me also!!
Tried to follow your tut on a Thomson 585v7 firmware. Binwalk unfortunately cant recognise anything in the .bin file. Is it because of no signature for it in the magic file or it has to do something with the bin file?
Keep up the good work and please do post an editing the firmware tutorial

I truly appreciate the article, it will be bookmarked for future reference. Unlike Ralph Corderoy, I especially liked the screen shots, as they allow one to see exactly how the different commands are invoked and what their outputs are. Once again, thanks, kudos!

If there’s a checksum in the firmware you’ll have to determine where the checksum is and what checksum algorithm is used. Haven’t looked at the ZIP file you linked to yet, but if there’s GPL code available for the device that that’s usually the easiest way to figure it out. Also look to see if open source projects like OpenWRT or DD-WRT (since you didn’t mention what this device is I’m assuming it’s a router!) have done any work on it. I plan on covering firmware mods in more detail in a later tutorial.

@Joseph:

The unsquashfs error you’re getting looks like you are using the standard version of unsquashfs instead of the lzma version. Are you sure you’re running the correct unsquashfs binary from the firmware mod kit source?

@ Jim:

I haven’t looked at the Thomson routers before. If you aren’t getting any results from binwalk then it didn’t find any known signatures in the file. What kind of output do you get from running strings? If there are few readable strings then I would suspect the firmware (or part of it) is compressed. Try running binwalk with the -a option; this will use all signatures and will result in a lot of false positive matches but may help you find some gzip or other compressed data in the firmware.

@ Ralph/Axelle:

I started writing the tutorial by just copy/pasting the text, but felt that the screenshots made the commands and their output much easier to follow. I agree though that having to click back and forth is a pain – I’ll be sure to have future pictures viewable inline with the text!

@Rogan Dawes:

I did Google Sercomm and found that link too, but since it wasn’t related to the filesystem I was looking for and wasn’t the object of the tutorial I skipped over it. Good point though that I should have mentioned in the tutorial – always Google odd strings in the firmware!

Also tried your instructions on a Thomson 585v7 7.4.4.7 UK firmware (Download link of bin file: http://goo.gl/fvTcr). Strings cant get anything usefull (link: pastebin.com/ibXN4KQy) and binwalk with the -a switch seems to get too much info (i m guessing false positive, as you mentioned. Link:pastebin.com/nCbvDuq2)
If you have any time to provide any ideas/steps to get pass this, i would gladly try to provide signatures for Thomson firmwares for binwalk.
Once again, thank you for your tutorial and time!

Gi0: I don’t see anything obvious in that firmware image, but given the size and the layout of the firmware update file, I’d be surprised if it’s Linux based.

Given the lack of nearly any strings, I’d say that aside from the header (the first few bytes look like they are “magic” bytes for the firmware header) the firmware image is probably compressed, but nothing that I recognize.

I’m actually having an odd issue. I tried to strip the latest firmware from linksys’ site like you did with dd with the exact same command. However when I try to decompress with lzma I get the error : lzma: uboot.lzma: Compressed data is corrupt. Currently, I’m using this version of lzma: xz (XZ Utils) 4.999.9beta
liblzma 4.999.9beta. What version are you using?

It’s the lzma utility in Ubuntu, so maybe it’s a different version from what you are using? I would expect any lzma decompression tool to handle the file though, it should just be standard lzma compression.

Are you sure you did the dd command correctly and got the ‘bs’ and ‘skip’ parameters correct? What does the file utility report the uboot.lzma file to be?

From http://tukaani.org/lzma/
‘LZMA Utils
LZMA Utils are legacy data compression software with high compression ratio. LZMA Utils are no longer developed, although critical bugs may be fixed as long as fixing them doesn’t require huge changes to the code.

Users of LZMA Utils should move to XZ Utils. XZ Utils support the legacy .lzma format used by LZMA Utils, and can also emulate the command line tools of LZMA Utils. This should make transition from LZMA Utils to XZ Utils relatively easy.’

I downloaded the old LZMA Utils source, compiled and ran that version of lzma and everything was happy.

There are quite a few Chinese wireless routers, with 2mb rom, 16MB ram and atheros chips. They are all in Chinese. I was trying to figure out if I could install openwrt, when I found it was not possible, I was thinking if I could turn them into English. Using binwalk and other techniques mentioned here, I was unable to even find out what OS and file system it uses. Is it possible to extract html files from this rom image and modify them and repack them? The bin file is in this archive.

I am a bit late on the topic, but Linksys does provide source code for the previous 1.00.12 firmware on their website. Using that and a Linux machine to compile the source, I believe it may be easier to play around with this router. Plus, easier to enable all other incomplete/missing features, like IPv6, VPN, and the blessed telnet access to the device.

Good point, GPL source code is always helpful when reversing/playing with firmware. Unfortunately, source code isn’t always available or complete, so knowing how to take a firmware update apart is useful in those situations.

I really just picked out the WRT120N as an example after seeing someone else on the Web who had been trying unsuccessfully to take apart the firmware. 🙂

Just wanted to leave a note to say thank you! I’ve been having some problems with a YeaLink VoIP phone which runs Linux. Was able to use binwalk and gzip to extract the files required and was able to debug through the problem.

If the device runs Linux then it’s pretty easy: the /prod/mtd file will list all the mtd partitions and their names. One will likely be named ‘kernel’ or something similar. you can then read the kernel directly from /dev/mtdblockX (where X is the mtd number listed in /proc/mtd).

If it is running some other proprietary system, then who knows. But most of them provide at least basic mechanisms to view configuration data and read/write to memory/flash, so if you can figure out where the kernel is on flash or in RAM then you might be able to dump it though the serial port.

Hey, I’m attempting this with the Annex A version of the firmware. I get up to the step where you find the FS but searching the kernel for filesystem strings doesn’t return anything that I could identify as a FS (though my knowledge is fairly limited).

How can i extract the firmware of Linksys WRT120N?
Link:http://homedownloads.cisco.com/downloads/firmware/FW_WRT120N_1.0.01.001.bin
The only thing i can do is extracting the lzma compressed PFS.IMG file (file system similar to 3COM/SMC not linux based) located at
DECIMAL HEX DESCRIPTION
——————————————————————————————————-
945152 0xE6C00 LZMA compressed data, properties: 0x5D, dictionary size: 8388608 bytes, uncompressed size: 1348080 bytes
Binwalk output gives a lot of false positives.

Hello,
First of all great tutorial!
I have an assigment for my master in router reverse engineering. Is there a way to emulate a router like
linksys wag120N? I don’t have the router and I would like
to test some of these great tips. Searching google I
found qemu and virtualbox, but I can’t find a tutorial
on how I can virtualize a router.
Thank you in advance!!!!!!!!!!!

I was just wondering if there is an easy way to convert the ascii-output from an firmware dump into a regular data/bin file? I’m trying to dump the firmware from an embedded linux device using the “dump” command in the redboot bootloader via a serial connection.

Thank you for this post, really interesting stuff. I have a silly question, could you please explain how you calculated those arguments that you gave to skip(for dd)? What skip does, by the way, is to start reading from X sector on the file, thus, you use this value to jump to a specific byte on the firmware binary, correct?

So to sum up: How do you calculate from which block to which block you need to extract something out of the firmware binary?

Don’t be thrown off by the terminology “block”. This has nothing to do with hard drive blocks or sectors or anything like that; this is just setting the “block” size used internally by the dd software.

For example, when dd’ing the LZMA compressed kernel out of the firmware image, I specified a block size of 1 byte and told dd to skip 196672 blocks, because that LZMA file was located 196672 bytes into the firmware image. I could just as well have specified a block size of 196672 and told dd to skip 1 block (which is actually more efficient and faster, but perhaps a bit less intuitive).

If it is possible to make your individual perfect cup of joe every time, you’ll not be tempted order your daily brew in the local coffeehouse.
Store the beans in dark cool places, not where it
really is hot and exposed to light. Either way, you can not get a better made drip coffee machine for
THIS price.

fantastic read. Thats one for the favorites. I dont want to miss a tutorial by these genius.
I hope with all my heart that you write more about the discovery of exploitation for Embedded Systems.
I have a question is how to reverse engineer that exist within the router such as TP-Link router
I have repeatedly tried to do that through (), but the program did not accept the file and did not recognize it at all
Is there another way to do it is there Loader for this special formats or what I hope mentioned

fantastic read. Thats one for the favorites. I dont want to miss a tutorial by these genius.
I hope with all my heart that you write more about the discovery of exploitation for Embedded Systems.
I have a question is how to reverse engineer that exist within the router such as TP-Link router
I have repeatedly tried to do that through (ida pro 6.1 and ida pro 6.4), but the program did not accept the file and did not recognize it at all
Is there another way to do it is there Loader for this special formats or what I hope mentioned

Excellent guide ! I was able to extract cfe booloader, linux kernel and squashfs filesystem from a DLink 2750u russian router firmware image.
But both lzma and unsquashfs gave me error, but 7z was able to extract both.
Also 7z has a plugin system so if the corresponding libraries is present is system then 7z can extract a very wide varity of images. So anyone can try 7z, it’ll make the steps of detection easy.

Now i’m trying to figure out how to discard the bootloader and create a firmware flash image using only kernel and filesystem. I dont want to replace the bootloader in the device 🙂

Hi !
Do you know where I can find information about “capturing” a firmware update between the computer and the device ? ?
I have a watch I have to update using the company software that is installed in my computer. This software gets the update from internet and send it to the watch using an usb cable.

I have tried to use wireshark to see the communications between the company software and the web, but it looks like there is a ssl connection made to transfer the file. So I think I have nothing to do there.

I was wondering if there is possible to capture the update from the usb, when the update takes place. Or if there is a software or terminal command to check where does the company software place the update in my computer …

Try using the p7zip utility to extract the LZMA file. I’ve found the ‘lzma’ Linux utility is very picky about which LZMA files it works with. Also, the integrated LZMA extractor built into binwalk should work fine too (just run binwalk with the -e option).

Hey Craig thanks alot 🙂 …it worked but now i encounter another error, when i try to load the (.bin) file in IDA PRO it prompt an error message “ida failed to load the start address automatically” any solution ?? plz help me

I’ve been trying to make sense (essentially disassemble both the boot loader and kernel and analyse its functions and data structures) of the extracted images i.e. bootloader and the kernel. As radare2 doesn’t recognize any of these files (i.e. its not an ELF or any other known well formed binary), its spitting out random assembly when disassembling. I’d presume you’d need the STARD/LOAD address to start OR an entry-point. Do you have any ideas on performing static analysis on “uboot + kernel”.

When you say “So this signature could be a modified SquashFS signature that is a concatination of ‘sq’ (SQuashfs) and ‘lz’ (LZma). Let’s extract it with dd and take a look:” and you use skip=851968 in the DD command, where did you come up with the value for the skip? I’m working on another bin file and am curious how to identify this value.

Awesome tutorial! everything went well up until I went to use fmk squashfs-2.1-r2/unsquashfs and I get a segmentation fault . however I noticed that with your fmk its unsquashfs2.1-r2-lzma wag120.squashfs I am using fmk version.99