2008/01/19

As the preparation for my course on Data Communications Forensics and Security this term, I've decided to to do a quick survey of disassemblers. The problem is that I've been writing my own disassemblers for special purposes, but I need to have something more general purpose for the students. Also, the code I wrote stayed with IBM when I left. Here is a quick survey of what I've found so far, in no particular order.

Let's start out with the reigning King of disassembly, IDA Pro. This is more a disassembler framework than just a disassembler only. As of Jan 2008 they've moved it from the old DataRescue website (and presumably distributor) to the new Hex-Ray site. It's up to version 5.2 and there are quite a few plug-ins for it and this is clearly the strength of IDA Pro. Unfortunately, they want serious money for it and the University isn't interested in paying. I'm also a bit concerned about the move to Hex-Ray. What does it mean?? Will it survive. I'd also like something that came with source code.

Sourcer doesn't seem to exist any more. V-Communication's website doesn't seem to list it. Sourcer used to be my favorite disassembler before I got into writing my own.

Apparently ASMGen is still around, but I think it is stuck in the 16bit MS-DOS world. It was basic back then and must be antique now. I'll give it a spin and see.

Jean-Louis SEIGNE's disasm32 is apparently a VxD disassembler according to his own website and it is available via WinSite. (Another website seems to indicate it is a visual disassembler, I'll find out what I get the chance to run it.) It seems to be at least 12 years old, so I don't think it will be that interesting.

I can't find WDASM, so it is probably dead.

Obj2asm is an MS-DOS object file disassembler and is available on Simtel.

The New Jersey Machine-Code Toolkit also seems to have been discontinued back in 1998. It's written in SML and utilizes a machine model for disassembly (amongst other things) which should give it a lot of flexibility. However, it doesn't help if the project has been abandoned.

GNU offers a lineup of surprisingly useful tools in its binutils package. Quoting: "nm - Lists symbols from object files. objdump - Displays information from object files. readelf - Displays information from any ELF format object file. strings - Lists printable strings from files. " They are meant for UNIX and so are not that useful for Windows. However, in theory the should be able to handle PE files, but they are not robust or endian agnostic.

OllyDebug is not a disassembler, but a debugger. However, quite a few people use it for program analysis either to aid the disassembly or to produce the disassembly. It's free and one of the best.

Although not actually a disassembler, REC attempts to decompile from binary to source. It uses the netwide disassembler for preprocessing, according to the documentation.

The Netwide Disassembler is a part of the Netwide Assembler project. It doesn't actually understand the various binary file formats itself, so you have to give it the naked binary code. I've used this and objdump in my projects. It is far more useful than it sounds like. Consider that a certain amount of malware can only be snagged from memory.

Boomerang is another decompiler (versus a disassembler). It was active until 2006, so I'll have to see where it stands.

The diStorm project looks very very interesting to me in that they want to create a really good library for disassembly, not just a disassembler. This will not be for the casual disassembler. The core library is written in C (source is available, I think) and it interfaces with Python, which wouldn't have been my choice. They also have separated the opcode libraries from the code (again, according to the documentation) which makes it easier to repurpose the code, though I always wonder how much real mileage you get from it.

The open directory project lists a few more here.Another mention is Wotsit. This site has been very useful over the years in figuring out various file formats (I'm a file format hacker at heart, but long inactive.) You need this site to figure out the various binary file formats.So, the next step in this exercise is to evaluate the best candidates and see how well they will do in practice. That will be in some later post.