Saturday, November 15. 2008

Here's something amusing. I spent the first half of the day writing a short Haskell program which generates x86 instructions in MASM syntax. The program generates all variants of the non-privileged instructions from the opcodes.chm file of the MASM32 package. This means that the instruction generator is not complete at all. FPU, MMX, SSE and other newer-than-x486 instructions are not covered. Nevertheless the generator already generates nearly 150,000 different x86 instructions.

When assembled with MASM32 the resulting file is more than 600 KB big. Trying to disassemble this thing with a few standard disassemblers turns out to be a problem. IDA fails to disassemble an instruction after maybe 5% of the executable and never manages to recover afterwards. Lots of manual help is necessary to convince IDA to go on. OllyDBG manages to disassemble that instruction but has huge gaps at many, many other points of the disassembly. The created file is an interesting test file for x86 disassemblers I'd say.

The Haskell program is just about 300 lines long. 280 of those lines are the definitions of the instructions and what operands they can take. The generation of the instructions from the instruction definitions is just 20 lines and all but 8 lines are not even strictly necessary. I love Haskell's expressiveness.

Anyway, click here to see the Haskell source or click here to download the whole package including the Haskell program (source + EXE), the generated output of the Haskell program, a MASM32 source file that can be used to assemble the test file, and the test file EXE itself.