Coolest hack/trick ever!

February 17, 2011

Some time ago, I wrote about lguest, a minimal x86 hypervisor for the Linux Kernel, which is mainly used for experimentation, and learning stuff about hypervisors, operating systems, even computer architecture/ISA(x86 in particular).

Today I cloned the git repo for the lguest64 port, and I started browsing through the documentation and the code. In the launcher code(the program that initializes/sets up and launches a new guest kernel), I saw the coolest programming hack/trick I’ve seen in a long time. :P

/*L:170 Prepare to be SHOCKED and AMAZED. And possibly a trifle nauseated.
*
* We know that CONFIG_PAGE_OFFSET sets what virtual address the kernel expects
* to be. We don't know what that option was, but we can figure it out
* approximately by looking at the addresses in the code. I chose the common
* case of reading a memory location into the %eax register:
*
* movl <some-address>, %eax
*
* This gets encoded as five bytes: "0xA1 <4-byte-address>". For example,
* "0xA1 0x18 0x60 0x47 0xC0" reads the address 0xC0476018 into %eax.
*
* In this example can guess that the kernel was compiled with
* CONFIG_PAGE_OFFSET set to 0xC0000000 (it's always a round number). If the
* kernel were larger than 16MB, we might see 0xC1 addresses show up, but our
* kernel isn't that bloated yet.
*
* Unfortunately, x86 has variable-length instructions, so finding this
* particular instruction properly involves writing a disassembler. Instead,
* we rely on statistics. We look for "0xA1" and tally the different bytes
* which occur 4 bytes later (the "0xC0" in our example above). When one of
* those bytes appears three times, we can be reasonably confident that it
* forms the start of CONFIG_PAGE_OFFSET.
*
* This is amazingly reliable. */
static unsigned long intuit_page_offset(unsigned char *img, unsigned long len)
{
unsigned int i, possibilities[256] = { 0 };
for (i = 0; i + 4 < len; i++) {
/* mov 0xXXXXXXXX,%eax */
if (img[i] == 0xA1 && ++possibilities[img[i+4]] > 3)
return (unsigned long)img[i+4] << 24;
}
errx(1, "could not determine page offset");
}

It’s very well commented, and so I don’t think there’s something I could explain any better.
Very very nice trick!
‘Prepare to be shocked and amazed!’ ;)