rm@blog ~ $ rm -rf / ; Robert Mustacchi's Musings on Technology

Figuring out where you’re longjmp(3c)ing with DTrace

Last Monday was the illumos hack-a-thon. There, I worked with Matt Amdur on adding tab completion support to mdb — the illumos modular debugger. The hack-a-thon was wildly successful and a lot of fun, I hope to put together an entry on the hack-a-thon and give an overview of the projects that were worked on over the course of the next few days. During the hack-a-thon, Matt and I created a working prototype that would complete the types and members using ::print, but there was still some good work for us to do. One of the challenges that we were facing was some unexpected behavior whenever the mdb pager was invoked. We were seeing different actions depending on which actions you took from the pager: continue, quit, or abort.

If you take a look at the source code, you’ll see that sometimes we’ll leave this function by calling longjmp(3c). There’s a lot of different places that we call setjmp(3c) and sigsetjmp(3c) in mdb, so tracking down where we were going just based on looking at the source code would be tricky. So, we want to answer the question, where are we jumping to? There are a few different ways we can do this (and more that aren’t listed):

Inspect the source code

Use mdb to debug mdb

Use the DTrace pid provider and trace a certain number of instructions before we assume we’ve gotten there

Use the DTrace pid provider and look at the jmp_buf to get the address of where we were jumping

Ultimately, I decided to go with option four, knowing that I would have to solve this problem again at some point in the future. The first step is to look at the definition of the jmp_buf definition. For the full definition take a look at setjmp_iso.h. Here’s the snippet that actually defines the type:

Basically, the jmp_buf is just an array where we store some of registers. Unfortunately this isn’t sufficient to figure out where to go. So instead, we need to take a look at the implementation. setjmp is implemented in assembly for the particular architecture. Here it is for x86 and amd64. Now that we have the implementation, let’s figure out what to do. As a heads up, if you’re looking at any of these .s files, the numbers are actually in base 10, which is different from what you get when you look at the mdb output which has them in hex. Let’s take a quick look at the longjmp source for a 32-bit system and dig into what’s going on and how we know what to do:

The function is pretty well commented, so we can follow along pretty easily. Basically we load the jmp_buf that was passed in into %edx, add 0×14 to that value and then jump to that piece of code. So now we know exactly what the address we’re returning to is. With this in hand, we only have two tasks left: transforming this address into a function and offset, and doing this all with a simple DTrace script. Solving the first problem is actually pretty easy. We can just use the DTrace uaddr function which will translate it into an address and offset for us. The script itself is now an exercise in copyin and arithmetic. Here’s the main part of the script:

/*
* Given a sigbuf translate that into where the longjmp is taking us.
*
* On i386 the address is 0x14 into the jmp_buf.
* On amd64 the address is 0x38 into the jmp_buf.
*/
pid$1::longjmp:entry
{
uaddr(curpsinfo->pr_dmodel == PR_MODEL_ILP32 ?
*(uint32_t *)copyin(arg0 + 0x14, sizeof (uint32_t)) :
*(uint64_t *)copyin(arg0 + 0x38, sizeof (uint64_t)));
}

Now we know exactly where we ended up after the longjmp(), and this will method will work on both 32-bit and 64-bit x86 systems. If you’d like to download the script, you can just download it from here.