I am working on fixing the savevm/loadvm functionality in the Android
emulator, and the two issues I've encountered so far both appear to
stem from the asynchronous I/O (AIO) code. In both cases, the emulator
busy-waits indefinitely for an operation that never signals completion.

Unfortunately I am not really familiar with AIO, so I was hoping one
of the emulator devs could point me some resources (design docs,
general introduction, etc.). I've done some searching myself and found
some docs for the Linux kernel AIO implementation
(http://lse.sourceforge.net/io/aio.html), but I'm not sure to what
extent it applies to the QEMU code.

Tips for debugging AIO would also be greatly appreciated. I can trace
the execution until I am within the (emulated) device driver (i.e.
block/qcow2.c:qcow_aio_writev()), but haven't been able to pinpoint
the exact location where the actual async call is made. This makes it
difficult to identify the code that should signal completion back to
the main process (and apparently fails to do so). I know this code is
called though, because some asynchronous calls *do* signal completion.

TCG translates guest code into small sequences of host code (basic
blocks). These basic blocks can be chained together such that one block
directly jmps to the next block. The effect is that a guest can run a
tight loop whereas guest code continuously runs without a chance for
QEMU to do any work.

To allow qemu to make forward progress in such a scenario, we program
signals to fire. Currently, the signals fire in a number of
circumstances including when AIO operations complete, or when a periodic
timer needs to fire.

When dealing with multiple threads, it's very easy to screw things up by
not masking signals properly. Often times, this is hidden because the
periodic timer runs often enough that it doesn't matter if you miss a
signal. An exception, however, would be emulation of synchronous code.
This tends to happen in qcow2 metadata operations since they are still
synchronous. To complete this emulation, we have to block the current
thread until the I/O operation completes. But since qemu isn't
re-entrant, we can't run the full main loop as that could trigger
re-entrancy in qcow2. To work around this, we implement "idle bottom
halves" which are special bottom halves that are run by the normal io
loop but also by a special I/O used exclusive for emulating synchronous
writes.

To further complicate matters, non-x86 platforms (like ARM) are more
likely to not use a periodic timer which makes these bugs much more obvious.

I realize that the Android emulator is a rather heavy fork of QEMU, so
giving specific advice will probably be difficult. However, the
overall approach is still the same, so I hope you can help me get a
better understanding of that.

This is the problem with forking. This is very hairy code that requires
careful attention to detail. If you're introducing any type of
threading, disk emulation, or changes to the block subsystem, chances
are you've done it wrong.