In short, it is illegal to call __pa() on an address holdinga percpu variable. The times when this actually matters arepretty obscure (certain 32-bit NUMA systems), but it _does_happen. It is important to keep KVM guests working on thesesystems because the real hardware is getting harder andharder to find.

This bug manifested first by me seeing a plain hang at bootafter this message:

CPU 0 irqstacks, hard=f3018000 soft=f301a000

or, sometimes, it would actually make it out to the console:

[ 0.000000] BUG: unable to handle kernel paging request at ffffffff

I eventually traced it down to the KVM async pagefault code.This can be worked around by disabling that code either atcompile-time, or on the kernel command-line.

The kvm async pagefault code was injecting page faults into the guest which the guest misinterpreted because its"reason" was not being properly sent from the host.

The guest passes a physical address of an per-cpu async pagefault structure via an MSR to the host. Since __pa() isbroken on percpu data, the physical address it sent wasbascially bogus and the host went scribbling on random data.The guest never saw the real reason for the page fault (itwas injected by the host), assumed that the kernel had takena _real_ page fault, and panic()'d.