Each core has a block that it flushes to the global queue on command.
However, that is per-cpu state that is accessed from perfmon IRQs. Thus
any unsafe access, such as writing the block to the global queue, should
be protected from concurrent access. In this case, disabling IRQs is
necessary.

The bug showed up as a queue with an incorrect length. The sum of the
BLENs did not add up to qlen (q->dlen, btw). The last block on the list
was appended to *while* it was on the queue. This happened due to an
IRQ right after writing to the queue, but before we cleared
cpu_buf->block.