I always thought next_cpu was used here to cover an explicit race condition.

If you're using TCG, and you get a single after running the loop, but
before assigning cpu_single_env, then you'll set the interrupt exit
request on the old CPU state. You'll eventually exit I guess but you
potentially have to run through multiple VCPUs.

I'd feel more comfortable if we preserved the behavior here that we had
before.

Right. Patch 7 reverts to the old behaviour.

We need to do this series in a way such that we aren't breaking things
as we go along.