[Qemu-devel] Re: [PATCH] virtio: Use ioeventfd for virtqueue notify

From:

Stefan Hajnoczi

Subject:

[Qemu-devel] Re: [PATCH] virtio: Use ioeventfd for virtqueue notify

Date:

Mon, 25 Oct 2010 14:25:05 +0100

User-agent:

Mutt/1.5.20 (2009-06-14)

On Tue, Oct 19, 2010 at 03:33:41PM +0200, Michael S. Tsirkin wrote:
> My main concern is with the fact that we add more state
> in notifiers that can easily get out of sync with users.
> If we absolutely need this state, let's try to at least
> document the state machine, and make the API
> for state transitions more transparent.
I'll try to describe how it works. If you're happy with the design in
principle then I can rework the code. Otherwise we can think about a
different design.
The goal is to use ioeventfd instead of the synchronous pio emulation
path that userspace virtqueues use today. Both virtio-blk and
virtio-net increase performance with this approach because it does not
block the vcpu from executing guest code while the I/O operation is
initiated.
We want to automatically create an event notifier and setup ioeventfd
for each initialized virtqueue.
Vhost already uses ioeventfd so it is important not to interfere with
devices that have enabled vhost. If vhost is enabled, then the device's
virtqueues are off-limits and should not be tampered with.
Furthermore, older kernels limit you to 6 ioeventfds per guest. On such
systems it is risky to automatically use ioeventfd for userspace
virtqueues, since that could take a precious ioeventfd away from another
virtio device using vhost. Existing guest configurations would break so
it is simplest to avoid using ioeventfd for userspace virtqueues on such
hosts.
The design adds logic into hw/virtio.c to automatically use ioeventfd
for userspace virtqueues. Specific virtio devices like blk and net
require no modification. The logic sits below the set_host_notifier()
function that vhost uses.
This design stays in sync because it speaks two interfaces that allow it
to accurately track whether or not to use ioeventfd:
1. virtio_set_host_notifier() is used by vhost. When vhost enables the
host notifier we stay out of the way.
2. virtio_reset()/virtio_set_status()/virtio_load() define the device
life-cycle and transition the state machine appropriately. Migration
is supported.
Here is the state machine that tracks a virtqueue:
assigned
^ / \ ^
e. / / c. g. \ \ b.
/ / \ \
/ v f. v \ a.
offlimits ---------------> deassigned <-- start
<---------------
d.
a. The virtqueue starts deassigned with no ioeventfd.
b. When the device status becomes VIRTIO_CONFIG_S_DRIVER_OK we try to
assign an ioeventfd to each virtqueue, except if the 6 ioeventfd
limitation is present.
c, d. The virtqueue becomes offlimits if vhost enables the host notifier.
e. The ioeventfd becomes assigned again when the host notifier is disabled by
vhost.
f. Except when the 6 ioeventfd limitation is present, then the ioeventfd
becomes unassigned because we want to avoid using ioeventfd.
g. When the device is reset its virtqueues become deassigned again.
Does this make sense?
Stefan