Re: [PATCH] virtio-ring: Use threshold for switching to indirect descriptors

From

Sasha Levin <>

Date

Tue, 29 Nov 2011 16:21:04 +0200

On Tue, 2011-11-29 at 15:54 +0200, Michael S. Tsirkin wrote:> On Tue, Nov 29, 2011 at 03:34:48PM +0200, Sasha Levin wrote:> > On Tue, 2011-11-29 at 14:56 +0200, Michael S. Tsirkin wrote:> > > On Tue, Nov 29, 2011 at 11:33:16AM +0200, Sasha Levin wrote:> > > > Currently if VIRTIO_RING_F_INDIRECT_DESC is enabled we will use indirect> > > > descriptors even if we have plenty of space in the ring. This means that> > > > we take a performance hit at all times due to the overhead of creating> > > > indirect descriptors.> > > > > > Is it the overhead of creating them or just allocating the pages?> > > > My guess here is that it's the allocation since creating them is very> > similar to creating regular descriptors.> > Well, there is some formatting overhead ...

Very little. The formatting code is very similar to regular buffers.

> > > > The logic you propose is basically add direct as long as> > > the ring is mostly empty. So if the problem is in allocations,> > > one simple optimization for this one workload is add a small> > > cache of memory to use for indirect bufs. Of course building> > > a good API for this is where we got blocked in the past...> > > > I thought the issue of using a single pool was that the sizes of> > indirect descriptors are dynamic, so you can't use a single kmemcache> > for all of them unless you're ok with having a bunch of wasted bytes.> > If the pool size is limited, the waste is limited too, so maybe> we are OK with that...

What would you say are the best numbers for indirect descriptor sizesand the amount of those in a kmemcache?

> > > > > > > With this patch, we will use indirect descriptors only if we have less than> > > > either 16, or 12% of the total amount of descriptors available.> > > > > > One notes that this to some level conflicts with patches that change> > > virtio net not to drain the vq before add buf, in that we are> > > required here to drain the vq to avoid indirect.> > > > You don't have to avoid indirects by all means, if the vq is so full it> > has to resort to indirect buffers we better let him do that.> > With the limited polling patches, the vq stays full all of> the time, we only poll enough to create space for the new> descriptor.> It's not a must to make them work as they are not upstream,> but worth considering.> > > > > > > Not necessarily a serious problem, but something to keep in mind:> > > a memory pool would not have this issue.> > > > > > > > > > > I did basic performance benchmark on virtio-net with vhost enabled.> > > > > > > > Before:> > > > Recv Send Send> > > > Socket Socket Message Elapsed> > > > Size Size Size Time Throughput> > > > bytes bytes bytes secs. 10^6bits/sec> > > > > > > > 87380 16384 16384 10.00 4563.92> > > > > > > > After:> > > > Recv Send Send> > > > Socket Socket Message Elapsed> > > > Size Size Size Time Throughput> > > > bytes bytes bytes secs. 10^6bits/sec> > > > > > > > 87380 16384 16384 10.00 5353.28> > > > > > Is this with the kvm tool? what kind of benchmark is this?> > > > It's using the kvm tool and netperf. It's a simple TCP_STREAM test with> > vhost enabled and using a regular TAP device to connect between guest> > and host.> > guest to host?

guest is running as server.

> > > > > > > Need to verify the effect on block too, and do some more> > > benchmarks. In particular we are making the ring> > > in effect smaller, how will this affect small packet perf> > > with multiple streams?> > > > I couldn't get good block benchmarks on my hardware. They were all over> > the place even when I was trying to get the baseline. I'm guessing my> > disk is about to kick the bucket.> > Try using memory as a backing store.