<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"><HTML><HEAD><META http-equiv=Content-Type content="text/html; charset=iso-8859-1"><META content="MSHTML 5.50.4134.600" name=GENERATOR><STYLE></STYLE></HEAD><BODY bgColor=#ffffff><DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV><DIV><FONT face=Arial size=2>&nbsp;&nbsp;&nbsp; Doug Gilbert and I ran across some weirdness in the way the block device queues are plugged/unplugged.&nbsp; It turned up with some benchmarks of the SCSI generics driver - with the new queueing code, the generics driver is inserting requests into the same queue that block device requests are inserted.</FONT></DIV><DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV><DIV><FONT face=Arial size=2>&nbsp;&nbsp;&nbsp; The oddness is this.&nbsp; We were observing stalls in the processing of commands that was traced to the fact that the queue had remained plugged for an excessive amount of time.&nbsp; The stalls last for about 5 seconds or so.</FONT></DIV><DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV><DIV><FONT face=Arial size=2>&nbsp;&nbsp;&nbsp; Some investigation revealed that part of the answer is that the bdflush daemon essentially forces a bunch of dirty pages to be written to disk, but never bothers to unplug the queue when it is done.&nbsp; The result is that the queue remains plugged until someone else comes along and unplugs it.&nbsp; As it turns out, kupdate() does unplug the queue, and kupdate runs every 5 seconds or so.</FONT></DIV><DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV><DIV><FONT face=Arial size=2>&nbsp;&nbsp;&nbsp; Patching bdflush to run tq_disk after flushing buffers (i.e. before the schedule()) fixed *most* of the problem, but evidently not all of it (Doug was still observing stalls, but a lot less frequently).&nbsp; In other words, there is someone else out there queueing requests in such a way that the queue can remain plugged for some amount of time.</FONT></DIV><DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV><DIV><FONT face=Arial size=2>&nbsp;&nbsp;&nbsp; My gut tells me that it is wrong for bdflush to not unplug the queue when it is done queueing I/O requests.&nbsp; My gut also tells me that the generics driver probably wants to be unplugging the one specific queue that it is using to ensure that I/O gets queued right away (it doesn't make sense to unplug all queues in this instance).</FONT></DIV><DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV><DIV><FONT face=Arial size=2>&nbsp;&nbsp;&nbsp; Comments?</FONT></DIV><DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV><DIV><FONT face=Arial size=2>-Eric</FONT></DIV><DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV></BODY></HTML>