Comments

The block layer does not know about pending requests. This information
is necessary for copy-on-read since overlapping requests must be
serialized to prevent races that corrupt the image.
The BlockDriverState gets a new tracked_request list field which
contains all pending requests. Each request is a BdrvTrackedRequest
record with sector_num, nb_sectors, and is_write fields.
Note that request tracking is always enabled but hopefully this extra
work is so small that it doesn't justify adding an enable/disable flag.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
block.c | 48 +++++++++++++++++++++++++++++++++++++++++++++++-
block_int.h | 4 ++++
2 files changed, 51 insertions(+), 1 deletions(-)

On 12/05/2011 01:17 PM, Marcelo Tosatti wrote:
> There is no need to worry about synchronous read/write requests> bypassing this interface, correct?>
Synchronous read/write requests do not exist anymore (in the sense that
they also go through coroutines).
Paolo

On Mon, Dec 5, 2011 at 4:09 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Wed, Nov 23, 2011 at 11:47:53AM +0000, Stefan Hajnoczi wrote:> On earlier discussion, you replied to me:>> ">>> req = tracked_request_add(bs, sector_num, nb_sectors, false);>>>> The tracked request should include cluster round info?>> When checking A and B for overlap, only one of them needs to be> rounded in order for overlap detection to be correct. Therefore we> can store the original request [start, length) in tracked_requests and> only round the new request.> ">> The problem AFAICS is this:>> - Store a non-cluster-aligned request in the tracked request list.> - Wait on that non-cluster-aligned request> (wait_for_overlapping_requests).> - Submit cluster-aligned request for COR request.>> So, the tracked request list does not properly reflect the in-flight> COR requests. Which can result in:>> 1) guest reads sector 10.> 2) <sector_num=10,nb_sectors=2> added to tracked request list.> 3) COR code submits read for <sector_num=10,nb_sectors=2+cluster_align>
It will also round down to sector_num=0 when cluster size is 128
sectors (e.g. qcow2 and qed).
> 4) unrelated guest operation writes to sector 13, nb_sectors=1. That is> allowed to proceed without waiting because tracked request list does not> reflect what COR in-flight requests.
The tracked request list has <sector_num=10, nb_sectors=2> and the
candidate write request is <sector_num=13, nb_sectors=1>.
In wait_for_overlapping_requests() we round the candidate request to
<sector_num=0, nb_sectors=cluster_size>. This rounded request does
overlap <sector_num=10, sectors=2> so it will need to wait.
Therefore CoR and write will not execute at the same time.
Does this make more sense?
Stefan