Re: i386 version of cpu_sfence()

From:

Matthew Dillon <dillon@xxxxxxxxxxxxxxxxxxxx>

Date:

Sat, 29 Jan 2011 09:40:10 -0800 (PST)

:I think it suggests that:
: processor 0 processor 1
:store A <--- 1
: :
: : later
: :..........> load r1 A
:
:r1 still could be 0, since the A is still in the store buffer, while:
:processor 0 processor 1
:store A <--- 1
:sfence
: :
: : later
: :..........> load r1 A
:
:r1 could must be 1
:
:Well, I could be wrong on this.
:
:Best Regards,
:sephe
Hmm. Well, for it not to be globally ordered processor 1 would
have to be able to have visibility on a second write before it
has visibility on the first write. Since both writes are in
the write buffer from processor 0 processor 1 should never see
them out of order.
With the caveat, however, that processor 1 CAN see them out of
order if it reorders its reads or does speculative reads (which
all cpus do by default), hence processor 1 would need a LFENCE
between the two reads.
But processor 0 (for x86) should not need a SFENCE.
There might be another exception for write combining in the store
buffer, though. I'm not sure how wide the store buffer is
(32 bits?). We may not have to worry about it. In that case
write A, write B, write C where A and C can be write combined
could wind up causing another cpu to see the write of C before the
write of B.
-Matt