> Date: Tue, 29 Aug 2000 04:26:18 +0200 (CEST)> From: Andrea Arcangeli <andrea@suse.de>>> Alpha definitely needs mb(), ld_l/st_c doesn't imply any memory> barrier (yes, alpha is very aggressive in SMP 8).>>I wonder if it then makes sense to just provide a set of bitops>interfaces then which enforces the barrier. Or perhaps the other way

And where do you want to enforce the barrier? Before or after clear_bit,or both? Do you think we should add mb_clear_bit() and clear_bit_mb() andmb_clear_bit_mb()? I'm not sure that's cleaner than using mb();clear_bit(); mb(), however I see that only alpha due its SMP capabilitywould get the explicit mb() at zero additinal cost. Oh well...

Anyway what we call now clear_bit() should not have any implicit memorybarrier. clear_bit is just an atomic function, it doesn't have anything todo with memory barriers. (I dislike the __clear_bit approch)

I'd prefer to have things like UnlockPage implemented as the architectureprefers (with a safe SMP common code default) instead of exporting zillonsof mb_clear_bit_mb and friends variant (same for set_bit and otherfriends). Maybe the architecture can do even more deep assumption thanwhat we can export via an API bloating mb_clear_bit_mb().

Note that actually we just now enforce an implicit memory barrier _after_test_and_set/clear/change_bit (to be able to use them like spinlocks tobuild a critical section). For example mm/filemap.c:lock_page() relys ontest_and_set_bit (in TryLockPage) to do a mb() _after_ setting thePG_locked bitflag and of course the alpha port is doing it see:

And the above will allow the `lock = 0' (in our case clear_bit) to bereordered inside the critical section causing a plain SMP race. To fix itwe do need an explicit mb() _before_ clear_bit() (and after the criticalsection of course). And for using the wake_up optimization(waitqueue_active()) we need mb() _after_ clear_bit(). Fun ;). So we needboth before and after for two very different reasons (first mb is forclosing the critical section before releasing the lock, second mb is forordering rules of the wait event interface :)

Furthmore what we need aren't really mb() but smb_mb() that are noops inan UP compilation.

UnlockPage() { smp_mb(); clear_bit(); smp_mb(); if (waitqueue_active()) wakeup()}The above would fix the SMP race and it would be the most finegrinedimplementation that also avoids suprious mb() in the slow path oflock_page(). It would be also the most efficient implementation for thealpha :)) IMHO it also tends to be less misleading since there will be asmb_mb() at the start of the critical section too so we won't forgot ofthe second _necessary_ one. And we won't do a suprious mb while doing thetest_and_set_bit(PG_referenced) that doesn't need the mb.

However with the above common code to optimize lock_page we shouldre-implemented per-arch (ala string.h with overriding #define) for IA32and sparc64 to avoid wasting cycles in suprious mb() that happen to beimplicit in the bitops at least for such two architectures.

I'm checking if it's not a pain to audit all thetest_and_set/clear/change_bit to avoid adding the implicit barrier in themand move the stuff that we want to be optimized in the arch section.Meanwhile hints and comments are welcome.

If you prefer a zillon of mb_clear_bit_mb and all other possiblecombinations we can do that of course too (optimizing the slow path oflock_page shouldn't be very worthwhile optimization, I see).

BTW, (changing topic a bit) after your lesson about the meaning of sparc64membar I seen you never use the rmb/wmb inside the sparc64 code so I thinkyou can just optimize sparc64 to be more finegrined (and yes wmb shouldhave the alpha asm("wmb") meaning :) and rmb should be the exactcounterpart to avoid obvious mistakes where nobody knows anymore what suchtwo macros means :).

And this changes brlock.h accordingly so that it's implemented in thealpha rmb/wmb meanings:

--- 2.4.0-test7-pre5/include/linux/brlock.h Tue Aug 22 01:30:04 2000+++ /tmp/brlock.h Tue Aug 22 01:56:53 2000@@ -114,10 +114,15 @@ lock = &__br_write_locks[idx].lock; again: (*ctr)++;- rmb();+ mb(); if (spin_is_locked(lock)) { (*ctr)--;- rmb();+ wmb(); /*+ * The release of the ctr must become visible+ * to the other cpus eventually thus wmb(),+ * we don't care if spin_is_locked is reordered+ * before the releasing of the ctr.+ */ while (spin_is_locked(lock)) barrier(); goto again;(btw you and Alan previously said me that the wmb() is strictly necessary(Alan said at least in theory :) but now I well remebered why I said wmbisn't necessary, that's because spinlocks would otherwise deadlock atleast on alpha because in spin_unlock() we do:

so never causing the write buffer to be full (the write buffer have onlythe `lock' memory set to zero in it).

and the other task in the slow path is doing:

while (test_and_set_bit(&lock)) while (lock); /* other task is looping here */(if the lock set to zero by the other CPU running spin_unlock would neverbecome visible this CPU would deadlock in the slow path of the spin_lock)

I guess/hope we don't rely on the random stuff after spin_unlock to causethe write buffer to be flushed because of lots of writes in memory :)

I always read wmb/rmb are there to avoid reodering, that looked a matterof ordering of visibility only, not of happening or not happening.

Comments on this?

Andrea

-To unsubscribe from this list: send the line "unsubscribe linux-kernel" inthe body of a message to majordomo@vger.kernel.orgPlease read the FAQ at http://www.tux.org/lkml/