Yes, it looks like we can screw things up in the uncontended case (wherenobody blocks on the mutex). We could add an smp_mb after the lock operationand another one before the unlock, but I'm tempted just to useasm-generic/mutex-dec.h instead. The latter approach will subtly change thecurrent behaviour, so I'll post a patch when I'm happy with it.

Curious: did you find this by inspection or did you observe it going wrong?