On Thu, 8 Jan 2009, Linus Torvalds wrote:> > And I don't even believe that is the bug. I suspect the bug is simpler. > > I think the "need_resched()" needs to go in the outer loop, or at least > happen in the "!owner" case. Because at least with preemption, what can > happen otherwise is> > - process A gets the lock, but gets preempted before it sets lock->owner.> > End result: count = 0, owner = NULL.> > - processes B/C goes into the spin loop, filling up all CPU's (assuming > dual-core here), and will now both loop forever if they hold the kernel > lock (or have some other preemption disabling thing over their down()).> > And all the while, process A would _happily_ set ->owner, and eventually > release the mutex, but it never gets to run to do either of them so.> > In fact, you might not even need a process C: all you need is for B to be > on the same runqueue as A, and having enough load on the other CPU's that > A never gets migrated away. So "C" might be in user space.> > I dunno. There are probably variations on the above.