On Sat, May 14, 2011 at 09:04:23PM +0200, Nikola Ciprich wrote:> Nicolas, thanks for further report, it contradicts my theory that> problem occured somewhere during 2.6.32.16. Now I think I know why> several of my other machines running 2.6.32.x for long time didn't> crashed:> > I checked bugzilla entry for (I believe the same) problem here:> https://bugzilla.kernel.org/show_bug.cgi?id=16991

I don't think that that bug is related, I for one haven't seen anybacktrace that is similar to the above or relevant to divide by zero.

> and Peter Zijlstra asked there, whether reporters systems were running> some RT tasks. Then I realised that all of my four crashed boxes were> pacemaker/corosync clusters and pacemaker uses lots of RT priority> tasks. So I believe this is important, and might be reason why other> machines seem to be running rock solid - they are not running any RT> tasks. It also might help with hunting this bug. Is somebody of You> also running some RT priority tasks on inflicted systems, or problem> also occured without it?

No, no RT tasks here. The boxes in my case were just running a lot ofkvm processes.