> From: Ingo Molnar <mingo@elte.hu>> Date: Tue, 17 Jun 2008 10:32:20 +0200> > > those up to 1000 msec delays can be 'felt' via ssh too, if this > > problem triggers then the system is almost unusable via the network. > > Local latencies are perfect so it's an e1000 problem.> > Or some kind of weird interrupt problem.> > Such an interrupt level bug would also account for the TX timeout's > you're seeing btw.

when i originally reported it i debugged it back to missing e1000 TX completion IRQs. I tried various versions of the driver to figure out whether new workarounds for e1000 cover it but it was fruitless. There is a 1000 msec internal watchdog timer IRQ within e1000 that gets things going if it's stuck.

But the line sch_generic.c:222 problem is new. It could be an escallation of this same problem - not even the hw-internal watchdog timeout fixing up things? So basically two levels of completion failed, the third fallback level (a hard reset of the interface) helped things get going. High score from me for networking layer robustness :-)