If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

OK good news here, the performance drop on the myri was caused by a
problem between the keyboard and the chair. After the reboot series,
I forgot to reload the firmware so the driver used the less efficient
firmware from the NIC (it performs just as if LRO is disabled).

That makes me think that I should try 3.8-rc2 since LRO was removed
there :-/

Just for the record, I tested 3.8-rc2, and the myri works as fast with
GRO there as it used to work with LRO in previous kernels. The softirq
work has increased from 26 to 48% but there is no performance drop when
using GRO anymore. Andrew has done a good job !

Comment

Comment

It certainly is a performance regression, but I seriously doubt that it affects many Phoronix readers. I don't know how many of us use 10 GigE, but you have to start there, and then question how many of those readers have the Myri cards.

Yep, it's a driver problem. For that small subset, it's pretty darned serious, but then again, it'll be fixed when 3.8 hits the streets.

Comment

I think the problem is much more than a single driver or a simple performance regression. I updated a machine with gigabit Broadcom network to 3.7.1 . I started seeing processes hang when performing network operations--jdbc/memcached/activemq. The hangs were intermittent but would happen during heavy batch processing every night. Over the next several days, I tried messing with kernel settings, MTU settings, hugepage support settings, jdbc driver updates, jdbc driver reverts, disabling ipv6, enable/disable tcp keepalive, etc.

I eventually reverted the setting changes back and reverted the kernel back to a 3.6 release and haven't see one hang since. I came across some lkml mailings talking about epoll hangs and figured there must be something going on, since the stack traces I was seeing were showing strange kernel hangs--like the client was waiting on the server and the server was waiting on the client

I'll be following the progress on this network issue now to see what's uncovered.

Comment

I think the problem is much more than a single driver or a simple performance regression. I updated a machine with gigabit Broadcom network to 3.7.1 . I started seeing processes hang when performing network operations--jdbc/memcached/activemq. The hangs were intermittent but would happen during heavy batch processing every night.