Klaus Frahm wrote:
> I would like give my feedback on a recent modification of the tulip
> driver in 2.6.10-rc3 by:
>
> John W. Linville:
> o tulip: make tulip_stop_rxtx() wait for DMA to fully stop

Klaus,
thanks for the feedback. I'm one of the original authors of the patch.

That patch was added prevent the tulip driver from deallocating DMA buffers
while tulip was still doing DMA to consistent memory or RX buffers.
This showed up as an MCA (crash) on faster ia64 (1.6Ghz) HP ZX1 platforms.

> I have a sitecom network card which happens to work with the tulip
> driver. I have observed in the kernel version 2.6.10-rc3 when I
> deconfigure the device with "ifconfig eth1 down" (or also with
> "dhcpcd -k eth1"), I get the following message in dmesg and
> /var/log/messages:
> 0000:00:0e.0: tulip_stop_rxtx() failed
> The message did not appear until 2.6.10-rc2 and I assume it is due to
> the modification of "John W. Linville" mentioned above.

Correct - it is.

> This message does not seem to create any problem for me and I do not
> require any assistance. I can configure and deconfigure the device as
> usual and the network card appears to work properly. However, I thought
> this information might be useful for debugging purposes for the
> developer.

I need just one or two more bits of info.
Apologies for not including that in the original patch. (see below)

> Since this is related to DMA this might also be a hardward bug since I
> use a 5 year old motherboard and since 2.4.21 I can no longer use DMA
> for my old cdrom. DMA for the harddisk works properly.

The message does not indicate a new problem on your platform.
And it's unlikely you will run into the same problem the patch
was intended to fix.

I expect one of three things to fix this:
o The comet card needs more time than we've allocated.
Could you also try larger values for "i" in the loop?
e.g. 2000/10 or 4000/10

o The loop is too "tight" and poking the card every 10us is interfering
with DMA. The solution is to change the udelay(10) to 50 or 100
(and the corresponding "i" value initialization).

o Chip defect. When DMA is stopped, CSR5 Transmit State and Receive
State machines are expected to be zero. It's possible this chip
just never sets those states. I suppose we could check CSR6 bits
to confirm the ST and SR bits are clear before printing the message.
The CSR6 value above will tell me if that's feasible.