I recently built a dual Athlon box (Tyan Tiger MPX) and my GF4 Ti4200 just hung the system when I tried an OpenGL visualization plugin through XMMS. Damn. Cursor still worked, but everything else was frozen. Anyone else have this problem? I had the same thing years ago on my dual-Celeron box with a PCI Riva TNT card. I figured by now this would be fixed.

System is Red Hat 7.3 with 2.4.18-5 kernel, SMP kernel and GLX drivers.

Any tips would be appreciated. Like is this a kernel problem, or a driver problem? Are there some BIOS settings that might help? I'm running with the 'mem=nopentium' option, by the way. Given some pointers I'll even start trying to fix this myself, I'm that desperate.

Chaz

Thunderbird

08-20-02 03:05 AM

It might be that dual athlon stuff isn't that stable yet. It took a long time before normal intel cpu's were stable in SMP mode. Bug report to: linux-bugs@nvidia.com

chazmati

08-20-02 09:26 PM

Stable in SMP mode?

Maybe, although I haven't seen any other stability issues with this system. It's not overclocked, I didn't 'cheat' by using a pair of Athlon XP processors... so why doesn't this work?

Great linkage, thanks much. Required reading for anyone with a dual Athlon system + nVidia card.

Chaz

chazmati

08-22-02 12:21 AM

Just wanted to follow up on the SMP stability issue.

Disabling AGP sounded like a performance hit, so I read through the AMD issue report--which was awesome--and applied the patch to the 2.4.18-10 kernel, which RedHat released today.

So basically what I did was to grab "kernel-source-2.4.18-10" via up2date and apply the patch (see gbrauer's post for the link) to /usr/src/linux-2.4/arch/i386/kernel/setup.c before building a custom kernel. Red Hat's instructions were very clear although I encountered one error compiling the "transparent decompression extension" to the iso9660 filesystem. I disabled this option and was able to make bzImage/modules etc.

I rebooted into that kernel 30 minutes ago and haven't had a crash yet. During this time XMMS has been running that OpenGL plugin that quickly crashed my system under the stock 2.4.18-5 kernel.

Thanks again for the valuable information, hope this helps other people in the same situation.

I *believe* this is a backport of the AGP speculative caching fix from the 2.5 kernel. The patch I linked to above is a workaround that works by disabling speculative caching. This patch may actually fix the architectural problems in the kernel. You may have "double fixed" your problem by patching RH 2.4.18-10.

AMD claims the workaround patch only adds a 2-3% performance hit, but you may want to try running the stock 2.4.18-10 kernel and see if you're still stable.

Then follow the instructions to recompile a custom kernel (edit the /usr/src/linux-2.4/Makefile and add "Extraversion=<something>" so you don't overwrite your vendor kernel).

gbrauer mentioned the RedHat 2.4.18-10 kernel seems to have a (better?) patch for this, so you may not need this, but you might want to try it just to cover all bases.

tedkaz

08-26-02 10:28 AM

What a saga :-)

I went through hell with this, with RHL 7.2 I and
the 2880 it was ok with my dual athlon on a Tyan
S2462. Upograde to RHL 7.3 and 2.4.18-3,4,5 it was hosed. But I am now on 2.4.18-10 kernel and I seem to be ok. I never could make it through one round of Tuxracer before. Time permitting I will dig into this some more, it definelty seems to be a big performance hit with any of the open gl stuff going. ANyway, great thread with a great link.