Howto: Troubleshoot Nvidia drivers with the Ubuntu 7.04 desktop CD

Edit: Unfortunately, the images originally included in this post are gone, because of hosting problems in late 2009. My apologies.

Part of the difficulty in troubleshooting a video driver issue is that after two or three unsuccessful tries, settings are changed and it’s hard to tell if new problems have cropped up as a result. Too many tweaks and changes, added and removed packages, installed and uninstalled drivers and it’s a challenge to make sure the old changes you’ve made aren’t keeping the new ones from taking effect.

At the same time, part of the beauty of an Ubuntu installation — and Linux on the whole — is that there isn’t much need to restart your machine unless you’ve made crucial changes to some core elements, like a kernel upgrade. So it’s possible in some cases to troubleshoot or tweak an installation, even device drivers, without a restart. And that means a live environment is a perfect way to test things.

Use this if you think you’ve overdone your full installation of Ubuntu (or other shades of Linux), and you don’t want to go through the hassle of a reinstall if you’re not sure it will be of any help.

For this example, I’m going to install the nvidia-glx package on my quarrelsome (but tamed) Geforce4 440 Go, then patch, rebuild and install the 8776 driver — all in the same “installation,” all in the live environment and all without a restart. And I have screenshots to prove it. 😀

Here’s what you’ll need:

An Ubuntu 7.04 desktop CD. I strongly recommend the Xubuntu CD since it’s much easier on the system, but if you’re running at 3.06Ghz on a Pentium 4 HT with 2Gb of memory … Ubuntu is fine.

A working Internet connection. This is necessary if you want to try nvidia-glx or another driver out of the repositories; if you have other plans, it’s only a suggestion.

A speedy optical drive. Those CD access lags are killers.

A considerable amount of memory, the more the better.

A considerable comfort level with using the console. Almost all of this will be done from a tty window (not a terminal emulator), and you’ll need to be comfortable with that.

I should mention that performing acrobatics of this nature are rather hardware intensive. I can accomplish most of these feats within 20 minutes on a 1Ghz machine with 512Mb and a 64Mb Nvidia Geforce4 440 Go. Slower machines are going to be rather burdened by the live environment. Having said that, this method will work with any Ubuntu-based live CD (and probably some other flavors of Linux as well), so you could use something like PUD GNU/Linux on machines that are too light for a Xubuntu desktop CD.

An added bonus would be a second machine for Internet access, so you can read these instructions or look elsewhere for help with a graphical environment. I usually use w3m to surf while troubleshooting things, but it would be better to have a full graphical desktop available to you, rather than fighting with a text-mode browser. And it’s just easier to have a working second machine whenever you’re fixing another one. That was the voice of experience speaking there.

Before you get started, do yourself a favor and download any files or patches you might need, and put them on a USB drive or a spare hard drive. You’ll be able to mount those drives and access those files, so get them now rather than strain the live desktop by surfing for drivers or a random deb package that you saw somewhere on the forums. After about 10 minutes, that live CD lag gets annoying. 😀

First, boot to the live environment. After the initial load delay, you should be met with a sleek desktop not unlike this one. As you can see I’ve already downloaded the 8776 driver and the patch.

Smashing! Lovely! Gorgeous! Now let’s kill it. Jump to the first console window with CTRL+ALT+F1. Stop GDM politely with this command.

sudo /etc/init.d/gdm stop

That should be enough to stop the entire live desktop; just to be sure, you can try killing it rudely with

sudo killall gdm

Check and make sure it’s no longer running by hitting CTRL+ALT+F7. If the desktop is still there, go back and kill GDM again.

Now again from the command line, let’s update our package list and install nvidia-glx. The restricted repositories are enabled by default in Feisty, so that’s one small step we won’t have to take.

sudo aptitude update && sudo aptitude install nvidia-glx

Acknowledge the package list and install. Remember that you might want to use a different driver if you’re on an older or a newer card. In my case, nvidia-glx is what I need.

Now I can run the bundled X configuration program to set the xorg.conf file to use the new driver.

sudo nvidia-xconfig

Since nvidia-glx uses the 9631 driver, and that will turn off my LCD display if I don’t tell it otherwise, I edit xorg.conf to add this line under the “Device” section. Note that his is something unique to 420/440/460 Go users, along with a few others. Don’t copy and paste this next part unless you’re sure you need it.

sudo nano -w /etc/X11/xorg.conf

Now add this beneath the driver details under “Device.”

Option "UseDisplayDevice" "DFP"

Write out the file and close nano. Now comes the fun part. Since the module has been inserted and is ready for use, we can just restart GDM, which will in turn restart X, and that will lift us straight back to the graphical environment.

sudo /etc/init.d/gdm restart

And after a brief pause. …

Ta-da! Just like magic. No restart needed. Isn’t Linux just dreamy?

Test the driver by opening a terminal window and triggering the glxgears program, which will show you if acceleration is working properly.

glxgears -info

The output should be the spinning gears you see there. If you don’t see it, then something’s wrong with the driver, your card, the kernel module or another part of the environment. This is just a troubleshooting step though, so an issue or problem will help you get an idea of how to fix something.

Moving on. … Now let’s tear out that nvidia-glx package, strip out anything even remotely akin to the bundled Ubuntu driver and rebuild 8776 from scratch. This is the fun part. 😈

Drop back to the terminal with CTRL+ALT+F1, and stop GDM again, killing it if necessary.

sudo /etc/init.d/gdm stopsudo killall gdm

This time it’s very important that you make sure X isn’t running at all. Use CTRL+ALT+F7 to make sure there’s no graphical environment left, and kill GDM again if there is.

Now let’s get rid of the package, then the module. At the same time, we’ll strip out anything even remotely close to the Ubuntu driver, even the restricted modules that come by default.

Remember: If you installed a different driver, change nvidia-glx to nvidia-glx-legacy or nvidia-glx-new or whatever. (I’ve also gone straightaway and given you linux-restricted-modules-generic and linux-restricted-modules-2.6.20-15-generic since you can’t possibly be using a different kernel in a live environment, since that would require I restart … I think. 😛 )

Now let’s make sure that nvidia module isn’t inserted into the kernel.

lsmod | grep nvidia

Ideally, the results should be nothing. If you get anything that even remotely resembles this …

Once those are in place, copy the driver and the patch into your home directory. Then decompress the driver with this line.

sh nvidia-Linux-x86-1.0-8776-pkg1.run -x

Now change into the directory it makes, then into the usr/src/nv subdirectory.

cd NVIDIA-Linux-x86-1.0-8776-pkg1/usr/src/nv/

Patch the driver to run against the 2.6.20 kernel.

patch Makefile.kbuild ~/NVIDIA_kernel-1.0-8776-20061203.diff.txt

When patch finishes, jump back up three directories to where the installer program is.

cd ../../..

Now start the installer with root privileges.

sudo ./nvidia-installer

From there, the installer will take over. If you’ve done everything right thus far, it will build the patched driver, install it, insert the module and offer to tweak your xorg.conf file — which I usually let it do. When it’s done, start GDM up again with this familiar command, and everything should be golden.

sudo /etc/init.d/gdm restart

And if the driver works for you, this is what you should see, more or less.

Ta-da, again! That’s the same session, the same live environment, but now using the earlier driver. No reboot. And if something had gone wrong, I would have some way of checking to see if the problem lies with the driver, or the card, or elsewhere. I don’t have to wonder if some other tweak is still in the way.

And best of all, if I want to start over clean, I just reboot fresh, and I’m back to where I started.

So remember this the next time you’re laboring with a driver or a card, and it seems like things should be working but they’re not. If you think it might be helpful, go to a live environment and start clean.

Worked great installing 100.14.09 in my Feisty. The only thing I did not do was to patch the driver to 2.6.20 ( did not realize I had to download the file after half way thru the installation). I just answered yes to all the questions during the nvidia installer. I guess I will post if I get any errors.
Thanks a lot.