Sunday, May 25, 2008

I first upgraded X.Org on my live HDD and it was Good. The proverbial poop hit the fan when I happily tried doing the same on my laptop at home. I backed up /etc/X11/xorg.conf and re-configured X in order to force hardware auto detection:

dpkg-reconfigure xserver-xorg

I then logged out, in order to restart the GNOME display manager (GDM) and ... my computer hard-locked. No display, Caps Lock LED keeps blinking, no disk activity, no ping. Dead as a Dodo.

I turned the computer off, turned it on again and waited for GDM to come up. Same result - my laptop ended up quietly blinking its Caps Lock LED.

By now you've no doubt realized that this is the Bad part of my story. After the initial shock, I realized that my plan to go to sleep early that night, was going to stay just that - a plan.

I first needed to get the laptop to boot at all. I cycled the laptop power and waited for the GRUB menu to come up, I then used the arrow keys to select the "Single User" kernel configuration, and hit <Enter>. The boot sequence ended up with a password prompt for the root user, I typed it in and then realized that I had no idea what to do next.

I looked at ~/.xsession-errors, but it seemed to belong to an earlier (working) X session. I browsed /var/log/ for a relevant log file and found /var/log/Xorg.0.log - I hoped to see an error or warning message close to the end of the log file that would point me to the cause of the problem. No luck: the log file ended abruptly at some point, with no obvious problem indication.

I then did something irrational:

invoke-rc.d gdm start

and to my surprise GDM came up, graphical display and all. WTF?!

I didn't know enough about the startup process to figure out the difference between "Single User" mode and the normal startup sequence. The only obvious difference was visual: a long while ago I added vga=791 to the default kernel command line in /boot/grub/menu.lst - this made the virtual terminals come up in 1024x768 resolution. I did not touch the single user command line, so it came up in the default low resolution (640x480 ?).

So I removed that extra command line parameter, ran update-grub, rebooted, and it "fixed" the problem: apart for the low resolution in the virtual terminal, X came up OK. A happy end?

I guess that most users, at this point, would just settle for a low resolution console, and would go on with their lives, chalking this one up as just another Linux hardware incompatibility issue. I guess I would've done the same, if I didn't know for certain that this was a regression - it worked before, so there's no reason for it to be broken now! I wanted my high resolution console.

After browsing the bugs filed against the ATI display driver at the Debian Bug Tracking System I realized that I was pretty much on my own - my hardware is simply too old, and my setup (Debian GNU/Linux "testing" on a Compaq Presario 900 laptop, with an on-board ATI Radeon Mobility IGP320M U1 video adapter) is probably rather unique.

I was mentally ready to compromise. It seemed very likely that the problem is driver related, so I tried using the VESA display driver instead of the ATI display driver:

backup /etc/X11/xorg.conf

open the file for editing

find the line that starts with Driver in the section named Display

modify the string on the Driver line from whatever it is to "vesa"

save the file

restart X (e.g. by logging out and then hitting <Ctrl>-<Alt>-<Backspace>)

I don't use any spiffy 3D stuff, so I figured I could live with a generic SVGA driver instead of the hardware specific driver. I reinstated the vga=791 kernel command line option, rebooted my box and it all seemed to work OK.

That is, until I tried playing a video file with mplayer - I hit 'f' to go fullscreen, and to my dismay the image was not stretched to fill the screen - instead it was centered, still at the same size, surrounded by a black frame that spanned the rest of the screen area. Apparently, hardware acceleration is used not just for spiffy 3D, but also for image scaling.

Video playback is more important to me than high resolution display in virtual terminals. But I just couldn't let it go. And this is where my story gets Ugly.

Friday, May 16, 2008

Like many other posts on this blog, this story begins with me upgrading something...

This time it's X - the windowing system for Linux and other Unix-like operating systems. I didn't know it at the time, but I was in for a rough ride. I decided to split this post into three parts, corresponding to the mental phases I went through.

Well, let's start off with the Good part.

In my live-HDD article, I described how to install Debian/testing on a USB hard drive. It turned out to be rather easy, with some rough edges that I needed to smooth. One of the problems was to get X to auto detect and configure the display adapter and the attached monitor. X, prior to version 7.3, can pull it off rather nicely, but you must invoke hardware detection manually.

The way I did it was to add the following command to /etc/init.d/bootmisc.sh:

dpkg-reconfigure -fnoninteractive xserver-xorg

This is no longer required with X.Org 7.3. You can get X to auto detect hardware when it starts by running the above command once, so that any reference to specific hardware is removed from the resulting X configuration file /etc/X11/xorg.conf. This is a one-shot deal - there's no need to do it again, and the changes to /etc/init.d/bootmisc.sh can be safely reverted.

I tried my live HDD on several machines, with different combinations of displays and display adapters, and I'm happy to report that it seems to work well.

Gone are the days of manual futzing with the X configuration file. After moving from Window$ to Linux, this was probably one of the most striking mis-features of X that I encountered. Good riddance.

Friday, May 9, 2008

I'm currently writing a tiny console application for Windows. It's written in C and I use the MinGWcross compiler to compile it and I use Wine (the WINdows Emulator) to test it under Linux. I expect to have more to say about this application when it's ready, but till then let me just say that it's meant to read and write lots of files.

One problem that I've hit with Wine is that my program fails to access files with names that contain non-English characters. A quick search brought me to this message on the wine-users mailing list archive.

So here's how to make it work for Hebrew, with UTF-8 encoded file names on the disk:

Sunday, May 4, 2008

I'm a man of habits. For example, every morning, as soon as I get the chance, I go to my computer and check my local mailbox for email notifications from Bacula - it sends me a message for each of the three backup jobs that were run during the night. It's usually OK, but sometimes not, and then I have to deal with the problem.

Actually, I haven't had a problem for quite a while, until last Thursday. It's not that I got an error, I didn't get any message at all - my mailbox was empty.

My initial guess was that the Bacula director daemon was stopped for some reason, so that the backup jobs were never started. I promptly started the Bacula console program bconsole, and used the run command three times, to start each backup job.

And then it hit me: bconsole would've reported an error if it could not connect to the Bacula director daemon. I typed list jobs and sure 'nuff - last night's jobs were on the list, marked as successfully completed.

By this time the backup jobs that I manually launched earlier were also completed, and no - there were no new messages in my Inbox. Something was broken. It wasn't critical - after all, backup jobs ran to completion - but my habits were disrupted, and I didn't like it at all.

That evening I sat down to fix the issue, or at least figure out what went wrong. I suspected that Bacula's configuration was somehow broken, probably due to a recent upgrade. I fired up emacs, opened /etc/bacula/bacula-dir.conf and /etc/bacula/bacula-dir.conf.dist, and compared them with M-x ediff-buffers.

The file with the .dist extension is the one that would've been installed upon a fresh install of Bacula. This file is left behind after an upgrade in order to allow the system administrator to do exactly what I was doing: compare the current configuration with the pristine configuration, that's shipped with the package.

Apart for the obviously different bacup/restore job definitions, there was indeed a difference in the command line used to launch bsmtp - the utility that's shipped with Bacula and is used to send email notifications. Aha! gotcha!

I "fixed" the command line, saved the file, restarted the director daemon

invoke-rc.d bacula-director restart

and launched a backup job using bconsole.

The job was completed with no errors, but I got no email notification. I read the log messages carefully and noticed that bsmtp reported a fatal error (it could not connect to localhost). So I copied the bsmtp command line from the configuration file and pasted it into a root console window. I got the same error message.

So now it looked like a problem either in bsmtp or exim4 (the mail server). I tried restarting the mail server:

So the culprit was exim4, not Bacula. I googled for the error message and hit Debain bug #476987. I read the thread of messages there, which mention several workarounds, till I got to the last message which simply stated that the bug has been fixed in version 4.69-3. I used

apt-show-versions exim4

to find out that the version installed on my machine is 4.69-2.

I could've stayed with this version, getting along with no email notifications from Bacula, until the fix trickled in from Debian/sid. But sticking to my habits is a motivation too strong to ignore - I decided instead to install the fixed (unstable) version of exim4 (please consult my previous post about maintaining a mixed testing/unstable Debian system).

I used the following command to upgrade any already installed package with the string "exim4" in its name, to the version that's in the unstable repository:

aptitude install -t unstable "~nexim4 ~i"

I then restarted exim4, retried the backup job and finally got a notification email. Yay.

Fixing problems caused by recent upgrades is becoming another habit for me - it took maybe half an hour to sort this out, less than it took to write this blog post.

I was once told that pain is not habituating. Maybe I didn't get the point, but if you can't get used to pain then I guess this kind of activity can't be considered painful. But if this isn't pain then it must be fun, right?