The Indian restaurant opposite my house, which was called The Natraj, has reinvented itself as a trendy bar, called "The Buddha Lounge". This is ironic on multiple levels.

Firstly, the Five precepts of Buddhism forbid the consumption of alcohol or intoxicating substances. Secondly, the more strict Eight precepts encourage followers to abstain from music and dancing, and also from all sexual activity (and the main purpose of these types of bar is basically to find willing sexual partners). Finally, followers also refrain from "luxurious places for sitting or sleeping", so even the "lounge" part is out.

I was reading the Wikipedia article about Ken Silverman's PNGOUT, which is a program for creating optimised versions of PNG images. However, it was the screenshot in that article that intrigued me the most. Upon further investigation, it seems that a group of Wikipedia users have been running a minor contest amongst themselves to create the most optimised version possible of an image I originally uploaded three years ago.

Version 1.2.0 of my C Algorithms library is up. The biggest changes in this release are the improvements to the test suite. I've written a bit about the test process that I've been using for improving the library.

Learning about coverage tools has been an interesting process. I liken writing tests without using coverage analysis to trying to optimise code without doing any profiling. With optimisation, it's easy to pick something that you think is a bottleneck and waste lots of time optimising it; in the same way, I've found that it's possible to write tests that you think are exercising the code in a satisfactory way, but actually aren't. Profiling helps to show exactly what's going on. In the course of analysing the library, I found a bug that should have been shown up in the tests, but wasn't, because the tests weren't exercising all of the code as I assumed they were.

Testing how code behaves in failure conditions is as important as testing how it behaves normally, so I wrote some code that uses #define macros to wrap the standard C allocation functions and allow the tests to simulate memory allocation failures. Again, coverage analysis is helpful here, too.

All in all, I'm not entirely sure why I'm writing a data structures and algorithms library, considering that all of these things have already been implemented hundreds of times over by different people. I originally wrote the library to remove the dependency of Irmo on GLib. Since then it's taken on a life and direction of its own, probably due to my own slightly obsessive nature. I think I just like the process of crafting something to the highest quality I possibly can.

(Also: Open source software with a test process? World coming to an end!)

We're rapidly reaching (or have reached?) the point where it's standard to have at least 4 gigabytes of RAM in desktop PCs. This presents an interesting dilemma, because most people run 32 bit operating systems; 32 bits doesn't allow more than 4GB of RAM to be addressed. The ideal alternative is to move to 64 bits; all modern CPUs support x86-64. Unfortunately, it requires a massive porting effort to get everything working on x86-64 (drivers from third party vendors are likely to be the biggest problem), so we're not quite there yet.

In the meantime, there's a useful feature called PAE which allows up to 64GB to be addressed by a 32 bit OS. I was surprised to see, however, that neither Windows XP or even Vista support it, although the server-based versions of Windows do!

The cynic in me wondered if this was a deliberate attempt by Microsoft to stop people from using the normal desktop version of Windows for running big servers, but this seemed a bit too much, even for them. But the Wikipedia article has the actual reason: "desktop versions of Windows (Windows XP, Windows Vista) limit physical address space to 4 GB for driver compatibility reasons".

So poor Microsoft appear to be stuck between a rock and a hard place. They cannot enable PAE, which is, in a sense, a backwards compatibility feature, because doing so would break driver backwards compatibility. This would appear to be an example of a situation where the Linux-style hatred of stable APIs wins over maintaining backwards compatibility. One part of the problem is that Microsoft relies on third-party vendors for drivers. They can't just update their platform and the drivers with it, because they don't have any control over them.

The Gnome 3.0 announcement is a win for sanity and demonstrates the maturity of the people running the project. There's an elegance about a project that aims to be boring-but-functional, rather than exciting-and-unstable. Rather ironic for a project that was once described as a "cascade of attention-deficit teenagers".

I think that probably almost all smart people have realised that scripting using the Bourne shell is a bad idea if the script in question is more complicated than simply automating what can be typed by hand. I mostly avoid writing shell scripts, preferring to write scripts in either Ruby or Python. However, the ability to write shell scripts is still a useful skill; there are certain situations where writing a shell script really is the easier thing to do - mostly situations that involve mostly revolve around executing commands, or where they're the "standard" thing to do - init.d scripts, for example. It's also useful to be able to debug shell scripts that other people have written. To this end, I recently set about honing my shell scripting skills.

To this end, I wrote a script called branch_helper, which is for automating some of the drudge of managing Subversion branches. The main aim of this was to make maintenance of Strawberry Doom easier, as it is developed as a branch within the Chocolate Doom repository and needs periodic updates.

The result is a script that is probably as complicated a shell script as I am ever going to write; certainly the most complicated that I am ever going to want to write. The process did, however, give me deeper insight into why shell scripts, as a "programming language" are quite so unscalable and only suitable for very simple scripts.

One of the most fundamental drawbacks of shell scripts is the lack of a proper list construct. Almost all programming languages give you arrays of some form or other; the closest that you can get with shell scripts is "a string containing a list of items separated by spaces". While this sort-of suffices for some situations, the most obvious drawback is that you can't put items in the "list" that contain spaces themselves. The result of all this is that almost all semi-complex shell scripts are broken if you try to use them with files/directories that contain a space. To demonstrate this, try running a configure script in Cygwin from a directory containing a space (eg. "Documents and Settings").

Bash has arrays as an extension, but, obviously, that won't work with any other Bourne shells. However, the standard Bourne shell does have one type of list - namely, the list of arguments to a function. It's sometimes possible to make use of this if you structure the script in the right way.

Semi-related to the first problem is the problem of how variables are expanded. command "$arg" and command $arg have different meanings, for example, as they expand into either one argument or (potentially) several arguments, respectively. One useful thing to do trying to write "correct" shell scripts is to continually ask yourself - "what would happen if this variable contained a space?"

The inability to easily "return" useful information from a function is one annoying drawback. Every function acts as a "mini-subprogram", which is rather aesthetically pleasing in a way, and actually incredibly useful in some situations. However, it suffers from the fact that the only result that programs in Unix can return is a single 8-bit value (exit code).

The result is that the typical way to pass a value back from a function to its caller is to do something slightly hideous like this: result=`myfunction "$arg1" "$arg2"`

You can also get all kinds of insidious "gotchas" from the fact that the shell will sometimes fork. For example, the following give different output:

(In the latter, the loop runs in a separate process, so the "result" variable is set in that separate process, and the value lost when the loop finishes).

This is actually another manifestation of the previous problem, but handling error situations can be problematic. The simple requirement of "check if a program runs correctly; if it fails, exit the script with an error" can actually be quite tricky to achieve. As the shell can fork to run different parts of the script (especially if you use the backticks trick to pass back values from functions), the "exit" command does different things in different places. If you're in a main script, "exit" will exit the script, but if you're in a section of code that has been forked off into a separate process, it only exits from that other process.

I wrote a function called "error" to exit with an error, and used it to check that functions run correctly and, if they don't, chain back up to the top and exit properly. So in the end, calling a function looks like this:

result=`myfunction "$arg1" "$arg2"` || error

Portability issues. This isn't so much of a problem nowadays because you can pretty much rely on bash being installed on most systems and take advantage of its extensions. However, if you really do want to write a proper "portable" Bourne shell script, there are some things that catch you out. For example, bash lets you define functions using "function myfunction() {" but this isn't supported elsewhere. Similarly, when doing comparisons, bash lets you do eg. "[ "$value" == "shoes" ]" in addition to the standard syntax, which is "[ "$value" = "shoes" ]".

Some very old systems have quirky interpreters that mean you have to do tricks like "[ "x$value" = "xshoes" ]", because, without the "x", if "value" was empty, that would expand to " [ = shoes ]", which is a syntax error.

All in all, some rather nasty quirks that rapidly turn into gigantic annoyances when you try to do anything complicated. However, it's not to say that shell scripting is completely without merits.

Automake helpfully provides the ability to run tests with "make check" - you can give it a list of test programs to run, and it will go through each in turn and check that they exit with a success status (0). However, when running test cases for stuff written in C, it's nice to run them in Valgrind - that way, you can pick up on any memory leaks or other subtle memory errors that you wouldn't otherwise notice.

Automake allows you to set a variable called "TESTS_ENVIRONMENT" that is prefixed to all your test commands, so you can run your tests in valgrind with something like:

make check TESTS_ENVIRONMENT=valgrind

Unfortunately, this isn't perfect. First of all, it's rather tedious having to type that every time you want to run some tests, and secondly, it doesn't automatically fail in error cases.

So I wrote some automake magic to make it all a bit more streamlined. Firstly, a --enable-valgrind flag to configure, to run tests with valgrind. It's then a simple matter of tweaking Makefile.am to set TESTS_ENVIRONMENT when we have valgrind enabled. Finally, a short wrapper script for valgrind to fail the test on any valgrind error output. I run with the -q (quiet) option to hide the normal valgrind blurb.

One thing that is important is to ensure that the tests are real executables and not magic libtool wrapper scripts (automake does this if you build against a .la file). Valgrind gets confused otherwise.

All in all, fairly straightforward. I guess autotools isn't always such a pain after all.

His lifetime spans the entirety of World War II, the founding of the United Nations, the entirety of the cold war including the construction and demolition of the Berlin wall, the first man in space and the space race that followed, the American civil rights movement and all rock music ever made.