Thursday, February 24, 2011

There are some tremendous ideas behind the ubiquitous Unix shell (um, that would be Bourne, bash, (d)ash or maybe ksh?). The problem is that a lot of these ideas are very, very dated. Bash is probably the best example of how to keep a Unix bourne dialect alive. Ksh was beastly (tons of features), but I think bash has finally passed it. But is this a good thing?

As I start writing more complex scripts I begin to feel the age of Bourne. I have been using (d)ash (due to it being the Busybox shell and much smaller than bash -- GNU seem to be set on add the kitchen sink to every tool.) You can pretty much do general purpose scripting with Bash, but still with the legacy syntax of Bourne. You might as well go with Perl or Python (and their associated huge installation footprints).

Then there is rc (the Plan 9 shell). It starts with Bourne and "fixes" things rather than tack on stuff around the edges. It is very minimalistic and has a certain elegance I haven't seen since Awk. Plan 9's toolbox minimalism was an attempt to get back to the origins of Unix (lots of small single purpose tools). The famous anti-example of this is probably GNU ls. Look at the options, the many, many options.

Rc isn't actively supported much (Plan 9 has since faded -- if it ever shone brightly to begin with), but it has the feel of something well thought out.

You'll hear more from me about that in upcoming posts.
Time to shut up and code.

Monday, February 21, 2011

These past few posts have been ramblings to myself at the cusp of starting a new CFT (Copious Free Time) project. I am weighing an "elegant" path (Haskell) vs a "Old Unix hacker" path (Shell scripts).

While the Haskell approach is alluring, there is a lot of learning to do there and I am an "Old Unix hacker". I am very familiar with the benefits of functional programming and have found the past 3 months doing Haskell (some on my day job) a lot of fun.

But, I know I can get more accomplished sooner if I take a "Unix hacker" approach.

Now, for the meat of this post (and an often arguing point against using shell scripts in critical environments): Safety.

Or, more specifically, what about all of the points of unchecked failure in a shell script?
Doesn't this betray the notion of an embedded system?

Well, there is the dangerous situation of uncaught typos, but let's say we are real careful. How do we handle problems like:
1. A process in the pipeline dies unexpectedly.
2. The filesystem becomes 100% full.

Interestingly, while something like "dd if=$1 | transform | gzip >$2" looks like it can be full of the above problems, I could argue that you have this problem using any programming language/approach.

However, because it is so difficult to catch "exceptional" errors in the shell, it starts to make me wonder how I would handle this in a language that supports "exceptions".

This is where things start to unravel (for me). What do you do in that exception? How do you recover?
Let's look at some approaches:

1. Unix approach: Wrap the "dd" line in a script and have a monitor start it, capture and log stderr and restart it if necessary (but not too aggressively -- maybe at some point give up and shutdown the system).
2. Erlang approach: Interestingly similar to above.
3. Language w/ exceptions: Catch the error, close the files and.... um, restart?

In the Unix approach, the cleanup is mostly done for you. Good fault tolerance practice (as suggested by Erlang) is pretty much handled by variants of init (I believe that daemontool's supervisor has been doing this well for years).

I am sure there are holes in my argument, but for my CFT, I am persisting all important data on disk (an event queue is central to my system). Every change (addition, execution, removal) of an event is an atomic disk transaction. If any process dies, it can be relaunched and pick up where it left off.

For fault tolerant (embedded) systems I am not sure what I would do in an "exception" handler... outside of clean up and die.

If you can afford to run a (multi-tasking, memory managed) Linux kernel in your embedded system and Busybox is there, then the shell (ash in this case) becomes a potential basis for a system architecture.

Of course, this is not breaking news. But I think it gets lost when we start taking a " single programming language" view of system development (as advocated by almost every modern programming language). If you are trying hard to figure out how to get your favorite programming language to do what the shell does, then maybe it isn't the right tool for the job.

Sure, the "shell" isn't elegant and is full of pitfalls and gotchas when you use it beyond a couple of lines, but when your shell script starts to grow, you too should consider looking elsewhere for help (i.e. commands beyond what is built into the shell).

An example: Don't get caught up in gawk/bash's ability to read from TCP sockets, leverage netcat (nc).

I'm building an embedded soft-real-time control system. It will handle sensor events and provide feedback to the user using voice synthesis.

I really want to use Haskell for this CFT project, but I can get something running so much quicker by shell scripting. There won't be a lot of sophisticated algorithms and I don't see scalability as a concern.

When it comes down to it, I am find it harder and harder to do system programming in a "programming language" vs something in a shell (with support from awk and friends). It doesn't matter if it is C or Haskell, it starts to feel like (once again) re-inventing a wheel.

As an example (and it has nothing to do with this current CFT project), consider this problem: I want to transform 1024 byte chunks of a file and write the results as a compressed file. The transformation doesn't matter, but let's say the transformation is written in C (or Haskell for that matter) and takes 50-100 ms per 1024 byte chunk.

I want to do this task as fast as possible. I have (at least) 2 CPU cores to work with. Let's look at two approaches:

Approach A: Write a Haskell/C program to read 1024 bytes at a time, perform the translation, then the compression and write the 1024 bytes to an output file.

Okay, so I need to link in a decent gzip compression library and I use an appropriate "opt" parser to grab the input and output file. Done.

Approach B: dd if=$1 bs=1024 | translator | gzip > $2

This assumes that I write the same core "translator" code as above, so we can ignore that and focus on reading, compression and writing.

You can guess which will take shorter to implement, but which is the more efficient?

Well, my wild guess would be Approach B. Why? Well I already have a couple of things going for me. One is that I have automatic concurrency! While "dd" is just sitting there reading the disk, translator is running and gzip is also doing its thing. If I have 3 cores, then I have a good chance that each process can run in parallel (for at least a little while before they block). There is some cost in the piping, but that is something that linux/unix is optimized to perform. Given that, "dd" has a good chance of causing more efficient file input buffering than my single threaded app in Approach A. The dd process has disk buffering + pipe buffering working for it so it may fetch (and dispatch) several 1024 byte chunks before it blocks on a full pipe. A similar (but reverse) caching is happening with gzip too.

So, you then consider rewriting Approach A but using a concurrency module/library. Ugh. Let's not go there.

So, if I take a scripting approach, my "controlling" part of the system can be written using the Shell and I can optimize to Haskell (or C) as needed.

Monday, February 07, 2011

I have some code that lazily transforms a fairly large list of data (from an IO source) that must be search sequentially. Since it is lazy, the list isn't fully transformed until the first search. Since, I presume, a rather large stack of thunks are constructed instead, this first search takes a really, really long time. (It would be faster, I surmised, to strictly transform the list as it was being built, rather than lazily upon the first search).

I started playing with `seq` but couldn't quite get the strictness right -- the code represented some of my first attempts at Haskell. So, I decided to refactor the code (replace naive tail recursion with maps, filters, folds, etc). I figured at this point I would be able to see more clearly how to avoid the lazy list.

Surprisingly, this refactoring was enough for the compiler to "do the right thing" and sped my application up significantly. What was the compiling doing here? Did it remove the laziness? Or did it just optimize the hell out of what I thought was a lazy vs strict problem?