"Linux Gazette...making Linux just a little more fun!"

Introduction to Shell Scripting

"When the only hammer you have is C++, the whole world looks
like a thumb." -- Keith Hodges

The Myth of Design; the Mystery of Color

At this point in the series, we're getting pretty close to what I consider
the upper limit of basic shell scripting; there are still a few areas I'd
like to cover, but most of the issues involved are getting rather, umm,
involved. A good example is the `tput' command that I'll be covering this
month: in order to really understand what's going on, as opposed to just
using it, you'd need to learn all about "termcap/terminfo" controversy
(A.K.A. one of the main arguments in the "Why UNIX Sucks" debate) - a
deep, involved, ugly issue (for a fairly decent and simple explanation,
see Hans de Goede's fixkeys.tgz, which contains a neat little "HOWTO".
For a more in-depth study, the Keyboard-and-Console-HOWTO is an awesome
reference on the subject). I'll try to make sense despite the confusion,
but be warned...

Affunctionately Yours

The concept of functions is not a difficult one, but is certainly very
useful: they are simply blocks of code that you can execute under a
single label. Unlike a script, they do not spawn a new subshell but
execute within the current one. They can be used within a script, or
stand-alone.

Functions may also be loaded into the environment, and invoked just
like shell scripts; we'll talk about sourcing functions later on. For
those of you who use Midnight Commander, check out the "mc ()"
function described in their man page - it's a very useful one, and is
loaded from ".bashrc".

Important item: functions are created as "function pour_the_beer ()
{ ... }" or "pour_the_beer () { ... }" (the keyword is optional); they
are invoked as "pour_the_beer" (no parentheses). Also, be very careful
(as in, _do not_ unless you really mean it) about using an "exit"
statement in a function: since you're running the code in the current
shell, this will cause you to exit your current (i.e. the "login")
shell! Exiting a shell script this way can produce some very ugly
results, like a `hung' shell that has to be killed from another VT
(yep, I've experimented). The statement that will terminate a function
without killing the shell is "return".

Single, Free, and Easy

Everything we've discussed in this series so far has a common
underlying assumption: that the script you're writing is going to be
saved and re-used. For most scripts, that's what you'd want - but what
if you have a situation where you need the structure of a script, but
you're only going to use it once (i.e., don't need or want to create a
file)? The answer is - Just Do It:

Typing a `(' character tells "bash" that you'd like to spawn a
subshell and execute, within that subshell, the code that follows -
and this is what a shell script does. The ending character, `)',
obviously tells the subshell to 'close and execute'. For an equivalent
of a function (i.e., code executed within the current shell), the
delimiters are `{' and `}'.

Of course, something like a simple loop or a single 'if' statement
doesn't even require that:

"bash" is smart enough to recognize a multi-part command of this type
- a handy sort of thing when you have more than a line's worth of
syntax to type (not an uncommon situation in a 'for' or a 'while'
statement). By the way, a cute thing happens when you hit the up-arrow
to repeat the last command: "bash" will reproduce everything as a
single line - with the appropriate semi-colons added. Clever, those
GNU people...

No "hash-bang" ("#!/bin/bash") is necessary for a one-time script, as
it would be at the start of a script file. You know that you're
executing this as a "bash" subshell (at least I _hope_ you're running
"bash" while writing and testing a "bash" script...), whereas with a
script file you can never be sure: the user's choice of shell is a
variable, so the "hash-bang" is necessary to make sure that the script
uses the correct interpreter.

The Best Laid Plans of Mice and Men

In order to write good shell scripts, you have to learn good
programming. Simply knowing the ins and outs of the commands that
"bash" will accept is far from all there is - the first step of problem
resolution is problem definition, and defining exactly what needs to
be done can be far more challenging than writing the actual script.

One of the first scripts I ever wrote, "bkgr" (a random background
selector for X), had a problem - I'd call it a "race condition", but
that means something different in Unix terminology - that took a long
time and a large number of rewrites to resolve. "bkgr" is executed as
part of my ".xinitrc":

OK, by the book - I background all the processes except the last one,
"icewm" (this way, the window manager keeps X "up", and exiting it
kills the server). Here was the problem: "bkgr" runs, and "paints" my
background image on the root window; fine, so far. Then, "icewm" runs
- and paints a greenish-gray background over it (as far as I've been
able to discover, there's no way to disable that other than hacking
the code).

What to do? I can't put "bkgr" after "icewm" - the WM has to be last.
How about a delay after "bkgr", say 3 seconds... oh, that won't work:
it would simply delay the "icewm" start by 3 seconds. OK, how about
this (in "bkgr"):

That should work, since it'll delay the actual "root window painting"
until after "icewm" is up!

It didn't work, for three major reasons.

Reason #1: try the above "ps ax|grep" line, from your command line,
for any process that you have running; e.g., type

ps ax|grep init

Try it several times. What you will get, randomly, is either one or
two lines: just "init", or "init" and the "grep init" as well, where
"ps" manages to catch the line that you're currently executing!

Reason #2: "icewm" starts, takes a second or so to load, and then
paints the root window. Worse yet, that initial delay varies - when
you start X for the first time after booting, it takes significantly
longer than subsequent restarts. "So," you'd say, "make the delay in
the loop a bit longer!" That doesn't work either - I've got two
machines, an old laptop and a desktop, and the laptop is horribly slow
by comparison; you can't "size" a delay to one machine and have it
work on both... and in my not-so-humble opinion, a script should be
universal - you shouldn't have to "adjust" it for a given machine. At
the very least, that kind of tuning should be minimized, and
preferably eliminated completely.

One of the things that also caused trouble at this point is that some
of my pics are pretty large - e.g., my photos from the Kennedy Space
Center - and take several seconds to load. The overall effect was to
allow large pics to work with "bkgr", whereas the smaller ones got
overpainted - and trying to stretch the delay resulted in a
significant built-in slowdown in the X startup process, an untenable
situation.

Reason #3: "bkgr" was supposed to be a random background selector as
well as a startup background selector - meaning that if I didn't like
the original background, I'd just run it again to get another one. A
built-in delay any longer than a second or so, given that a pic takes
time to paint anyway, was not acceptable.

What a mess. What was needed was a conditional delay that would keep
running as long as "icewm" wasn't up, then a fixed delay that would
cover the interval between the "icewm" startup and the "root window
painting". The first thing I tried was creating a reliable `detector'
for "icewm":

'$X' gets set to the value of "$(ps ax)", a long string listing all
the running processes which we check for the presence of "icewm" as
the loop condition. The thing that makes all the difference here is
that "ps ax" and "grep" are not running at the same time: one runs
inside (and just before) the loop, the other is done as part of the
loop test (a nifty little hack, and well worth remembering). This
registers a count of only one "icewm" if it is running, and none if it
is not. Unfortunately, due to the finicky timing - specifically the
difference in the delays between an initial X startup and repeated
ones - this wasn't quite good enough. Lots of experimentation later,
here's a version that works:

What I'm doing here is loading a 1x1-pixel image and checking to see
if "xv" has managed to do so successfully; if it has not, I continue
looping. Once it has - and this only means that X has reached a point
where it will accept those directives from a program - I stick in a 3
second delay (but only if we've done the loop; if "icewm" is already
up, no delay is necessary or wanted). This seems to work very well no
matter what the "startup count" is. Running it this way, I have not
had a single image "overpainted", or a delay of longer than a second
or so. I was a bit concerned about the effect of all those "xv"s
running one after another, but timing the X startup with and without
"bkgr" put that to rest: I found no measurable difference (as a
guess, when "xv" exits with an error code it probably doesn't take
much in a way of resources.)

Note that the resulting script is only slightly longer than the
original - what took all this time was not writing some huge, complex
magical fix but understanding the problem and defining the solution...
even though it was a strange one.

There are a number of programming errors to watch out for: "race
conditions" (a security concern, not just a time conflict), the
`banana problem', the `fencepost/Obi-Wan error'... (Yes, they do have
interesting names; a story behind each one.) Reading up on a bit of
programming theory would benefit anyone who's learning to write shell
scripts; if nothing else, you won't be repeating someone else's
mistakes. My favorite reference is an ancient "C" manual, long out of
print, but there are many fine reference texts available on the net;
take a peek. "Canned" solutions for standard programming errors do
exist, tend to be language-independent, and are very good things to
have in your mental toolbox.

Coloring Fun with Dick and Jane

One of the things that I used to do, way back when in the days of BBSs
and the ASCII art that went with them, is create flashy opening
screens that moved and bleeped and blinked and did all sorts of things
- without any graphics programming or anything more complicated than
those ASCII codes and ANSI escape sequences (they could get
complicated enough, thank you very much), since all of this ran on
pure text terminals. Linux, thanks to the absolutely stunning results
of work done by Jan Hubicka and his friends (if you have not seen the
"bb" demo of "aalib", you're missing out on a serious acid trip. As
far as I know, the authorities have not yet caught on, and it's still
legal), has far outstripped everything even the fanciest ASCII artist
could come up with back then ("Quake" and fractal generators on
text-only terminals, as two examples).

What does this have to do with us, since we're not doing any
"aalib"-based programming? Well, there are times when you want to
create a nice-looking menu, say one you'll be using every day - and if
you're working with text, you'll need some specialized tools:

1) Cursor manipulation. The ability to position it is a must; being
able to turn it on and off, and saving and restoring the position are
nice to have.

If you copy and paste the above into a file and run it, you'll realize
why the insurance business is considered deadly dull. Erm, well, one
of the reasons, I guess. Bo-o-ring ( apologies to one of my
former employers, but it's true...) Not that there's a tremendous
amount of excitement to be had out of a text menu - but surely there's
something we can do to make things just a tad brighter!
(text version)

This is NOT, by any means, The Greatest Menu Ever Written - but it
gives you an idea of basic layout and color capabilities. Note that
the colors may not work exactly right in your xterm, depending on your
hardware and your "terminfo" version - I did this as a quick slapdash
job to illustrate the capabilities of "tput" and "echo -e". These
things can be made portable - "tput" variables are common to almost
everything, and color values can be set based on the value of `$TERM'
- but this script falls short of that. These codes, by the way, are
basically the same for DOS, Linux, etc., text terminals - they're
dependent on hardware/firmware rather than the software we're running.
Xterms, as always, are a different breed...

So, what's this "tput" and "echo -e" stuff? Well, in order to `speak'
directly to our terminal, i.e., give it commands that will be used to
modify the terminal characteristics, we need a method of sending
control codes. The difference between these two methods is that while
"echo -e" accepts "raw" escape codes (like '\E[H\E[2J' - same thing as
H2J), "tput" calls them as `capabilities' ("tput clear" does
the same thing as "echo -e" with the above code) and is
(theoretically) term-independent (it uses the codes in the terminfo
database for the current term-type). The problem with "tput" is that
most of the codes for it are as impenetrable as the escape codes that
they replace: things like `civis' ("make cursor invisible"), `cup'
("move cursor to x y"), and `smso' ("start standout mode") are just as
bad as memorizing the codes themselves! Worse yet, I've never found a
reference that lists them all... well, just remember that the two
methods are basically interchangeable, and you'll be able to use
whatever is available. The "infocmp" command will list the
capabilities and their equivalent codes for a given terminal type;
when run without any parameters, it returns the set for the current
term-type.

Colors and attributes for an ISO6429 (ANSI-compliant) terminal, i.e.,
a typical text terminal, can be found in the "ls" man page, in the
"DISPLAY COLORIZATION" section; xterms, on the other hand, vary so
much in their interpretation of exactly what a color code means, that
you basically have to "try it and see":
(text version)

This little script will run through the gamut of colors of which your
termtype is capable. Just remember the number combos that appeal to
you, and use them in your "echo -e '\E[<bg>;<fg>m'" statements.

Note that the positions of the numbers within the statement don't
matter; also note that some combinations will make your text into
unreadable gibberish ("12" seems to do that on most xterms). Don't let
it bother you; just type "reset" or "tput sgr0" and hit "Enter".

Wrapping It Up

Hmm, I seem to have made it through all of the above without too much
pain or suffering; amazing. :) Yes, some of the areas of Linux still
have a ways to go... but that's one of the really exciting things
about it: they are changing and going places. Given the amazing
diversity of projects people are working on, I wouldn't be surprised
to see someone come up with an elegant solution to the color
code/attribute mess.

Next month, we'll cover things like sourcing functions (pretty
exciting stuff - reusable code!), and some really nifty goodies like
"eval" and "trap". Until then -

Happy Linuxing to all!

Linux Quote of the Month

"The words `community' and `communication' have the same root.
Wherever you put a communications network, you put a community
as well. And whenever you take away that network - confiscate it,
outlaw it, crash it, raise its price beyond affordability - then
you hurt that community.

Communities will fight to defend themselves. People will fight
harder and more bitterly to defend their communities, than they
will fight to defend their own individual selves." -- Bruce Sterling, "Hacker Crackdown"