Search

Work the Shell - Web Server Tricks with $RANDOM

I just migrated onto a newer, bigger server (read that as “more
expensive”, of course, but because my traffic's justifying it, I'm
good with the change). To make matters more interesting, I also just bought a
new laptop (a MacBook Pro), and between the two migrations, I've been looking
through a lot of old directories and bumping into all sorts of scripts I've
written in the past few years.

The one I thought would be interesting to explore here is one
I wrote for a pal who was involved in a charity and wanted a way to have a
single URL bounce people 50/50 to one of two different Web pages—a sort of mini-load balancer, though his application wasn't quite
the same.

The core piece of this is the $RANDOM shell variable that's actually kind of
magical—each time you reference it, you'll find it's different, even
though you aren't actually assigning a new value to it. For example:

This violates the core user design principles of the shell and even the very
definition of variables (which are supposed to be predictable—if you assign the
value 37 to it, it still should have that value 200 lines and
17 references later). Other variables change value based on what
you're doing, without you actually assigning it a new value, like $PWD, but
because that's the present working directory, if you move around in the
filesystem, it's logical that its value would change too.

The RANDOM value, however, is in a category of its own and makes it super easy
to add some pseudo-randomness to your scripts and user interaction (whether
it's truly random is a far more complicated—mind-numbingly complex—issue. If
you're interested, try Googling “determining the
randomness of random numbers” to jump down that particular rabbit hole.

In the Bourne Again Shell (bash), RANDOM numbers are within the range of
0..MAXINT (32,767). To chop it down and make it useful, you can simply divide it
by the max numeric value you seek.

In other words, if you want a random number between 1..10, for example, use the
% “remainder” function with a call to expr:

As you can see, the first step is to screen out the hosts that aren't
actually up at the present moment, then grab the first field (as it's
sorted by how busy the system is at the current moment).

One approach to this could be to call ruptime every time a request
comes in and just grab the first value. This can be done like so:

$ ruptime -rl | grep -v down | head -1 | cut -d\ -f1
host2

The trouble is that the systems report uptime information only approximately
every minute,
and calling ruptime dozens or hundreds of times per second can end up
producing a problem—the least-busy system will be swamped. If you get a lot of
traffic, that's not going to be a manageable solution.

Here's where we could have our friend $RANDOM step back into the
picture. Instead of always simply picking the machine with the lowest load
average, let's randomly choose one of the three least-busy systems.
The core snippet would look like this:

With a bit more code, you could bias it so that, say, 50% of the time it would
pick
the least-busy system, 33% of the time it would pick the second-least-busy system,
and 17% of the time it would pick the third-least-busy system. As time passed
and as the load moved around, these systems would keep changing, and you'd
achieve a crude but effective load-balancing system.

Knowing how easily you can select one of a number of possible paths randomly in
a shell script, what else can you imagine that would be helpful or just fun?

Dave Taylor has been involved with UNIX since he first logged in to the
on-line network in 1980. That means that, yes, he's coming up to the
30-year mark now. You can find him just about everywhere on-line, but start
here: www.DaveTaylorOnline.com. In addition to all
his other projects, Dave is now a film critic. You can
read his reviews at www.DaveOnFilm.com.