Building the Perfect Box: How To Design Your Linux Workstation

This article is a guide to building capable Linux workstations from cheap generic PC hardware.

Most of the good things about Linux flow
from the fact that it makes a full-featured Unix accessible on
inexpensive hardware. Accordingly, there's a huge amount of
documentation and folk knowledge in the Linux community about how
to get people who already have cheap hardware to use Linux on it.
Up to now there hasn't been much advice available on how to acquire
cheap hardware that is well-matched to Linux, for someone who
already knows Linux.

At today's prices, it's possible to put together a terrific
personal Unix platform for less than $2,000 US. If you're prepared
to go mail-order, shop carefully and make a few minor tradeoffs,
you can do it for $1,500 or even less. But beware. If you buy as
though for a DOS/Windows box, you won't get the best value or
performance. Linux works its hardware harder than Unix does, and
configurations that are marginal under DOS/Windows can cause
problems under Linux.

In this article, we'll develop a recipe for a cheap but
capable Linux workstation. While developing it, we'll discuss the
recipe choices in some detail, and see how to avoid common pitfalls
that can cause you grief.

We are going to stick to Intel hardware in this article.
Alphas are fast and have that wonderful 64-bit architecture, and
SPARCs too have earned their fans. However, I think PC hardware is
still overall the most cost-effective—cheapest to buy, easiest to
get serviced and best-tested with Linux. And, given the relative
sizes of the respective markets, PC hardware seems likely to hold
its lead for years yet.

For more detail on this subject, organized in a reference
rather than narrative format, surf to my PC-Clone UNIX Hardware
Buyer's Guide at
http://www.ccil.org/~esr/clone-hw-guide/contents.html. I've been
maintaining this document and its FAQ ancestor for longer than
Linux has existed, and have been running Unix on PC hardware since
shortly after it first became possible in the late 1980s.

What To Optimize

Most people think of the processor as the most important
choice in specifying any kind of personal-computer system. Our
first lesson in building Linux boxes is this: for Linux, the
processor type is nearly a red herring. It's far more important to
specify a capable system bus and disk I/O subsystem.

One important reason for this is precisely because PC systems
are marketed in a way that presents processor speed as a primary
figure of merit. The result is that the development of processor
technology has naturally gotten pushed harder than anything else,
and off-the-shelf PCs have processors that are quite overpowered
relative to the speed of everything else in the system. Your
typical PC these days has spare CPU-seconds it will never use,
because the screen, disk, modem and other peripherals can't be
driven fast enough to tax it.

If you're already running Linux, you may find it enlightening
to keep top(1) running for a while as you use your machine. Notice
how seldom the CPU idle percentage drops below 90%.

It's true that after people upgrade their motherboards, they
often report a throughput increase. But this is often due more to
other changes that go with the processor upgrade, such as improved
cache memory or an increase in the system bus's clocking speed,
i.e., enabling data to get in and out of the processor
faster.

The unbalanced, processor-heavy architecture of PCs is hard
to notice under DOS and Windows 3.1, because neither OS hits the
disk very much. But any OS that uses virtual memory and keeps lots
of on-disk logs and other transaction states is a different
matter—it will load the disk more heavily and will suffer more
from the imbalance.

Linux is in this category, and I'd guess Windows NT and OS/2
are too. Assuming you're buying for Linux on a fixed budget, it
makes sense to trade away some excess processor clocks to get a
faster bus and disk subsystem.

The truth is that any true
32-bit processor now on the market is more than fast enough for
your disks under a typical Linux-like load, even if it's a lowly
386/25. Your screen, if you're running X, can be a bit more
demanding—but even a 486/50 will let you drag Xterm windows around
like paper. And that's a lot slower than the cheapest new desktop
machine you'll be able to find by the time this article hits
paper.

So buy a fast bus. And especially, buy fast disks. How does
this translate into a recipe? Like this:

Don't bother with
the latest Pentium Pro whizbang 300mHz super-scorcher with a
cooling fan bigger than it is.

Do get a PCI-bus
machine.

Do get a SCSI
controller.

Do get the fastest
SCSI disks you can afford.

Buying PCI will get you maximum bus throughput, and makes
sense from several other angles as well. The doggy old ISA bus is
clearly headed for extinction at this point, and you don't hear
much about its other competitors (EISA, VESA local-bus video or
MCA) anymore. With PCI now being used in Macintoshes and Alphas as
well as all high-end Intel boxes, it's clearly here to stay, and a
good way to protect your investment in I/O cards from rapid
obsolescence.

The case for SCSI is a little less obvious, but is still
compelling. For starters, SCSI is still at least 10-15% faster than
EIDE running flat out. Furthermore, EIDE is still something of a
“jerry-rig”. Like Windows, it's layered over an ancestral design
(ST-512) that's antiquated and prone to failure under stress. SCSI,
on the other hand, was designed from the beginning to scale up well
to high-speed, high-throughput systems. Because it's perceived as a
“professional” choice, SCSI peripherals are generally better
engineered than EIDE equivalents. You'll pay a few dollars more,
but for Linux the cost is well repaid in increased throughput and
reliability.

For the fastest disks you can find, pay close attention to
seek and latency times. The former is an upper bound on the time
required to seek to any track; the latter is the maximum time
required for any sector on a track to come under the heads, and is
a function of the disk's rotation speed.

Of these, seek time is more important and is the one
manufacturers usually quote. When you're running Linux, a one
millisecond faster seek time can make a substantial difference in
system throughput. Back when PC processors were slow enough for the
comparison to be possible (and I was running System V Unix), it was
easily worth as much as a 30mHz increment in processor speed. Today
the corresponding figure would be higher.