Running the Numbers

According to Benjamin Disraeli “There are three kinds of lies: lies, damn lies and statistics.” But, according to Jon “maddog” Hall, “There are three kinds of lies: lies, damn lies and benchmarks.”

Businesspeople and journalists have at
least one thing in common—they love numbers. Their appetite for
numbers constitutes the enormous demand market that keeps think
tanks, research firms and other professional guessticians in
business.

The “suits” have a legitimate need for data to make
informed business decisions. Good data often mean the difference
between life and death for a company. We writers have a need no
less compelling, though far less legitimate. We need to tell
stories, and numbers make great story material—especially when we
turn number-packed spreadsheets into pretty pictures worth a
thousand words. Hey, it saves us from writing, and as the graphics
people will tell you, readers look at pictures first.

Admit it; you looked at this graphic first, didn't you? Of
course you did. Well, the graphic tells a lie. I made up the
numbers, the spreadsheet made up the graphic, the artist made it
pretty, and here we are, making a point. It's not just that
“figures lie and liars figure”. It's that we rely a great deal on
both. Take it from an old liar, or as I hate to admit, a PR guy.
Let me tell you a true PR story.

The year was 1988, and I was working for a hot networking
company that sold more than half the network connectors in its
market. Since that market consisted of boxes sold by one company
and sales figures for it were easily (though not precisely)
estimated, we could figure out our market penetration—or something
close to it.

However, there was a hitch: we competed directly with that
box company (and lived at their mercy), so we didn't want to
release our actual sales figures. Yet we wanted to publicize our
success. We knew our story would be easier for editors to accept if
our numbers were “objective”. Since editors go to industry
research firms for their objective data, we had to get those firms
to OEM our numbers for us.

We created a nice “internal” graphic showing our best
guess—55% market penetration—and called Analyst A at the biggest
research firm. Would he like to know how we were doing? Of course!
So we sent him the graph. A few days later, after the firm finished
digesting this “fact”, we started referring editors to the firm.
Sure enough, new graphs began to appear with stories of our
success, all listing the research firm as the “source”.

Were we lying? No. We were simply dropping our best facts
into the bottom end of the data food chain, having faith that it
would find its way to the top. Now here I am, at the top of the
same food chain, and I see very few Linux companies which know how
this system works. I think the problem is techies are too honest
and too literal. Most Linux companies are run by techies—guys like
you—if we trust our own readership figures (from an “objective”
source, of course).

Take an issue like host web servers and operating systems. At
Netcraft's site,
http://www.netcraft.com/whats/,
you can discover, for example, that the Microsoft Network is
running Microsoft-IIS/4.0 on NT3 or Windows 95. That information is
returned when Netcraft interrogates the site host, copied out of
the browser and pasted right into this text. Can we trust Netcraft,
or the information it automatically obtains? Our techies here at
Linux Journal say “No.” In fact, one techie
believes MSN may actually be hosted by UUNET using BSDI. There's no
way to tell by using Netcraft's method, because it's too easy for
the host to spoof out a wrong answer, just to be perverse.

I still wanted that kind of data, however qualified, so I
went ahead and put a chart together with the results of Netcraft's
interrogations of the top 25 U.S. hosts (see “Work Still Cut Out”
in upFRONT). Highly qualified findings are better than no findings.
Some interesting data are there, such as Hotmail, a Microsoft
property, running Apache on FreeBSD; and some significant ones,
such as Linux running on only two (Real Networks and Go2Net). These
findings address the concerns of several parties—regular readers
of Linux Journal, BSD advocates, “suits” who
need unbiased numbers and others—who feel that Linux Journal should remain as informative and unbiased as
possible, while still advocating Linux to the world.

No problem with doing that, but we need the numbers. The
numbers we get from research firms will be no better than those
obtained from Linux vendors and other involved parties.

Linux IPOs per Moon Phase

Let's look at two of the oldest and largest research firms
operating today: Gartner Group and International Data Corp. In
recent months, the Linux community has made the most of IDC's very
positive numbers, which show Linux as the fastest-growing server
operating environment. At the same time, many in the Linux
community bristled at Gartner's research findings, which called
Linux “hype du jour” (among other unkind things) in its report,
“Will Linux Be Viable Competition for Windows Desktops?” A Google
lookup finds over 2,000 pages mentioning both Gartner and Linux.
Most of those flame Gartner for its cluelessness. One anonymous
coward on Slashdot writes, “The reality is, while the Gartners of
the world are fudding away, Linux is already installed at all
levels. Get a clue, PHBs [bosses]. Now, why would anybody pay
Gartner for advice?”

The answer is simple: because they need objective information
from somebody, and they're not getting it from you. Flames are
easy. Facts are hard. Like them or not, Gartner and IDG traffic in
facts or the most educated possible guesses. If you have good hard
information on how Linux is changing the business world, let them
know. Or skip them, and let us know. We're in the same business.
That's what you—our readers—pay us for. No lie.

Doc Searls
is the Senior Editor
for Linux Journal. He can be reached via e-mail at
doc@ssc.com.