Bayes' theorem has proven an effective aid to many sorts
of pattern-recognition tasks, such as fingerprint identification, facial
matching, and handwriting analysis. Take the problem of teaching a computer to
read scribbled-on bank checks. As anyone who's tried to scan a magazine article
into a Word file knows, getting a computer to read clean, standard fonts
accurately is dodgy enough. Add in the variables introduced by eccentric writing
styles, ballpoint versus felt-tip pens, differing rates of ink absorption, plus
crumpling and folding of the pages, and traditional optical character
recognition can be rendered useless.

The Bayesian model allows a computer to incorporate prior
knowledge of the billions of ways handwritten letters and numbers can stray from
standard forms, training itself to read the writing on checks by
"seeing" lots of examples, thus building a base of prior probabilities
to factor into decisions. If, for example, in most of the past 1 million
instances the computer discerned a wavy shape that turned out to be an s,
then that loopy figure on the check is probably an s, too - unless it's
followed by what's probably a 6, in which case it's more likely to be a
dollar sign or an 8.

In the emerging world of computing applications that
employ such networked Bayes reasoning engines, almost any observable phenomenon
can be inferred as a symptom of a hidden cause - whether characters in a
document, or the behavior of an office worker repeatedly clicking a button on
his keyboard when the computer refuses to respond. If an email message has
broken out in exclamation points, perhaps the disease is spam. In the case of
the office worker, the ailment might be a toxic interface. If traditional
computing seems designed for a binary universe only a microchip could love,
Bayes nets are made for the world of uncertainty, conflicting truths, static,
and frustratingly incomplete information sets we live in.

One of Peter Rayner's hobbies recalls the problem of the
short-order cook: using Bayesian methods to extract clear audio signals from the
thickets of random noise on old recordings, resurrecting the glory of '20s jazz
musicians from scratchy gramophone discs. As we sipped Côtes du Rhône in the
oak-paneled sanctuary of the Masters' Lodge, Rayner and Lynch discussed
approaches to boosting the sound quality of MP3 files, searching databases of
GIFs to find a particular image, and predicting the transport rate of
pharmaceuticals through blood.

At Cambridge, researchers have applied the reverend's
notions to disciplines as various as improving hearing aids and determining
whether a given dose of a drug will sufficiently anesthetize a surgery patient.
"This man of enormous importance for the 20th century - with a philosophy
so far-reaching it makes Marx pale into insignificance - was essentially
forgotten," Rayner tells me, adding that in a university environment
designed to churn out MBAs, the wider implications of Bayes' work would have
been overlooked long ago because it seemed to have few practical applications.

Lynch was recognized as an extraordinary, if unorthodox,
student while still an undergraduate in the '80s. Rayner - a ruddy,
alabaster-bearded, outspoken embodiment of the Cambridge lineage that produces
infidel mathematicians - recalls that his now notably successful former student
had a tough time getting out of bed. "Mike didn't do any work at all until
a quarter of an hour before the exam, when he was miles away from any
textbooks. But he used to invent these solutions which were very creative."

The broad sweep of Rayner's academic and cultural
interests was a powerful influence on the young engineer, who says his mentor's
insistence on problem solving over "hand-waving, headline-grabbing
rubbish" encouraged him to think of innovative and practical applications
for Bayes' work. It was over morning coffee with Rayner and other graduate
students, says Lynch, that he first considered applying the 250-year-old theorem
to the task of training computers to recognize patterns of meaning.

Lynch's first company - created in 1991 during his student
years and fortified with an impulsive £2,000 loan offered in a pub - is called
Neurodynamics. Working for, among others, companies in the British intelligence
and defense industries, Neurodynamics uses neural-network technology and
Bayesian methods to create applications that specialize in character,
handwriting, and facial recognition, as well as surveillance. Lynch enjoyed
cooking up solutions for high-level skunk works because, he says, "they
have the most interesting problems."

One of the interesting problems Lynch addressed for
British intelligence was how to enable computers to make sense of large volumes
of words in many languages for a top-secret project. The young entrepreneur, who
didn't have the necessary security clearances, was never told what sort of texts
the technology would analyze - intercepted email, faxes, leaked documents? - but
was instructed to perform his operations on newspaper stories from around the
world. Out of that work came the chunk of code called the Dynamic Reasoning
Engine, the Bayesian heart of every Autonomy product.

To determine whether two passages are concerned with the
same fundamental ideas, Lynch realized, you don't need to know the meaning of
each word. In fact, it's not even necessary to be able to speak the language. As
long as you can teach the computer where one word ends and another begins, it
can look at the ideas contained in a text as the outcomes of probabilities
derived from the clustering of certain symbols. The symbol penguin, for
instance, might refer to the Antarctic bird, a hockey team, or Batman's nemesis.
If it clusters near certain other symbols in a passage, however - say, ice,
South Pole, flightless, and black and white - penguin most
likely refers to the bird. You can carry the process further: If those other
words are present, there's an excellent chance the text is about penguins, even
if the symbol for penguin itself is absent.

Lynch had zeroed in on the Achilles' heel of search
engines. A search on penguin is just as likely to generate a list of
pages about Penguin-Putnam books, the Purple Penguin Design Group, and why Linus
Torvalds chose the bird as Linux's symbol as it is to uncover useful data on
flightless aquatic birds of the tuxedoed, krill-munching variety.