ALL SYSTEMS GO

T
his past July, the annual conference of the American Association for
Artificial Intelligence was marked by an oddly heroic
event. Throughout the conference, held in Providence, Rhode Island,
world-class players of chess, checkers and other games pitted their
skills against state-of-the-art computer adversaries. For the most
part the humans either lost or barely eked out victories. Then a
twenty-seven-year-old native of New Mexico named Janice Kim challenged
a computer to the oldest, most complex game of all: go.

When a video image of the match flickered onto one of the oversize
monitors in the "Hall of Champions," audience members laughed and
shook their heads in disbelief: the computer was being granted a
twenty-five-stone handicap. Kim is an exceptional player--the only
female go professional the West has ever produced--but this seemed
ridiculous: the first stone she placed looked like a lone paratrooper
dropped into an enemy garrison.

I was the commentator for the match, yet I had not seen a
handicap like that in a long time. When I started playing go as a high
school sophomore living in Chappaqua, New York, my father, then an
experienced amateur, started me out at twenty-three stones. I did not
stay at that handicap for long. The rules of the game are easy to
learn--much simpler than those of chess--but playing go well requires
a combination of restraint and aggression, of quick intuition and cold
calculation that appealed to me. And I liked to win. I devoured all
the go books I could find and spent every weekend at the go club in
New York City. By 1988, still a high school senior, I was the U.S.
representative to the world go youth championship in Paris, and had
made arrangements to apprentice myself to a Japanese go master in the
fall.

Despite its prestige and history in Japan, its huge popularity
in Korea, and its cachet among European intellectuals, go is all but
unknown in the United States. As a result, we go players are used to
explaining ourselves. We can intone an (inevitably stilted) litany in
our sleep: Go is an ancient game of strategy that originated in China
some 4,000 years ago. The board is a nineteen-line grid; the stones are
black and white, all alike; and the subtleties that develop from those
simple elements make it the deepest game in the world.

Go is a game of construction--the board starts out empty, and the
opponents, one playing black, the other white, take turns placing
their stones on the intersections, trying to surround and control
unclaimed territories. Once played, the stones do not move. Go is a
game of construction, but it is also a game of war: once a group of
stones is completely surrounded, it is removed from the board. As
conflicts arise over territory, each player struggles to surround and
capture his opponent's stones while protecting his own. The game ends
by agreement, when neither player can improve his position.

But that only begins to explain the true character of the game.
If chess is a battle, go is an entire campaign. It is a long game in
which a number of battles may rage simultaneously, none of them
decisive, and a single move can affect the entire board. There is a
saying (mostly among go aficionados, I have to admit) that go is to
chess as poetry is to double-entry accounting.

Unfortunately, a twenty-five-stone handicap gives the
game's true depth and beauty little chance to emerge. After a dozen
moves, Kim's stones were adrift without a base. It looked as if
Handtalk, the best go program in the world, would maintain control of
the entire board and deal her a crushing defeat. A few people in the
audience suggested that the game be abandoned, that the handicap was
too great for even a professional to overcome.

As commentator, I was torn. When I arrived at my go master's
house in Tokyo, tradition dictated that the apprentice paint a slogan
in Japanese calligraphy and hang it on the walls of the playing room. My
slogan, blobbed in a script more awkward than any Japanese first
grader's, was katsu, or "win." To begin training for go at the
professional level, you need only love the game; to survive for any
length of time, though, you have to hate to lose. My sympathies, in
other words, were entirely with Kim, but I was tempted to agree with
the skeptics.

My concern turned out to be unwarranted. Just as the spectators
began to lose heart, the computer made a bad move, and as Kim told us
after the game, she heard a little voice in her head saying, "Just put
one foot in front of the other, keep taking little baby steps across
the board, and you will win." And win she did. The loss with a
twenty-five-stone handicap gives the world's best go program a ranking
of around seventeen kyu--that of a beginner who has been playing for a
couple of months. In stark contrast, last year the world's best chess
program, Deep Blue, handily defeated the Russian world champion, Garry
Kasparov, with no handicap at all. Kasparov seemed to feel that his
was an epochal defense of humanity's intellectual supremacy, yet the
affair afforded go programmers little more than a rueful chuckle: the
day a go program can stump even an average amateur is still a long way
off.

If computers can be programmed to master chess, why are they so
weak at go? The answer, in a nutshell, is that Deep Blue's approach
simply will not work with go. Go is too complex, too subtle a game to
yield to exhaustive, relatively simpleminded computer searches. To
master it, a program has to think more like a person. That fact
represents a high hurdle to programmers--and an exciting opportunity.
Since the 1950s chess has posed a grand challenge to the field of
artificial intelligence, or AI. But the more chess programs have
improved, the more they have sidestepped the issues that once made them
seem worth designing. Go sends investigators back to the basics--to the
study of learning, of knowledge representation, of pattern recognition
and strategic planning. It forces us to find new models for how we
think, or at least for how we think we think.

A
rtificial intelligence has always joined mundane goals with lofty
ones. On one level the field has been about making clever devices:
computers that understand English, systems that diagnose diseases,
toys that play games. On another level, though, such devices are
supposed to serve as windows on the mind. AI investigators have long
maintained that intelligence can be understood strictly in terms of
computation--and apart from any physical systems that happen to
exhibit it. By dissecting and characterizing intelligence, then
building its components into devices that can do some of the things
that people do, engineers and programmers hope to succeed where
philosophers and psychologists have failed: they hope to develop a
scientific understanding of intelligence.

In the late 1950s, when computer programs began, however
tentatively, to approximate elements of human reasoning, AI
investigators made some bold predictions. Within a decade, many said,
computers would be as intelligent as humans: they would communicate
with people in English and discover important mathematical proofs. But
distilling the calculus of cognition proved much harder than most
people expected.

Intelligence demands knowledge, of course, but most of what
people know is so deeply ingrained that it hardly seems like knowledge
at all: that a cup of water, tipped over, will spread onto a table;
that painting a wall will not change the shape of the wall or cause a
flash flood in Texas. But those bits of common sense are embedded in a
nervous system that has been optimized over millions of years of
evolution. Reproducing such knowledge in a computer program is still
impossible. The result, for AI, is that programs designed to deal
with some part of the real world "intelligently" can do so only by
ignoring everything else. A program called MYCIN, for instance, can
correctly diagnose a patient with meningitis if certain symptoms are
described to it. But if you describe the rust spots on your car, MYCIN
will likely diagnose your car with meningitis as well.

To skirt the shortcomings of their machines, AI investigators
have had to focus on "toy domains"--highly simplified and idealized
worlds in which the rules governing reality can be expressed by a
handful of axioms. In the 1950s chess seemed an excellent choice. The
game is formally simple--played as it is with an eight-by-eight-square
checkerboard, thirty-two pieces and a few basic rules--yet mastering
it is a hallmark of human intellectual achievement. At the time,
machines that could evaluate billions of positions a minute were
practically inconceivable. To succeed at chess, it seemed, computers
would have to beat human players at their own game. Investigators
therefore began by focusing on planning, on the optimum ways of
representing tactical and strategic knowledge of the game, and on
sophisticated positional judgment, as well as the mechanics of
"search."

The term search, in this context, refers to a simple
algorithm known as "minimax," used in all successful modern chess
programs. When choosing from among several moves, the logic goes,
explore as many branches of the "game tree" as possible: first try all
your currently possible moves, then try all your opponent's possible
responses to each of your moves, then try your responses to those
responses, and so on, until you run out of time to think. Finally,
pick the move that leads to the best possible position, assuming
your opponent will offer the strongest possible resistance.

By
the 1980s, chess-playing programs that focused on identifying
high-level goals and ways to achieve them were making only halting
progress. Programs that concentrated on search, meanwhile, were fast
becoming serious contenders. The story of computer chess had begun to
sound like the children's book Fortunately, by Remy Charlip:
Fortunately, the further computers could think ahead, the better their
moves were. Unfortunately, deeper searches took longer. Fortunately,
computer hardware doubled in speed every couple of years.
Unfortunately, that speed encouraged investigators to abandon
high-level processing and focus on search alone.

Today, forty-seven years after the minimax algorithm was first
proposed as an approach to computer chess, Deep Blue can examine a
staggering 200 million positions per second. That raw power, combined
with clever refinements in the algorithm, enable Deep Blue to think
fourteen or fifteen moves ahead--and as deep as forty or forty-five
moves when necessary. As a result, the program has beaten Kasparov,
arguably the strongest chess player ever. The grand challenge of
computer chess has been met. Unfortunately, it has left many of the
questions that inspired it unanswered.

Over the years, AI investigators have gotten used to
swallowing big, nasty doses of reality. Many of them, therefore, are
greeting Deep Blue's success as a refreshing change of pace. Still,
others are left cold. They deride Deep Blue for using "brute
force"--for having nothing whatever to do with human intelligence. If
the purpose of AI is to learn enough about intelligence to create
intelligent computer programs, as the pioneering investigator John
McCarthy, professor of computer science at Stanford University,
suggests, then computer chess has little to do with AI, and Deep Blue
is just a triumph of engineering. When human players look ahead, they
examine just a few moves at various choice points--or, more often,
just one. Instead of considering billions of positions, they consider
dozens. When a reporter asked the former world chess champion José
Capablanca how many moves ahead he looked while playing, Capablanca
replied: "Only one, but it's always the right one."

Other students of AI argue, however, that criticisms of brute
force are misguided, and that computer chess shows that the end
justifies the means. Since computers match human performance in
mechanistic ways, they say, people who are uncomfortable calling
computers intelligent will simply have to redefine what they mean by
intelligence. Not only is it unnecessary for computers to emulate
human thought, they argue, it is a bad idea. People have parallel
"hardware"--the brain--in which the basic computational events occur
slowly (on the order of milliseconds), but simultaneously, by the
billions. Digital computers have serial hardware, which carries out
fundamental operations one after the other, but in a matter of
nanoseconds. People are good at recognizing patterns; computers are
good at crunching numbers. Trying to get computers to solve problems
the way people do, such AI advocates conclude, makes as much sense as
trying to build an airplane that has feathers and flaps its wings.

That sounds reasonable, but it still smacks of retreat: if a
task seems too hard, Deep Blue fans seem to be suggesting, just
redefine it. Fortunately or unfortunately, depending on your
perspective, the game of go will not let programmers off that
easy. The most practical way for a computer to play go, it seems, just
happens to be the most interesting way.

A
cursory analysis might suggest that the difficulty of computer go has
to do with combinatorics. The go board has a much larger grid than a
chessboard, so the number of possible moves is much larger. On average,
at any given moment in a chess game, a player has twenty-five legal
moves to choose from. Various refinements have enabled the minimax
algorithm to effectily considers only the five or six best choices,
pruning away the rest. In go, on average, a player can choose from 250
legal moves per turn. Even if a go algorithm could prune that number
down to fifteen or sixteen, and consider 200 million positions per
second, it would need seventy years to explore as deeply as Deep Blue
does in three minutes.

Very well, one might say; computers will get faster soon
enough. But the problem goes deeper than that. Go programs not only
fail to evaluate 200 million go positions a second; they cannot
accurately evaluate a single one.

Remember that Deep Blue must not only follow those billions
of branches of the game tree; it must also calculate the value of the
final position in each branch, so that a choice of moves can be
made. In chess, positions are relatively easy to evaluate. The
identity of each piece is overwhelmingly important: being down a pawn
is acceptable, being down a queen is not. Throw in a few positional
factors, such as king safety and the number of squares the pieces on
each side can attack, and you have a definition of positional value
that, combined with massive search, is good enough to beat a grand
master.

In go, such a strategy will not work. Each stone is worth the
same amount, and there is no king to protect. The amount of territory
the players control determines who is ahead. The trick is deciding
just who controls what.

A group of stones can be surrounded and captured, but it can
also assume a defensive formation, called "making life," that protects
it from capture for the rest of the game. In order to make life, a
group has to claim two separate pieces of territory, or "eyes." But to
ensure a group's safety a player need not always waste moves forming
both eyes: he only has to know that he can do so if attacked--like a
castle guard who keeps the drawbridge down until invaders are
sighted. Similarly, groups of stones are "dead" when they have no hope
of making life. The opponent can capture and remove a dead group at
any time, but he may not bother to do so. Why waste lives storming
castle walls when the defenders have nearly run out of food?

Now take a particular group of black stones resting on the go board:
how should the computer count it? Is the group dead? If so, the space
it occupies and the territory it surrounds belong to white. Or is the
group alive? Then the territory it surrounds belongs to black. The
answers to such questions can be passing subtle. Is the group
defensible or indefensible? Does the group need to spend moves
defending itself, or can it become the aggressor? Hundreds of points,
at any given moment, can hinge on the status of groups that are
neither clearly alive nor clearly dead. How well a player analyzes the
signs helps determine how well he plays the game.

With experience, people can learn to judge a group's status
quickly and skillfully. But how can those judgments be turned into
numbers a computer can use? People are astonishingly good at pattern
matching: they can recognize a face at various angles, in shadows or
in light, smiling or frowning; they can recognize a voice across a
squealing subway car or a crackling telephone line. But people cannot
easily evaluate go positions at a glance. They must learn to combine
their pattern-matching abilities with logic and knowledge in order to
make deep but highly restricted searches of the game tree. Go
programs, on the other hand, depend largely on rough heuristics, or
rules of thumb, for deciding whether groups are alive or dead. That
tactic works in simple situations, but when used to evaluate even
slightly complicated positions it can fail disastrously.

Kim's game against Handtalk demonstrated the problem
perfectly. Early on, she trapped two of Handtalk's stones so that she
could capture them at any time. Handtalk recognized that those
stones were lost--that a rescue attempt would be pointless. But Kim's
configuration posed another, hidden threat: her stones could not only
capture Handtalk's stones, they could make life as well, thereby
threatening all the stones around them. Handtalk failed to notice
the dual nature of the attack. It wrote off the battle as an
isolated loss of two stones, and never bothered to defend the others
at risk.

I
can sympathize with Handtalk in a strange way: I have had to learn
the difficulties of go twice, once as an aspiring player, and once as
a programmer trying to teach the game to a computer. When I moved to
Tokyo at the age of eighteen, I felt like a kind of retarded older
brother to my master's four other pupils, who ranged in age from
eleven to seventeen. Every weekend we joined some fifty other
aspirants at the playing hall of the Japanese Go Association for the
ongoing tournament that would, at the end of the year, determine who
was promoted to professional status and who was left to struggle on
for another year.

At the time, I was the only Westerner in the league. Most of
the young people I played with were hugely talented and had started
playing by the age of eight. Many of them had dropped out of high
school to concentrate on the game. Yet even among these professional
students, our house was considered the most serious.

My teacher's house was perhaps the last still following an
ancient Japanese instructional tradition of austerity. Our days were
spent kneeling on tatami mats in front of our go boards, playing
through masters' games and mulling over our victories and defeats of
the preceding weekend. Sometimes we would cross the street to play
ball in the park or visit the corner store, but we were implicitly
forbidden to have outside friends or interests.

When I got back to the United States and started college, I
tried playing go against the existing computer programs, but I was
appalled at their shortcomings. I was sure I could do better. After
writing down some ideas, I teamed up with an old friend, Tim Klinger,
now a doctoral candidate in computer science at New York University,
and we set about creating our own go program.

As
an experienced player, I knew that certain configurations of stones
come up again and again. Tim and I decided to incorporate those
patterns into our program as a number of if-then rules: if pattern X
appears on the board, then Y is a good move to achieve goal Z. Because
I think in terms of groups of stones, our program also recognizes and
operates on groups. That general approach--explicitly seeking to model
human knowledge and decision-making processes--might be called the
traditional AI approach to computer go. All the best go programs have
used it.

Because "life" and "death" are critical aspects of go, we
have concentrated mostly on them. Our life-and-death engine performs
deep, narrow search. Instead of assuming that all moves are
admissible, as a chess program does, it begins by assuming that none
are. First, the algorithm establishes a goal for the search--"capture
stone A," for instance. Then it decomposes that goal into subgoals,
sub-subgoals and so on. Once a "goal tree" has been established, the
program can draw on its various if-then rules to determine which moves
are most likely to achieve each minor goal in turn. Once all those
subsets of moves have been determined and compared, the program is
left with a few legal moves most likely to achieve the ultimate
goal.

The goal tree enables the program to recognize futile positions
quickly and abandon bad lines of play. Suppose, for instance, that the
program has to achieve subgoals A and B in order to achieve ultimate
goal C. If it can find no moves to achieve A, it can stop; there is no
need to waste time determining how to pursue B.

Our life-and-death engine is almost complete. Although the
knowledge base we have given it is made up of only about 200 patterns,
the system is solving problems successfully. Over the next six months
we hope to incorporate the engine into our go program and flesh out
the knowledge base. If all goes well, and if life and death in go are
as important as we think they are, the resultant program should easily
best every other in the field.

In
the meantime, we have a number of problems to overcome. The
traditional AI approach relies on introspection: How do I evaluate a
position? How do I encode that method into software? But our powers of
introspection are extremely limited; if you doubt it, try explaining
exactly how you catch a ball. (You may be tempted to say, "I just
raise my hand and grab the ball as it flies by." But that only begins
to explain it. First you have to calculate the distance and trajectory
of the ball from the flat image projected onto your retina; then you
must use that trajectory to calculate how much the muscles in your
hand, shoulder, arm, back and legs need to flex in order to move your
limbs and bend your joints in just the right way at just the right
moment. As experts in robotics have found out, the whole process is
almost impossible to state as an equation, let alone to solve in real
time.) Perverse as it sounds, writing a go program is a constant
battle against common sense. At every turn, your brain tries to
convince you that what it is doing is complete and coherent--but it
never is.

Then there are the software engineering problems common to
any large programming project. Our program handles a lot of complex
knowledge, and while it is designed to do so as efficiently as
possible, unwanted interactions between its various parts are hard to
avoid. Countless parameters and pieces of code can interfere with
performance, and we spend a dismaying amount of time tracking down
problems caused by seemingly innocuous changes.

We are increasingly tempted to write a program that simply
teaches itself--either by analyzing the games of strong players or by
playing against itself. That approach may sound too simpleminded to
work, but a couple of remarkable programs have already taught
themselves to play other games. TD-Gammon, the best backgammon program
in the world, written by the physicist Gerald Tesauro of the IBM
Thomas J. Watson Research Center in Yorktown Heights, New York,
started out by making random moves. Then, after playing millions of
practice games against itself, the program bootstrapped itself into a
nearly perfect player. The "neural net" that encodes TD-Gammon's
expertise takes a long time to evaluate positions by the standards
of computer chess: TD-Gammon can search only three moves deep in 15
seconds. But the positional knowledge it encodes is so sophisticated
that its decisions are nearly always right.

A few programmers have tried to apply the same techniques to
go, but their success has been limited. Even the most sophisticated
self-taught programs play like rank beginners; a bright newcomer to
the game of go might beat any of those programs in the very first
game. Yet machine learning remains a highly promising avenue of
research, and I am beginning to pursue it now, with the help of the
electrical engineers Donald C. Wunsch and Raonak Zaman of Texas Tech
University in Lubbock. Learning to play go from scratch may be too
much to ask of a neural net. But if such a net were added to our
already sophisticated program, it might gradually teach itself some of
the more intuitive points of go strategy--how to evaluate the strength
of a group or the value of a wall of stones, for instance.

No matter how you approach it, go is hard to program. Fortunately,
its difficulties happen to be the right kind of difficulties: they
force programmers to be innovative, to unify existing techniques, and
thereby to foster basic advances in artificial intelligence.

And yet, for the moment, computer go remains the campaign-finance
reform of AI: everyone agrees it is a good idea, but no one wants to
do it alone. It may be that computer go is simply too daunting, too
prone to disappointment, to attract the kind of concerted effort that
led to Deep Blue's triumph. Although a few tens of thousands of
dollars in prizes are given out every year in computer go
competitions, they are not enough to support even one programmer. The
Ing Foundation in Taiwan is one of the most generous supporters of
computer go, and offers a tantalizing $1.6 million prize for a program
that can beat a professional. Unfortunately, the offer ends in the
year 2000, and the goal is still so remote that the offer might as
well expire tomorrow.

As it stands, the field of computer go is dominated by
dedicated hobbyists. Handtalk was not written by a team of computer
scientists and engineers funded by a giant multinational corporation,
nor by a government-funded team of academics at a top research
institute. It was written by a retired chemistry professor from
Guangzhou Province in China, on his ancient PC, and in assembly
language because he did not have a C compiler. The bottleneck in
computer go has to do not with processor power, but with ideas.

Nevertheless, following gingerly in the footsteps of AI
optimists before me, I predict that thirty years from now a computer
will play go, without a handicap, against someone of Kim's
caliber. And the computer will win. Whether the essence of
intelligence can be captured in manipulated symbols, whether massive
search will be of any use outside of toy domains, whether self-taught
machines can ever solve real-world problems: those are all open
questions. But whatever the answers, computer go is poised to be an
integral part of the process of discovery.

David A. Mechner is a doctoral fellow at New York University's
Center for Neural Science.