Foresight Update 5

page 3

AI
Directions

Artificial intelligence, like nanotechnology, will reshape our
future. Nanotechnology means thorough, inexpensive control of the
structure of matter, and early assemblers will enable us to build
better assemblers: this will make it a powerful and
self-applicable technology. Artificial intelligence (that is,
genuine, general-purpose artificial intelligence) will eventually
bring millionfold-faster problem solving ability, and, like
nanotechnology, it will be self-applicable: early AI systems will
help solve the problem of building better, faster AI systems.

AI differs from nanotechnology in that its basic principles are
not yet well understood. Although we have the example of human
brains to show that physical systems can be (at least somewhat)
intelligent, we don't understand how brains work or how their
principles might be generalized. In contrast, we do understand
how machines and molecules work and how to design many kinds of
molecular machines. In nanotechnology, the chief challenge is
developing tools so that we can build things; in AI, the chief
challenge is knowing what to build with the tools we have.

To get some sense of the possible future of AI--where research
may go, and how fast--one needs a broad view of where AI research
is today. This article gives a cursory survey of some major areas
of activity, giving a rough picture of the nature of the ideas
being explored and of what has been accomplished. It will
inevitably be superficial and fragmentary. For descriptive
purposes, most current work can be clumped into three broad
areas: classical AI, evolutionary AI, and neural networks.

Classical AI

Since its inception, mainstream artificial intelligence work
has tried to model thought as symbol manipulation based on
programmed rules. This field has a huge literature; good sources
of information include a textbook (Artificial Intelligence
by Patrick Winston, Addison-Wesley, 1984) and two compilations of
papers (Readings in Artificial Intelligence, Bonnie
Lynn Webber, Nils J. Nilsson, eds., Morgan Kaufmann, 1981, and Readings
in Knowledge Representation, Ronald J. Brachman, Hector J.
Levesque, eds., Morgan Kaufmann, 1985).

The standard criticism of AI systems of this sort is that they
are brittle, rather than flexible. One would like a system that
can generalize from its knowledge, know its limits, and learn
from experience. Existing systems lack this flexibility: they
break down when confronted with problems outside a narrow domain,
and they must be programmed in painful detail. Work continues on
alternative ways to represent knowledge and action, seeking
systems with greater flexibility and a measure of common sense.
(A learning program called Soar
[ also
this link ], developed by Allen Newell of Carnegie Mellon
University in collaboration with John Laird
and Paul Rosenbloom,
is prominent in this regard.) In the meantime, systems have been
built that can provide expert-level advice (diagnosis, etc.)
within certain narrow domains. Though not general and flexible,
they represent achievements of real value. Many of these
so-called "expert
systems" are in commercial use, and many more are under
construction.

Evolutionary AI

When one reads "artificial intelligence" in the
media, the term typically refers to expert systems. If this were
the whole of AI, it would still be important, but not potentially
revolutionary. The great potential of AI lies in systems that can
learn, going beyond the knowledge spoon-fed to them by human
experts.

The most flexible and promising learning schemes are based on
evolutionary processes, on the variation and selection of
patterns. Doug
Lenat's EURISKO program used this principle, applying
heuristics (rules of thumb) to solve problems and to vary and
select heuristics. It achieved significant successes, but Lenat
concluded that it lacked sufficient initial knowledge. He has
since turned to a different project, CYC, which
aims to encode the contents of a single-volume
encyclopedia--along with the commonsense knowledge needed to make
sense of it--in representations of the sort used in classical AI
work.

Another approach to evolutionary AI, pioneered by John Holland,
involves classifier
systems modified by genetic
algorithms. A classifier system uses a large collection
of rules, each defined by a sequence of ones, zeroes, and
don't-care symbols. A rule "fires" (produces an output
sequence) when its sensor-sequence matches the output of a
previous rule; a collection of rules can support complex
behavior. Rules can be made to evolve through genetic algorithms,
which make use of mutation and recombination (like chromosome
crossover in biology) to generate new rules from old. This work,
together with a broad theoretical framework, is described in the
book Induction: Processes of Inference, Learning, and
Discovery (by John H. Holland, Keith J. Holyoak, Richard
E. Nisbett, and Paul R. Thagard, MIT Press, 1986). So far as I
know, these systems are still limited to research use.

Mark S. Miller and
I have proposed an agoric approach to evolving software,
including AI software. If one views complex, active systems as
being composed of a network of active parts, the problem of
obtaining intelligent behavior from the system can be recast as
the problem of coordinating and guiding the evolution of those
parts. The agoric approach views this as analogous to the problem
of coordinating economic activity and rewarding valuable
innovation; accordingly, it proposes the thorough application of
market mechanisms to computation. The broader agoric open
systems approach would invite and reward human involvement in
these computational markets, which distinguishes it from the
"look Ma--no hands!" approach to machine intelligence.
These ideas are described in three papers ("Comparative
Ecology: A Computational Perspective," "Markets and
Computation: Agoric Open Systems," and "Incentive
Engineering for Computational Resource Management")
included in a book on the broader issues of open computational
systems (The Ecology of Computation, B. A. Huberman,
ed., in Studies in Computer Science and Artificial Intelligence,
North-Holland, 1988).

Ted Kaehler of
Apple Computer has used agoric concepts in an experimental
learning system initially intended to predict future characters
in a stream of text (including written dates, arithmetic
problems, and the like). Called "Derby," in part
because it incorporates a parimutuel betting system, this system
also makes use of neural network principles.

Neural nets

Classical AI systems work with symbols and cannot solve
problems unless they have been reduced to symbols. This can be a
serious limitation.

For a machine to perceive things in the real world, it must
interpret messy information streams--taking information
representing a sequence of sounds and finding words, taking
information representing a pattern of light and color and finding
objects, and so forth. To do this, it must work at a pre-symbolic
or sub-symbolic level; vision systems, for example, start their
work by seeking edges and textures in patterns of dots of light
that individually symbolize nothing.

The computations required for such tasks typically require a huge
mass of simple, repetitive operations before patterns can be seen
in the input data. Conventional computers simply do one operation
at a time, but these operations can be done by many simpler
devices operating simultaneously. Indeed, these operations can be
done as they are in the brain--by neurons (or neuron-like
devices), each responding in a simple way to inputs from many
neighbors, and providing outputs in turn.

Recent years have seen a boom in neural network research.
Different projects follow diverse approaches, but all share a
"connectionist" style in which significant patterns and
actions stem not from symbols and rules, but from the cooperative
behavior of large numbers of simple, interconnected units. These
units roughly resemble neurons, though they are typically
simulated on conventional computers, and the resemblance in
behavior is often very rough indeed. Neural networks have shown
many brain-like properties, performing pattern recognition,
recovering complete memories from fragmentary hints, tolerating
noisy signals or internal damage, and learning--all within
limits, and subject to qualification. A variety of neural network
models are described in the two volumes of Parallel
Distributed Processing: Explorations in the Microstructure of
Cognition (edited by David E. Rummelhart and James L.
McClelland, MIT Press, 1986). Neural network systems are
beginning to enter commercial use. Some characteristics of neural
networks have been captured in more conventional computer
programs (Efficient
Algorithms with Neural Network Behavior, by Stephen M.
Omohundro, Report UIUCDCS-R-87-1331, Department of Computer
Science, University of Illinois at Urbana-Champaign, 1987).

A major strength of the neural-network approach is that it is
patterned on something known to work--the brain. From this
perspective a major weakness of most current systems is that they
don't very closely resemble real neuronal networks. Computational
models inspired by brain research are described in a broad,
readable book on AI, philosophy, and the neurosciences (Neurophilosophy,
by Patrica Smith Churchland, MIT Press, 1986) and in a more
difficult work presenting a specific theory (Neural
Darwinism, by Gerald Edelman, Basic Books, 1987). A bundle
of insights based on AI and the neurosciences appears in The
Society of Mind (by Marvin Minsky, Simon and Schuster,
1986).

Some observations

For all its promise and successes, AI has hardly
revolutionized the world. Machines have done surprising things,
but they still don't think in a flexible, open-ended way. Why has
success been so limited?

One reason is elementary: as robotics researcher Hans Moravec of
Carnegie-Mellon University has noted, for most of its history, AI
research has attempted to embody human-like intelligence in
computers with no more raw computational power than the brain of
an insect. Knowing as little as we do about the requirements for
intelligence, it makes sense to try to embody it in novel and
efficient ways. But if one fails to make an insect's worth of
computer behave with human intelligence--well, it's certainly no
surprise.

Machine capacity has increased exponentially for several decades,
and if trends continue, it will match the human brain (in terms
of raw capacity, not necessarily of intelligence!) in a few more
decades. Meanwhile, researchers work with machines that are
typically in the sub-microbrain range. What are the prospects for
getting intelligent behavior from near-term machines?

If machine intelligence should require slavish imitation of brain
activity at the neural level, then machine intelligence will be a
long time coming. Since brains are the only known systems with
general intelligence, this is the proper conservative
assumption, which I made for the sake of argument at one point in
Engines of Creation.
Nonetheless, just as assemblers will enable construction of many
materials and devices that biological evolution never stumbled
across, so human programmers may be able to build novel kinds of
intelligent systems. Here we cannot be so sure as in
nanotechnology, since here we do not know what to build, yet
novel systems seem plausible. It is, I believe, reasonable to speculate
that there exist forms of spontaneous order in neural-style
systems that were never tested by evolution--indeed, that may
make little biological sense--and that some of these are orders
of magnitude better (in speed of learning, efficiency of
computation, or similar measures) than today's biological
systems. Stepping outside the neural realm for a moment, Steve
Omohundro (see above) has found algorithms that outperform
conventional neural networks in certain learning and mapping
tasks by factors of millions or trillions.

Algorithms of neural style
may exist that were never tested by evolution

Thus, although there is good reason to explore brain-like
neural networks, there is also good reason to explore novel
systems. Indeed, some of the greater successes in current neural
network research involve multi-level versions of
"back-propagation" learning schemes that seem rather
nonbiological (and Omohundro's algorithms seem entirely
nonbiological).

In summary, AI research is rich in diverse, promising approaches.
Our ignorance of our degree of ignorance precludes any accurate
estimate of how long it will take to develop genuine, flexible
artificial intelligence (of the sort that could build better AI
systems and design novel computers and nanomechanisms). If
genuine AI requires understanding the brain and developing
computers a million times more powerful than today's, then it is
likely to take a long time. If genuine AI can emerge through the
discovery of more efficient spontaneous-order processes (or
through the synergistic coupling of those already being studied
separately) then it might emerge next month, and shake the world
to its foundations the month after.

In this, as in so many areas of the future, it will not do to
form a single expectation and pretend that it is likely ("We
will certainly have genuine AI in about 20
years"--poppycock!). Rather, we must recognize our
uncertainty and keep in mind a range of expectations, a range of
scenarios for how the future may unfold. Genuine AI may come very
soon, or very late; it is more likely to come sometime in
between. Since we don't know what we're doing, it's hard to guess
the rate of advance. Sound foresight in this area means planning
for multiple contingencies.