l'SYCHOtX)GICAL SCIENCE

General Article

WHAT IS AN "EXPLANATION" OF BEHAVIOR? By Herbert A. Simon

The cognitive "revolution" in psychology introduced a new share with meteorologists, evolutionary biologists, andconcept of explanation and somewhat novel methods of gath- all those physical or biological scientists who ventureering and interpreting evidence. These innovations assume that outside the laboratory into the complexity of real-worldit is essential to explain complex phenomena at several levels, phenomena.symbiylic as well as physiological: complementary, not compet- Hence, my first comment: In the year 1991, we knowitive. As with the other sciences, such complementarity makespossible a comprehensive and unified experimental psychology. a great deal about human thinking, and especially aboutContemporary cognitive psychology also introduced comple- the symbolic processes, involving selective heuristicmentarity of another kind, drawing upon, and drawing to- search and recognition of familiar cues, that people use together, both the behaviorist and the Gestall traditions. solve problems, to design artifacts and strategies, to make decisions, to communicate in natural language, and to learn. How people solve problems is no great mystery; I would like to begin with two comments on contem- we know enough about it to create computer programsporary cognitive psychology—on where we stand. The that do it, and do it in a way that closely simulates humanevidence supporting these observations is so overwhelm- performance, step by step. By the same test, we knowing that I will not bore you by rehearsing it. But we have how people design strategies, and even how they leamsome conventional, customary ways of talking about psy- language and make scientific discoveries.chology that fly in the face of what 1 think are the facts, In all these cases, we have examples of computer pro-and 1 would like to distance myself from these ways of grams that perform these tasks in humanoid ways. If youspeaking, which I believe are harmful to the continued want evidence for this claim, I can refer you, for starters,rapid progress of our science. to standard sources like Anderson's (1990b) cognitive How often have you heard that "some day we will psychology textbook or the recent Foundations of Cog-understand the mind," or that "the human brain is a great nitive Science, edited by Posner (1989). Cognitive psy-mystery that we must seek to solve"? In fact, psychology chology is not some dream ofthe future; it exists, and itexists not in the future, but in the present. By any rea- allows us to explain a vast range of phenomena. It is notsonable metric, we know more about the human mind a finished science, thank goodness (what science is?), butand brain than geophysicists know about the plate tec- each year adds to its store of knowledge and understand-tonics that move the continents over the globe, far more ing, and its powers of prediction.than particle physicists know about elementary particles, My second comment: Histories of psychology are fondor biologists about the processes that transform a fertil- of talking about ''schools of thought," and their rise andized egg into a complex multicellular organism. fall, attributing to the chronology of our field a circular We discount our knowledge because some of it is so course, rather than the helical one (at worst) attributed tocommonplace, so familiar from our everyday acquain- other sciences. In the histories there is not just psychol-tance with ourselves and other people. We discount it ogy, the science of human behavior; there is introspec-also because it often is insufflcient to permit predictions tionist psychology, and behaviorist psychology, andof behavior in important matters that concern us. The Gestalt psychology, and information processing psychol-former is a great blessing to us, for it allows us to learn ogy, and connectionist psychology—schools withouteasily facts of sorts that other sciences have to tease out end, and without cumulation, each school combating andwith great effort. The latter is a true limitation that we destroying the previous one, to be consumed, in tum, by its successor. This circular view of history is wholly counterfactual. Address correspondence and reprint requests to Herbert A. Simon,Department of Psychology. Carnegie-Mellon University, Pittsburgh. The "cognitive revolution" (1 even used the phrase in myPA 15213. opening summary) did not destroy either behaviorism or

1992 American Psychological Society VOL. 3, NO. 3, MAY 1992

150 PSYCHOLOGICAL SCIENCE

Herbert A. Simon

Gestalt psychology. It drew liberally upon both of them, availability and relevance of these large bodies of databoth for experimental data and for concepts. The produc- provide powerful means to test the adequacy of the mod-tions of information processing psychology are natural els. This tying together and relating of disparate bodies ofdescendants of the familiar stimulus-response links of experimental data with hypotheses about the causalbehaviorism (though not identical with them). Means- mechanisms greatly facilitates cumulation.ends analysis, central to information processing theories So much for these two debilitating myths: that theof problem solving, was explored by Duncker(1945), and mind is something we will understand in the future andby Selz (1913) before him. The neural nets of current that the path of psychology is circular, each newconnectionist models have their origins in physiological "school" tearing down and replacing the one it succeeds.probings of the nervous system, via the "cell assem- Neither myth bears the slightest resemblance to the trueblies" of Hebb (1949), and in notions, traceable back to state of affairs, and it is time that we put them to rest andAristotle, ofthe associative structure of memory. get on with advancing still further a science that has made In the course of this paper, I refer to another currently great strides in this century.fashionable novelty in psychology, so-called situated ac-tion and situated learning, and show that its antecedentsare also very familiar. Psychology is as progressive and EXPLAINING A CONVERSATIONcumulative as any of the sciences, and we can today citeexperiments of Ebbinghaus (1964), or Wundt (1902), or The scene is a street in Singapore. A woman is talkingHovland (1951), or Skinner (1938) as major sources of to two other women, talking in Tamil, a Dravidian lan-empirical support for contemporary theories.^ guage that is spoken in a large region of southem India around Madras and in parts of Sri Lanka. We wish to In our generation, we have discovered a mode of psy- explain her behavior.chological theorizing that has greatly facilitated, and will What is there to explain? For one thing, why Tamil?continue to facilitate, the cumulation of knowledge and Why not English, or Chinese, or Malayan, the predomi-theory in psychology. Today we build computer models nant languages in Singapore? An explanation would de-of both symbolic and connectionist systems. Instead of scribe the migrations that brought large numbers ofconstructing microtheories for each phenomenon we ob- Tamils from India to this distant port. This "simple"serve (e.g., theories of retrospective inhibition), or explanation still presupposes some vital theoretical un-macrostatements that are too simplistic and general to derpinnings. It assumes that under some circumstancesexplain much (e.g., "forgetting follows a power law"), migrants will retain, for a generation or even beyond, thewe construct computer programs that can be given com- language of their ancestors. What are those circum-plex cognitive tasks, identical to those given to our hu- stances, and what conditions in Singapore satisfiedman subjects, and that will predict the temporal path of them? And when will this woman, probably multilingual,human behavior on those tasks (NeweU & Simon, 1972). use Tamil, and when one of the other languages of Sin- Some cognitive psychologists today aspire to build gapore?"unified" models of this kind: The SOAR (Newell, 1990),ACT* (Anderson, 1983), and PDP (Rumelhart & McClel- The explanation by migration also assumes conditionsland, 1986) systems are familiar examples. Others of us that caused the migrants to leave their homeland, andaim at models of middle range: a GPS (Newell & Simon, historical "taws" that would explain migration as a re-1972) to account for problem-solving phenomena; an sponse to such conditions. What were those conditions,EPAM (Feigenbaum & Simon, 1984) to account for ver- and what is the nature of such laws?bal learning processes; an ISAAC (Novak, 1977) to ex- Some social psychologists undertake to answer ques-plain how people understand problems described in nat- tions like these. For the rest, these questions are usuallyural language text, construct mental representations of left to history, sociology, and the other social sciences.those problems, and go on to solve them; and an INTER- But insofar as they involve things stored in the humanNIST (Miller, Pople, & Myers, 1982) or a MYCIN (Short- memory, they are also part of cognitive psychology. It isliffe, 1976) to describe the processes of expert medical proper that they be welcomed back into our science, as isdiagnosis. being done by those who are now focusing on the psy- chology of everyday life. Whether comprehensive or not, such models enablecognitive psychology to organize large bodies of dataaround the mechanisms that produced them; and the The Structure of a Dynamic Explanation

1. For example, see the uses of Ebbinghaus (1964) and Hovland Even answers to all of these questions will only begin(1951) in testing the EPAM theory of verbal leaming (Feigenbaum & to explain our Tamil woman's behavior, but before con-Simon. 1984). tinuing, let us ask what has already been revealed.

VOL. 3, NO. 3, MAY 1992 151

PSYCHOLOGICAL SCIENCE

Uxplanation of Behavior

Our explanation has the form o\' a fugue, with two erally, not metaphorically) a system of difference intertwined themes, l-'irst. locxplnin an event, we refer to equations. For each possible state ofthe computer, com- antecedent events—initial conditions. To explain a bined with the input at that instant, the program deter- Tamil-speaking woman's presence in Singapore, we find mines the next state of the computer. The computer's a migration from Madras. But that poses the new ques- memory holds the initial conditions (the current state) tion of explaining the presence of Tamil-speaking people and the laws of behavior (the program). Its input devices in southern India. So explanation by antecedent events convey to it the external stimulus, which may, as in the takes us back to the explanation of those antecedents. If case before us, take the form of sentences in a natural the data were available (they are not), they could take us language. step by step in an almost infinite regress to the cosmo- Since a computer program is a system of difference logical Big Bang and beyond. equations, a properiy programmed computer can be used But—the second theme of the fugue—explanation by to explain the behavior of the dynamic system that it antecedent events also requires general laws to explain simulates. Theories can be stated as computer programs. how each situation causes the succeeding one. What Controlled experiments can be performed on com- causes of migration can take people from one land to puter programs, altering specified program components another? What laws determine the language that a person to determine how such changes affect the performance of will speak in an ethnically foreign land, and when? tasks. The architecture can thereby be modified to sim- The natural sciences commonly employ this fuguelike ulate the human performance better. structure of explanation. The differential equations of There is no epistemological difference between using a physics describe mechanisms that determine the next program incorporating Newton's laws to explain the movements ofthe stars and planets, given the initial con- movements of Mars and using a program incorporating ditions: their present positions and velocities. The laws of linguistic laws to explain how speech is generated or un- genetics and Darwinian selection explain how a commu- derstood. But perhaps you are not familiar with the com- nity of organisms (the initial conditions) evolves over puter programs that have these linguistic capabilities. years or millennia into a new and different community. One example is ISAAC, written by Novak (1977), which For systems that change through time, explanation reads the English language statements of problems in takes this standard form: Laws acting on the current state physics textbooks, forms internal representations ("men- of the system produce a new state—endlessly. Such ex- tal pictures") of the problem situations, and then pro- planations can be formalized as systems of differential ceeds to derive the applicable equations and to solve equations or difference equations. them. Another such program is ZBIE, written by Siklossy (1972), which reads a simple sentence in a natural lan- Explaining by Simulation guage at the same time it inputs a diagrammatic repre- We return to our Tamil women, whom we left talking sentation of the scene described by the sentence (e.g., on the street. To understand their conversation, we must "The dog chases the cat."). ZBIE learns the meanings of have some knowledge of the lexicon and syntax of their the words in the sentences it reads (i.e., learns what ob- language. Tamil is one of about 20 highly inflected Dra- jects or relations in the diagrams the words denote) and vidian languages spoken throughout southern India. To analyzes their grammatical structure. When it is later characterize its syntax, we build a computer program that confronted with a scene it has not seen before, but one parses the speaker's sentences. Such a parser is also a set composed of familiar kinds of objects in familiar rela- of difference equations, playing the same role as the dif tions, it constructs an appropriate and grammatically cor- ferential equations in physics. rect sentence to describe the scene.^ But we might go even further in explaining Tamil. We A remarkable feature of programs like ZBIE is that might build a diachronic story—conceptually, another set they not only explain how natural language is under- of difference equations—to explain how the contempo- stood, they also understand it. The linguistic symbols are rary Dravidian languages evolved from some common not translated into an esoteric formal language; hence, we ancestral base. This means postulating laws of linguistic do not have to numericize or otherwise encode the sen- transformation that cause language evolution. Since tences whose production or understanding we wish to Chomsky's revolution, or even since Grimm's, explana- tion in linguistics has become another exercise in building 2. Because the words in the sentences have denotations in the dia- and testing difference equations (Chomsky, 1957). grams, ZBIE has a genuine understanding of the sentences it reads and Another approach to these questions is to write com- those it constructs. It anticipates fully, and by a decade, the objections puter programs that are capable of using and understand- against machine understanding raised by Searle (1984) in his Chinese ing, even learning, language. A computer program is (lit- Room parable—and answers these objections decisively.

VOL. 3, NO. 3, MAY 1992

PSYCHOLOGICAL SCIENCE

HerhtTf A. Simon

explain. The programs use symbol structures that are simplest situations). But we do not require the reductionisomorphic to those the human subject uses. All informa- in order to explain the aggregate events at the highertioti processing theories of cognition have this property: level. We can write the system of difference equations forThey actually perform the tasks whose performance they this higher level independently of any lower level expla-explain. nation. Cognitive psychology (fortunately) does not have Programs that simulate cognitive processes describe to stand still with breathless expectation until neurophys-the processes in symbohc languages isomorphic to those iology completes its work. As cognitive psychology hasbeing modeled, and hence, actually execute the pro- been doing, it can proceed with its task of explainingcesses. Consequently, they provide a rigorous test of the thought processes at the level of symbol systems.sufficiency of the hypothesized processes to perform the Partitioning explanation into levels also points to atasks of interest. strategy for neurophysiological research, Neuropsychol- ogy has two main tasks. It must explain electrochemi- cally how neurons and simple organizations of neurons NEUROPHYSIOLOGICAL EXPLANATION store and transmit information. It must also help build the Simulating language behavior with a computer teaches bridge theory that shows how the symbol structures andus the properties an architecture must possess if it is to symbol-manipulating processes that handle informationspeak and listen, and what processes are employed by its at a more aggregated level can be implemented by suchprogram. It allows us to test, at the level of symbolic neuronal structures and organizations. The bridge needbehavior, how closely these processes resemble those of not be built solely from one bank of the river; it can behuman speakers or listeners. It does not tell us how the constructed by cooperative effort of information process- same structural conditions and programs are realized by ing psychologists with neuropsychologists. But if theythe biological components known as neurons and the as- are to cooperate, they must learn to read each other's semblages of components that make up the biological blueprints.brain. This strategy relieves neuropsychology of the heroic, Explanation of cognitive processes at the information but impossible, task of climbing in a single step fromprocessing (symbolic) level is largely independent of ex- neurons and nerve nets up to complex human behaviorplanation at the physiological (neurological) level that without inserting intermediate strata into the structure.shows how the processes are implemented. Some neuropsychologists and connectionists do not yet There is nothing mysterious abut explaining phenom- accept the need for higher level aggregate theories, or theena at different levels of resolution. It happens all the meaning of information processing programs as examplestime in the physical and biological sciences. A theory of of such theories. Such misunderstanding forms a seriousgenetics need not (fortunately) rely on a knowledge of barrier to collaboration.quarks. As a matter of history, the former theory pre- Nowadays, a discussion of neurophysiology necessar-ceded the latter by many years. The theory of genetic ily raises the question of whether mental functions are toprocesses was developed by Mendel, using genes as ab- be modeled as parallel or serial systems. At the loweststract primitive "atoms." Fifty years later, a microscopic level, the individual neuron demonstrably transmits sig-foundation was provided for the theory by locating the nals longitudinally, in serial fashion. At the next level up,genes in visible chromosomes. After another half cen- brain tissue forms a network of elements operating intury, the structure of chromosomes was elucidated in parallel, and the same can be said of the eyes and ears. Atterms of the combinatorics of DNA, strands of four com- the level of conscious reportable events, the bottleneckplex submolecules, nucleotides. Two levels of reduction of attention and short-term memory again gives the mindand still no quarks! And no need of them, although we the characteristics of a serial organization. It is worthsurely believe that nucleotides are made of atoms, which pondering that the low-level anatomy of the conventionalare made of neutrons and protons, which are made of von Neumann "serial" computer looks every bit as par-quarks. allel as a neural network; yet at the more aggregate, sym- Explanation on different levels does not deny the pos- bolic, level, it executes its processes sequentially, one orsibility of reduction. Higher level theories use aggregates a few at a time.of the constructs at lower levels to provide parsimonious From these observations, we can conclude, first, thatexplanations of phenomena without explicit reference to at the level of the network of neurons, modeling will havethe microconstructs. The lower level details do not show to be largely parallel. It is not clear, as yet, how far wethrough to the higher level. can abstract from the details of neural structure in our Of course, the higher level mechanisms are reducible models, or how many structures the models will have toto those of the lower level (at least in principle, although contain to simulate relevant events at this level.the computations can actually be carried out only in the Second, at the symbolic level—the level of events tak-

VOL, 3, NO. 3, MAY t992 153

PSYCHOLOGICAL SCIENCE

l-\planalion of Behavioi

ing place in hundreds ot milliseconds or more—modeling EXPLAINING THINKING

will continue to he kugely serial, lor Ihe mind behaves like a serial system wherever the hollleneck of attention Our Tamil women are still talking on the street in Sin- supervenes upon events. While most people ean proba- gapore. So far, we do not know what they are saying. bly chew gum and walk at the same time, very few can When we eavesdrop, we find that the speaker is explain- carry on a technical conversation while maneuvering a ing to her companions how to solve the Tower of Hanoi car through heavy traffic. puzzle!"' Third, at the intermediate level of events milliseconds By now, we know exactly how to theorize about this or tens of milliseconds in duration, the comparative ad- kind of behavior. We construct a set of difference equa- vantages of parallel and serial modeling are not yet elear. tions (a computer program in a symbol-processing lan- This is the level of the EPAM program (Feigenbaum & guage) that simulates human behavior in solving the Simon. 1984), whieh simulates learning and perception at Tower of Hanoi problem. In fact, programs of this kind the symbolic level, and the level of most connectionist have existed for some years (Simon, 1975). Notice that I systems. It is also the foundation level of SOAR (Newell, refer to "programs" in the plural, for different people 1990). a unitled control structure for cognition. Teasing may solve the problem in different ways, using different out the respective roles of parallel and serial processors strategies. and their interface at or near this level is a major con- temporary task for cognitive research. Heuristic Search

scription. Common to virtually all ofthe problem-solving Concern with architecture reminds us that not all theo- strategies that people have been observed to use is a ries take the form of difference equations. In faet, theo- problem space and a search through this space until a ries in psychology have traditionally had a quite different solution is reached (Newell & Simon, 1972). The moves form. Typically, they make assertions such as "If the that change one situation into another in the Tower of independent variable, x, increases, the dependent vari- Hanoi may be legal moves, as defined by the problem able, _v, will also increase." Laws of this form are very instructions, or they may be "wished-for" moves that weak. They are also merely descriptive, not explanatory. change the current situation into a distant one in one step. Much stronger claims are made by laws ofthe form "y In some strategies, most of the problem solving takes = 80.V + 300," where the parameters, 80 and 300, were place in the head, making use of symbolized goals and known or estimated prior to the current experiment. If, in mental models, symbol structures describing the situa- addition, these parameters describe structural character- tion at each stage of the search. In other strategies, the istics ofthe system (e.g.. the speed at which it can store subjects work directly from the physical Tower of Hanoi or aceess information), then the law begins to explain as puzzle in front of them, using visual perception of the well as to describe. Let us call laws of this kind, with the current arrangement of the disks to calculate a next numerical parameters taken seriously, models. move, and recording it by actually moving the disk. In For example, Baddeley (1981) showed that the con- eurrently fashionable terminology, the subjects who use tents of short-term memory can be retained for only the latter strategies are engaging in situated action. about 2 s without overt or covert rehearsal. This finding There is a good deal of debate at present (under the implies that the maximum capacity of short-term memory rubric of situated action) as to whether problem solving is whatever content can be rehearsed in this time. Other requires the subject to create a mental problem space and experiments have shown that it takes about 300 ms to to search in that space, or whether the search can be recover a familiar '^chunk" (e.g., a familiar word or almost wholly external, with no significant problem rep- phrase) from long-term memory, and about 80 ms per resentation in the head (Suehman, 1987; Winograd & syllable to pronounce it. From these facts, there follows Flores, 1986). Sometimes the debate is enlarged by chal- the law: 2,000 - 300C + 805, where C and S are the lenging whether problem solving can be modeled at all by numbers of chunks and syllables, respectively, in the symbolic systems. longest strings that can be retained in short-term mem- The best way to resolve the debate is to construct ory. The law can be tested using the standard immediate programs and observe what they can and cannot do. A recall paradigm (Zhang & Simon. 1985). Some of the properties of systems can be captured in 3. It is widely believed on the Carnegie-Mellon campus that I cannot static laws, preferably models, which specify the rela- give a talk without mentioning the Tower of Hanoi within the first 15 tions among variables, qualitatively or numerically. min. I contribute this new evidence to support that belief.

154 VOL. 3, NO. 3, MAY 1992

PSYCHOLOGICAL SCIENCE

Herbert A. Simon

running program is the moment of truth. This particular solving or question-answering performance that isdebate has been largely resolved by programs already speedy and for which the expert is unable to describe inwritten and tested. Some strategies that have been writ- detail the reasoning or other process that produced theten for the Tower of Hanoi depend on search through an answer. The situation has provided a cue; this cue hasinternal representation of the problem, or even initial given the expert aceess to information stored in memory,search through an abstracted representation to find a plan and the information provides the answer. Intuition isfor the more detailed search. Other strategies that have nothing more and nothing less than recognition.been written search externally, representing internally We do not have conscious access to the processes thatonly the "affordances" provided by the external objects allow us to reeognize a familiar object or person. Weand their relations (Simon, 1975). Hence, it has been recognize our friend, but we do not know what traits anddemonstrated constructively that both situated action features, what cues, enable that recognition to occur.and strategies requiring planning and internal represen- Nor can we describe these traits and features to othertations are realizable by symbol-proeessing systems. people accurately enough to enable them to recognize the What has not been settled, and cannot be settled with- same person. We are aware of the fact of recognition,out extensive empirical study, is the extent to whieh, and which gives us access to our knowledge about our friend;the circumstances under which, human beings will use we are not aware of the processes that accomplish theone or another kind of strategy. Our Tamil woman is not recognition.earrying a physical Tower of Hanoi puzzle with her. She The process of recognition (i.e., intuition) is readilyhas no alternative, if she is to explain the solution to her realized in computer programs by means of so-called pro-friends, but to form a mental representation of some ductions. A production is an (if ^ then), or (condition —*sort—a problem space—and to describe the moves in that action), statement that, at least superficially, resembles aspace. Her friends have no alternative for understanding (stimulus -^ response) hnkage. For our present purposes,the explanation but to translate the description into their we need note only that, while the stimuli of classicalown mental representations. If a physical Tower of Hanoi behaviorism are in the environment, not in the head, thepuzzle were present, matters might be quite different. conditions that have to be satisfied to trigger the action ofBut life does offer us a great deal of variety. So much for a production may be (but need not be) symbol strueturessituated action. held in memory. Productions can implement either situ- Different people, or the same people in different situ- ated action or internally planned action, or a mixture ofations, ean employ different strategies for performing a these.given task. A theory of their performance would include Quite general programming languages (e.g., the lan-a computer program describing the strategy they are us- guage OPS5; Brownston, Farrell, & Martin, 1985) ean being in a given instance together with a specification of the constructed entirely of productions. The execution of acircumstance under which this particular strategy will be production can be made to depend on a context by in-used. The specification can include a variety of elements, cluding among the conditions for execution one or moreincluding the subjects' previous experience and learning. goal symbols. The production will then be activated only in contexts where the appropriate goal is present. Con- ditions can also reflect other elements of contexts besides Expert Behavior goals. Actually, I was joking about theTower of Hanoi. That Consider a (simplified) expert modeled as a productionis not what the Tamil women are talking about at all. In system. Cues in the environment that the expert encoun-fact, the speaker is telling about a new recipe she has ters trigger information in memory, hence, initiate ac-leamed; her friends regard her as an expert in preparing tions appropriate to the situations marked by these cues.gourmet meals. In its simplest form, the model produces situated action. The conversation is not a monologue. The expert does When the doctor notices some symptoms, a diagnosismost of the talking, but her friends ask frequent ques- is triggered, or, alternatively, information that is ac-tions, and she usually replies promptly. One of them asks cessed indicates certain additional tests should be per-how long the dish should remain in the oven. The expert formed to reach a definitive diagnosis (a departure fromanswers, then says, "Of course, I don't have any sys- pure situated action). When the doctor has reached atematic rules for determining such things. I just use my diagnosis, another production aeeesses information inintuition. It's all a matter of experience." memory about the prognosis and about appropriate The expert has just stated, very succinctly, the theory courses of treatment.of expert performance that has emerged in recent years Information organized in a production system of thisfrom psychological research and modeUng. In everyday kind—a sort of indexed encyclopedia—can produce ex-speech, we use the word intuition to describe a problem- pert behavior. Expert systems may, in addition, have

VOL. 3, NO. 3, MAY 1992 155

PSYCHOLOGICAL SCIENCE

l-\plati;ition of

some cap.ibihdcs lor iiic;ins-cnds analysis or olhcr forms those underlying regularities that do not change from mo- oi' reasoning and hoiinsiic sc.uuh, but at their core is a ment to moment. How does one find laws to describe or prodiiclion syslcni capable of leeogni/ing appropriate explain the behavior of an adaptive system? cues, hence, capable of aeting intuilively. The shape of a gelatin dessert cannot be predicted There is no incompatibility between intuition and anal- from the properties of gelatin, but only from the shape of ysis. A chess master in a tournament does a good deal of the mold into which it was poured. If people were per- analysis, of look-ahead to possible continuations of the fectly adaptable, psychology would need only to study game. The same chess master, playing simultaneously a the environments in which behavior takes place. Some of number of weaker players, moves quickly, hardly ana- this viewpoint is refiected in the affordances of Gibson's lyzing ahead at all but selecting moves almost wholly on (1979) theories of perception, and in the rational adapta- intuition in the form of recognition of weaknesses created tion models of my colleague Anderson (1990a, 1991). by the opponents. This rapid play is weaker than the In its extreme form, this position eliminates the need more analytie play of the tournament, but only a little to run laboratory experiments or to observe people. weaker. Merely examine the shape of the mold: Analyze the en- A large part of the chess master's expertise lies in his vironment in which the behavior is to take place and the or her intuitive (recognition) capabilities, based, in turn, goals of the actor, and from these deduce logically and on large amounts of stored and indexed knowledge de- mathematically what the optimal behavior (and hence the rived from training and experience. Under the conditions actual behavior) must be. of rapid play, the chess master's behavior is a form of Nowhere has this method of explaining human behav- situated action: under tournament conditions, it is more ior been carried further than in modern neoclassical eco- planful. nomics. The neoclassical theories also show the severe Similarly, our expert Tamil gourmet, after a quick in- limits of the approach. First, the scheme works only if the ventory of her refrigerator and kitchen cabinet, can rustle actor's goals and the alternative behaviors available for up a presentable and tasty meal in a hurry, relying on choice are known in advance. Change either the goals or intuition—experience encapsulated in memory and the alternatives and the optimal decision may change (Si- evoked by the sight of familiar items of food. Of course, mon, 1991). Do we think that we can predict what the given some time to plan and prepare, she can usually menu will be in the Singapore apartment tonight without produce an even more delieious meal. knowing what is in the refrigerator, or what some of our The core of an expert system, in human or computer, gourmet's favorite recipes are? Can we predict it from a is a system of productions that operates like an indexed book on nutritionally optimal diets? encyclopedia. Cues in the situation (external or imagined) In most real choice situations, there is a multiplicity of are recognized by the conditions of productions, trigger- goals, often partly confiicting and even incommensura- ing the actions associated with these conditions. The case ble. A simple example is the trade-off between speed and in whieh the cues are predominantly external is some- accuracy: Unless we know their relative importance, we times called situated action. cannot select an optimal behavior. The production system of an expert is generally asso- Nor are the alternatives from which the actor might ciated also with reasoning (search) capabilities that sup- choose usually known in advance (even to the experi- port an adaptive system of analytic and intuitive re- menter). Human beings spend much of their time itivent- sponses. ing or discovering actions that fit the circumstatices. The whole vast collection of human activities known as de- sign—whether in architecture or engineering, or painting, ADAPTIVITY OF BEHAVIOR or management—is aimed at synthesizing appropriate ac- tions. In explaining or predicting behavior, whether op- The human mind is an adaptive system. It chooses timal or not, we must know not only the design product behaviors in the light of its goals, and as appropriate to (the alternative finally chosen) but the design process as the particular context in which it is working. Moreover, it well (Simon, 1981, chaps. 5 and 6). can store new knowledge and skills that will help it attain The process of design is highly depetident on history its goals more effectively tomorrow than yesterday: It and experience. Before Newton, designers did not use can learn. the calculus, and undoubtedly reached different solutions As a consequence of the mind's capacities for adapta- than in later ages when the calculus was available. So tion and learning, human behavior is highly flexible and choice is always relativized to the current state of knowl- vanable, altered by both circumstances and experience. edge, and inventing new alternatives or even new pro- Scientific laws, whether descriptive or explanatory, cesses for generating alternatives is very different from are supposed to capture the invariants of the phenomena. choosing among available and known alternatives.

156 VOL. 3, NO. 3, MAY 1992

PSYCHOLOGICAL SCIENCE

Herbert A. Simon

Design does not aim at optimization. Almost always, the optimum. Here is where we must look for the invari-the process must be halted and a solution selected long ants of an adaptive system like the mind.before all alternatives have been generated and com- But does the point need to be belabored? Optimizationpared. Even the idea of generating "all" alternatives is is an ideal that can be realized only in (a) extremely sim-usually chimerical. Limits on human (and computer) cal- ple worlds (if offered the choice, take a $10 bill in pref-culation and incomplete information foreclose finding the erence to a $1 bill) and (b) worlds having strong andbest: Most often, a stop rule halts the search when a simple mathematical structures that admit the computa-satisfactory alternative is found—one that meets a vari- tions required for optimization (e.g., worlds that can beety of criteria but maximizes none. So we should not described in terms of a linear objective function and lin-expect the recipes of our expert gourmet cook to be op- ear constraints, so that solutions can be found by lineartimal; but if she invites us to dinner at her Singapore programming algorithms). These are not the worlds inhome, the meal will be delicious; it will ''satisfice." which most human life is lived. The nonoptimality of behavior is obvious even in the We would not think of trying to predict where thesimple Tower of Hanoi task discussed earlier. Many dif- moon will be at midnight tomorrow night without know-ferent strategies can be used to solve the problem; and ing where it is tonight. In the same way, we should noteven in identical circumstances, different subjects use presume to predict how a human being will solve a prob-different strategies, not all of which can be optimal. lem or learn a new skill without knowing what that humanThere is substantial empirical evidence that subjects also being already has stored in memory by way of relevantadopt a wide range of strategies, most of them subopti- information and skills. Changing the information andmal, in solving cryptarithmetic problems (Newell & Si- skills will change the behavior. This principle is the basismon, 1972). for all of the differences observed between experts and In complex adaptive behavior, the link between goals novices.and environment is mediated by strategies and knowl- To some extent, we can finesse this requirement foredge discovered or learned by the actor. Behavior cannot our research by restricting our study to the ubiquitousbe predicted from optimality criteria alone without infor- college sophomore, assuming that all college sophomoresmation about the strategies and knowledge agents pos- know roughly the same things, at least those that aresess and their capabilities for augmenting strategies and relevant to the mainly contentless tasks we confront themknowledge by discovery or instruction. with. When we want to go further to study individual What constitutes an available alternative depends on differences in task performanee or to study the effects ofthe capabilities of the actor: such things as visual acuity, previous knowledge and skill on performance, we muststrength, short-term memory, reaction times, and speed face up to the boundary conditions outlined above.and limits of computation and reasoning—to say nothingof expertise based on stored knowledge and skill. Beforethe exercise of optimizing can be carried out, all of these COGNITIVE AND SOCIAL PSYCHOLOGYside conditions must be nailed down: goals, knowledge of Since adaptive behavior is a function of strategies andimmediately available altematives, means for generating knowledge, both largely acquired from the social envi-new alternatives, knowledge for predicting the outcomes ronment, there can be no sharp boundary between cog-these alternatives will produce, and limits on the ability nitive psychology and social psychology. The context inof the actor to hold information in memory and to calcu- which knowledge is acquired and used, an exogenouslate. variable in cognitive psychology, provides the endoge- The predictions of an optimizing theory depend as nous variables for social psychology and sociology.much on the postulated side conditions as on the optimi- Studying expert behavior immediately begins to dis-zation assumption. In fact, in most cases, if the correct solve the boundary between cognitive psychology, on theside conditions are foreseen and predicted, the behavior one side, and social psychology (to say nothing of socialcan usually be predicted without any strict assumption of and intellectual history), on the other. It is not an acci-optimality; the postulate that people satisfice, look for dent that histories of science provide an important part of"good enough" answers, is usually adequate to antici- the data used to test cognitive theories of scientific dis-pate behaviors. covery (Langley, Simon, Bradshaw, & Zytkow, 1987). There is no way to determine a priori, without empir- The histories do not draw a boundary around individualical study of behavior, what side conditions govern be- investigators, but encompass the sources of an investiga-havior in different circumstances. Hence, the study of the tor's knowledge and, more broadly, the social processesbehavior of an adaptive system like the human mind is that direct the production of scientific knowledge and itsnot a logical study of optimization but an empirical study communication.of the side conditions that place limits on the approach to But we have already seen this point illustrated in the

VOL. 3, NO. 3, MAY 1992 157

PSYCHOLOGICAL SCIENCE

!• Aplanalion of Behavior

simple interaction among the 1 amil women—Iheir ehoice out serious interaction between our theories of social his- of language, their very presence in Singapore, the influ- tory, say, and our theories of problem solving. Only ence of their experience (itself a product o\' soeial envi- aggregative properties ofthe symbolic processes will en- ronment) on whal they can do and like to do. ter into the explanation ofthe larger scale social phenom- As another example of this intermingling of the social ena (Simon & Ando, 1961). with the cognitive, communication between different We can divide up the task of explanation in other communities of experts involves translation, that is, un- ways. Difference equations explain actions and their con- derstanding by members of one group of the language and sequences as functions of the initial conditions; they ex- concepts of the other. As Voss and his associates have plain the moment after in terms of the moment before. shown, we can study one aspect of this phenomenon by For many purposes, we can take the system's initial con- observing how experts from different communities attack ditions, the contents and organization of memory when the same problem in quite different ways (Voss, Tyler, & our observations begin, as given, and leave to another Vengo, 1983). Another aspect, not yet much studied, day and another theory the explanation of how those would tell us how experts learn to translate from foreign initial conditions came about. dialects. Thus, we can study the behavior of an accomphshed The flow between cognitive and social runs in both expert and compare it with the behavior of a novice, directions. Social psychologists have long been inter- while putting aside the explanation of how the expert ested in how people form beliefs, or models, about other became so. We can study how different strategies—plans persons. Theories of person perception need to be inte- versus situated action, say—lead to different behaviors, grated with cognitive theories about knowledge acquisi- but study separately how strategies are acquired. tion and formation of representations. There is no a priori Similarly, we can factor, if only incompletely, the syn- reason to suppose that different processes are involved in tax of language from its semantics, and thereby study the two cases. how speech strings are processed more or less indepen- dently of our study of how large structures of knowledge are organized when they are stored in the human brain. DIVIDE AND CONQUER In trying to understand the behavior of three women Unified Theories on a street in Singapore, we have already set a dizzying array of tasks for psychology: to explain the migrations of In pointing to the virtues and even necessities of the peoples; the origins and changes in their languages; their divide-and-conquer strategy, I am not denigrating the ef- development as individuals in society; their gradual ac- forts of others to build unified theories of cognition: quisition of values, skills {including skills of social inter- Anderson's (1983) ACT*, Newell's (1990) SOAR, or action), knowledge, and attitudes; the adaptation of their Rumelhart and McClelland's (1986) connectionist sys- behavior to their goals; and the physiological underpin- tems—-just to mention the efforts of some colleagues. But nings of ail of these processes. It appears that we are we must understand the goal of those efforts. The goal is going to have to build computer programs, systems of not to erect a single system representing the "whole difference equations, of immense complexity to explain man." Rather, it is to show how a single control structure such behavior. can handle all of the cognitive processes of which the human mind is capable. Perhaps the activity would be better understood if it Forms of Subdivision were labeled "unified theories of the control of cogni- Fortunately, we do not have to explain everything at tion." In any event, the effort to build such comprehen- once, or within the boundaries of a single program. We sive control structures does not in any way make otiose have already seen that complex phenomena can usually or superfluous efforts to build explanatory theories of be segmented into levels from macroscopic to micro- components of cognitive performances, and to build scopic, separated by both the spatial and the temporal them at various levels of aggregation. scales ofthe events they describe. Provided that the phe- For a realistic conception of what unified might mean, nomena are roughly hierarchical in structure, as most we need to look over our shoulders at that most unified natural phenomena are, we can build explanatory theo- and parsimonious of sciences, physics, with its hundreds ries at each level, and then bridging theories that iink the of pages of theory of specific phenomena at various levels aggregated physiological behavior to the units of expla- of detail and resolution, all bound together rather shakily nation at the symbolic level just above. into the broader structures of quantum mechanics, rela- Above the symbolic level, we can study more compre- tivity theory, and the still somewhat visionary unified hensive social phenomena on a different time scale, with- field theories.

VOL. 3, NO. 3, MAY 1992

PSYCHOLOGICAL SCIENCE

Herbert A. Simon

And if a look at physics does not persuade us that wil! continue to improve, but we do not need to be un-unified theories tell only a small part of the story, we can happy with our current ability to test our theories of cog-inspect chemistry, and biology, and geology, and genet- nition.ics, where the point is even more glaringly obvious. Along one dimension at least, considerable unhappi- ness is still expressed. How can we test the significance of the discrepancies we find between our models and the METHODS FOR THE STUDY OF BEHAVIOR observed human behavior? Computer programs are com- Our methods for gathering data to test our theories plex, having many degrees of freedom. By taking advan-must fit the formal shapes of the theories. I limit my tage of this freedom, cannot we simply adjust the pro-remarks to theories of symbolic cognitive processes. gram ad hoc to fit any data?What are appropriate methods for testing the fit of com- A sound caution underlies this objection. Our confi-puter programs (difference equations) to human behav- dence in a theory grows, and should grow, with increaseior? The programs predict the next action a system will in the ratio of the number of data points explained to thetake as a function of its present state and current input; number of degrees of freedom in the theory. A theorythat is to say, they predict what production will fire at expressed as a computer program has many degrees ofeach successive moment. The fineness of resolution of freedom. But a human thinking-aloud protocol, or a set ofsymbolic programs is of the order of tens or hundreds of such protocols, contains a great many data points. It ismilliseconds: The programs predict what the subject will the ratio that counts, and that ratio can be very large.do each few hundreds of milliseconds. Standard procedures for evaluating the fit of computer Contemporary technology largely limits us to observ- programs to data are lacking today. The familiar tests ofing subjects' visible and audible behaviors, and the rich- statistical significance are inappropriate. The percentageest streams of such behaviors are verbalizations and eye of variance explained is more useful, but does not takemovements. Under most circumstances, we do not yet into account the number of degrees of freedom. I have noknow how to interpret in detail the information we get precise solution to offer to the problem, but the directionfrom electrical measurements on the scalp. in which we should look for one is obvious. We can obtain data for analyzing the behavior of the Search for alternative ways of testing our theoriesTamil women because one of them, not wanting to miss brings us back to more conventional psychological ex-any of the details of the recipes, is tape-recording their periments. Conventionally, we observe a few behaviorsconversation. Unfortunately, the available technology (latencies, accuracies) over some minutes, then averagedoes not permit us to record eye movements on a street the data over tasks and subjects, then compare the aver-in Singapore. aged numbers between control and experimental condi- Data on eye movements and verbalizations are still too tions. While this standard procedure is often useful andcoarse to capture all the behavior at the symbolic level. valuable, it also suffers from severe limitations. Its tem-In eye movements, we may detect a new saccade every poral resolution is very low; it can seldom be used toV3 or y^ s. In verbalization, subjects may utter a clause or study individual events of a few seconds' duration.phrase equivalent to a proposition every 2 or 3 s, at best. More serious, conventional experimental methods doMuch of our inference from behavior to the underlying not deal with the serial dependency of events on thisprogram has to be indirect. temporal scale. Since the execution of each production of But that is no cause for dismay. In this regard, cogni- the cognitive system can change memory contents,tive psychology is not different from the other sciences, hence, change the conditions that determine what pro-which are always inferring underlying theoretical pro- duction will fire next, it is hard to test an explanation ofcesses from gross observed events. At that future time the behavior unless this temporal dependency can be cap-when we shall obtain direct evidence, say, electrochem- tured in the data. In particular, averaging over subjects isical evidence, identifying precisely the sequence of pro- bound to destroy sequential contingencies. Verbal proto-cesses being executed, the game will be over and we will cols and eye movement records are almost the only formsneed to look for new domains of research. But we need of data that give us any means for capturing these con-not hold our breaths while waiting for that to happen. tingencies. We now know the difference between verbal proto- A principal means for testing theories of cognition atcols, interpreted as behavior, and introspection (Ericsson the level of elementary symbolic processes is to compare& Simon, 1984). Over the past quarter century, we have the successive behaviors the theories predict with thegathered vast experience in encoding verbal protocols successive behaviors of subjects revealed by thinking-and eye movement records at a level of detail that permits aloud protocols and eye movement records. The proce-us to test what productions are being executed. We dures for testing goodness of fit are not yet standardized,should strive to improve these methodologies, and they but the underlying principle is to demand a high ratio of

VOL. 3, NO. 3. MAY 1992 159

PSYCHOLOGICAL SCIENCE

I \pi.inaluin ol lUiiavior

d;itii points lo luinibcrs ol piodiuiions in the simulalion given task. A theory of their performance vi'ould describe progratTis. their strategies and specify the circumstance under which each strategy will be used. The core of an expert or expert system is a system of CONCLUSION productions that operates like an indexed encyclopedia. We have left our Tatnil women standing on the street External or imagined cues are recognized by the condi- in Singapore, but I am sure that they will finish their tions of productions, triggering the associated actions. conversation and return home before the heavy after- The case in which the cues are predominantly external is noon shower drenches them and refreshes the city. They sometimes called situated action. have given us some hope that their behavior, as an ex- The production system of an expert is associated also ample of the general run of human behavior, is explain- with reasoning (search) capabilities that support an inte- able, and that today we already possess many important grated system of analytic and intuitive responses. pieces of that explanation at the level of symbolic pro- cesses. Adaptive Systems By way of summary. I recall here the main generali- zations we reached along the way: The human mind is an adaptive system that chooses behaviors in the light of its goals, and as appropriate to context. Moreover, it can store new knowledge and Computer Programs as Theories skills: It can learn. For systems that change through time, explanation The link between goals and environment is mediated takes the form of laws acting on the current state of the by learned strategies and knowledge. Behavior cannot be system to produce a new state—endlessly. Such expla- predicted from optimality criteria without information nations can be formalized with differential or difference about the strategies and knowledge agents possess or ac- equations. quire. A properly programmed computer can be used to ex- The study of the behavior of an adaptive system is not plain the behavior of the dynamic system that it simu- a logical study of optimization but an empirical study of lates. Theories can be stated as computer programs. the side conditions that place limits on the approach to Controlled experiments can be performed on com- the optimum. puter programs to determine how such changes affect the performance of tasks. The programs can then be modified Cognitive and Social Psychology to simulate the human performance better. Programs that simulate cognitive processes describe Since strategies and knowledge are both largely ac- these processes in symbolic languages and actually exe- quired from the social environment, there can be no cute the processes. Consequently, they test the suffi- sharp boundary between cognitive psychology and social ciency of the theory to perform the tasks. psychology. The context in which knowledge is acquired and used, an exogenous variable in cognitive psychology, provides the endogenous variables for social psychology Symbolic and Physiological Explanation and sociology. Explanation of cognitive processes at the information processing (symbolic) level is largely independent of ex- Verbal Protocols as Data planation at the physiological (neurological) level. Explanation on different levels does not deny the pos- Theories of cognition can be tested by comparing the sibility of reduction. Higher level theories use aggregates behaviors they predict with the successive behaviors of of the constructs at lower levels. The lower level details subjects revealed by thinking-aloud protocols and eye do not show through to the higher level. movement records. Strictness demands a high ratio of Some of the properties of systems can be captured in data points to numbers of productions in the programs. static laws that specify the relations among variables, In summarizing at this high level of abstraction, I have qualitatively or numerically. left out all of the rich detail of the behavior we can ex- plain: chess playing, medical diagnosis, problem solving in physics and mathematics, the use of diagrams in think- Dependence of Behavior on Knowledge ing, scientific discovery—yes, and even the Tower of Different people, or the same people in different situ- Hanoi, and a conversation about cookery on a street in ations, can employ different strategies for performing a Singapore.