THE AGE OF INTELLIGENT MACHINES | The Mechanics of Creativity

September 24, 2001

Roger C. Schank directs the Institute for the Learning Sciences at Northwestern University, where he is also John Evans Professor of Electrical Engineering and Computer Science, Psychology, and Education. Previously, he was Chairman of the Computer Science department at Yale University and Director of the Yale Artificial Intelligence Project. In addition, he was Assistant Professor of Linguistics and Computer Science at Stanford University. Schank holds a Ph.D in Linguistics from the University of Texas at Austin. He is the founder of two businesses, Compu-Teach, an educational software company, and Cognitive Systems, a company specializing in natural language processing. An internationally acclaimed researcher in artificial intelligence, Schank is the author of numerous articles and books, including Dynamic Memory; Scripts, Plans, Goals, and Understanding with Robert Abelson; and The Cognitive Computer and The Creative Attitude with Peter Childers. Christopher Owens is engaged in AI research at the Artificial Intelligence Lab at Yale University, where he is currently completing a Ph.D. His primary interests are studying the organization of human memory and applying the principles thereof to the task of making machines smarter. His work focuses on people’s ability to reuse old knowledge to solve new problems, specifically the kind of frozen, culturally shared knowledge typified by the planning advice given in proverbs and other folk adages.

What exactly is creativity? Could a machine ever be creative? These are questions psychologists, philosophers, and AI researchers would all like to be able to answer. But what kind of answers are we looking for? The search for a rigorous philosophical definition of creativity has been overworked, and we don’t intend to further pursue that course here. On the other hand, redefining the question in AI terms and applying AI research methods might result in a new and useful kind of answer, or at least an interesting set of new questions to consider.

A persistent criticism of AI work has centered around arguments like, “By its very nature, a machine could never really be creative.” Since a basic assumption underlying work in computer science is that a machine can perform any task that can be described via sufficiently specific rules, people who make statements like the above mean that no rules can ever be found that will account for creativity and other quintessentially human behavior, that there is something inherently mystical in these abilities that cannot be expressed via rules and procedures. Or else they might mean that even if such a set of rules and procedures could be found, a machine that was obeying them would only seem to be creative. Its behavior, they say, would be a kind of elaborate parlor trick; it would be achieving its effect merely by fooling us.

But as AI researchers and cognitive scientists, our work is based upon the assumption that rules and procedures underlying human behavior can be found. Our job is to define problems in such a way as to maximize our chances of succeeding in this endeavor. Our goal is to come up with an algorithmic definition of creativity, a set of processes and steps that can account for the kind of creative thinking that we observe in people. Although the idea of a human or machine exhibiting creativity by following a set of rules seems on the face to be a contradiction, this is not necessarily so. If we can agree on some kinds of behavior that constitute creative thinking and can develop an algorithmic model that accounts for these behaviors, then we have an algorithmic theory of creativity and hence a first step toward creative machines. Whether or not a philosopher would agree that the resulting machine truly embodied creativity is almost irrelevant to us: building machines that act in ways that appear to be creative would be a significant enough step to undertake.

Creativity is often associated with problem solving, science, and the arts. People often view creative thinking as something out of the ordinary, as a mode of reasoning in which completed thoughts suddenly spring to mind without being cued, thoughts perhaps having nothing at all to do with what the thinker was working on at the time the thought occurred. Often people implicitly assume that creativity represents some divine, unconscious, or other inspiration out of the control of one’s ordinary thought processes. Actual case studies of scientific and artistic creativity, however, support the idea that creativity springs not from any mystical source but from a certain set of cognitive skills. There is no principled distinction between creative and less creative thinking other than the degree to which this set of skills is applied. Highly creative individuals simply have these skills better developed than do the less creative. What, then, are these cognitive skills? How are they used? How can we program a computer to exhibit them? These are questions that we can study more fruitfully than the open-ended type of question, “What is Creativity?” with which this chapter opened.

In our view, the basic cognitive skill underlying creativity is the ability to intelligentlymisapply things. A creative solution to a problem is one that uses an object, technique, or tool in a useful and previously undiscovered way. A creative work of art, similarly, might use some material or image in a new way. A creative piece of scientific research might involve, for example, applying a principle from one field to a problem in another. At Yale we are studying the cognitive skills underlying one particular type of creative behavior, the creation of novel explanations for unexpected events. Explanation is a kind of problem solving in which the problem is of a mental nature: “How can I connect this new and strange piece of knowledge with the rest of my knowledge so that it all makes sense,” or “What pattern of events do I know about into which I can fit this new fact.”

Of course, by this definition, many kinds of understanding can be seen as explanation, in that all understanding consists of integrating new facts into existing knowledge. But what we are interested in here is the kind of explanation that requires conscious work, the explanation of events that are at first puzzling. This kind of explanation may require searching for a missing piece of knowledge, or it may require finding some previously unseen causal connection. Often an explanation can be found by seeing one object or event as similar to another in a way that was not previously noticed, in other words, by making up and using a novel analogy. For example, when we asked people in the Yale AI lab to try to explain the unexpected death of the successful three-year-old race horse Swale, one student was reminded of the death of Jim Fixx, the runner. He reasoned that Swale was like Fixx in that both participated in regular strenuous activity, and that possibly Swale, also like Fixx, had a congenital heart defect.

This kind of reasoning from prior examples is very important to our approach to explanation. Although an understanding system could conceivably explain each new situation by reasoning from first principles, chaining together all the rules and pieces of knowledge it needed to build the entire explanation from small elements, this probably does not happen very often. One reason is that the computational complexity of this task is unreasonably large. Another is that people seem to be able to use remindings and analogical reasoning to construct explanations: they can use the same explanation over and over to cover a range of similar and thematically related situations.

For a second example of this kind of reasoning, consider the folk use of proverbs, which can be viewed as a kind of extremely abstract pattern used by people to explain unfamiliar situations in a familiar way. When someone standing in the rain beside his disabled car and wishing he had had that overdue tune-up analyzes the situation by saying “A stitch in time saves nine,” he has placed the situation in a context in which knowledge about topics like prevention and the bad effects of not taking precautions is available. The causal reasoning represented within this proverb is available without the effort of building an analysis or explanation from scratch.

In a manner similar to the way people might use proverbs, our systems store and reuse old explanations using a knowledge structure called an Explanation Pattern, or XP. Like a proverb, an XP is a frozen piece of causal reasoning. Because it is designed to be retrieved, modified, and applied by a computer program, it has the following basic components:

A characterization of a set of situations to which it is likely to apply, for example, deaths of athletes

A characterization of a broader set of situations to which, even if it does not apply, it is likely to be at least relevant, for example, unexpected bad outcomes

A causally annotated description of the event being explained. For example: Running involves physical exertion. Physical exertion strains the heart. Straining the heart combined with a heart defect can cause a heart attack. A heart attack can cause death.

Since we are viewing explanation as a kind of problem solving, and since a creative solution to a problem is one that uses an object, technique, or tool in a useful and previously undiscovered way, a creative explanation is one that uses an XP in a novel and previously unencountered way. If the basic idea of an explanation system is to select a relevant XP from memory and apply it to a new episode, then the basic idea of a creative explanation system is to intelligently misapply XPs; to try, if no relevant XP can be found, to modify and apply an XP that, although at first seemingly irrelevant, might nevertheless bear some thematic relationship to the episode being explained.

The idea of using near misses is an important one. Often people associate creativity with a simple relaxation of constraints on retrieval and pattern matching, with some kind of random process by which weird remindings can be generated. Creativity, according to this view, consists of being tolerant of inappropriate remindings, in being slow to discard an erroneous idea. But that tolerance is only half the process. Because processing power is finite, the key is to be intelligent about choosing which inappropriate XP to misapply. Creativity does not lie in floundering through memory, trying one randomly selected idea after the next; it lies in finding near misses that are reasonably close to the right idea and fixing them to fit.

This approach puts a large demand on memory, since the retrieval task is no longer simply to find the closest fit from among a library of XPs (which selection, by the way, is an important task itself, and we do not mean to denigrate its difficulty here). The task of memory is now to fail gracefully: to find the closest fit, or if a close fit is not available, to find a near miss that nevertheless captures some important and relevant aspects of the situation being explained. This kind of near miss is the most likely candidate for modification.

Along with a means for getting reasonable near misses, this approach also requires that the system, when presented with an inappropriate XP, be able to analyze what is wrong with it and to select an appropriate repair strategy based upon that analysis. The student who explained Swale’s death in terms of Jim Fixx’s knew that Jim Fixx was not a race horse, that the explanation would not directly apply without making the connection between a runner and a race horse. People do this so easily that we hardly think about it, yet the task is difficult. This modification of inappropriate XPs in order to adapt them to new situations, which we have been calling “tweaking,” is central to being a creative explainer.

Our algorithm for creativity must therefore embody three processes: a means of searching through memory for applicable patterns that returns a reasonable set of near misses, a means of evaluating the near misses and seeing what is wrong with them, and a means of modifying those inappropriate patterns to suit the current situation. Of course, for any of this to work reasonably well, our creative machine must have a rich memory of facts, experiences, and relationships it can draw upon as starting points for new explanations. Searching for and adapting patterns is a reasonable strategy only if the library of patterns is large enough that the near misses will nevertheless have at least something in common with the episode being explained. Our fourth requirement, therefore, is to have a large and richly indexed memory of explanation patterns and other knowledge gained from experience. How these patterns are learned is another interesting problem to be attacked.