October 30, 2007

Last year I posted a series of thoughts about two story generation systems: Minstrel and Universe (12345). I had some critical things to say about the Minstrel system, but they were based on my reading — I hadn’t yet been in contact with the system’s author, Scott Turner. This month I finally connected with Scott, and yesterday he sent me the following thoughtful response to the issues raised by our previous discussion on Grand Text Auto.

I’m particularly happy about this because Scott has graciously offered to try to respond to any further questions in the comments for this thread. Also, I’ll be paying close attention to the conversation, given I’m writing about Minstrel in my forthcoming book. Below are Scott’s thoughts.

I haven’t worked in AI for many years, but I was delighted when Noah contacted me and I had a chance to read the discussion on this blog of my dissertation work. At the time I did this work there was no Internet as we know it today, and in some sense I worked virtually in isolation. No one else was working on computer storytelling, creativity or related subjects such as interactive fiction. The best that I could hope for in the way of a community of interest was occasionally meeting up with folks like Michael Lebowitz at a conference. I can’t help but think that if I were doing my work today, the feedback I could get through the Internet would greatly improve my results. The Internet is truly wonderful in the way it can bridge space and economics to bring together similar interests in ways that could never happen in the physical world!

After Noah pointed me towards this blog I read through the discussion of Minstrel and found it very thought provoking. I thought I’d take a few minutes to share some insight into how Minstrel came to be and discuss some of the issues that Noah raised.

At the time I began my work on Minstrel the only previous work on storytelling was Meehan’s Talespin. Talespin was intriguing because it was clearly an incomplete approach to storytelling1 and yet what it produced had many of the necessary elements of stories. Talespin’s stories were, if nothing else, internally consistent and about the sorts of things we expect in stories. What was missing was purpose. An author doesn’t just ramble a stream of character actions; he creates his story to try to achieve his own goals.2

Now, Talespin was essentially a planning engine3, so it seemed reasonable to build a better storytelling program by simply augmenting the Talespin model with a “meta”4 level of goals and plans representing what the author was trying to achieve with his storytelling. And, in fact, the first versions of Minstrel operated just this way.

One problem became immediately obvious with this approach: the stories weren’t original. They just regurgitated what Minstrel already knew. In truth, Talespin had this problem as well, but because Talespin had a fairly large dictionary of character goals, plans and actions it wasn’t as immediately obvious that it was only “shuffling the pieces around the board.” And while re-ordering knowledge might provide a low level of creativity, it clearly wasn’t sufficient for creating interesting stories.

Consequently my effort shifted from storytelling to creativity. Storytelling went from being an end in itself to being the domain in which Minstrel demonstrated creativity.5

Now, there are many fascinating questions about creativity, but one that intrigued me was how people could be creative across many different problem domains. (I’m speaking here of normal day-to-day creativity, not Thomas Edison-level creativity.) It seemed unlikely to me that people developed wholly different creative processes for different problem domains. Instead, I looked for ways in which creativity could be embedded in the low-level cognitive processes that underly all intelligence.

The result was Minstrel’s notion of “creative memory.” In Minstrel, if you try to recall something and draw a blank,6 the creative process kicks in and tries to imagine something appropriate. This notion has a beautiful elegance to it, because (whether one believes specifically in case-based reasoning or not) it is clear that memory underlies intelligence. So a creative memory — imagination — immediately adds creativity to all the upper levels of cognition.

There’s much more to be said (and argued) about creativity, but I will skip over that for the moment.

As I began to implement this model of creativity in Minstrel, I ran headlong into another difficult problem — knowledge and context.

Consider this (true) example of creativity:

A motorcyclist is motoring down a lonely road when his crankcase cracks. He pulls over in time to see all the oil drain from his engine. He has spare oil, but not welding equipment to fix his cracked crankcase. After some thought, he opens both ends of a can of beans, puts the cylinder on the road and builds a fire inside of it. He lets the fire burn for about a half-hour. Then he removes the can, uses a stick to scoop up the softened asphalt and patches the crack. He lets the patch dry, refills the oil, and limps into the next town for more permanent repairs.

Suppose we try to detail all the knowledge that the motorcyclist required to invent this solution to his problem: roads are made of asphalt, asphalt melts at a low temperature, asphalt is sticky when melted, asphalt adheres when dry, asphalt is impermeable to oil, food comes in cans that are cylindrical and made of metal, a can opener can remove the entire lid of a can, removing both ends of can forms a cylinder… A full listing of all the knowledge needed to understand/create this example would cover pages.

To put it another way, before we can be creative, we need all the knowledge and cognitive processes to be non-creative.7 Doug Lenat has had 40 people working twenty years on cataloging this sort of common-sense knowledge and hasn’t yet finished. So clearly I, the lonely graduate student, faced a serious problem.

The solution I chose was to limit Minstrel to a “toy domain”8, and because of the depth of knowledge Minstrel required, it was a very tiny domain indeed. So when Noah asks why Minstrel was “starved for data” the primary answer is that I didn’t have the manpower to represent and encode more data. I had to choose between coding up more example stories, or implementing the processes of storytelling and creativity. In addition, I was wary of the criticism I might receive if I gave Minstrel “interesting” data to start with. Certainly if I’d started Minstrel with a wide variety of unique and clever story scenes, it could have produced unique and clever inventions. But would the creativity lie in Minstrel or in me? So I chose instead to limit Minstrel to a small set of “uninteresting” data to make it clear that any creativity displayed was due to Minstrel.

Nonetheless, Minstrel was a brittle program. My contention is that if you give me a robust, non-creative program that demonstrates all the world knowledge, intelligence and problem-solving ability of the average 21 year old, I’ll be able to implement a robust creative program atop that. But I didn’t have that luxury. I had to build just exactly those parts of that robust intelligence I needed to demonstrate my thesis. Naturally, if you stray even slightly from those limits, things break.

Talking about Minstrel’s brittleness, Noah says:

But it is important to remember that this problem arose with the completed system (and not an incomplete one, as with the mis-spun tales of Tale-Spin reprinted by Aarseth, Murray, Bolter, and others).

This is, I think, a false distinction. In this sort of research, the programs we write aren’t products that are “complete” when the Ph.D. is awarded9. They are vehicles for implementing, experimenting with, and understanding theories of artificial intelligence. Meehan fixed many of his mis-spun tales along the way to the “final” version of Talespin, and I did the same with Minstrel. At any rate, I’m dubious (as was Meehan) about drawing anything more than entertainment from these mis-spun tales. As one of my committee members said10, “Don’t these just prove that you’re a bad programmer?”

Noah also repeats the comment from Rafael Pérez y Pérez and Mike Sharples:

[T]he reader can imagine a Knight who is sewing his socks and pricked himself by accident; in this case, because the action of sewing produced an injury to the Knight, Minstrel would treat sewing as a method to kill someone.

Like some of the commentors on the previous thread, I actually find this an (unintentionally) wonderful example of creativity, and exactly the sort of thing Minstrel ought to be capable of creating. There’s an Irish folk song in which a woman imprisons her husband by sewing him into the bedsheets while he sleeps. Doesn’t that show exactly the same creative process (magnifying a small effect to create a large one)? The problem with Minstrel is that it would do a bad job of the magnifying. It would write something like:

The Knight killed himself by sewing on himself.

because it doesn’t have the knowledge and reasoning processes to create a more plausible “magnification” of the effect. But imagine a novel in which a terrorist in solitary confinement is given access to a sewing needle because “What could he do with a sewing needle?” He then manages to kill himself by plunging the needle into his carotid artery — seems like a fine plot device, yes? But all this really says about Minstrel is that it isn’t a complete reasoner — which I’d not dispute.

As to the larger question of whether it is even possible to capture the knowledge necessary for human-level storytelling, or whether “scruffy” AI is a bankrupt idea, I’ll not comment except to say that it takes the most highly-evolved learning machines on the planet at least 20 years to build that sort of knowledge base, and even then most of those do not become good authors. So expecting a complete solution from a program with a few grad-student-years of effort sets the bar unreasonably high.

Noah concludes by saying that whether Minstrel is an improvement over Talespin is “debatable.” That seems like hyperbole to me. I can’t imagine that anyone would seriously argue for a model of storytelling that didn’t include explicit author goals, or creativity, or the notion of boredom — all of which Minstrel brought to the table. There’s certainly plenty of room to criticize Minstrel, but I think it was a leap forward from Talespin and began to address some of the difficult (and fascinating) issues in both storytelling and creativity.

1. Indeed, the only way Talespin produced narratives that seemed at all like stories was to force the narrative into a supplied “storymold”.2. It’s worth noting that Meehan understood this point very well.3. My advisor once asked this question: “What are the fundamental differences between Talespin and blocks world planning?”4. As we used to say in the UCLA AI Lab, “Anything you can do I can do meta.”5. However, Minstrel was also capable of doing device invention using the same creativity techniques it used in storytelling, and lately I’ve been applying the same creativity techniques to music composition.6. Or find the answer you’ve recalled “boring.”7. Common-sense knowledge also serves as a back-end filter or “sanity check” on creativity, although Minstrel didn’t model this.8. A very common solution to that problem in those days.9. Indeed, Minstrel was more a tool set than a monolithic program. It was constantly being changed, updated and used in new ways as I explored different issues.10. If I recall correctly. Perhaps this comment was made to Meehan and I’m suffering from “creative memory”.

15 Responses to “Scott Turner on Minstrel”

Scott, it’s a pleasure to have an opportunity for public discussion of these ideas with you. As you might have noticed in one of my comments on the previous discussion, the question for me, originally, was whether to view Minstrel primarily as an exploration of ideas of human creativity (simulating authors) or as an exploration of new possibilities for media (generating stories). It’s a very helpful clarification for me when you write about the point at which your

effort shifted from storytelling to creativity. Storytelling went from being an end in itself to being the domain in which Minstrel demonstrated creativity.

It’s also helpful for me that you frame questions about Minstrel in terms of the common-sense reasoning problem. Yes, limiting the domain to a microworld is a common way to try to get around this problem. In some ways this points to an exciting future for these techniques in areas of media (e.g., games). After all, games are microworlds — we can specify exactly how they behave, what objects exist, and so on. We can leave out unrelated knowledge (e.g., sewing doesn’t matter to the King Arthur domain) and lessen the chances of inappropriate generations.

However, the greater the leaps the system can produce (a knight eating a princess, suicide by sewing) the more we need to be able to reason about the appropriateness of what it produces. Which brings us to a reasoning problem that we can’t solve by limiting the domain — because the very nature of these leaps is that they stretch beyond the bounds of the expected within the domain. This means that Minstrel probably doesn’t point in the right direction for those seeking new routes to making media, even if it represents an exciting result in the area of computational creativity.

Or, at least, that’s how my thinking runs at the moment. I’d be very interested to hear if your thinking runs similarly.

Less generally, I’m also interested to read that, “At the time I began my work on Minstrel the only previous work on storytelling was Meehan’s Talespin.” Would that mean you don’t really count Natalie Dehn’s work? Reading back over the conversation from last year, I was reminded by Michael Joyce’s comment that her ideas influenced a number of people in the early 1980s.

Thanks for your post, Scott, it’s really interesting to hear more details about the creation of Minstrel. Several of us on this blog are working on or towards generative story systems, including myself, and I find Minstrel very informative and inspirational.

Noah, I wonder if even within a limited domain, e.g. a game microworld, that a system like Minstrel can create some creative, unexpected, and entertaining results. If one adds a layer of reasoning about appropriateness to a Minstrel-like approach, as you suggest, I have optimism it can be used to give narrative intelligence to games / interactive stories. It would require significant R&D to achieve this, of course.

Andrew, I think my comment may have sounded too negative. I think you’re exactly right. The question of whether a Minstrel-style approach could be used for media-making is far from settled: it’s a research question. That research really needs to take the form of system building.

But, of course, the system building will be guided by an idea of what might work. Someone needs to have an “aha” moment — an insight into how you could keep a Minstrel-like approach freewheeling enough to create interesting results while somehow limiting inappropriate results. It might not seem as creative, in this new version, but it could serve the purpose of media-making (rather than exploring questions of computational creativity).

One possibility for this media-focused system might be to constrain the world in further respects. Another might be to somehow limit the freedom of movement given to TRAMs. A third might be some evaluation at the end that disposes of inappropriate generations.

At the time I wrote my comment I was thinking of only the last of these approaches. But perhaps the last is also the least tractable. I’m guessing it’s more likely that the first, second, or a combination of them might provide a realizable “aha” moment for someone out there.

If you’re willing to exclude some possibly good but unanticipated productions by writing down a set of criteria for what’s (minimally) acceptable and rejecting anything that fails them, the “evaluation at the end” machinery already exists in Minstrel, so architecturally that seems like it’d be the easiest route to try. The story-generation domain uses only a fairly minimal assessment, the “boredom” criterion (reject anything too similar to previous stories), and so generates pretty promiscuously. In principle the assessment could contain anything, though, and Turner gives an example in the tool-invention case study, where his assessments are much more concrete in order to reject tools that don’t work. (Pages 35-36 of The Creative Process give an overview of how assessment fits into the architecture.)

Mark, I think I see your point — but to me it’s non-obvious how to do this for stories (as opposed to, say, for tools). Maybe it’s worth brainstorming here. Do you have some criteria you might imagine for judging acceptable stories that would be aimed at narrowing the range (rather than, as with boredom assessment, expanding it)?

In parallel I’ll go back and take a look at the passage you’re discussing. Recently I’ve also been looking at Scott’s dissertation, which goes into more detail (in some areas) than the book — so I’ll also check out the version there.

Ah yeah, I’ll admit I wasn’t thinking about stories per se. Your comment about “media-focused” systems and Scott’s comment about music composition sent me off into a mental tangent about using Minstrel-style approaches to generate components of multimedia art, rather than narratives. I could imagine more easily using the assessment approach for that sort of thing, since it aligns more with a tool metaphor. In particular it could be used as a component of some larger system that asks a Minstrel-like system to generate novel components (visual art, music, widgets, whatever) meeting certain specifications whenever it needs them.

In the more general narrative case I’ll agree it’s harder, since it’s basically a commonsense-reasoning problem to be able to reject stories that seem random, aren’t interesting, aren’t interpretable, etc.. Michael and I have been running up against a similar problem in our game-generation work: even in simple games of the level of complexity of a WarioWare microgame, it’s difficult to find a way for a machine to distinguish “surprising but interpretable (or even clever)” from “completely nonsensical and seemingly random”. Even the largest commonsense-reasoning databases (like Cyc) are still far off from encoding enough relevant cultural and cognitive knowledge, imo, to be able to distinguish that sort of thing. I’m also skeptical that some general non-messy solution exists, since I think a lot of the judgments are based on bits of cultural history and human perception that didn’t have to be the case, rather than on universal truths about rationality or something. For example, a common interpretability strategy is to reference well-known game or film or story or comic tropes, but doing so effectively requires a system to know what such tropes are, and how to know when a person would recognize them (versus the reference being too subtle, or too obscure, say)—basically a huge knowledge-engineering problem in a poorly-mapped-out space.

From the Minstrel perspective though I think that might be focusing on a criticism of something that wasn’t really the intent. I read Minstrel mainly as a proposal for how a creative system should generate things given an appropriate assessment—the whole approach of failure-driven reasoning and whatnot depends on having some notion of what a failure is. To that end I saw the “boredom” heuristic more as a really simple test case than as the main result: it demonstrates shows that if your main criterion is avoiding boredom in this toy story world, then Minstrel finds a way to do so. If your criteria are something else, how to write them down is a whole different problem, though I’ll agree that if they turn out to be impossible to write down for a particular domain it’d point away from Minstrel being very useful in that domain.

(I haven’t gotten myself a copy of the dissertation yet, unfortunately, so everything I say is from reading the book.)

We can leave out unrelated knowledge (e.g., sewing doesn’t matter to the King Arthur domain) and lessen the chances of inappropriate generations.

Imagine that our cognitive processes have evolved to be very efficient for normal day-to-day situations. The picture we might have of “normal” cognition would be robust “common sense” processes operating on very relevant knowledge. So when we’re out for a walk and see someone hurl a stone at us, we can react quickly, using simple, fast cognition about how moving objects generally behave, the effects of getting hit by objects and so on. We’re not thinking about sewing, or elephants or anything else not obviously relevant.

Now we might also evolve different cognitive processes for dealing with unusual situations, particularly where our normal processes have failed. We’d expect these to be less efficient (in the sense of more often creating poor solutions) and to make use of less relevant knowledge (since we only fall back to these processes when using the relevant knowledge has failed).

If we call that second category “creativity” then it seems likely that building a computer model of creativity that uses only a very limited set of relevant knowledge in a microworld is self-defeating. And I think that’s a legitimate problem for Minstrel. To some extent you can get some interesting cross-fertilization even with a limited microworld, but it would be very interesting to see if a model could be built that could robustly reason creatively across very disparate domains.

However, the greater the leaps the system can produce (a knight eating a princess, suicide by sewing) the more we need to be able to reason about the appropriateness of what it produces.

If your criteria are something else, how to write them down is a whole different problem, though I’ll agree that if they turn out to be impossible to write down for a particular domain it’d point away from Minstrel being very useful in that domain.

Ah, but as Dave Jefferson challenged me on this issue: If you have the knowledge to recognize a bad solution, why did you create it in the first place? It’s hard to swallow a model of cognition along the lines of “generate a bunch of random answers and then filter out the ones that don’t work.” My answer goes back to the notion above about common-sense reasoning versus creativity reasoning, but I think it is a legitimate issue. And even if people do this sort of post-hoc editing of creative reasoning (and IIRC the psychologists are split on the issue), why should we build computer programs with the same limitations?

As I recall, one reason I had the post-hoc assessment in the case of device invention was that it let me re-use the TRAMs from storytelling essentially unchanged. If we borrow a notion from CBR that knowledge starts out very specific and becomes generalized as we experience it more often, then it may be that our creative processes are fairly specific (since we use creativity much less frequently than common-sense reasoning). But (as pointed out above) this is inconsistent with the notion that creativity must apply broadly across domains! So post-hoc assessment by common-sense reasoning might be a “hack” to compensate for this problem. This notion suggest some interesting experiments about people who are creative in several different domains. Have they generalized their creativity? Or are they especially good at applying their common-sense filters?

…it’s difficult to find a way for a machine to distinguish “surprising but interpretable (or even clever)” from “completely nonsensical and seemingly random”

In the case of Minstrel, it didn’t generate nonsensical ideas because everything was goal-driven. Killing yourself with a sewing needle might be naive, but it isn’t nonsensical. I always thought that Lenat’s EURISKO was interesting in this way — it couldn’t create a nonsensical solution, but it could creatively explore the solution space to find unlikely but effective solutions.

(I haven’t gotten myself a copy of the dissertation yet, unfortunately, so everything I say is from reading the book.)

The Tech Report Noah references above is the dissertation (in a more compact form).

I’ve actually thought only a little about how human creativity works. But I had a very interesting series of conversations earlier this year with Lev Manovich, Jim Hollan, and Falko Kuester about how we might examine creativity as something that happens across globally distributed communities. In some ways this connects to Lev’s interests in the mixing of media, the movements of fashion, and so on. In other ways it connects with Jim’s work on distributed cognition. Which is all to say, I’m intrigued by the idea that we might view a Minstrel-like system as simulating a social process as much as an individual process. People create lots of pretty wild things, other people serve as a filter for those creations, and the elements from things people view as successful get partially re-used in their own creations.

Less speculatively, one of the questions that interests me (and Andrew) is whether a Minstrel-style system might be useful in an interactive context. I think interaction is one of the ways that audiences can really understand the power of generative systems — the responses to their actions take a form based on the processes at work, with the back-and-forth starting to reveal some of the contours. This would be good for a system, like Minstrel, that has processes of an intriguing shape (whereas it’s bad for systems like Eliza, which have basically boring underlying processes).

Right now, the best way I can think of to create an interactive media experience around something like Minstrel is to offload the commonsense reasoning on the audience. Rather than presenting finished stories generated by the Minstrel-style system, the system would be presented as a tool or partner, generating candidate fictional elements which the audience could selectively accept or reject, allowing the audience to guide the results toward the traditional or the surreal. I’m finding myself intrigued by the idea…

Less speculatively, one of the questions that interests me (and Andrew) is whether a Minstrel-style system might be useful in an interactive context.

I’m fairly naive about IF, but I did speculate a bit about running Minstrel “backwards” to go from episodes to higher-level structures. In a very simplistic way this is how Minstrel begins a story — by getting a bit of input the reminds it of a specific memory, which in turn reminds it of a plot. Something of the same sort could be attempted for interactive fiction. Supposing that the human in the loop enters some action, Minstrel could then try to be reminded “backwards” of some plot structure that action could fit. For example, the player drinks a potion; that reminds Minstrel of “Romeo & Juliet.” Then Minstrel could try to inject elements into the story to complete a Romeo & Juliet-like plot. Creativity would be useful both in recognizing a past plot and creating the new plot elements

Right now, the best way I can think of to create an interactive media experience around something like Minstrel is to offload the commonsense reasoning on the audience.

I had to be careful in creating Minstrel not to inject my own intellect into the results, but it seems clear to me that I could have improved Minstrel’s output considerably just with better use of language and “fleshing out” Minstrel’s reasoning. So the notion of a collaborative system which uses human reasoning to augment the computer’s creativity is intriguing. On the other hand, you’d probably get people arguing that creativity is exactly the part that people do best!

[…] generated increasingly by machines. Two examples are Brutus and the 20th century’s MINSTREL (see Noah’s comments on MINSTREL). Why should we worry about originality in student work if we are perhaps only a couple […]