Playing Phantom of the Arcade within the context of a web page has me thinking about the new things we could be doing with interactive fiction that simply weren’t possible in the heyday of text adventures.

IF is an interesting game type. They player isn’t just working to complete the story, but to experiment with the gameworld and read what the author has to say. Being stuck in IF can be just as rewarding as moving forward, provided the game has interesting things to say about it.

For example:

> use fork on light socket

I beg your pardon?

Isn’t very fun. You type in something crazy and the parser just shrugs. On the other hand:

> use fork on light socket

The electricity in this house is functioning much better than your common sense. Ergo, you have been shocked.

This might squeeze a smile out of the player. Even if it doesn’t, there is satisfaction in seeing the gameworld react to your input. You connect with the author over this mutually understood action. Feedback itself is a reward.

The challenge for the author is to fill the world up with entertaining feedback. Trying to write a response for every conceivable player action is a task that scales poorly. That’s simply too much to write, and a vast majority of them will never be seen. What the author needs is to be able to anticipate likely courses of action – correct or not – and compose responses for them.

I’ve always said that you can’t beat people in aggregate, no matter how smart you are. Anyone who has ever run a tabletop roleplaying game can tell you how impossible it is to anticipate player actions, even when there are only four of them and they’re all your friends. Anticipating the actions of thousands of strangers is an impossible task. Those people will try things that never occurred to you and will come up with (sometimes legitimate) solutions to problems you didn’t envision. Every “I beg your pardon” and “you can’t do what” on the part of the parser is a tiny failure on the part of the writer as he or she tries to outwit thousands of readers.

But for games that run over the web there is the ability for the author to adapt their story in response to failed player input. What I would like to see is a system which gathers up failed actions (sorted by room) for the writer to review, sorted by frequency. So the designer can review this list and see that n number of people tried to use the glazed ham with the woodburner stove instead of using it to distract the rabid wiener dog. The author can either let the player cook the ham (which would not diminish its usefulness against the dog) or offer up some sort of reasonable response as to why they can’t or shouldn’t cook the ham, or at the very least add some text so that the game can say “no” instead of “huh?”

To do this you’d just need a bit of functionality added to the parser: Whenever it encounters something it doesn’t understand, it needs to submit something to a database on the website with the subject, verb, and room. (And maybe a couple of other tidbits for housekeeping purposes.) Within a few hours of going live, the author should have a very clear picture of where the rough spots in the game are (what rooms had the most dud entries) and what the commonly attempted player actions are in those rooms. This would be much smoother and more seamless than simple playtesting, and would include the input of all players instead of just a handful of dedicated testers. It would make the designer better at their job, and help focus effort onto the most likely responses.

I always got very frustrated with the open-ended IF games. I was constantly asking to do things that the game didn’t understand. Eventually when a player gets enough “I beg your pardon?” messages, they’re inclined to turn it off for good. Much better for me were the Microzine games for the old Apple II C computers. They were literally “Choose Your Own Adventure” novels translated into text game form with horrible dark age graphics (“Is that the sea monster or…a big pixel blob or…wow, maybe there’s a dragon, too! Oh, nope.”) You got text and then you got multiple choices which you arrowed between. What you didn’t get was “I beg your pardon.” It was more limited, sure, but much less frustrating.

What you’re suggesting, though, would revolutionize IF games for me. They might actually become playable for people like me with a low threshold for failure and very limited patience. At least the game would reward my insanity with knowing what the heck I’m saying. Get on it, Shamus. I’d actually pay to play that game.

My limited programming knowledge makes me think that a simple if-then statement would work. If you made a new function(playerresponseemail) or whatever, that says
if gameresponse = huh?
then send email to programmer
I think this would take care of this problem. Just toss in the function in the code for each room and off you go.

Obviously, it’s a lot more complicated than that, but it definitely seems within the capabilities of modern technology. Someone who can actually program should get on that.

Never seen a feedback parser, but there have been some awesome improvements in IF over the years. There are two big languages used in the DIY community: TADS and ADRIFT. TADS now has HTML support, so you can do hypertext formatting of text, including in-line images along with text. Images act as rewards for players, and a motivation for thorough exploration.

ADRIFT has an auto-map…the importance of which can’t be underestimated. I always ended up making maps on paper anyway…so not having to do that anymore is a welcome respite.

Sending an email to the programmer would be a great way to inundate the programmer with emails. Shamus’ idea is for the parser to add to a database accessible to the programmer. This is a great idea, indeed.

Yeah, I think you’ve touched upon one of the more interesting arguments in the IF world, one that has been bantered around for some time: do we need better IF parsers?

I actually started writing a blog entry with this title a while back, but never finished it. The jist of it was going to be no, we don’t — you can argue that the tools already exist, it just requires more work and dedication on the part of the author to have a system that is better able to handle unrecognized input.

Players usually cite the parser as the main problem with IF, as you describe — we want to try different things, but when the parser just responds with a generic “I don’t know what you mean,” we get turned off, particularly when this happens over and over. Players get frustrated when the parser doesn’t provide sufficient feedback to understand why a particular action didn’t work or was not understood. They also get frustrated when the parser doesn’t recognize input that it probably should understand (like if you refer to an item described in the narrative, but it’s not implemented as an object in the game world). Players are also frustrated when the parser doesn’t accept a wide range of input, such as when players try to use adverbs (quickly, carefully, angrily, etc).

Some of these things are issues that the author needs to address, and the community has discussed for some time about how adequate testing is needed to uncover most of these things. Obviously, there’s only so much testing you can do — and often authors use other IF authors or veteran IF players for their testing, which doesn’t always reveal some of the problems that might come up when less advanced players try the game. Still, after playing enough games you can generally spot the ones that have undergone thorough testing and those that haven’t. Many IF games and engines can capture the full text of a game from start to finish, which the author can then review to see what players did (or tried to do), and learn from that. A good example is Aaron Reed, whose 2005 IF game (“Whom the Telling Changed”) was a finalist at SlamDance in Park City. He saved all of the transcripts from when the game was on display at the competition, and eventually compiled and published the data on his web site. It was a very revealing look at how many (newbie) players approached IF.

Some of the other issues mentioned above, though, are less author-dependent, but there are still things that authors can do to prepare for such things. The Inform system, for instance, supports a huge library of extensions which can make the parser work better in some cases — particularly for newbies who don’t necessarily understand how parsers typically work. The aforementioned Aaron Reed, for instance, has written a customizable Inform extension called “Smarter Parser”, which allows the parser to understand a broader range of input and can direct newer players towards proper parser syntax. There are also extensions which can perform basic typo correction, so that players don’t have to re-type commands just because of a simple mistype. And there are plenty of others.

Part of the problem with IF parsers is that they need to operate under a defined set of rules, and players need to learn those rules. But I think it’s a valid argument to say that it’s largely the author’s responsibility to teach the player those rules, and the rules of that particular game world. Those rules are taught through sophisticated and comprehensive error trapping, so when the player tries to do something that breaks either the parser rules or the game world rules, the player is given a good explanation of why he or she cannot perform the desired action. If that happens, players are generally more accepting and willing to continue; in the absence of that, they are more likely to just say “screw it” and get back to blowing things up in an FPS.

This is not to say the solution you describe is not a useful idea — I think it’s a great idea, in fact. What it represents is just a unique way of expanding the “test base” for a game in a dynamic fashion. It hasn’t been done up to this point probably because playing IF within web browsers is still a very new advance for IF. As the technology develops (there are still a number of issues to resolve), I suspect we will definitely see more solutions like this.

I think the handling of unrecognized input would have to be handled a certain way, because the list of errors could potentially be huge. I think what would probably be more useful is a system that just saves every game transcript to a database and sends it to the author, and if there is any unrecognized input, it can be (for instance) highlighted in red so authors can spot them easily. That way, the author will have the full context of the error on display, without having to figure out when, where, and why it happened.

But essentially, I think what we’re talking about is a different, more comprehensive approach to (continually) testing and refining the games that we make.

Hmm…maybe I’ll just blog this whole response. Sorry for rambling so long.

Let me recommend the interactive fiction of Andrew Plotkin (who also wrote a nifty duelling-wizards game called Spellcast) – a quick Googling will turn up his lair, called Zarfhome for reasons I don’t know.

He wrote one of the niftiest IFs I’ve ever played, called “Spider and Web”, which is a story told in the form of flashbacks: you are (MINOR SPOILER, but one found in the game’s reviews) being interrogated, and when you “lie” about something you did, you drop out of the flashback into the present, where your interrogator says, “Oh, come now, we know you didn’t do *that* — let’s try it again, shall we?”, after which you’re back in the flashback again. It’s been a decade or more since I played it, but it still stands out in my mind as a very clever demonstration of what can be done with IF beyond the “take grease; put grease in lock” style.

I would like to see an AI parser that tries to figure out what you’re talking about. That is, if you find a typewriter and enter in ‘write with typewriter’, even if the lexicon doesn’t understand “write with”, the parser will figure out you’re talking about the typewriter and give you some more information about it, such as:
“You are looking at an old gray metal Remington Royal typewriter with black keys. It is quite dusty. A yellowed sheet of paper has been inserted into the carriage. There is no ink ribbon.”
This is sort of the way people actually carry on conversations, in that if they come across something they don’t understand they normally ignore it and instead respond to what they did comprehend. So while “write on typewriter” doesn’t mean the same thing as “look at typewriter” or “examine typewriter”, responding to the latter might supply important information to someone attempting to do the former (in this case, that the ink ribbon is missing).

Obviously this is a fairly trivial example, but the point is with an AI parser you could reduce the user’s frustration by building on what is understood, rather than simply what is not understood.

Alas, IF is not so simple. Look at Inform. Games designed in Inform are output to Zcode, which is a bytecode format, which is then interpreted by a Zmachine interpreter. That is to say, Inform code is always run in a virtual machine, sort of like Java. The Zmachine has limitations to what it can do and how it does things. The language you code the Zcode in has limitations in terms of what functions of the Zmachine it can tap into.

Tracking feedback from commands, like say, logging interactions that the parser doesn’t understand, would require adding a little functionality to the Zmachine interpreter itself (Parser doesn’t understand? Add it to the log with this info). Adding web formatting code would require commandeering some unused Zcode opcodes and then adding functions to the language, like Inform, to allow those new functions to be accessed.

That said, IF coding is probably easier than a number of other projects, but you have to be careful about breaking existing models because there’s such a mass of existing IF that might stop working if you don’t break things very carefully. And there are so many darned IF languages out there these days reinventing the wheel would require getting just about everything right up front in order to win converts. And backwards compatibility with other interpreters probably wouldn’t hurt.

Friend of mine was doing some research for medical expert systems that actually worked along those sort of lines. There was a technical term for it, but I forget what it was. The basic idea is that instead of trying to teach the computer to be smart like a human when you write the code, you make it easy for the expert using the system to train it to be better as they use it. Only in your example the expert is the storyteller, not a doctor.

So to extrapolate what they were doing to your idea Shamus, you’d encode a few basic properties into your items, and some responses based on that. Ham is a food, and trying to use a food on an item is, by default, trying to feed the food to the item. So when you try to use the food on the stove the system recognises this as something it doesn’t know what to do with, and then asks the expert for help:

“Should I say ‘The stove doesn’t seem interested in eating the ham.’?”
“No.”
“Why not?”
“A stove is not alive.”
“Oh. What should I say instead?”
“For the stove: The smell of cooking wafts tantalizingly through the room.”
“Which of these items are alive? ”

The neat thing about doing it this way is that, just like in your idea, it gives a default response for things it doesn’t know what to do with, and flags them for the writer to fix (and it asks about things in the order of how often they come up.) But the other neat thing is that, in addition to working out what precisely to do with the ham-stove combo, the system also learns a bit more about the relevant properties of its world, and the _default_ response also gets better. So when you try to use the bagel on the washingmachine, it will say “I don’t understand”, instead of “You can’t feed the bagel to the washingmachine”; using the bagel on the stove gets the cooking message; using the bagel on the dog gets the not interested message (… and using the dog on the stove gets you the cooking message, which you may or may not need to sort out, depending on your sense of humor or lack thereof.) And in all cases it flags these unknowns for expert review.

You could also, with-or-without the expert system idea, co-opt your players as co-authors. You give them an option to flag a response as inappropriate from within the game, and even enter a better one. So the first time you use the ham on the stove it tells your player “You can’t do that, but I don’t know why; do you?” and the player, being a smart-arse, flags this and enters for the new response “Because the moon is made of cheese.” The next person who tries it gets “You can’t do that; could it be because the moon is made of cheese?” If they answer yes you count it as a vote for, and eventually with a few votes you start answering “You can’t do that because the moon is made of cheese.” If they answer no you collect another possible response to collect votes for. If theres a clear winner you just use it, if not you flag it for the real author with the response ideas and votes. If you had to login to play the game, you could even keep a history of peoples ideas and how often they were voted up or down, and weight their future suggestions accordingly.

Have I thoroughly derailed your idea onto a tangent yet? I believe I have. Ah good – my work is done here now.

Where I initially thought you were going with this, by the way, was some sort of collaborative effort to populate the game world with responses to goofy attempted actions: I thought you were going to suggest that every time the game would normally say the equivalent of “Huh?”, it would instead ask the player to help out by contributing a little snippet of text that it could use the next time this came up. It would rely on the good will of its participants (always risky), but the developer could act as a moderator, and the nice thing would be that the world would get richer and more detailed the more people played the game.

Of course, this only works (or only works easily) for purely null results, like your fork-into-outlet example, above: a player could have contributed that text. Once you get into altering the state of the game world as a result of your attempted actions, that wold be a whole hairball that would probably be more trouble that it would be worth.

Instead of asking the developer to read all the failures, why not asking the community to provide answers to the failures, making it more “2.0” ?
Every failure would be visible though a web page and the visitor would be able to submit a response to the failure, which will be displayed if the same failure happens again.
The submitted response wouldn’t have any impact on the story, but it would be more entertaining to read.
The developer would still have the opportunity to write something related to popular failure.

DevNull and 2.0: Good thinkin’! (Sorry Dev, I missed your earlier post that made a better version of my suggestion, and 2.0 seems to have made a similar suggestion at almost exactly the same time. Funny.)

I’ve been looking into Inform recently. If you HAD a parser with feedback, this specific “I beg your pardon?” response is trivial to fix – “understand using a thing on the light socket as plugging it into the socket”.

This is what testing is for, but due to the heavily authorial and obscure nature of IF, it’s often difficult to get good testing in.

Shamus, I think this could be applied to more than just IF. All we need is a game with online capabilities, and a team willing to patch it.

I remember those HL2 heat maps which signified where most players have met their doom? Steam already has the capability to poll this sort of information and Valve could use this data to release patches to for example add extra ammo/armor in the red spots of the map, re-arrange the cover and etc.

This goes double for MMO’s. Every time someone abandons a quest or a mission you could log this information showing you which quests require tweaking.

Since taking up computer science, I have been shown time and again that feedback is a powerful thing. I would love to try out a game using the system described above, though my own skills are nowhere near good enough to make one on my own.

First, in order for responses to be sent to the internet, a person has to have an account on the game’s host site or such. This way we can sort of filter out the people with dumb responses like “because the moon is made of cheese” and keep suggestions relatively limited to people who really care about the game.

So as people go through the game, the game keeps record of all the things that the player tried to do. When the player quits, he or she can choose to fix some of the problems with a simple editor, or they can save a log and fix the problems later. The editor will be discussed later, though. Okay, so as the players go through the game and fix what they thought was wrong in the end, the game saves all the additions to a file, which basically consists of a bunch of commands, as well as those commands’ effects. If the player so chooses, that file can be sent to the internet and attached to that player’s account’s signature or something. So other people could download the file and use it as a sort of ‘mod’ for their game. Then you could add a popularity-tracker to the site and everything, and add bells and whistles and stuff to make it all pretty.

As for the editor, it would be limited, but would be able to shape the world without unbalancing anything. With the editor, you can add/change an event description, rename the item (so every item would have both a handle (which the game uses) and an alias (the name the player sees)), rename the item it’s used on (doing the same with this), or even destroy either of the items. The last part is optional, I guess, but it could add some authenticity if used properly.

Also, perhaps if people without accounts wanted to send their entry to the internet, the attachment file would be sent to the server and kept on a page for a couple days. After the couple days, unless it has been downloaded by however many other people, with a couple members required (to dissuade advancing your own attachment artificially), in which case it stays there, until it goes however long without being downloaded, to lessen strain on the servers.

IF probably wasn’t the first genre of gaming I encountered, but it’s fair to say it was the most time-consuming. I played Zork and HHGG far more than was healthy. It’s been a while, but my recollection of those games was that they had well-programmed responses even for some particularly idiotic moves. I think that’s absolutely critical to making the game enjoyable, because it feels more like you’re toying with the game engine and less like you’re fighting it.

Phantom of the Arcade doesn’t give me that impression at all. One of the very first moves I attempted was shooting a ghost with a light gun (“shoot ghost with light gun”). Now, in retrospect, this doesn’t really make a great deal of sense – the gun was unplugged and would have been a very unimpressive weapon. However, given the very nature of the device, the game should have recognized what I was trying to do and belittled me for expecting an unplugged device to function properly. Instead, the parser told me that it didn’t understand the word ‘shoot.’ In this case, it’s difficult to decide whether you’ve tried something completely unexpected, or merely worded an action poorly – if there’s a definitive negative response, you immediately know the action was foolish, and move on to something else.

I ran into the same hurdle again when I first encountered a game cartridge sitting in a crane game. I tried taking the cartridge, playing the game, inserting a quarter, breaking the glass on the game, and vandalizing the game. Because none of these gave me a cryptic or negative response, I carried on, confident that I was merely wording something in a way the moronic parser couldn’t grasp. If the game had laughed at me for wishing to earn fluffy dice – or, better yet, allowed me to win the dice as a useless item – I wouldn’t have wasted the time on it.

Another irritation was not being allowed to play any games. Of course it’s not a brilliant choice when I’m actually trying to escape, but it shouldn’t have been disallowed. The way it’s written makes everything feel extremely linear and forced – there’s only one way to play through the game, and it feels more like navigating an unlit maze than solving a puzzle.

I think your parser-feedback idea is excellent. It wouldn’t change the actual linearity of the world at all, and would certainly not change the overall plot…but it would make the world seem more alive and more interesting.

The timing of your post is uncanny, Shamus. Emily Short, co-creator and chief documenter of Inform 7, and arguably the foremost IF author and theorist of the contemporary movement, released a game about a week ago that features an experimental feedback parser that allows for rapid iterations of the game using distributed authorship. Read about it and download it here:http://emshort.wordpress.com/
It’s called Alabaster.

Eric Idema recently announced a project-in-progress to allow IF authors to do exactly this: he has developed a server-side z-machine, and one of its intended uses is to keep anonymous transcripts and to provide feedback to authors, including information on the most frequently used commands by location. I am not sure it currently ties in to parser errors, but it might (or it might be possible to add that functionality).

Reminds me of an internet-based “20 Questions” game I heard about. It sort of learns based on the answers people give to its questions and such. They made a toy version (I think it was based on the internet one), and it is extremely smart at guessing what you’re thinking.

Seems like a text adventure game (well, there’s no reason you couldn’t apply it to a graphical adventure game like Sam & Max) version of VALVe’s death maps. They take note when players die, and the locations of the deaths turn into heatmaps they can study.

One interesting thing they found, for example, in Episode 2 was that a lot of players killed themselves by jumping off the waterfall, for no reason whatsoever. Turns out that players like playing Lemming with poor Dr. Freeman.

It’s a useful feedback for them, so they can determine what players like doing en masse (turns out players like dieing if they’re the ones causing it), and where the combat hot-spots are that tend to cause players to get killed. (For example, if they notice a whole lot of players are dieing in spot X, chances are there’s a bottleneck or enemy rush that’s not appropriately telegraphed, so players get stuck by it.)

This could be pretty useful in interactive fiction, but only to a point – if the game gets bogged to the point players are forced to resort to the brute-force method (use [everything] on [everything]), then it’s going to choke the parser, and all that the developer will learn is that players are frustrated.

This comic is a silly, hilarious adventure based on the point-and-click/text based adventures of old, but what’s unique about this comic is that each new page is the result of a different “command” being inputted into the “game” And what’s interesting is that all the commands are based on reader-suggested commands in the site’s suggestion box so the readers pretty much guide the story (well, for the most part. The creator admits to making up a couple of commands out of necessity to the plot).

This topic reminded me of it considering that literally any command can be suggested in this comic and something will happen since the creator just draws what happens as a result of it. Even if the command is something silly like “Build a fort” (which really happened…and the creator made it one of the central elements of the plot later). It may not be interactive fiction in the sense were talking about, but it is very reader driven and seemingly silly commands later became a thematic device in the comic’s silly story. If you like point-and-click adventures, I recommend giving this thing a read. You’ll really enjoy it. Again, the link is http://www.mspaintadventures.com/

[…] Alabaster Processing Preparing a game for testing November 8, 2008 Shamus Young has some interesting comments on parsing in IF and how he thinks it could be improved; Mike Rubin has a response. Partly this […]

Brilliant idea, Shamus. I’ve been writing an IF adventure on ADRIFT software for some two years now. Most of one’s time is spent in dealing anticipating and writing feedback to wrong actions. I’ve linked your post to the adrift forum, hopefully their clever programmer (it’s a one man band) will be able to implement it.

Interesting post and discussion. In 2003, I designed an experimental hyper-fiction adventure game together with Stephan P., “Tine wird älter”. We implemented many of the proposed features: The author gets feedback about unrecognised input (sorted by frequency) and can develop the story based on the ideas of the players. In fact, we started with very few story elements and used the system to gather even basic ideas (although we did this more or less internally – at that stage playing could be quite frustrating). The story was closely linked to the system itself: You play a four year old girl that wants to get five and thus has to understand and cope with the strange language of grown-ups.

We implemented the parser in PHP/MySQL. Neither the code nor the interface are quite up-to-date but it still works – for those of you who speak German: http://www.tinewirdaelter.de

If someone is interested in using the system as an author, feel free to contact me. Having read this post and the discussion I am even tempted to redesign and recode the whole thing from scratch…

…and I’m Stephan mentioned the post above; two long-time readers pointed us to this blog entry.

As Thomas said, the original Tine is slowly growing obsolete, but the basic idea certainly proved worthwhile. Doing it in English should even be a bit easier than in German, as the sentence structure allows for greater regularity coupled with semantic flexibility. It is, however, incredibly time-consuming for the author at the beginning, before the basic structure has reached a certain density and before long-time players begin to join the author in adding content. A new version should probably put greater effort in finishing large parts of *one* solution to the game before it goes online, and then introduce several authors as soon as possible. Perhaps the best way to do that would be to start with a well-known IF adventure and start building new possibilities from there?

There is a game coming out called Scribblenauts, and there is an insane number of words you can write that become the objects you write and interact with the world to solve puzzles. In an interview, the guy said they basically took encylopedias and dictionaries, extracted all the objects, then created basic properties and interactions (edible? flamable? etc.) then applied them and made the sprites. So, this should be intersting demonstration of that princilple. I know I’m getting it! :)

One Trackback

[…] Alabaster Processing Preparing a game for testing November 8, 2008 Shamus Young has some interesting comments on parsing in IF and how he thinks it could be improved; Mike Rubin has a response. Partly this […]