Monday, October 1, 2012

It was the middle of May when I first wrote about the ETS Math Assessment Game Challenge and my search for fun games that use traditional mathematical notation. At the time, I was contemplating which of several projects to undertake for my Summer of Professional Development. I had decided the previous Fall semester that I would not take on any extra obligations for Summer 2012, using it instead as a "summer sabbatical," knocking a few items off of my personal and professional punch lists.

As I considered where to invest my Summer efforts, I found my mind returning to the design of a math assessment game. I have been working with students and colleagues at Ball State for the last few years on educational games—that is, games that produce predictable learning outcomes—but I had not considered the challenges of making games for assessment.

There were two ways in which the contest sponsors made this more appealing to consider. First, there were two published learning progressions, based on sound educational research, so I knew I wouldn't have to try to invent a taxonomy of learning. Given the progressions, I framed the design problem as one of mapping the learning progression levels back to measurable in-game behaviors. The second factor was the appealing prizes: cash prize for first place, and top three get a trip to talk about games and education research with ETS staff. Cash is a powerful motivator when one has turned down work offers in order to focus on personal growth for the summer months, and professional networking is always in season.

I decided to pursue an entry to the challenge, figuring it would take about two weeks to generate a testable digital prototype. Spoiler alert: I had forgotten to take into account Hofstadter's Law.

First Attempt

I had been playing a lot of Flash Duel, and this inspired my first attempt. I worked out a game that embraced the board-as-number-line property of Flash Duel and Engarde, giving the player a hand of cards representing both numbers and numeric operations (represented by the Jack above). I wanted to take advantage of two-player competitive gameplay, perhaps even using combat as a metaphor, and this would also have allowed me to hit another item on my punchlist: make an online multiplayer game. However, I had a hard time making this prototype fun. I decided to put my Lego away for now and switch tactics.

Second Attempt

My wife had found out about the game Equate, which appears to marry Scrabble and algebra. I still have not played it, but I liked the idea of reinforcing well-formed mathematical "sentences" in the same way that Scrabble requires (and assesses) knowledge of well-formed English words. I became particularly interested in scoring systems that reflect increasingly mature understandings of algebraic relationships, particularly with respect to variables. I spent enough time with this prototype to try a digital version. This prototype had a nice feature, that it could be created as a single-player game in case there were problems with multiplayer.

I had recently become aware of the PlayN library for making cross-platform games. I love the technological wizardry of GWT, which allows you to write your application in one language and deploy to many, hiding the seamy side of cross-browser Web development; PlayN is essentially that, for games. Working with PlayN required me to learn Maven, which had also been on my imaginary punch list, through very far down. After having invested significant time into understanding how Maven works, it's hard to imagine doing another serious project without it, despite the headaches. Learning a new API and project automation platform took time and energy away from core game design and development tasks, but I have no regrets along these lines.

As I tinkered with technology, I also considered how to align learning outcomes with in-game actions. Achievements seemed a natural way to do this, and I developed several different achievement taxonomies over the subsequent weeks and months. The learning progressions necessarily combine evidence of learning and evidence of ignorance, and to capture this, I used a system of demerits and badges in my game. The complete details for these are provided on the "For Educators" section of the game's Web site.

Implementing the game was an opportunity to practice what I preach. I started with my domain model, developing it via TDD. Very quickly, I hit my first major impediment: in order to implement the achievements I had sketched, I would need to write my own expression parser. I started with an ad hoc approach that worked for simple, non-variable cases, and over time, I revised this to use the Shunting-yard algorithm to build an in-memory parse tree. This parse tree is traversed by various visitors to determine progress towards demerits and badges.

Debugging the parse tree

Without unit tests to save me from regression defects, I would certainly have not completed the project. Summer 2012 was supposed to be Summer Sabbatical Happy Learning Time, but it turned out to be Summer of Unexpected Interruptions. The good news is that we discovered that our master bathroom floor was rotting away before anyone fell through it. The bad news is that we had contractors in and around the house for about seven weeks to fix it, along with other planned maintenance and improvement projects. Combined with a death in the family and associated travel, it was not the focused and productive three months I hoped it would be. We got to spend some quality time with loved ones, which is really was more important than implementing a multiplayer mode for a summer project. In any case, keeping written a task list and a suite of passing unit tests allowed me to leave the project for over a week, then drop back into productive development very quickly. This worked much better than when, in the ignorance of youth, I tried to keep project plans entirely in my head—projects which never saw the light of day.

Unit tests cannot save one from bad user-interface design decisions. My first fully-playable prototype used a "drag and drop" motif for placing tiles on the board. As I manually tested each build, I was able to get the feature working properly, and I was quite pleased with myself. Then, I tried playing through a game or two. It is really tedious to drag and drop tiles with the mouse. It wasn't quite as bad on the Android tablet, but it was still awkward, especially because mine is pretty clunky and unresponsive. I decided to revise the user interface to use a click-select, click-place model. The result was a much improved user experience. Unfortunately, the drag-and-drop assumption had been buried very deeply into my software architecture, and I had to rewrite nearly all of the user-interface system.

Screenshot of the final version

The working title for the game was Algebra Game. Catchy, I know. Brainstorming with my wife helped me to realize I really wanted to highlight the fact that the game is about equations. Since the game takes places on squares, I ended up with the slightly punny Equations Squared. I hope no math zealots are upset that one cannot actually create quadratic equations in the game.

I had decided from early in the project that even though PlayN supported many different platforms, I would focus on the HTML target. Everybody has a browser, so I hoped this would give the biggest impact; I'm still interested in how the game can be adjusted for Android and iOS, but right now, it's HTML-only. PlayN builds upon the GWT compiler technology, and it has similar dependencies on browser implementation of CSS. Web developers: you can see where this is going. Alignment of symbols and handling of canvas were predictable on chromium-based browsers and Firefox, but IE had problems. For the alignment of text on tiles, I ended up replacing the dynamically-created tiles with an image sheet, which is all but guaranteed to work on all modern browsers. After my project was submitted to the challenge, I got a very helpful fix for other IE problems, but I decided to leave the code alone during judging; I ended up putting up a little warning message regarding IE not being fully supported, which hopefully does not drive too many people away.

Example boards

Late in the Summer, I started work on the Website for the game. I knew I wanted to host it on my departmental Web server because it was easy to integrate secure file transfer into an automated build process; this could have been done with other hosts, but I already had SSH set up for these machines. After a bit of fumbling around, I came across JQueryUI, which was a joy to use. There are a few niggling details that I wanted to fix on the site, but I'm happy with the result. The best part of it is under the hood. I wrote an internal domain-specific language in Javascript to abstract the construction of sample boards. For example, that first board is created with the following call:

new Board(5).horizontal(0,2,'1+1=2').caption('Horizontal')

This allowed me to abstract the construction of sample boards from the code that creates the table of CSS-colored content. With a little more TLC, I could have factored out the first two arguments that specify the starting coordinate of the equation, yielding a Clean DSL, but despite this, I'm pretty happy with the approach. "If it can be automated, it should be automated!" In case you're curious, this is similar to, but exactly the same as, the Java-based DSL I use in the production code to create and test game board configuration.

Here's a little quantitative breakdown. By the end of the project, I had 710 changesets in my Mercurial repository. According to SLOCCount, I ended up with 7,086 lines of platform-independent production code and 2,559 lines of unit tests, plus about 200 more to handle platform-specific idiosyncracies. (A brief aside: I've never used SLOCCount before, but it installed easily from the Ubuntu repositories. It claims the COCOMO total estimated cost to develop this software was $293,470. I guess I work cheap.)

A few days before the submission deadline, I put together an introductory video, as required for the competition. I spent hours searching for and tinkering with video editing software, but nothing seemed to work the way I wanted. I ended up doing this with RecordMyDesktop and a hand-held mic. This was about the thirtieth take.

The project took much longer than planned, but I did end up with a complete submission. I am happy with both the results and the process. I hope that the results are useful to others as well, and if you do end up using Equations Squared in a learning environment, please do let me know. I tell my students regularly that shipping is hard. It's easy to have a few ideas and write some throwaway code; it's a different matter altogether to actually build something of value. There are several ways in which I've considered extending the project, including multiplayer modes and mobile-native versions. Whether any of these are undertaken really depends on demand.

A big and public Thank you! to my alpha and beta testers. They provided invaluable feedback on platform issues, usability, and defect detection. Finally, thank you to my wife and kids for indulging my desire to spend the summer creating.

Author's Note: At the time of publication, there are less than 2.5 hours until the winner of the ETS Math Assessment Game Challenge is announced. In the spirit of unbiased personal reflection, I wanted to get this post up before the winners are announced. There are a few more details I was hoping to write about, such as the use of Google Web Fonts to do some nifty dynamic font loading, but I will delay that for another day.Update: Good news, everyone! Equations Squared won the grand prize! The contest entry page now has a pretty gold ribbon proclaiming the news.

A few of my testers expressed difficulty reading equations vertically. It never gave me any trouble, and it seemed to me that once the testers tinkered for a while, they got over it. The design intent is that one read horizontal and vertical equations the same way, but there's no doubt that the former is more familiar than the latter.

I'd like to get the game in front of a group of 5th-9th graders of mixed experience and see if this is a hindrance, or if it affects the relationship between demerits/badges and assessment. Alas, that's a study I don't quite have the time for right now.

About Me

I am an Associate Professor of Computer Science at Ball State University. Much of my recent work has been on the intersection of games, fun, and learning, leading interdisciplinary teams of undergraduates in studio-based learning experiences,