Monday, January 21, 2013

I stumbled across a new game worth mentioning on Ludi Berkeley for several reasons, but at the moment I'm intrigued about marketability. Take for moment to review the description of the game provided (from BGG) below:

"Compounded is a game about building chemical compounds through careful management of elements, a fair bit of social play and trading, and just a bit of luck. In Compounded, players take on the roles of lab managers, hastily competing to complete the most compounds before they are completed by others – or destroyed in an explosion. Some compounds are flammable and will grow more and more volatile over time; take too long to gather the necessary elements for those compounds and a lot of hard work will soon be scattered across the lab."

It sounds exciting, but does this theme severely limit the potential audience in an already niche market of designer board games?

The answer is yes, it most certainly does. This does not mean however, that the success is limited more than say the most universally accessible theme of a train game based in ancient Egypt.

All themes constrain marketability due to player taste, accessibility or cultural relevance. I would argue in cases such as Compounded, it actually strengthen the potential of success.

While we cannot yet evaluate the the quality of the game, we can make several observations:

Chemistry is a relatively uncommon as a theme in gaming; the novelty of the idea was enough to intrigue me to write a blog post about it. The board game market has long been known to support well executed ideas regardless of theme. We've seen commercially successful board games about German politics, irrigation, and bureaucratic paper pushing.

The game has made it this far through play testing without a theme change, indicating either the mechanics mesh well with the Chemistry theme or the designer is at least content with the design and finds this theme most fitting.

With the game in preparation for publishing, Compounded seems to already have its audience. At the moment I became aware of it on Kickstarter it had gained 398 backers in 24 hours and as I publish this blog post it has gained 577 backers in four days of support. I'm thoroughly looking forward to Compounded and many more exciting themes in 2013.

Sunday, January 13, 2013

This post is the last in a series of three discussing the most recent season of D&D Encounters. I've already posted my recap of how the season unfolded and a discussion of how the "adventuring day" math in 4th Edition affects the narrative character of the game. One of the conclusions I drew was that the math behind roleplaying games uniquely, and distinctly from most other categories of games, affects both the mechanics and narrative of the game.

If you were to distill the central mechano-narrative concept of just about any RPG, it might look something like this: when you do cool stuff, the numbers go up. Of course, every RPG defines "cool stuff" differently, and every one has its own method for making the numbers go up. In D&D, "cool stuff" might be slaying a monster, surviving a trap, or solving a puzzle. From there, you get two rewards, experience points and treasure. Finally, the numbers go up once you've accumulated enough experience points to gain a level or if you find a magic item in all that treasure.

Every edition of the game has worked like this, but the relative contribution of level- and item-derived bonuses to a character's statistics has changed with every incarnation of D&D.

3rd Edition

In 3rd Edition, at sufficiently high level, a character would likely have no fewer than half a dozen magic items that granted static, constant-effect bonuses to his numbers. The preponderance of such items gave rise to the derisive nicknames "Christmas tree" or "mannequin" effect, suggesting that the primary purpose of a character was to act as a stand to hang his magic items on.

There were two big problems with the Christmas tree effect, one narrative and one mechanical. The narrative problem was that a huge fraction of the magic items in 3E went entirely unused. Though the designers and developers tried their hardest to make items interesting enough to compete with bigger numbers, it just didn't work. As a 3E player, given a choice between outfitting your character with an intriguing though limited-use and situational item and one that boosts a number that your character uses often, it was the utilitarian "make the numbers go up" every time.

And the mechanical problem was that the math behind magic item contributions to character numbers was obscured to both players and DMs. There was presumably some internal idea of an "assumed magic item power" vs. "character level" graph, but we couldn't see it. We could posit a function's existence and set its upper and lower bound but had no idea what its shape or intermediate values were.

In that respect, DMing 3rd Edition was a bit like being an auto mechanic who was charged with taking care of a car's engine but couldn't lift the hood. You knew you needed to operate the car under a certain set of conditions, and you knew you needed to give the engine the right kind of oil. But you didn't know what kind, or how much, or how frequently, and you often resorted to making adjustments once things had started making too much noise.

4th Edition

As it was wont to do, 4E fixed most of 3E's problems by introducing some of its own. 4th Edition made two major simplifications to how the item structure worked. First, it reduced the number of types of items that could give a static bonus to a score from at least eight in 3E to only three. That change helped to alleviate the narrative problem from 3rd: no longer did you have to choose between a ring that granted you a brief moment of invisibility and a ring that enhanced your armor--because there was no such thing as a ring that enhanced your armor in 4th Edition. Instead, the ring slot was free to do all sorts of other interesting things besides just provide a numerical benefit.

So far, so good. 4th Edition's other big reform was to codify the exact character level for which each magic item was appropriate. In general, a +1 item (i.e., one that adds 1 to a d20 roll, or a 5% bonus to your attack or defense) was deemed appropriate for a 1st-level character, +2 for a 6th-level character, and so on up to +6 for level 26. So in 4th Edition, a DM had a much better idea of what magic items that characters should have at what time; for example, if your 7th-level fighter already has a +3 sword, he's slightly more powerful than average for his level.

But what if a DM doesn't give out magic items at that canonical rate? What if he built his world to be comparatively low-magic, or he decides there's not a situation where the characters realistically would have found a magic sword in the past few levels? In the original conception of 4E, he would have broken the game. It +3 vs. +2 doesn't seem like a big number, but for every "unit" of each magic item lower than where you "should be" for your level, combat becomes 10% more difficult.

The much worse problem was a newly introduced narrative one. As the magic item math became more transparent and more tightly integrated with the character power math, DMs and players alike began to think more acutely about the "correct" magic item to have at a certain level. As a result, magic items felt less exciting, less mysterious--frankly, less magical--and more utilitarian, more simple tools to make the numbers go up.

Essentials (and looking forward to Next)

Sometime when we weren't looking, but probably around the time that the Essentials rules started worming their way into 4E, the "inherent bonuses" rule showed up, and it's a much better addition to the game than most of the rest of Essentials. All the rule does is decouple the "numbers going up" function of magic items from their narrative interest. The "expected" numerical increases from magic items just get rolled into level-derived bonuses. The reason that "inherent bonuses" is good is that it frees the DM even more to give or withhold magic items as he sees fit, or to make them focus on their special properties instead of just what they do for the characters' numbers.

War of Everlasting Darkness was the first Encounters season to mandate the use of inherent bonuses rather than assuming a glut of magic item-based numerical increase, and it worked beautifully. Only a handful of magic items existed in the season, and all of them served a deeper narrative purpose than simply making the characters more able to survive combat. My players really got into it, getting more excited about the items' ability to break the eponymous darkness than about their game statistics.

And that seems to be a more authentic expression of what a magic item in D&D should be, dating back to its Tolkien-esque roots. Sure, Sting was probably a little sharper and better balanced than the average short sword, but that's not why we remember it. Sting is a notable installation in the fantasy repertoire because it glows blue when goblins and orcs are around. Orcrist wasn't interesting because it gave Thorin Oakenshield a bonus to his attack rolls; it was interesting because the orcs and goblins knew of its might and feared it.

Next seems to be moving back toward the "interesting effects" function of magic items instead of the "numerical bonuses" function; its most recent set of playtest rules limits the highest bonus from a magical weapon (even ones of legendary rarity) to +3. Meanwhile, the rules are full of neat descriptors of who created the item, what it's made from, how it's used, and what its properties and "quirks" might be.

That's exactly the direction D&D should be going: making the math behind magic items both transparent and decoupled from the game's core mechanics. The holy avenger sword is a part of D&D lore because of its epic history with the forces of good, not because of some number it adds to your attack rolls.

Friday, January 11, 2013

As a quick review of Part I, I went over five of the seven criteria I use to evaluate board games. This is a scale with a maximum score of five broken down by the following categories:

Originality (0 - 1 pt) - How exciting is the unique combination of ideas that bring this game off the shelf? Does it stand out from similar games?Theme (0 - 0.5 pt) - Do the thematic elements blend into the game play? Is the theme fitting? Does it increase my interest in playing or is it a last minute addition?Pure Fun (0 - 1 pt) - Do I enjoy the game? Is it a go to game when I have the necessary player count? Does the game play move along or does it often move too slow?"Re-play-ability" (0 - 1 pt) - Do I feel the need to revisit it in order to try new options and strategies? Is it predictable? Do I want to play again immediately?Strategy to Luck Ratio (0 - 0.5 pt) - Does the game present itself well as far as the impact of strategy vs. luck? Is the amount of "luck" adequate for the game length?

Now the final two categories:

Player Scaling (Up to 0.5 pt)

When I'm looking to acquire a new game one of the first items I look at is the intended number of players. 2-4 is limiting when you've game nights of five or six people.

There are fun party games that are 4-8+ but they really mean 7+ for an ideal experience. Games that have 2-6 or 3-5 often really need the upper end of the range to put all the player actions or roles into play or the lower end of the spectrum as additional players largely only add time with little to no additional player interaction.

This is a measure of how viable it is to get to the table, a larger "optimal" player range, the better the game under this rating. Two player games do get a pass here unless they are functionally broken. There are no bonus points but I do enjoy when games have different pacing or alternate strategies emerge as more viable with different player counts with the same game. Tongiaki is almost a completely different game when played with two players versus six players.

This final category is mostly about ensuring there is tension in a game. As I've mentioned in a previous article about final scoring, it is often the final controllable factor of whether the designed experience was enjoyable or not. Scoring shouldn't have a runaway leader problem nor should it transmit to a beginner that twenty minutes in that they are in for the long haul with no shot at the podium.

I have no qualms with defeat normally but if the session commonly turns into "I gotta catch Bill as he rolled three sixes to start" for an hour I have an issue.

Games that have an all or nothing achievement, such as conquering the globe in Risk or bankrupting everyone else in Monopoly, can leave players slightly jaded at the end as one person comes out with a glorious victory while at least several others were more witnesses or victims than participants . This is a negative in my book as it doesn't leave others with a sense of "what if?" that motivates them the next time, a thought of "I see I could have done this for four more points and that for three more that may have been the difference."

0.5 - Transamerica, In the Year of the Dragon, End of the Triumvirate
0.25 - Kahuna, Jambo, Rosenkonig
0.0 - Risk, Jenga, Monopoly

My BGG ratings distribution

Games Rated

Now the criticisms are fair; I don't exactly follow the standard BGG.com "recommended rating" method and on rare occasions I've received messages that respectfully disagree with my personal rating system, but it does have its own argument to make. Lets look at several examples:

BGG Recommended Rating:10 - Outstanding. Always want to play and expect this will never change.My Rating:
This is of course very fitting as one of my few perfect tens. It holds a special position on the shelf.

Power Grid
Originality (0.75/1.0) - Mechanics mesh well, little innovation
Theme (0.5/0.5) - Plays like I'm supplying energy to cities
Pure Fun (0.75/1.0) - Decisions can get bogged down at times
"Re-play-ability" (0.5/1.0) - Game length a hindrance
Strategy/Luck Ratio (0.5/0.5) - Strategy is strongly rewarded
Player Scaling (0.5/0.5) - Neat intricacies between player counts
Parity (0.5/0.5) - Feels close throughout, strong catch-up mechanicMy Rating:Overall4.0/5.0 = 8 out of 10BGG Recommended Rating:8 - "Very Good Game. I like to play. Probably I'll suggest it and will never turn down a game."
While Power Grid is a very good game, I'd certainly turn it down if two good shorter games were an alternative. I enjoy playing but there are many games I'd suggest over it, some being rated lower than Power Grid but play much faster.

Ticket to Ride
Originality (0.75/1.0) - Love the mechanics, but very simple
Theme (0.5/0.5) - Solid theme
Pure Fun (0.75/1.0) - Very enjoyable game for its length
"Re-play-ability" (0.5/1.0) - Initial intrigue is short lived
Strategy/Luck Ratio (0.5/0.5) - Great strategy/luck mix
Player Scaling (0.25/0.5) - Race with 2 and logjam with 5
Parity (0.25/0.5) - One player can go unchecked too oftenMy Rating:Overall 3.5/5.0 = 7 out of 10BGG Recommended Rating:7 - "Good game, usually willing to play."
I would say my feelings about Ticket to Ride match the BGG recommended rating. It is a good game and I'd usually be willing to play, it does need some variety periodically to keep it from getting stale.

My Rating:Overall 2.5/5.0 = 5 out of 10BGG Recommended Rating:5 - "Average game, slightly boring, take it or leave it."
I probably feel a little more fervor about Blokus than one may think from this rating but that is because it suffers in getting to the table. With 4 players being optimal player count, it brings a competitive field of games for 4 players that have more depth and intrigue. It loses out as an abstract, but overall I probably enjoy this game at about a 5.5 or 6.0 level.

Saint Petersburg
Originality (0.5/1.0) - Ahead of its time, similar games are superior
Theme (0.0/0.5) - I just don't feel it, could have been any theme
Pure Fun (0.25/1.0) - Theme probably carries over here, feels stale
Replayability (0.25/1.0) - Repetitive and cumbersome
Strategy/Luck Ratio (0.25/0.5) - Interesting decisions rarely show
Scalability (0.5/0.5) - Works nicely with 2, 3 & 4 Players
Parity (0.0/0.5) - Arguable, but I see too many blowoutsOverall 1.75/5.0 = 3.5 out of 10BGG Recommended Rating:
Somewhere between:3 - "Likely won't play this again although could be convinced. Bad." and4 - "Not so good, it doesn't get me but could be talked into it on occasion."
I fall right between these two descriptions as it has never been a game I've particularly enjoyed and if given nearly any other option I'd probably elect it over this. I should clarify, this is not a bad game at all though, very functional and sound design for what it is intended to do, it just doesn't captivate my interests.

Conclusion

Imperial
Originality (1.0/1.0) - A true pioneer in many waysTheme (0.5/0.5) - I'll give it the benefit of the doubt
Pure Fun (0.75/1.0) - The fun value has decreased
"Re-play-ability" (0.75/1.0) - The excitement has faded
Strategy/Luck Ratio (0.5/0.5) - No luck
Player Scaling (0.5/0.5) - An interesting experience with each player count
Parity (0.5/0.5) - With plenty of common property, its hard to break freeOverall 4.5/5.0 = 9 out of 10
BGG Recommended Rating:9 - "Excellent game. Always want to play it."As I mentioned briefly in the intro, I've recently come to terms with a loss of affection for Imperial. It has served its purpose masterfully, but I no longer reach for it at each gaming opportunity. Underneath the investment wargame fasade exists an investment game that uses war to determine the market valuations. Imperial isn't enough of an investment game for the time it takes to play it, and the military actions literally go around in a circle and have begun to feel like a chore. I hope to get Imperial back to the table several times this year and re-evaluate, but for now it is trending downward.

Now in hindsight it seems kind of silly to write two articles over lowering a game rating by 10%, but I'm done and I need to go start my taxes now. Thanks for reading.

Thursday, January 10, 2013

In the previous post, I discussed my experience DMing the War of Everlasting Darkness season of D&D Encounters. In this and the next post, I'm delving a bit deeper into the math behind 4E--and why some of the season's grand experiments worked better than others.

War of Everlasting Darkness's "schtick" was that it would play around
with some of the "sacred cows" of 4th Edition, especially the concept
of the fixed-encounter adventuring day and the contributions of magic items to character statistics. It
was a noble experiment, attempting to feature the "exploration" and
"interaction" pillars of D&D as heavily as the "combat" pillar, and
using magic items more for their flavor than for their mechanical
effect, two concepts that feature heavily in D&D Next.
4E
(and to a lesser extent, every edition of D&D that has preceded it)
is based around the concept of the "adventuring day," the idea that the
characters have a certain slate of abilities they can use every day,
and once those are exhausted, it's time to go to sleep and get back into
it tomorrow. Therefore, DMs face a tight balancing act: too many combat
encounters in one day, and the difficulty ramps exponentially with
each; too few, and each is trivially easy. Most "adventuring days" in 4E
are assumed to have four combat encounters, give or take, and
Encounters has dutifully implemented exactly three to five combat
encounters per adventuring day in nearly every season until this one.

It
turns out that 4E is exquisitely balanced to handle exactly the right
number of combats per day: characters have just enough powerful
daily-use abilities to make it through the day, but not so many that
combat becomes far too easy. When the system works, it works
brilliantly: at every level of the game, combat has a flavor of peril
and a sense that things could go terribly awry, but rarely does it
become overwhelmingly and frustratingly difficult.

War
of Everlasting Darkness, though, turned this assumption on its head, to
mixed results. Each adventuring day had a number of combat encounters
that could happen. But sometimes combat only triggered if the
characters acted a certain way or said a certain thing. Sometimes a
usually-hostile monster wouldn't be so bent on killing the characters if
they'd handled a past situation particularly deftly. And sometimes,
three branches of the same tunnel could lead to an easy combat, a hard
combat, or no combat at all, essentially at random.

As a
result, most adventuring days this season had too few combat encounters
compared to the "four per day" heuristic. That typically resulted in
one of two outcomes. In some cases, when the players knew they'd only be
facing one or two combats of any significance in the day, all the
characters could use their most powerful daily-use abilities at once,
and combat was over practically before it started. On the other hand,
sometimes the module over-compensated for this eventuality by making that
once-daily encounter far too difficult.

In 4E, you gain around a
3-4% increase per level to the average d20 roll, not to mention
increases in damage and new powers that let you do cool things. An
encounter four levels above the characters' level is entirely winnable
but on the order of 30% more difficult--since you're less prepared to
deal with the monster, and the monster is simultaneously more prepared
to deal with you. That translates into a lot of battles won or lost more
on luck than on clever strategy or careful planning. Winning because of luck is the least satisfying way to win, and losing because of it is the most frustrating way to lose. 4E in its "sweet spot" is very good at avoiding those outcomes. It's clear that War of Everlasting Darkness saw 4E operate quite far from that comfort zone.

In
other words, if your game is a heavily structured one, based on
definite combats that will absolutely happen, then 4E is a
fantastic rule set for your game. For other situations, it works less
well. What if the characters are on a long journey and the DM wants them
to have one single combat encounter in a given day, fighting some
trolls in a roadside ambush? What if the characters are in some tight
spots but opt to talk through them or run away rather than fight it out?
Or what if the characters have a bellicose streak and create combat
where you didn't plan on one existing?

4th Edition is
much more poorly suited to running those sorts of games. And there's a
lot to be said for those sorts of games: they appeal to players who are
interested in non-combat situations. They give characters who are built
around something rather than fighting a chance to shine. Most
importantly, the game feels a lot more organic when the characters are
working things out their own way rather than being conducted along the
DM Railroad.

Being witness to this great experiment has
led me to appreciate something unique about roleplaying games compared
to other categories of games. When you alter the math in a board game,
you're altering the mechanics, and that's it. If you make the starting
cost of coal in Power Grid 2 Elektro rather than 1, all you've done is
make coal power plants a comparatively worse early-game strategy. You
haven't changed the narrative tone of the game.

In D&D and in other roleplaying games, though, when you alter the math, you've altered not only the mechanics but also the narrative tone. In making 4E so explicitly structured around the correct number of combat encounters per adventuring day, 4E became very much about combat encounters. That, in turn, encourages players to create characters best served to engage in combat encounters. A version of D&D less carefully balanced on the precipice of the perfect adventuring day would result in not only the (mechanical) change in how the adventuring day played out but also the (narrative) change in what the experience of playing 4E is like.

This isn't a knock on 4th Edition. To its credit, its balance really is very good, as long as its strictures about including the correct number and difficulty of combats per adventuring day are obeyed. But because in a roleplaying game, the math is intrinsically tied to both the mechanics and the narrative, 4th Edition is simply more capable of handling certain styles of gaming than others. In the third and final post in the series, we'll look at how another D&D concept, the math behind magic items and "inherent bonuses," and how it might affect the look and feel of D&D Next.

Last December, D&D Encounters concluded probably its most ambitious experiment yet: an arc of three seasons that told different sides of the same (very drow-y) story. The first, Web of the Spider Queen, started the story with the PCs as crusading anti-drow heroes; the second, Council of Spiders, turned the tables and had the PCs play as a bunch of villainous drow. The third and most recent (and most ridiculously named), War of Everlasting Darkness, wrapped things up by putting the PCs back on the anti-drow side, though the drow were mostly a background, behind-the-scenes threat for a sizable chunk of the season.

This seven-month-long drow arc saw quite a few D&D firsts for me: the first time I've played the game with my girlfriend (!), the first time in my twelve-year history with the game that I've actually played as a drow character, and the first time I've DMed an extended 4th Edition campaign. I was an early supporter of 4th Edition, believing that it was just as fun as the game's previous incarnations (albeit in a different way), but the constant Encounters grind of "make a 1st-level character, advance to 3rd, and do it all again" started to get old. When our usual Encounters DM, Matt, offered to let me DM this season, I jumped at the chance, mostly to get to do something different for a change.

The Season

Structured entirely differently from previous Encounters seasons, this one had us play eight sessions, with each character gaining a level after every one. That meant we actually got to explore levels 4-8, unprecedented in Encounters-land. The leveling schedule was clearly accelerated, with all the experience-point math hand-waved, but neither players nor DM seemed to mind too terribly much. It was yet another piece of evidence in favor of a DMing strategy I prefer: keep the XP behind the scenes, and tell players when they've gained a level. No player likes being 50 XP short of a new level, and no DM wants to have to design his game around that possibility.
I found the game exceedingly easy to run, though that's not saying much since the War of Everlasting Darkness module told me what to do at every turn. To keep the game dynamic, I did go "off-module" more than most DMs might have: if the characters had a clever solution to an ostensible combat encounter that didn't involve fighting, I let them pursue it. If they had a creative skill to use in a skill challenge that wasn't explicitly enumerated in the module, I let them use it.

My table composition changed more drastically from week to week than I might have liked: one week, we had twelve people show up with all the combat roles at least doubly covered, and we had to split into two tables. The next week, without warning, we were down to three (naturally, two of whom were playing strikers). But lack of continuity is a hallmark of Encounters, often leading to more humorous self-effacing comments (looks like Marshall's character was too busy drinking with his new dwarf buddies to join us this time!) than actual confusion. And in a season like this one, where each session happens in a different place and time, it's even more plausible that a certain character might be missing.

All in all, I had a fantastic time DMing. Of course, the situations I created and the story I told was only part of it. A far bigger reward was knowing my players had a good time. Comments like "we quit our other weekly game because Encounters here has much better DMs" and "I really appreciate all the extra work you've put into this game beyond just what's in the book" make me know I've done a good job.

Wednesday, January 9, 2013

Last week, I posed the question "if I have twelve cards numbered 1-12, draw three per day without replacement, record their numbers, and then replace them and shuffle, how many days will it take before I should expect to see all twelve?" I was interested in seeing either a simulated or analytical result, and I got convincing ones of each.

Analytical solution

My friend Angelo is an astrophysics/applied math double major, so it's pretty safe to say he's better at math than I am. Here's his analytical solution to the problem, slightly limited by the "three cards per day" restriction.

P(x)
= (# of strings of length x that contain each digit between 1 and 12 at
least once) / (number of total possible strings of length x)

= 1 - ((# of strings
missing 1 or more digit) / (12x))

Actually, as he points out, this yields the number of cards you'd need to look at, not the number of days--but that's easy enough to fix by dividing by three. Here's the plot of the analytical cdf:

which appears to cross P(x) = 0.5 around card number 35, or roughly 11 or 12 days. Here's his analytical pdf along with a simulation (n = 10000):

They're a little off, and the simulation isn't quite smooth at n = 10000, but the agreement is reasonably good. For the "magic number" of P = 0.5, Angelo's analytical solution gives 11.7 days, and the simulated solution gives 12.7. Here's his full write-up.

More simulations

I did a simulation of my ownwith the following logic: start with an array of twelve zeros, pick three of those zeros at random to become ones, and repeat the process until all the zeros are ones. Then do the whole thing over again, 50000 times. My simulation was done in MATLAB. Here's the distributions I came up with:

I get basically the same shape as Angelo's graphs, though the extra 40000 simulations helps make the pdf a lot smoother. It's most likely to take 11 or 12 days, and the mean value of the pdf is 12.7, just like with Angelo's simulation. Strangely, my cdf crosses P(x) = 0.5 at x = 11.7 days, exactly one day too soon, so it's likely I messed up the integration of the pdf.

I got a nice verification of my simulation from another friend, Forrest, who actually does numerical simulations for a living, so there's a good chance his is far more rigorous than mine. I don't have his simulation code, but after one million simulations, his cdf looks similar enough to mine (note the change in scale on the x-axis):

Forrest concludes that the "magic number" is somewhere between 11 or 12, putting his best estimate at 11.3.

Conclusion

The analytical solutionand all three simulations put the P = 0.5 threshold somewhere between 11 and 13 days, and the mode or most likely occurrence of both pdfs is essentially a tossup between days 11 and 12. If we're looking for a single numerical answer to the question, I'm putting it at day 12: you're most likely to have seen all twelve cards on day 12. (Follow-up question: is it a coincidence that the number of the "critical" day is also the number of cards in the stack?)

Other interesting facts: it is in fact possible to be done on day 4, but it appears to be exceedingly rare. Only once in 50000 trials (or 0.002% of the time) did my simulation see all twelve cards in only four days. It's equally rare to have to spend more than 50 days: one simulation in 50000 (again, 0.002%) took 57 days to complete, but no other simulations took longer than 48 days.

Thanks to everyone who helped solve this problem! I'm both pleased and amazed that I found such an abstract, academic problem that is nevertheless relevant to real-world game design.

Monday, January 7, 2013

It finally happened: I've admitted I lost some love for Imperial. Up until now it had been one of of four titles I had given a perfect 10/10 out of the 140+ I have rated on BGG.

During my latest re-evaluation I realized I've gone 6 months without giving a thorough explanation of games I enjoy. How can I expect others to follow my thoughts if I keep my game evaluations to myself?

Now before I get into my rating scale (which may elicit plenty of fair criticism), let me explain my criteria in order to rate a game.

1. I play every game at least 3 times before rating it.

2. I decline the opportunity to rate a game if I still don't feel I have a grasp of the depth of the game, no matter how many times I've played it. A great example is Go, for which I have played maybe six full games and always feel I'm probably unqualified to play.

3. I do not rate games who are ambiguous to grade - I'd give the concept of Poker at least a 9.5/10 but there are so many variations I could never come to a consensus on a rating for the BGG page "Poker". I also play Warhammer 40k, but that is a game for which the experience is almost entirely determined based on opponent, terrain, painting/modelling, etc, certainly difficult to quantify with a single number.

4. I do rate games for which I have only played a virtual version implementation. I only do so if the implementation demonstrated great quality and encompassed the "spirit" of the game.

5. I do not rate card games played with an ordinary deck of cards. They are almost endless and I try to reserve my ratings for the evaluation of the hard work and creativity that went into the intellectual property on the market. This may sound a bit sanctimonious but I promise I don't intend for it to be.

Now my rubric for rating for the past 6 years has been comprised of three main measures: Originality, Fun Factor , Replay Value & Game Composition. It is measured out of a maximum of 5 and each category is divisible by a 1/4th of a point.

Originality (Up to 1.0 pt)

Measuring originality has become more subjective for me over time, but in theory it is how unique is the blend of mechanics and how the game differentiates itself from other games on the market. I try to give credit to games who pioneered new ideas, but inevitably over time the original titles get dinged about a quarter point if something comes along that re-implements an idea significantly better.

Theme (Up to 0.5 pt)
Theme is a bit easier to judge; a game usually either has a fitting theme or it doesn't. I do give partial credit for trying, and so even if a theme isn't ideal I will give a quarter point if it improves enjoyment beyond just an abstract.

A simple number that takes into account how much I enjoy the game play and how frequently I want to play the game when opportunities. There is probably a correlation here with overall score as naturally I want to rate games higher based on how much I enjoy them. Some of this rating has to do with how much fun it is for the designed playing duration. Games do have a tendency to sag in this category over time, but the time-tested games seem to stay put.

"Re-play-ability" (Up to 1.0 pt)
I don't believe this is a word but it does have expressive value until we find a suitable replacement. Its hard to imagine a game being fun but not "re-playable" in a sense that it isn't worth playing over and over. What I am identifying is are there multiple strategies that make a game worth playing specifically to try to achieve and is there enough player driven chaos to warrant a different game experience. Modular boards and card variety give points as do optional inclusions and alternate rule sets.

A perfect score is in sight if a game has a small learning curve and interesting strategies that the player must adapt to as the game goes on. The game should be unsolvable and dependent on the decisions of the players you are gaming with. Similar to the Pure Fun rating, a shorter deeper game has an advantage over a longer game with the same depth.

Strategy to Luck Ratio (Up to 0.5 pt)
The purpose of this category is to help value how well a game measures up to its intended quantity of luck and impact of strategic decisions. I'm certainly not alone in my disdain for three hour games whose outcome is largely impacted by luck. My intention is to differentiate games whose style seems to indicate they are comprised of significant strategy when they are actually closer to a lottery drawing each time they are played.

Perhaps an example may help; in the game Simply Suspects, each player gains a secret identity and through the course of the game moves evidence in order to place the blame on other players. Frequently a grand jury is triggered in which identities are eliminated who have accumulated enough evidence. The game often turns into everyone throw down evidence onto identity #1 until they are eliminated, rinse and repeat with identity #2. If you are unfortunately stuck with drawing identity #1 you have the ability of moving evidence off of your identity onto another but it is often futile as you will often be outnumbered and appear suspicious.

Now this is hardly a deal-breaker for a game with a length of 15 to 20 minutes, but it does demonstrate how sometimes games can appear to be vastly more strategic than how they often play out. It's not a negative for a game like Yahtzee to be very luck driven as the initial impression is congruent with that idea. Games get this the full points in this category by default unless they prove otherwise by not giving adequate ability to mitigate luck.

I've broken this post into two parts as it has become lengthier than expected. In Part II I will discuss the final two criteria in my process for evaluating board games and delve into some criticisms of this particular grading scale. In the meantime, what do you think? Can games be graded based on preset guidelines or is the purpose lost when many judgments are subjectively swayed anyway?

Friday, January 4, 2013

In my discussion of Kingdom Builder yesterday, I mentioned that only after a few months of playing the game did we finally draw all twelve of the scoring objective cards, and that seemed unusual. That got me thinking: how unusual is it really? How many games of Kingdom Builder should I expect to have played before I encountered all twelve at random? Here's the same problem, rephrased to make it more general:

I have twelve cards, each marked with a different integer 1 through 12. The cards are identical except for the markings. Every day, I draw three cards at random, without replacement. I record the number on each card then replace the cards into the deck. Then, I shuffle the deck. The next day, I repeat the same steps: draw without replacement, record the numbers, replace, and shuffle. We'll say for the sake of argument that I shuffle thoroughly enough to make the deck truly random, and that today's picks are independent of what yesterday's were.

How long should I expect to keep drawing cards before I've drawn all twelve? A couple of attributes of the probability distribution are intuitively obvious. The minimum number of days would be four: if you drew three different cards for each of four different days, you would get to twelve. But that has to be a pretty unlikely event; by day three, you're more likely than not to repeat at least one number. At the other end, the distribution should trail toward infinity: after an infinite number of days, you're guaranteed to have drawn all twelve, but the "guarantee" only comes at infinity. After a hundred days, or a thousand, it's less and less likely but still possible that there's one recalcitrant card that hasn't shown up.

I'd be interested in seeing the entire probability distribution, either the pdf (differentiated form) or cdf (integrated form). In particular, I want to know where the cdf crosses 0.5--that is, at what day it becomes more likely than not that I have seen all twelve. It would be an easy enough system to model, but I'm also interested in an analytical solution if it exists.

Thursday, January 3, 2013

When we published our review and discussion of Kingdom Builder, one of our readers gave a great suggestion: in addition to our first-impression style review, it might be interesting to go back and write a more detailed review later on, incorporating more discussion of strategy and how our opinions of the game have changed over time. Discussing strategy and particular optimization decisions is coming; it's on my list of resolutions for the new year. But the idea of revisiting old games to see how well they've held up or how our approach to them has changed is especially intriguing, since a particular game could lose its luster after playing it only a few times, or the strategic implications of a game might only become clear after running through it once or twice.

Kingdom Builder

Kingdom Builder has been a frequent installation in our all-too-seldom game nights thanks to its straightforward mechanics and reasonably fast play. The analysis paralysis I mentioned in the initial review has abated a bit as we've gotten more familiar with the rules, which has made Kingdom Builder quicker and more fun. We're still getting new combinations of map tiles and scoring objectives--in fact, in our most recent game, we drew the "Citizens" objective for the first time ever--so the game continues to feel fresh. Most importantly, we haven't played with the disastrous combination of Paddock and Hermits again, but we have played with both Paddock and Hermits alone, and the game worked just fine.

Verdict: Kingdom Builder is still good. Now that the rules are more familiar to us, and now that we've identified the rare combinations of mechanics that break the game, it's actually getting better. Almost as importantly, Kingdom Builder is short enough that we can play it multiple times in the same night and engaging enough that we actually want to.

Small World

My co-author Alex first introduced me to this fantasy-themed territory-control game, and I played it again for the first time in a couple of years when we got together over the Christmas holidays. Like Kingdom Builder, Small World features enough mechanical combinations that each game feels different from the last, which is emerging as a pattern for making games continue to shine far past their purchase date. Another similarity to Kingdom Builder is that some of these combinations are far more powerful than others, but such game-breaking scenarios seem rare.

Verdict: I still enjoy Small World also, but it's clear to me that it would require several, perhaps dozens of, play-throughs before you could develop anything approaching an optimal strategy. That, combined with Small World's inherent low variance, could make this another game that's worth coming back to. I'd love to try Small World in its iPad incarnation; like Ticket to Ride, this game has to be more fun when the computer takes care of the constant but unexciting numerical computation for you.

Tongiaki

Another one I first played with Alex, Tongiaki is a constant fixture in our game daysdue to its low-pressure environment and hilariously, constantly shifting alliances. We've played this one enough that we have developed some ostensibly optimal strategies: some of us like the "build up your forces on a single island" approach while others go with "explore outward as much as possible," and above everything else, "keep 5-point islands to yourself at all costs." Of course, like finches in the Galapagos with no external pressure, we've converged onto the same strategy without knowing how anybody else does it, and it would be fascinating to play Tongiaki with a new player just to see what new strategies emerge.

Verdict: we're not playing Tongiaki for the deeply complex tactical experience; as far as I can tell, we've exhausted all the possible strategies at this point, and we know that winning it is as much a function of variance as of good gameplay. Instead, the reason we keep coming back to it is that it's fun to make your friends' boats sink, and it's a refreshing "palate-cleanser" between more demanding games.

Blokus

More of a visual puzzle game than a strategy board game, I've played Blokus with a few different groups but most recently with my parents. It's worth playing mostly because it tests a completely different set of skills than the intensely mathematical strategy contained in most of the Euro games in our arsenal. The real strategy behind the small 1x1 square piece is starting to become clear (use it to escape jams in enemy territory!). Also becoming clear is the game's biggest limitation: despite the rules workarounds for two- and three-player games, it really does need four players to work correctly.

Verdict: Blokus is a game that is probably going to appeal to an entirely different set of people than the "heavy" Euro games will, and it's a bit unusual in that it doesn't neatly fit into the strictures of "Euro" or "American die-roller" or "party". Play this one if you're into visual-spatial puzzlers. Don't play it if you don't have exactly four people.

Despite our league's earlier tweaks to the BQBL draft and weekly matchup structure, we used the Grantland standard scoring rules, mostly because Grantland's weekly posts tabulate the score for us--plus, none of us wanted to go through the exercise of coming up with an alternate scoring system, and we wouldn't know how to "fix" it even if the balance were off. Throughout this season, though, a few imbalances in the scores became especially clear.

1. Too much difference between "interception" and "interception return for TD"

Maybe a receiver runs a wrong route, maybe he just gets beat, or maybe he catches a bad break, but there's some element of bad quarterback play in just about every interception. But whether a pick turns into a pick-six isn't just on the quarterback. It's a function of very many things including field position, proximity of other players, and speed/size/ball-handling skill of the intercepting defender, not to mention a bit of luck.

In other words, the quarterback has basically nothing to do whether an interception is returned for a touchdown or for no gain whatsoever. It's not as if the egregiously terrible throws result in defensive touchdowns while the merely bad throws result in simple interceptions.

This is the Bad Quarterback League, though, and we do want to reward miscues, poor decisions, and errors by the QB that hurt the team. There's no question that a pick-six hurts a quarterback's team more than a regular-flavor pick--and there's no question that it's more fun to watch Brady Quinn put points on the board for the other guys than for Kansas City--so some premium for "interception return for TD" over "interception" makes sense. But the standard scoring for an INT-TD, 25 points, is much higher than the score for an INT, 5 points. That seems like too much of a premium for something largely out of a quarterback's control.

2. Too much difference between "interception return for TD" and "fumble return for TD"

An interception and a fumble are functionally the same: both are rooted at least in part in poor decision-making or bad ball control, and both result in your team not having the ball anymore. That's bad, and it results in 5 BQBL points in both cases. An INT-TD and fumble returned for a touchdown are functionally the same, too: both cost your team six points.
In standard BQBL scoring, though, the INT-TD is worth 25 points, while the fumble-TD is worth a mere 10. They're equally harmful to your team, and they result from equally bad quarterbacking, so it only makes sense that they should have the same BQBL value too.

3. Sometimes you earn more points for fumbling than for keeping the ball

Mark Sanchez is backed up against his own end zone. He drops back, looking for some receiver downfield but doesn't find one. Eventually, he hears the foreboding footsteps of a 280-pound defensive lineman bearing down on him, and he knows he has no play. Sanchez takes the sack, goes down in the end zone, and gives up two points. He earns your BQBL team 20 points!
In another version of this sad tale, Sanchez tries to make a play, pitching out to Shonn Greene, but dropping the ball on the ground instead of completing the lateral. The astute defensive lineman falls on the fumble, and Sanchez has now given up six points. But he only earns your BQBL team 10.

Like we've just talked about, a fumble returned for a touchdown is worth 10 points in BQBL, regardless of how long that return happens to be. Getting sacked for a safety in the end zone is worth 20. That's an especially bad quarterback play, because a quarterback on top of his game should always throw the ball away rather than take the sack and the safety. But here's another case where the BQBL score doesn't necessarily accurately reflect just how badly a quarterback has been playing. Giving away six points is always worse than giving away two, so a lost fumble in the end zone should always be worth more than getting sacked for a safety.

4. Completion percentage doesn't always tell the whole story

There's a nice little BQBL windfall (5 points) for completing less than half of your passes, and even more points for completion percentages under 40 (15 points) or 30 (25 points). And clearly, there needs to be some cutoff if the BQBL is going to award points for poor completion percentage. But is a quarterback who completes 18 of 35 passes really having a better day than the one who completes 17 of 35? How about the one who completes 4 of 10?

Consider Sam Bradford's Week-5 brilliance, in which he completed 7 of 21 passes. Seven. On twenty-one attempts. Bradford did get the rare sub-40-percent bonus, but so would a quarterback who completed 17 of 43, which is awful but somehow doesn't seem as bad. The 17-for-43 guy at least tried the whole game. Mr. 7-for-21 might not have.

5. "Benched" needs a more consistent definition

Sometimes, "benching" is obvious: John Skelton's 4 interceptions and 18.2 quarterback rating earn Ryan Lindley a ticket into the game. That's worth a cool 35 BQBL points. The Seahawks' 35-point lead in the same game earns Russell Wilson a trip to the bench and Matt Flynn some playing time; that's worth zero points.

It's not always as clear, though. Aaron Rodgers' 219 yards and 81.9 rating were probably among the worst in his career, but pulling him with 5 minutes to go in the game had as much to do with the Packers' offensive futility and not wanting him to get injured as with his mediocre play. Blaine Gabbert's 7-for-19 and a whopping 53 yards was a truly miserable performance, but his giving way to Chad Henne in the 4th quarter had at least as much to do with a legitimate injury as with his stats.

Most decisions in the BQBL are obvious because they're quantitative, but this one is a little subjective, so awarding it is a tougher call. For a player to earn "benched" points, he must have been pulled by the standard of sufficiently poor play. That requires a judgment call, and it's important that the judgment be applied in the same way in every case.

Recommendations for future BQBL seasons

Reduce INT-TD scoring to 10 points. This brings it in line with fumble-TD and reduces a scoring discrepancy based largely on factors out of the QB's control.

Keep "sacked in the end zone for a safety" scoring at 20 points, but add the condition that if a quarterback fumbles as he's sacked, he gets the full 20 points anyway. This eliminates the QB "incentive" for losing the football.

Add a scoring mechanism for completing a sufficiently small number of passes, perhaps "fewer than 10 completions: 10 points".

Retain the 35-point score for benching, but don't award it in cases of injury or when the benched QB's team has no chance of winning the game anyway. Maybe "benched before the 4th quarter" or something similar.

Did anyone else out there run a BQBL this year, with standard or modified rules? Any other recommendations or improvements you've identified?

Wednesday, January 2, 2013

I've got a bit of an annual routine during the holidays, I like to clean things up in preparation for the upcoming year.

I've been revising my collection over the past year, looking to sell or trade away a few of the games I feel less and less compelled to bring out and looking to eliminate some redundancy in the collection. Of course this is all short lived and I've been spending some of the last few months already looking for suitable replacements.

Theme is often a make or break aspect of a new game on the market for me. I pass over Mediterranean merchants, Middle Ages worker placements, Abstract construction titles and Nineteenth Century Train games all the while knowing I'm missing out on some great gaming, but unable to persuade myself and others to give them a try. Lets take a look at a few themes in particular that have caught my eye recently.

The Great Fire of LondonSynopsis:
Fresh off a successful reprint Kickstarter campaign, The Great Fire of London appears to have gained a whole new audience just in time for 2013. Players control buildings and battle to control the flames from swallowing their property while innocently steering it toward the their opponents.

Why do I LOVE this theme?
Fire is a natural villain and while it isn't quite an untapped theme, it does allow for interesting options with thematic mechanics. I particularly enjoy the historical element of this as it blends history with the feelings of tragedy as your stomach drops when your key stronghold turns into ashes.

Functioning as low level criminals, players slowly increase their notoriety by taking down increasingly tougher challenges alone until they reach a point in which they have to trust one another in order succeed.

Why do I LIKE this theme?

While also sounding like a movie theme, I love games that have some negotiation element to them, but who aren't entirely dedicated to it. What better way to evoke charismatic negotiations among backstabbing players than with a guild of thieves?

Salmon RunSynopsis:
Due out in 2013, Salmon Run is a race up stream for players as they play special abilities and avoid obstacles on the way to their breeding grounds.

Why do I LOVE this theme?
Why not? I like a good racing game every now and then and at least this one has a great purpose. I think the modular boards is always a neat aspect for replay value and it looks like a neat gateway game.

Players fight an epidemic by adding on to their hospital in order to keep up with the incoming patients. Players bid on hospital add-ons in an attempt to best

Why do I LIKE this theme?

The idea of improving infrastructure of a hospital in order to keep up with demand is appealing to me. As a tile laying game it could be great or fall flat with me but the theme definitely would get me in the door for a play to see how infectious the game really is.

Players juggle the role of an entrepreneur by jumping into a start up company and attempting to hire the best personnel from opponents. Players hire engineers to improve the product, salespeople to sell the product and executives to improve productivity.

Why do I LOVE this theme?

As an aspiring entrepreneur I really enjoy the idea of pursuing a scarce pool of talent and battling with opponents to keep them on board. As employees stay their costs increase as they become vested in the company. I think its a great business concept to develop and I hope the gameplay lives up to it.

Happy 2013 from Ludi Berkeley! To ring in the new year, and to celebrate the start of the Ludi Berkeley's full year, here are thirteen gaming resolutions I'll try to follow in 2013.

1. Host and attend more game nights. I'm in an engineering graduate program in the San Francisco Bay area. It's likely that at no time in my life will I be in such a board-game friendly situation, and I've done a less good job taking advantage of that in 2012 than I did in 2011. There is no reason that my apartment shouldn't be packed to the walls with board gaming at least once per month.

2. Expand my collection. It's always the goal of any board gamer--or video gamer, movie buff, or avid reader--but in 2012, the only new additions to my collection were Pandemic: On the Brink and Kingdom Builder. The grad student stipend isn't one of fantastic wealth, but neither is it a shackle to poverty, and acquiring three or four new games in the new year seems reasonable. I already have my eye on Seasons, which is beautifully designed, has similar mechanics to 7 Wonders and Ascension, and has been recognized as the best new game of 2012.

3. Play the Spiel des Jahres winners and nominees. Either the Spiel des Jahres overall winner (Kingdom Builder in 2012, Dominion in 2009, and so on) and/or the Kennerspiel des Jahres ("Enthusiast Game of the Year"; 7 Wonders in 2011, Agricola in 2008, and so on) tends to be very, very good and a lasting fixture in board game years after its award. I'm making a point of getting in on the ground floor with both the Spiel des Jahres and the Kennerspiel des Jahres, and the more of the nominees I can also play, the better.

4. Tackle more of the BoardGameGeek top 20. Despite the changing tastes of gamers and the weekly entries of new games, the top 20 games on BoardGameGeek remain surprisingly constant from year to year. I've played Agricola, Power Grid, Dominion (and its expansion Intrigue), 7 Wonders, and Race for the Galaxy, and I was able to add Puerto Rico this year. Some of the games on there are expensive (Eclipse at $99 MSRP) or don't look like my cup of tea (Twilight Struggle, a two-player war game). But they're all at the top of their game (so to speak) for a reason. At the top of my list among these games include Le Havre, Caylus, and Ora et Labora.

5. Be less scared of very long games. Agricola isn't my favorite board game out there, but that has as much to to with its length as its unforgiving mechanics. Power Grid is without a doubt one of my five favorite board games. I end
up enjoying every minute of it every time I play, but all too often I
veto it because it takes too long. And I adore the immersive aesthetic and beautiful artwork in La Citta, but all too often I let it get vetoed because it takes too long. It's time to dust off these evening-killers and accept that we're only going to get one game in.

6. Get around to playing video games I purchased in 2012 (or longer ago). I'm not as huge a video gamer as I was in middle or high school, but I own a Playstation 3 and a PC well-suited to playing video games, plus plenty of games to play on both. I started but didn't finish Skyrim's Dawnguard expansion (and haven't scratched Dragonborn yet). I loved Mass Effect 1 and 2 but haven't even installed 3 yet despite having pre-ordered it back in the spring. And that used copy of Bioshock I picked up along with my PS3 has been sitting patiently on the shelf for more than two years now.

7. Delve deeper into low-leverage indie gaming. Heartfelt storytelling, innovative gameplay, fantastic aesthetics, and a focus on experience rather than difficulty are a set of ideals very much in vogue in independent game design these days--and a set of ideals that very much appeals to me. I've been unfortunately out of the loop playing this sort of game for the past several years, and I want to get back into it. In particular, I'm interested in getting hooked on a point-and-click adventure game, King's Quest or Monkey Island style.

8. Play at least one game of 1000 Blank White Cards. Essentially do-it-yourself Fluxx, 1kBWC is the ultimate in creative party gaming. Artists have a chance to create pretty pictures, comedians and raconteurs can entertain and engage, and us gaming geeks can play around with clever mechanics and rules interactions. I've loved the game since high school and have played it scant few times since college.

9. Play-test D&D Next. Here's a chance to do something exactly in my wheelhouse: delve into the underlying design decisions behind the next iteration of the greatest and most popular tabletop RPG out there. I've followed the open beta since its beginning back in May but, despite my best efforts to organize a group, haven't been able to actually sit down and play it. 2013 might give me a better chance, as the next season of Encounters includes the D&D Next beta among its optional rules.

10. Play a D&D Lair Assault. The name "Lair Assault" has always put me off: it implies that it's going to be a session of combat and nothing else, within a version of D&D that's soundly criticized (sometimes fairly) as focusing too much on combat. However, all of my D&D-playing friends who have done Lair Assaults have reported having a great time, and with 4th Edition starting to make its exit, now is as good time as any to give one a shot.

12. Blog more about game strategy. One idea I had starting this blog was to relate specific situations and anecdotes from games and pose them to the blog's readers, asking for a wider audience to consider the optimum strategy. (Exactly how much is power plant 13 worth in Power Grid?) Ludi Berkeley has been all too light on that sort of post so far.

13. Redesign the blog. The neutral gray was a placeholder layout I picked back in June when I started the blog, and unfortunately I haven't had time to make it more interesting-looking. Expect an aesthetic explosion in the New Year!

Let me know what your gaming resolutions for 2013 are, and I wish you good luck in fulfilling all of them! I appreciate everyone who's followed along with the blog so far, and I encourage you to follow the blog and spread the word about it if you've enjoyed reading.