Thursday, February 23, 2012

40k Metrics: The Data We Want Versus the Data We Have

Warhammer
40,000 has a lot of aspects that can be reduced to numbers through basic
probability and algebra. While analysis
of in-game unit or army statistics definitely has a place and plenty of good uses, and the end result is (hopefully) a more accurate prediction of
army performance on the table top.

But
why are we doing all this mathhammer? To
help us make better choices, in order to win more often. Duh.

It's still all about the numbers.

With
that said, doesn’t it strike you as hilarious that we don’t have even
rudimentary statistics for some of the most basic things that other competitive
sports or games take for granted?We are
woefully ignorant about decisions that other competitive activities have put to
bed.

Let’s
talk about the most important one first, the giant elephant in the tournament
hall: who should take the first turn? In
chess the player going first wins about 56% of the time. Why don’t we have this basic statistic for
40k? It wouldn’t be hard to determine. On your tournament score sheet, have the
players check a box to determine who went first. Once we have the raw data for who went first,
and who eventually won, the data mining potential is endless.

Just
knowing whether or not the player who went first won more often doesn’t
actually help you make a choice, actually.
Let’s hypothetically say that the player who goes first in the NOVA Open
wins 55% of the time: great! I choose to
go first.

But
wait!

Let’s
say that an army like Eldar happen win 60% of the time when going second due to different unit
capabilities for codex. Knowing that
overall the first turner won isn’t relevant to an Eldar player. And by the same token, a Guard player might
win 70% of the time when going first, making it even more of a no-brainer. So clearly, once we have the raw data, the
first order of business is to sort it by codex.
Having access to the results of taking the first turn on a codex by
codex basis is data worth having.

And
that is just the surface. The next level
is to sort the data by player record.
How often did the top 10% of finishers play first or second? Was that number greater or less than the
tournament or codex average? Knowing
this would allow us to see if the first (or second) turn has greater predictive
power for highly skilled players. If the
tournament average says 55% of first turners playing Space Marines win, but at
the top tables the average skews to 65%, it would be quite clear that overall
success is highly correlated to the turn decision. If it turns out to be less, it would be
relevant to know that among the top players, turn decision is less
important. Either way, the answer is
worth knowing.

This
wouldn’t be too difficult to do, adding a single check box on a results slip,
and a line item on the Excel spreadsheet or database program. In a day and age were sabermetricians can
tell you the precise amount that home field advantage benefits a high school
basketball team, it is fairly egregious that 40k players (for whom nerdy things
like statistics and spreadsheets ought to be second nature) don’t have a
definitive answer for the simple question, “should you take the first or second
turn?”

So
who is going to be the TO who wants to give the community this huge gift?

5 comments:

This sort of data would indeed be neat to have. I'm not a big tourney player, but it would be interesting. That's a good start, plus being able to break down a codex's wins not just by win/loss but also by what book they played against. Seeing that (for example) GK beat SM 70%, Orks 55% and Tau 35% would be interesting to see what kinds of R/P/Scissors charts could be constructed!

Yeah, it definitely doesn't take much data mining skills to get some really valuable information out of results data. I still think knowing your codex' first/second turn choice is a huge benefit.

Once we get that, then you can refine it to first/second turn choice vs. each enemy codex. That would probably take a larger sample size than even a big tournament like a NOVA/Adepticon. But if we had 3 or 4 such tournaments we would probably be able to say, "I'm playing Marines vs. Tau, if I go first I am a 15% favorite, if I go second I'm an 8% underdog." That would be damn valuable informtion.

I come from a fighting game background where we have exactly this type of stat. Results from big tournaments are collected and put into grids basically showing how likely any given character was to beat another character. Something like this: http://iplaywinner.com/storage/thumbnails/2739927-8163688-thumbnail.jpg?__SQUARESPACE_CACHEVERSION=1282064042277.

Over enough time and data, a fairly accurate tier list can be generated. New technology is discovered all the time, though, so you still have to take it with a grain of salt.

Having something similar for 40k would be awesome, but I don't know if it would really be that useful. Where Chess and Street Fighter are static games, I think 40k has too many variables to consider. Showing that Orks beat Eldar 78% of the time doesn't really seem all that meaningful if we don't know what the list compositions were like.