Friday, October 02, 2009

Empirically Measuring Class DPS

I saw a really interesting discussion on EJ about those Top 50 DPS class leaderboards. The argument is that, because the leaderboards take the top 50 parses, the class with high variance--perhaps due to a high crit rate--in their DPS end up higher on the boards than classes with low variance.

For example, if you took two classes that did 5000 DPS, one with a 50% crit rating, and the other with a 20% crit rating. If each class did 100 fights, and you picked the Top 20 fights, the class with the 50% crit rating is going to be on the top. This is because the Top 20 fights are fights with a greater than average critical strike rate, inflating the damage done.

So if the Top 50 boards aren't a reliable way to measure Class DPS, what would be?

One idea is to look at the average DPS across *all* parses. The problem here is that as you drop lower and lower, skill and gear becomes dominant factors. At least the Top 50 parses are probably going to be equally skilled and geared, so you're only looking at the differences between classes. But maybe you can make the assumption that each class is equally likely to have under-skilled and under-geared players. Is that a good assumption? I'm not sure.

10 comments:

My suggestion would be to add the item level of players wearing all purple, divide the player base into ranges about 50 or 100 item level difference wide and publish and average for each range. Being sure to include a standard deviation so we can see how much variance exists inside each range.

And, if each range contains a large enough number of players it fine to assume each class will have a reasonably similar distribution of geared toons.

If you created a cut-off, such as all parses where the player didn't die and was in reasonable gear and maybe even already had the achievement, then you just randomly selected 50 of those you'd end up with a better estimate.

Selecting the highest adds some selection bias to your results. Selecting randomly would eliminate a lot of that. If the number of samples was reasonably high (say 50), then the chance of accidentally getting ones that weren't representative would be fairly low.

Statistics has solved this problem already. Sadly, I'm not a statistician.

I would sugest the following procedure: take the entire universe of parses, and then calculate the average DPS per player, that is for player xyz calculate his average DPS across all parses available for that character (so in a sense calculate his average lifetime DPS rather than his average DPS for a single instance run). Do this for every character in the sample.

Then take the top 50. By Chebyshev's Weak Law of Large Numbers, you are assured convergence to the true mean, so if there are enough parses for a character (20 or so should be more than enough), then the effects of variance should be eliminated (since you will likely have both higher-than-average crit rates and lower-than-average crit rates in your lifetime DPS history).

I just realized something: in the procedure you have above you would have a natural upwards drift in DPS as an individual character gears up, so you would not be taking IID samples, but autocorrelated and heteroskedastic samples (since the variance depends in how much SP/AP you have).

Nonetheless, the procedure should still hold if you correct for the above.

I did a post, a while back, on why I didn't think the 5% Hybrid DPS tax didnt work, but its conclusion lends its self to your topic as well.

The answer is that it is difficult to even parse an "average" dps. Fight mechanics, different assignments, all can lead to parses that vary wildly from one day to another. I know that I can pull nearly 5k in certain bosses in naxx 10 man. But others, only mayve 2.5 - just because I am assigned to be at the back kiting the zombie chows.

1. The original EJ assumption is probably incorrect because critical rates don't actually vary that much from the mean due to the number of times the player launches any given attack. It would take a truly enormous number of player parses to generate the kinds of extremes being speculated.

2. Averages across large bodies of data wouldn't work all that well because the further you get from the leading edge performance, the more room there is for 'junk data' including under -geared/-skilled players.

3. Your 'all classes have an equal chance to be geared' assumption has at least one flaw: healing/tanking classes. DPS variants of classes that can otherwise tank or heal have two possibilities - someone geared for pure DPS, and someone who is using their secondary gear because they normally tank/heal.

4. The Top 20 lists are meaningless for another reason: the results are all 'fake' since they almost certainly involve inefficient use of various buffs/abilities to magnify the individual's dps.

Ummm, I think your wrong about whether the 20% or 50% crit rate would be higher in the skewed sample.

If they are equal when they get average crit rate, then if they both went up 1% the lower crit rate class should increase by more. I expect that the high crit rate class will start to saturate its proc on crit abilities.

Also the lower crit rate has a larger ability to get "lucky" streaks of crits.

Although you could be correct if what you mean is that the crit dependent class will have the most data points that scatter high. IF a class is highly crit dependent then presumably the players will stack crit.

I think Yyidth has the right idea - split it up by item level, and generate a range of values for each average item level step.

The real problem with "Top 50" isn't a freak variation in number of crits seen or some other RNG luck, it's that you can't generalize "the best players in the best gear" to "good players in decent gear", much less "average players in marginal gear". The question you're wondering about when you're measuring class DPS is "are good, geared Xs being sat in favor of less-skilled or less-geared Ys".

I don't think anyone in that Top 50 chart is gonna find themselves on the bench, and I don't expect to be able to compete with them for a raid spot, so Top 50 isn't a good standard here. What we're looking for is "70% of AIL 200 rogues are out-dpsing the average AIL 207 shadow priests", and for that math we need to factor in gear and skill, not just focus on the ultra-high-end.