Monday, August 28, 2006

In 2005, Ken Krzywicki authored an excellent hockey shot quality study (blog entry) that produced a formula for predicting the probability of a goal, based on shot type, shot distance, whether it was a rebound, and whether or not it was on a power play.

“It is often said that playoff hockey is a different brand of hockey,” he starts out. But, then, instead of rerunning his full analysis on playoff data, he uses the same regular-season formula, as if the playoffs are the same brand of hockey.

Is that a problem? No, not if the formula works just as well. After all, it’s pretty much assumed that Runs Created and other such baseball formulas work fine for World Series games. Is the same true for hockey? Does a 55-foot slapshot have the same chance of scoring in the playoffs as it does in the regular season?

The answer, as Krzywicki finds, is no – in general, the chance of scoring on any given shot is lower in the playoffs. The study divides shots into deciles, based on scoring probability. In eight of the ten deciles, the chance is lower. In the other two, the chance is only very slightly higher.

So clearly, shots are less deadly in the playoffs. There are two possibilities:

1. The style of play is different in the playoffs; perhaps checking is tighter, and so shots are of lower quality, even from the same distance.

2. The shot quality is the same, but the goalies are better at stopping those shots.

It seems to me that number 1 is more likely. But Krzywicki argues that it’s number 2 – that the data shows the difference is improved goaltending. And that’s where I disagree.

Krzywicki’s reasoning is that statistical tests on playoff data using the regular-season shot formula give results that show a good fit – and therefore, we should assume that the shot probabilities haven’t changed. Specifically, the Kolmogorov-Smirnov (KS) statistic is only 2.8% worse for the playoff data; the population stability index is a low 0.0114, which indicates a stable population between playoff and non-playoff shots; and the “deviation in the logit” – a measure of how much the coefficients vary between the two datasets – is-0.0313, “a small number.” Therefore, since the statistical tests show non-significant differences, the old model is a good enough fit for the playoff data.

My response, as I argued a few days ago in the case of correlation coefficients, is that the results of statistical tests, taken alone, can’t provide sufficient evidence of a good enough fit for the purposes. We already know that there is a high degree of similarity between playoff shots and regular season shots, just by the fact that hockey rules don’t change and nobody has noticed a serious difference even after watching hundreds of games. But that intuitive gut feel isn’t enough – if it were, we wouldn’t need this study. So how do we know that 2.8% goes farther than that, that it’s low enough to prove that there’s really no difference between the styles of play? Maybe 2.8% is huge – the difference between playing an AHL team and playing the New Jersey Devils. It looks small in the statistical sense, but it might be huge in the hockey sense. You’d need a whole nother study to show whether 2.8% really is small enough to ignore.

As for the conclusion about goaltending, suppose that there are two alternate universes. In one universe, playoff goaltending improves by 15% but defense stays the same (as Krzywicki argues here). In the other universe, goaltending stays the same, but defense improves by 15% -- that is, shots are 15% less likely to score because the defense forces softer shots, even given the type and distance.

Both those cases will result in exactly the same play-by-play data, and the same results for the statistical tests! There’s no way of figuring out which of the two universes we live in without additional data (such as shot speed and accuracy, for instance).

So, I would argue, this study leaves the question unanswered. The difference might be mostly the playoff style of play. It might be mostly goalie improvement. Or it might be some significant combination of both.

1 Comments:

First, I'm no hockey buff (or anything close) but this is something I've pondered in all sports.

Would looking at data from the games played between playoff teams in the regular season change the results?

For example, if the Devils and Rangers played X number of games in the regular season, would the percentages in those games be more or less similar to the games they play against each other in the playoffs.

Essentially, couldn't the data from the playoff games be the same as the data from the regular season, but it is being skewed by those teams that didn't make the playoffs, making it appear that there is a difference, but essentially there is no difference?