Wednesday, November 3, 2010

Score Effects and Minor Penalties

The playing to the score effect has received a fair amount of coverage from the more statistically inclined members of the hockey blogosphere over the last two years or so (see here, here and here for some good overviews on the subject).In short, playing from behind tends to have a favourable effect on a team’s shot ratio, whereas playing with the lead tends to have the opposite result.The effect increases linearly as a function of goal margin, and is exaggerated in the third period.

The majority of analysis conducted in relation to the subject thus far has focused on the effect of game score on shot ratio.Little, if anything, appears to have been done in the way of determining the extent to which the effect operates in other aspects of the game.It’s conceivable, for example, that game score could have an analogous effect on team penalty percentage (defined as minor penalties drawn / ( minor penalties drawn + minor penalties taken).

In order to answer the above question, I looked at the NHL.com play-by-play data from the 2007-08 and 2008-09 seasons and created a script on excel to determine how many penalties each team drew and took during that period.However, because I was only interested in penalties that provided one of the teams with a manpower advantage (relative to the situation that existed prior to the penalty was called), I only counted “unique” penalties.I defined a unique penalty as a penalty not accompanied by the calling of any other penalty at that specific point in time.This obviously fails to account for situations where multiple penalties are called and one team emerges with a powerplay, but such situations are rare enough so as to not affect the data materially.

I then determined the goal state prevailing at the time that the penalty occurred – in other words, whether the team that drew/took the penalty was trailing, leading or tied at the time that the penalty was called.Here are the combined results for the two seasons, broken down at the team level.

Evidently, the trailing team does significantly better in terms of penalty ratio than does the leading team.During the period in question, every single team did better in terms of penalty percentage when trailing than when leading.In aggregate, the trailing team drew roughly 54.5% of all penalties, making the magnitude of the effect similar to that observed with respect to shot percentage (the trailing team had an aggregate Corsi percentage of 55.2 during the same timeframe).

One final question remains – does the trailing team earn its penalty advantage on merit, or is it a product of referee bias?Given that the trailing team also enjoys an advantage in terms of Corsi, and therefore spends more time in the opponent’s end than its own (an area of the rink in which a disproportionate percentage of penalties are drawn), one might be inclined to favour the former explanation.However, as demonstrated in the table below, there isn’t much of a relationship between Corsi percentage and Penalty percentage when the score margin is other than zero.

In other words, while the trailing team tends to both outshoot and “outdraw” the leading team, teams that outshoot the opposition by a large margin when playing from behind don’t do significantly better with respect to (trailing) penalty ratio than teams that outshoot the opposition to a lesser extent.

As alluded to above, the second possibility is that the penalty advantage accruing to the trailing team is the result of referee bias.If this explanation is correct, then the team-to-team variation in both leading and trailing penalty percentage would be the product of both randomness and team ability differences in drawing more penalties than the opposition.To test this hypothesis, the following experiment can be performed:

taking each team’s penalty percentage with the score tied over the two seasons in question and regressing each value 60% to the mean.The resulting values provide an estimate of each team’s underlying ability to draw more penalties than the opposition.The regression is necessary given that approximately 60% of the team-to-team variation in penalty percentage with the score tied (in the sample of question) can be attributed to luck

simulating the two seasons such that every “unique” penalty that occurred when each team was trailing constitutes an individual trial

designate the probability of drawing any given penalty as that team’s “true ability” penalty percentage (as determined above) plus 0.045 (0.045 being the magnitude of the referee bias)

calculate the average team-to-team spread (standard deviation) in penalty percentage after conducting a sufficiently large number of simulations

compare the predicted standard deviation to the actual value

repeat the above with respect to all penalties that occurred when each team was leading

The results:

The essentially confirms the above hypothesis in that the predicted standard deviations are virtually identical to the actual values.As such, an analogy can be drawn between the trailing team advantage in penalty percentage and home ice advantage.The probability of a team winning any given game is approximately 5% higher on home ice relative to neutral ice, but all teams benefit from the effect equally (that is, the team-to-team variation in home vs. road winning percentage is entirely random).Similarly, the referee bias in favour of the trailing team appears to be even across the board.

10 comments:

Very cool stuff. The biggest surprise for me is that there isn't a good correlation between Corsi (i.e. zone time) and drawing penalties. Do you know if there's much correlation between the two with the score tied?

The correlation between the two variables is larger when the score is tied - 0.25 in 07-08 and 0.44 in 08-09.

Of course, as team penalty ratio with the scored tied over the course of a single season is mostly luck based, the "true" correlation between the two variables is much stronger.

If each team's regular season schedule consisted, for example, of 10 000 games (or any sufficiently large number) instead of 82, the correlation between the two would be 0.51 (if 0.25 is used as the attenuated value) or 0.89 (if 0.44 is used as the attenuated value).

1) I am not able to get accurate data for when score is tied.(like you have) How much is the average effect) (i.e. Trailing teams generate what % more shots)could one generate a reasonable approx. from knowing time trailingand overall even strength shot rate?

2) I have Home ice at around 11-12%I believe I did Pythagorean based on GF and GA over 3 years it was 55.5 % to 44.5% (just wondering how did you get your %?)

3) I appreciate why you only use data when game is tied ..however..does this cause a sample size problem? 3a) Doesn't the type of scoring chances/shots change when a team is protecting a lead?My reasoning is..even though we have had post lock out rule changes, and now you have shown referee bias for trailing team, and increased shots ..still it is very hard to come back? after 1 (26% chance) after2 (8% chance)shouldn't teams come back more often?

1. As mentioned in the post, the trailing team generated 55.2% of all shots directed at the net (goals + saved shots + missed shots + blocked shots) during the two seasons examined, but that's for even strength only.

The overall figure would be slightly higher, given that the trailing team tends to draw more powerplays than the leading team.

You would likely be able to generate a reasonable approximation using those variables (although if you're only estimating shot ratio then time trailing doesn't matter), but it's not something I've looked at specifically.

2. The 5% figure I quoted was in reference to home ice advantage relative to neutral ice.

3. Sample size is a concern in terms of using each team's penalty ratio with the score tied as a measure of it's ability to draw penalties. But my model took this into account by regressing each team 60% to the mean.

3a. The degree of bias is significant, but not overwhelming - the trailing team only slightly outdraws the leading team. Also, I suspect that the effect is larger earlier in the game (and perhaps also when the score margin is greater).

The 0.045 figure represents the advantage in penalty ratio enjoyed by the trailing team within the sample analyzed.

If you look at the chart in the post, the trailing team drew 54.5% of all penalties (and therefore had an advantage of 0.045).

If my prediction that the bias in favor of the trailing operates across the board, then the team-to-team variation in trailing penalty ratio should be the product of team differences in the ability to draw penalties and nothing more. That's why all teams received the same adjustment under the simulation.

In terms of your second question, looking strictly at the 2007-08, there were 10084 penalties that met my definition. 3556 of those occurred with the score tied. Of the remaining 6528, the trailing team drew 3587 - some 323 more than we'd expect if there were no bias.

It would seem that Phoenix is quite a bit better than its record. While the Corsi ranking can give a pretty good idea of the ability of teams, I think special teams play is an important factor as well. Click on my name for an article regarding this.

You're right - in assessing team quality in a general sense, special teams necessarily factor into the equation.

One problem, however, is that special teams performance is largely luck driven, even over the course of an entire NHL season. See, for example, this post by Gabe Desjardins, in which he finds that over half of the team-to-team variation in powerplay efficiency can be attributed to luck.

I read your post and see that you've looked at the correlation between special teams performance (PP% + PK% - 100) and standings points. The relationship is substantial, but much of that is mediated by luck.

If you looked at special teams performance over the 1st half of the season and how it correlated with point totals over the second half, the relationship would be mild - perhaps not even statistically significant for some years. This because the luck involved in team-to-team differences in 1st half performance doesn't sustain itself to the 2nd half of the season.