Saturday, February 20, 2010

The Relationship Between Outshooting and Outscoring over Time

Derek Zona from coppernblue had a great post last month that examined the relationship between outshooting and outscoring at even strength over time. Specifically, he looked at aggregated EV goal and shot ratios for each team over the last three seasons as a whole (2007-08, 2008-09 and 2009-10, I presume). He found that the teams with the best EV goal ratios during this period were overwhelmingly teams that outshot the opposition at EV.

I think that his point is an important one. While the relationship between outshooting and outscoring may not be apparent over brief periods, the teams that succeed at even strength over the long run are those that spend more time in the opposition's end than their own.

Whereas Derek examined the strength of this relationship over the last three seasons taken together, I thought it would be interesting to look at the how this relationship varies over the course of an individual season.

The above table shows the correlation between EV goal ratio and Corsi ratio at the team level over certain game segments. The first bar shows the correlation between these variables over games 1-100 (-0.09). The second bar shows the same for games 1-200. The last bar shows the correlation over the entire season (games 1-1230).

It's apparent that the correlation between EV goal ratio and Corsi ratio increases over the course of the regular season. The increase is more or less linear over the first 1000 games, at which point it reaches asymptote.

The increase is even more pronounced if one looks at the relationship in terms of overall variance (r^2), rather than as a correlation. While Corsi ratio only accounts for roughly 9-15% of the variance in EV goal ratio over the first several hundred games, the r^2 value for the entire sample is in the range of 35-40%.

I've also included a chart that shows the same data for the 2008-09 season. While the increase isn't as sharp as that observed in 2007-08, the overall message is the same: as the season moves forward, the relationship between outshooting and outscoring at EV grows stronger.

This is just the effect of larger sample size. Over a very short period of time, the explanatory effect of any variable will be almost 0, but over a long period of time it will be very high (assuming it actually has a correlation with the dependant variable).