Search

An initial disclaimer:

This piece is for discussion. Statistical operations can be tricky and there can be a number of ways to do things. I am not claiming to be right or wrong on anything, yet. If you have some advice, please provide comment.

Round 2

So, after my article on the Caps picking up Grabovski and me not thinking it was as big of a deal as others were making it, the response was brutal. I take some credit for that by putting out an unpolished piece. In the end, I stand by my argument that the idea Grabovski would go from a career 45-50 point scorer to a 60-70 point guy was hyperbole.

Some people discussed how his Corsi and Fenwick ratings, and that Washington had a lot more offensive zone faceoffs than Toronto (which should lead to more chances), would make him an improvement over Ribeiro. I basically argued that despite the improved advanced stats, it seemed crazy that any one person’s numbers would jump that high; thus, the Caps roster is at a net loss without Ribeiro, add Grabo.

To that end, I wanted to examine this further. Here is my idea: the better Corsi, Fenwick and offensive zone faceoffs a team has, under the “Grabovski hypothesis”, should lead to more team goals (he manes his teammates better argument). If this is true, we should be able to perform a linear regression and see how a variety of statistics effect the number of goals a team scores (goals for). In other words, I wanted to see what happens when we regress a team’s “goals for” for a season (y-variable) on a set of variables, including those mentioned above (X-set).

Thus, I went to stats.hockeyanalysis.com and grabbed team stats for all teams from the 2007-2008 seasons through the last season. I added all of HA’s data (see legend below) and added some dummy variable, which is common when analyzing panel data.

Items in red are in the data table, but were not used in the regression so there weren’t correlation issues between the x-variables.

Dummy Variables

east – Eastern Conference (0=No, 1=yes)

west – Western Conference (0=No, 1=yes)

yr** – year dummy for the year the data was taken (0 = not year **, 1 = year**) – one dummy variable for each of the six years

Results

Looking from the 2007-2008 season through the 2012-2013 season, the regression results only showed statistically significant results (at the 0.05 level) for shooting percentage and shots for (see “regressions results with lockout year” below).

I thought maybe the lockout-shortened season last year might have messed with things a bit, so I removed it and ran it again. The only thing statistically significant again is shooting percentage and shots for. Fenwick-for and Corsi-for are statistically significant at the 0.1 level, which is usually not accepted. Let’s say we do accept the stats at this level. A team would gain 1.7 goals per season for every additional 1,000 Corsi-for, or 1,000 shots directed at the net, or an one goal per season for every 333 additional Fenwick-for or 333 shots directed at the net (excluding blocked shots).

Grabovski

If I did this correctly, then only those old-fashioned statistics of shots on goal and shooting percentage matter how many times a team scores. Offensive zone faceoff percentage does not matter. Corsi and Fenwick are not statistically significant. Even so, Grabovski and his improvement on other players would have to add 1,000 shots directed at the net to gain an additional 1.7 goals per season (or 333 shots not including blocks).

This does not say whether or not Grabovski will be better or worse than Ribeiro. But, as it stands, Grabovski’s addition to the team based on the advanced stats do not have a statistically significant affect. What will matter? If he can get people the puck to score at a high percentage or put a lot more pucks on net, unblocked. We know he is not an assist guy, so I think it can be deduced that he will not likely raise the shooting percentage for others (give them good chances). Ribeiro on the other hand is a distributor based on his higher assist numbers throughout his career.

With the regression, as it is, I think my argument stands….the Washington Capitals roster is worse minus Ribeiro, plus Grabovski. The boys still have to play this out on the ice….

12 Responses to “More on Grabovski – Do advanced stats say anything about a team scoring goals?”

I’m not completely versed in this sort of stuff, so sorry if these are dumb questions.

1) When you say they’d need 333 extra unblocked shots for an extra goal, does that mean, if we plot the data, goals for = 1/333 x unblocked shots for ?

2) Are you running a multivariate regression for all these variables all at once? I’m thinking that if Fenwick for and shots for are very similar (one is shots, the other is shots + missed shots), maybe the coefficient for shots for eats up a lot of the significance of Fenwick for, meaning that the coefficient you get for Fenwick for is basically the coefficient you’d get if you used only missed shots instead of Fenwick.

Just thinking about this in hockey terms…getting an extra ~2 goals from ~300 shots means you’re shooting 0.75% on those marginal shots, which seems awfully small for any shot on goal.

Off the top of my head, a simple linear regression between wins and goal differential suggests 5-6 goals per win. A simple linear regression between wins and Corsi has a r value of something like 0.5. (That means it “explains” 25% of winning, right?) The spread between teams in Corsi differential for 2011-12 was something like +600 at best to -600 at worst. According to your conversion between goals and Corsi, that’s only worth ~3-4 goals, or a little more than half a win. Yet, the spread between those top teams (like LA and Detroit) and bottom teams (like Edmonton and Minnesota) was far greater than half a win. How do you reconcile all this information?

3) Can we flip this around for goals against? I don’t think the argument has been that Grabovski = more Corsi For, necessarily, but that the Caps will have a (far) greater proportion of Corsi events go in their favor with Grabovski than with Ribeiro. So there’s a defensive benefit as well. I’d say that probably exceeds the offensive benefit.

Also, why include defensive numbers like shots against and sv% in a regression for goals for? Just for completeness’ sake, or is there a “hockey reason” ?

4) “In the end, I stand by my argument that the idea Grabovski would go from a career 45-50 point scorer to a 60-70 point guy was hyperbole.”

I don’t like throwing out seasons entirely, but it seems like there are pretty good reasons to *heavily* discount Grabovski’s ’13 season. Do that, and he’s slightly more than a 45-50 point player. More like 50-55 (assuming ~75 GP). Throw in a bonus for better team and linemates and you get to 60-65, as long as age hasn’t taken much out of him.

“I basically argued that despite the improved advanced stats, it seemed crazy that any one person’s numbers would jump that high; thus, the Caps roster is at a net loss without Ribeiro, add Grabo.”

A 10 point jump isn’t crazy, IMO. Ribeiro had a sharper increase moving from Dallas to Washington. We even see those sorts of fluctuations by a single player all the time without a big change in teammates or situations, year-to-year. All it takes is Grabovski having a bit of an up year.

5) “We know he is not an assist guy, so I think it can be deduced that he will not likely raise the shooting percentage for others (give them good chances).”

Ribeiro does a lot of that on the power play–he has that skillset. I think it’s more apt, on the power play, to compare what Ribeiro has done to what Erat and Perreault–other skilled playmakers–can do, since I don’t think Grabovski will merely replace Ribeiro in all situations–I bet he’ll play less on the PP and more on the PK.

At even strength, from 2009-2012, Grabovski’s on-ice shooting% was actually a hair higher than Ribeiro’s.

I think saying the Caps will be worse without Ribeiro and with Grabovski, will be worse with Grabovski than they were last season, and will be a worse team are all different things.

It should be obvious that team results and actual team quality aren’t always the same. Sometimes, teams get fortunate and win games they probably shouldn’t have, and when there are 30 teams, there are bound to be a few that win a lot they shouldn’t have, or lose a lot they shouldn’t have.

I think this idea is relevant to the first and third statements. There’s evidence that the Caps were in the first category–the Caps excelled in the things that are pretty random (shooting% on the power play, for example) and not so much in the things that are far less random (Corsi–although they got better as the season went along, they weren’t ever good, exactly). It makes intuitive sense to me to equate team quality with all the attributes that fall in the second category.

Will the Caps be a better Corsi team with Grabovski than Ribeiro? We all agree, yes. Does that mean the team is better? The Corsi bump is supposed to be bigger than the shooting% decline from losing such a skilled playmaker, so I think it does, on the whole. Does that mean we think the roster is better? Absent significant coaching/strategy changes, yes, I think that’s the logical conclusion.

The second statement is different because you’re comparing to how you did last season. If a team underperforms (i.e. gets unlucky) one year, they could not change a single thing and get better results the next. But a team that overperforms could improve their team quality but get worse results in the standings. (It is just a sample of 48, or 82, games we’re looking at, after all.) I think the Caps fall in the second camp with this signing.

6) What of the effects of full seasons of Martin Erat (who is a perennial 45-55 point player himself) and Brooks Laich (who was an ironman before the ’13 season) ?

1) When you say they’d need 333 extra unblocked shots for an extra goal, does that mean, if we plot the data, goals for = 1/333 x unblocked shots for ?

With this regression, each independent variable has a coefficient….a number, which says how much each variable contributes to one goal. (I used the standard errors instead of the coefficient in my writing, but I will correct that). Nevertheless, for Corsi this regression shows that without last season that .00292 Corsi points lead to one goal. Divide 1 (one goal) by 0.00292 and you 342 Corsi points. In other words, for every 342 shots at the goal as defined by Corsi, one goal is scored on average.

2) Are you running a multivariate regression for all these variables all at once? I’m thinking that if Fenwick for and shots for are very similar (one is shots, the other is shots + missed shots), maybe the coefficient for shots for eats up a lot of the significance of Fenwick for, meaning that the coefficient you get for Fenwick for is basically the coefficient you’d get if you used only missed shots instead of Fenwick.

Just thinking about this in hockey terms…getting an extra ~2 goals from ~300 shots means you’re shooting 0.75% on those marginal shots, which seems awfully small for any shot on goal.

Off the top of my head, a simple linear regression between wins and goal differential suggests 5-6 goals per win. A simple linear regression between wins and Corsi has a r value of something like 0.5. (That means it “explains” 25% of winning, right?) The spread between teams in Corsi differential for 2011-12 was something like +600 at best to -600 at worst. According to your conversion between goals and Corsi, that’s only worth ~3-4 goals, or a little more than half a win. Yet, the spread between those top teams (like LA and Detroit) and bottom teams (like Edmonton and Minnesota) was far greater than half a win. How do you reconcile all this information?

All good stuff. I am going to run the regression without Corsi and Fenwick. I will use shots for, blocked shots and missed shots. I will post and try to discuss some of these results. It is a multivariate regression.

3) Can we flip this around for goals against? I don’t think the argument has been that Grabovski = more Corsi For, necessarily, but that the Caps will have a (far) greater proportion of Corsi events go in their favor with Grabovski than with Ribeiro. So there’s a defensive benefit as well. I’d say that probably exceeds the offensive benefit.

Also, why include defensive numbers like shots against and sv% in a regression for goals for? Just for completeness’ sake, or is there a “hockey reason” ?

I am going to look at shots against, blocked shots for and other team missed shots. I will report these. The defensive stats are for completeness. nothing is significant, so it shouldn’t have much more a difference than those dummy variables.

4) “In the end, I stand by my argument that the idea Grabovski would go from a career 45-50 point scorer to a 60-70 point guy was hyperbole.”

I don’t like throwing out seasons entirely, but it seems like there are pretty good reasons to *heavily* discount Grabovski’s ’13 season. Do that, and he’s slightly more than a 45-50 point player. More like 50-55 (assuming ~75 GP). Throw in a bonus for better team and linemates and you get to 60-65, as long as age hasn’t taken much out of him.

“I basically argued that despite the improved advanced stats, it seemed crazy that any one person’s numbers would jump that high; thus, the Caps roster is at a net loss without Ribeiro, add Grabo.”

A 10 point jump isn’t crazy, IMO. Ribeiro had a sharper increase moving from Dallas to Washington. We even see those sorts of fluctuations by a single player all the time without a big change in teammates or situations, year-to-year. All it takes is Grabovski having a bit of an up year.

Good points….let’s keep this going and see what can be explained.

5) “We know he is not an assist guy, so I think it can be deduced that he will not likely raise the shooting percentage for others (give them good chances).”

Ribeiro does a lot of that on the power play–he has that skillset. I think it’s more apt, on the power play, to compare what Ribeiro has done to what Erat and Perreault–other skilled playmakers–can do, since I don’t think Grabovski will merely replace Ribeiro in all situations–I bet he’ll play less on the PP and more on the PK.

At even strength, from 2009-2012, Grabovski’s on-ice shooting% was actually a hair higher than Ribeiro’s.

It is hard going from the individual impact on team stats. What I am trying to show is that if one player increases/decreases any stats (either alone or by making others better) what the affect will be on offense and defense. I will run this on goals against as well and put that in my next post.

I think saying the Caps will be worse without Ribeiro and with Grabovski, will be worse with Grabovski than they were last season, and will be a worse team are all different things.

It should be obvious that team results and actual team quality aren’t always the same. Sometimes, teams get fortunate and win games they probably shouldn’t have, and when there are 30 teams, there are bound to be a few that win a lot they shouldn’t have, or lose a lot they shouldn’t have.

I think this idea is relevant to the first and third statements. There’s evidence that the Caps were in the first category–the Caps excelled in the things that are pretty random (shooting% on the power play, for example) and not so much in the things that are far less random (Corsi–although they got better as the season went along, they weren’t ever good, exactly). It makes intuitive sense to me to equate team quality with all the attributes that fall in the second category.

Will the Caps be a better Corsi team with Grabovski than Ribeiro? We all agree, yes. Does that mean the team is better? The Corsi bump is supposed to be bigger than the shooting% decline from losing such a skilled playmaker, so I think it does, on the whole. Does that mean we think the roster is better? Absent significant coaching/strategy changes, yes, I think that’s the logical conclusion.

The second statement is different because you’re comparing to how you did last season. If a team underperforms (i.e. gets unlucky) one year, they could not change a single thing and get better results the next. But a team that overperforms could improve their team quality but get worse results in the standings. (It is just a sample of 48, or 82, games we’re looking at, after all.) I think the Caps fall in the second camp with this signing.

I am hoping the luck factor gets taken out by using the larger sample and looking at both stats with and without last season. So far, the lockout season messes up the advanced stats (makes some things statistically insignificant.

6) What of the effects of full seasons of Martin Erat (who is a perennial 45-55 point player himself) and Brooks Laich (who was an ironman before the ’13 season) ?

These should be included in the larger sample of stats. In the long run, what I am thinking is that it would require the entire team, with the coaching impact, Grabo, Laich, etc., 340+ Corsi points to get one goal. So how much can one player impact the score? Not much of course no matter who it is. But mayeb we can get some insight into what matters and what doesn’t.

I think you’d enjoy a read of the body of knowledge out there already about what stats correlate to winning (yes, goal percentage is #1 for teams) and which stats persist from season to season (for individual players, it is NOT shooting %).

What we know is that an individual’s shooting percentage is largely luck-driven in the NHL, as it doesn’t persist from season to season, whereas puck possession is more stable and ultimately more useful in projecting future performance.

I was trying to explore more into this Grabovski thing and see how some of the Corsi/Fenwick, etc. are supposed to lead to Caps improvement….

According to the regression I ran, while possession may be more stable, it doesn’t have a statistically significant affect on goals for or goals against. I understand that if we are trying to predict a player’s performance based on his shooting percentage that it won’t help us. However, teams don’t score more or get scored on less based on any the Corsi/Fenwick stats. Maybe pure time possession makes a difference, but I don’t have those stats.

What I am getting at is that you are probably right that Corsi and Fenwick can help us determine a player’s future performance best. But, how a player increases Corsi or Fenwick for his team does not affect the scoring according to this regression. Thoughts?

I use a five season sample (w/o lockout last year–n=180) and a six season sample (with last year n=150), so the small sample size shouldn’t be too much of an issue. I assume you are saying I should use averages because they are luck-based…fluctuate on chance. I will do some work to look at the stability of this stat in my next run/post.

I was actually planning on writing my own piece on this issue, albeit without the advanced stats work. Great job and thanks for doing so much leg work.

I am a proponent of greater advanced stats usage, but ultimately it’s production that counts. I’m excited for Grabo on the Caps, but I certainly agree that it’s a net loss without Ribeiro. All the advanced stats in the world don’t mean that Ribz’s shockingly consistent goal and assist production aren’t as good as Grabovski’s lesser production. I think your work here goes toward explaining why that discrepancy exists.

1. you’ve vastly narrowed your argument. your initial argument was a broad assailment of all things “Capital.” You can hardly say that your article was just “we cannot assume that Grabovski will get more points with the Caps,” which seems to now be the only point which you claim to have ever tried to make. I don’t think I have to use specifics here – as there has already been point/counterpoint again and again addressing the extremely incorrect assumptions you made about the Caps, in a broader sense than Grabo/Ribs. your initial rant against the Caps was what got you attacked.

2. the entire premise of this analysis is flawed. Since when is “Goals-For” a fundamental and ironclad predictor of a team’s success? Yes, sometimes the top teams in the league score the most goals, but also many times that is not the case (goal differential is far more important – I see goals for % listed as a variable, but was it incorporated into the y axis, or was it just a regression analysis against “Goals For” and nothing else?). I would argue, and I know others have as well, that Goals For/Against is not a great predictor of what makes a “good” team (those stats can easily be skewed – imagine a team that is very good, but has a few bad games in which they collapse and lose by like 6+ goals, or goes through a period of time when they have injuries and don’t score much, but are incredibly good when healthy). Also, you seem to be addressing only the point of regular season stats – not of playoff success, which is very different, and requires more than just offensive talent.

3. you are analyzing team stats… when we are talking about the effect that one single individual might have in his localized environment. your argument basically goes like this: “because over the last few seasons team corsi/fenwick numbers have not significantly influenced the Goals-For stat for teams, the argument that Grabovski might be a better fit for the Caps than Ribeiro is flawed, as it is based partly on those stats.” Tampa scores a lot of goals… and they usually suck. this analysis literally has NOTHING to do with the case of Ribeiro/Grabovski. also, anyone can the flaw here via your absurd conclusion: that only if Grabo and his influences added 1000 shots directed at the net would you see an increase of 1.7 goals per season. that is a conclusion that shows that this statistical analysis you have done is not of substance…

4. All of this debate is outside of reality… in my opinion, the argument was never about individual stats, and whether Grabovski will DEFINITELY have better numbers than Ribeiro (no one can know that, and even if he DOES have better stats, that doesn’t even mean for sure he is a downgrade). For one, Ribeiro is gone. that’s said and done, and it’s a different argument. was Ribeiro bad? no. Should the Caps have given him what Phoenix gave him? In my opinion, no. The point is that Grabovski has a lot of characteristics that really work with the Caps, and he will be a great replacement, and much better than locking in Laich as a replacement. this has turned into a bizarre discussion of advanced statistics, when it started as just an all out rant against the Caps (and to some extent, against Caps fans).

5. here is how Grabo can influence his team with possession stats, but without scoring: 2nd line comes up, dives possession, spends lots of time in the offensive zone, wears down other team, taxes the goalie. For whatever reason, they have a low shooting % or bad luck, or many of their shots miss, or whatever – but the net effect they have is that the other team isn’t as dangerous, they are tired out more and chased around more, regardless of *exactly* how good the 2nd line stats are, or Grabo’s individually. Other lines come on and score in general slightly more often because the other team is that much less dangerous after getting dominated by a more consistent second line. take out Ribeiro’s PP production (which is in a bubble) and their line really didn’t drive the game that way – that is I think our main point. it’s not that ONLY because of advanced stats are we happy about Grabo – and that by (not really at all) “concluding” that those stats are meaningless, you think we should change our minds and be like “oh ok, corsi isn’t a predictor of team goals over the season – therefore we should’ve kept Ribs.”

sorry for the long response – but I feel that if you took the time to write another article, I will take the time to respond.

Hi Nathan…thanks for the detailed reply. I appreciate it a lot as I am trying to make all of this fit together from a stats perspective. With the stats, I am trying to be unbiased….where my article was opinionated (with what I felt was some decent evidence). I replied to each of your points below:

1) I have begun to peel the layers on my argument. I still think the Caps roster is at a net loss with Grabo, minus Ribs. I was hit by advanced stats folks because I said it was grasping at straws to jump over Ribs career production and try to predict Grabs would be better, when his stats never show that….well, the plain stats anyway. I am sticking with this, but I am trying to get to the root of what some said. The Grabs makes people better theory, the PP vs. regular strength argument, etc. This is a start…I could be wrong, but I want to examine the things we hear to see if they are accurate. Maybe in the end we can get to the Caps roster being at a net loss/ejected from first-round of the playoffs argument.

2) I put together 2 more shorter posts today addressing some of these things. Looking at both Goals-for and Goals-against, there was no effect from Corsi/Fenwick. As far as ironclad goes…that is a hard standard to get to. To your larger point, I agree. Goal differential has the same problem, though. What if you win games by a goal, but get smashed when you lose…you could have a decent record and a pretty bad goal differential. I think the way to analyze this would be to have these stats by game for the six seasons, but not sure that is available. I will run a regression and post the results though with season goal differential.

3) I agree with what you are saying here for the most part. What I am trying to see is if the rise in team Corsi or team Fenwick over a season affects goal scoring or goal defense…and if any one player can influence those things that much…or if those things even matter. So far, they don’t influence goals for or against…the shots they leach from shots on goal is the only thing that gets them close to being statistically significant. It appears shot selection and accuracy are the big factors. Still more to examine.

4) I don’t consider this discussion bizarre. Grabovski is an upgrade to Laich, ok. Ribs was a fairly sizable loss. I just don’t think he gets them any further….

5) This is possible….but Corsi or Fenwick doesn’t say a lot to me about spending a lot of time with the puck or being in the offensive zone. Are those quality shots that missed or were blocked? Did the team setup, pass the puck around a bit…I don’t know. I think actual time numbers would be better here, but they are not in HA’s data set. Are players just tossing the puck at the net? I don’t know because of the salary cap the Ribs could have stayed….I don’t think Grabo is a suitable replacement to advance the Caps from their first-round rut.