She Was Too Good To Me: The Mirage of the Krueger Power Play

I write about hockey. I love the Oilers. My dad taught me that the Canucks are and always will be a bunch of classless bums.I also love jazz, films, coffee and comics.Twitter @RomulusNotNumaEmail: romulus @ theoilersrig.com (no spaces)

*As head coach for the 2012-13 season, Krueger continued to ice a successful power play, good enough for 8th in the NHL (20.1%)

*Since firing Krueger, the Oilers’ power play has hit the skids and fallen to a paltry 21st in the NHL (17%)

Overall, the narrative reads like this: Krueger was a power play ace and his loss is directly responsible for the lacklustre power play performance of the team.

Two powerful lines of thought reinforce this narrative.

1. The young, talented Oilers’ forwards surely possess a natural talent for power play success. Failure to produce with this group is seen as an special brand of failure, which only serves to highlight all the more the apparent successes of the Krueger years.

2. The Oilers of the last several years have been so woefully bad that any mark of success stands out as a Herculean feat.

The narrative, cut through with these lines of thought, lends the Krueger years an aura of respectability in the midst of years of failure.

“Well,” the story reads, “at least Krueger got something, anything, right! None of the other guys, including the current guy, can even manage that!”

There are other narratives about the Krueger years [the 24th overall standings’ finish; the “handling” of Yakupov; the use of the “Kid Line”] that deserve some sober second thought, but it is the persistence and ubiquity of the power play myth that is in the greatest need of a review.

An Aside on the Moral Psychology of Sports Chatter

One of the major pratfalls radiating out of nearly all traditional sports coverage is an uncritical commitment to a rather stale conception of subjectivity, wherein the individual operates as some kind of back room agent for their body, freely gazing over the field of possible future events and in clear command of actual outcomes. Let’s call this The Agent-Centered Model. This model fetishizes the individual and its rational command over both body and environment.

The benefit of such a model is twofold. It reduces the complexity of events in the world down to a simple human agent centered model of actions willfully performed by agents (X did Y). And, it opens the door for the intrusion of moral interpretations of events (X did Y, or X is responsible for Y).

Such a model, wherein individuals are in control of themselves and their environments and in wilful command of their actions, lends itself to a simple series of narratives. Each event allows for the allocation of praise/blame based on outcomes. The knotty confusion and noise of how events come about is reduced down to a simple story: X is to be praised for performing Y, just as Z is to be blamed for failing to stop the performance of Y.

This is the kind of thinking that turns players into heroes who “rise to the occasion” and villains who “collapse under the pressure.”

The major benefit here, aside from the tidiness of the storyline, is the ego-benefit the readers/viewers get from this kind of analysis. We all get to indulge in the fantasy of the free individual mastering the universe.

The major cost is the self-congratulatory tunnel-vision occasioned by this kind of thinking. Simple narratives (especially ones that flatter the egos of fans––’You are in control of your life!”) wear deep, deep grooves that make it increasingly difficult to add information and critical perspectives.

On Luck and Repeatability

One the strongest arguments the advanced stats revolution has made against traditional stats and the narratives that attend them revolves around the questions of luck and repeatability.

The reason luck is such a powerful anti-narrative concept is because it throws a wrench in the force of The Agent-Centered Model. The clean lines of subject–>action–>assignation of praise/blameare disturbed by luck. Once luck enters the picture, one is made to come to terms with how limiting the stories of The Agent-Centered Model are, i.e., how little we actually learn from them about what has taken place.

What do we mean by “luck”?

A persistent problem in any sports related conversation about luck (aside from the line the argument that simply dismisses luck out of hand) revolves around framing the issue. There are two frameworks in which luck gets brought up and the two are often confused.

1. The most common use of luck, one familiar to even those who reject or downplay its effects, is in relation to the singular event. For example, let’s say a puck takes an errant bounce off a skate and goes in the net. In this case, we find ourselves hard pressed to unreservedly praise the goal scorer for his good fortune, even if we accept the adage “one creates their own luck.” Something holds us back from committing ourselves to the Agent-Centered Model here. Mentally, we add an asterisk to the event and make a note that reads: luck.

2. When an advanced stats proponent uses the term luck, however, they tend to be referring to a series of events. Over a long enough stretch of track, any given player is going accumulate a record of performance, what we might call their “true talent.” For example, with several NHL seasons of data, we can come to some fairly definitive decisions on a player’s strength as a goaltender by looking at their average save percentage (at evens) over that period. The large sample size smooths out the uneven performances of daily life and fairly reliably tells you who will perform well/poorly in the future.

In this case, “luck” refers to outliers within the larger pool of data. These outliers are either bracketed off (as in the case of situational play, i.e., power play and penalty kill in favor of even strength), or smoothed out (as in the case of truly great and lousy performances). The goal of identifying luck in this regard is to reach a conclusion about a player’s true talent, i.e., a repeatable talent. This approach straight-forwardly challenges any robust Agent-Centered Model insofar as it attempts to account for all manner of factors outside the control of individuals. Built into this perspective is a rejection of the individual as master of his destiny.

The difference between the two kinds of luck involves making a change of perspective from evaluating a local phenomenon (an isolated event, in which a question of praise and blame is at stake) to evaluating a global phenomenon (a series of events, in which a question of repeatability is at stake).

Let me put it this way: in the first instance, the one most are familiar with, the goal of the conversation about luck is to determine whether a player is responsible for a particular outcome. In the second instance, the goal of the conversation about luck is to determine a repeatable level of performance.

To be clear, in this article we are concerned with the second kind of luck.

The Power Play

The reason I bring up the Agent-Centered Model is the following: I believe the narrative surrounding Krueger’s power play success is heavily luck-dependent. In a narrative scheme whereby “the results speak for themselves” and agents are unambiguously responsible for outcomes, Krueger is easily the hero of the movie. Once we account for luck, however, the entire narrative apparatus collapses on itself.

Let’s go through some background here. Some of the big kids of the advanced stats community have done the heavy lifting for us and the results are clear.

Power play efficiency (a team’s power play conversion rate) and power play shooting percentage are not a reliable indicators of the strength of a team’s power play. From Desjardins’ study:

Shooting rate – shots for per 60 mins at 5-on-4 – is by far the most persistent talent. PP efficiency and especially shooting percentage regress much more heavily to the mean. A team’s shooting percentage in one half of its games has almost no predictive value for the other half of its games… Shooting percentage has very little predictive value as we might have expected, but surprisingly (or not surprisingly to those who follow the line of reasoning behind shot differential metrics), the rate at which teams shoot on the PP is a better indicator of their future power-play “efficiency” than their past power-play efficiency.

Just like Gabe Desjardins found, shot production is a better predictor of future powerplay success relative to raw performance (with respect to 40 game sample sizes). And while missed shots have some informational value, blocked shots do not.

it appears that Fenwick For/60, with misses and blocked shots adjusted for scorer bias, is the best predictor of power-play success (GF/60), and well correlated with winning (Pts/game).

If we make a chart for the Oilers power play stats including efficiency, shooting percentage and fenwick for per 60 rate for the past 4 years we get the following:

[note: I’ve included the pre-Krueger year for reference. While Krueger was the Oilers’ associate coach during the 2010-11 season, I cannot find a record of whether he was in fact responsible for running the power play or not during that season. What is clear, however, is that the Krueger-Power-Play-Narrative is contingent upon Krueger taking over the power play and dramatically fixing it only the following year. See these narrative reference points here, here and here]

Looking at this chart, we can point out a couple of things.

1. The Oilers’ power play under Krueger enjoyed an unprecedentedly good run of shooting. Good enough for 1st and 2nd in the league.

2. Despite this, the Eakins’ year, lamentable as it is, actually managed to produce more unblocked shot attempts per 60 than either of the Krueger years.

3. During the Krueger years, the Oilers managed their remarkable shooting percentage while producing among the league’s worst in terms of unblocked shot attempts.

4. For reference, during the Krueger years, the elite power play teams managed c. 75 FF/60 (see here and here).

5. Based on FF/60, the Oilers of recent vintage have absolutely not had anything like a good power play, including under Krueger.

She Was Too Good To Me

The problem with looking at the power play in terms of efficiency and shooting percentage is that it lulls you into the mysteries of a fickle lover. Shooting percentage (as those who have watched Jordan Eberle or Nail Yakupov with any rigor are no doubt aware), especially on the power play, can vary widely from year to year.

The Oilers today are the same as they were before and during Krueger’s tenure: unable to reliably generate unblocked shot attempts on the power play. The story of the Krueger years is not “radically improved power play” but rather “exceptionally lucky power play.” Until the Oilers find a way to radically improve their shot generation abilities with the man advantage they are doomed to wait on luck, a mistress unlikely to strike thrice.

There has been a lot made of the recent hiring of assistant coach Craig Ramsay, who is said to be taking over the Oilers’ power play. As we wait to see what he comes up with, let’s be clear about a few things:

2. The power play under Eakins was no worse, however, than it was under Krueger. It was, in fact, marginally better.

3. As Ramsay casts about for power play models of success, the last thing he should do is hone in what Krueger did. That kind of luck does not come with a blueprint. He should look to what San Jose has been doing for the past 5 years instead (they just happen to be the number one team in FF/60 five years in a row).

The Blinders of Narratives

Part of the project of advanced stats is to trouble the stories we tell about hockey with reality and all its thorns. Bringing luck into the conversation accomplishes two things: it strips away factors that are beyond an individual’s control and it lays bare the narratives that are build upon illusory foundations.

Accounting for luck, we can say Krueger constructed a poor power play that was unlikely to repeat its apparent success if he remained head coach. We can also see that results (i.e., the power play efficiency rate) are far less reliant upon an individual’s actions. Or, we can see that the Agent-Centered Model is deeply flawed.

Hey Rom, interesting article. One quibble – I always have concerns with the use of the word ‘luck’, which creates this picture to those unfamiliar with numbers of a completely random event uncorrelated with skill, effort, etc. As in “winning the lottery is luck”. In the case of items like PP sh% and shot rate, the differences in sh% have been shown to be ‘luck’ (non repeatable), but PP shot performance might better be termed ‘random variation’ rather than luck. Good teams can do poorly and bad teams well over a season, but in the big picture, it *is* repeatable, it’s the only thing shown to be truly repeatable, and perhaps that’s the point.

Just for sh*ts and giggles, I took your premise and reversed the approach as a thought experiment.

What if you had 30 teams, all with *exactly* the same PP opportunities (3 2-min powerplays per game), exactly the same shot rate (0.9/min) and exactly the same sh% (10%). If you simulated a season applying these rates, how much variation could you expect to see in a season? (The simulation simulates the shot/goal percentage for each teams 82-game season, with a powerplay ending if/when a goal is scored):

Bear in mind, these 30 teams are EXACTLY the same, yet you see some wide variation in results (exactly what a statistician would expect to see):

Mins Range: 426.6 459.3 32.7
Shots Range: 354.0 436.0 82.0 – the best team is a shot per game better on the PP, EVEN THOUGH THEY ARE EXACTLY THE SAME
Goals Range: 31.0 53.0 22.0 – the best team almost doubled the worst team, EVEN THOUGH THEY ARE EXACTLY THE SAME
Eff% Range: 6.8 12.4 5.6
Sh% Range: 7.7 13.3 5.6

Luck is a bit of a strange word to use, but I have no idea what a good alternative is. I think what gets people is that they view it as a catch all for the whole thing (so they think it means the whole Oilers PP was based on luck), when really it refers to the fact that it beat the expectations.

Eberle has a monster year and everyone starts talking about 35 goals and 80 points. He’s never come close to those since. He had a career year because his shooting % was high, so were the guys on his line. That year was a product of luck, which can manifest itself anywhere on the ice. Sometimes you fall down behind the net, get up slowly and start skating back only to have the puck end up on your stick right in front of the goalie. Sometimes you get the puck settled at the right instance and get a quicker, harder shot off that beats the goalie. You get enough of these little breaks over the course of the year and it adds up to a good season, one that you might have issue with replicating in the future.

The “luck” card that advanced stat junkies apply when they are just plain wrong is pathetic. Some players (and some systems) generate better quality scoring chances and will therefore have better “luck”. Krueger’s powerplay generated “luck” two years in a row (would that qualify as repeatability?

How about the shorthanded goals given up by the Eakins “power”play? Was that just bad luck?

Despite your long-winded attempt to prove that up is down the fact remains that Eakins had problems with systems last season. He tried to fit players to his system when a good coach adapts the system to the player’s strengths. Hopefully some deranged advanced stats guy doesn’t convince him that he should just stay the course and let the law of averages fix the powerplay.

By the way, I am not an advanced stats hater. The advanced stats are useful tools, but one should not overestimate the value of any one evaluation tool as each has its weaknesses. This would clearly be a case where you rely too heavily on Fenwick. Until they start awarding points based on Fenwick stats, I will take those numbers into consideration, but not rely on them too heavily.

You are quite confused here, which appears to be a product of not reading the article.

1. Of course some players and teams enjoy better finishing capability. This isn’t in question in my piece at all. The question is, what is more reliably repeatable: efficiency, sh% or shot attempts? And, which correlates stronger with PP success (scoring goals).

The answer is clear.

2. As mentioned: “The power play was ghastly under Eakins (26th in FF/60)––the league worst 13 short handed goals against does not help matter.”

I think the point that Rom is making here, kobo, is this: If every season you both made a bet as to who could predict the top 5 powerplays of the league… and he used Fenwick as his predictor and you used something else… 10 years from now Rom would have all of your money.

In other breaking news, the fossil record indicates that e.g., modern humans and modern great apes have a single common ancestor. Also, bears sh*t in the woods.

The other thing to consider is that PP% is a pretty poor indicator of power play success. For example, you are playing 4 on 4 and your guy comes out of the box and the other team still has 15 seconds, you get a 15 second PP. If you don’t score, you are 0/1.

Or your team gets an unusual amount of 5 on 3 or 4 on 3 PP’s in a year, both of which are much better opportunities than 5 on 4. You’ll see an uptick in your %.

So you might benefit one year with a shooting % on 5on4, you might benefit another year by having more time 5on3/4on3.

For instance, in 2013-14 the Oilers had 12:39 of 5on3 time and 8:51 of 4on3 time. In the lockout shortened 2012-13 season, the Oilers had 10:59 of 5on3 time and 7:22 of 4on3 time.

They went from 7% in 2013 of their total PP time in non 5v4 situations to 4.2% in 2014.

You expect a better shooting % in those 5on3/4on3 situations, so the 2012-13 season isn’t as surprising as the 11-12 one was. That was the year RNH basically got to sit on the half wall and toss daisies into the middle of the PK box and the Oilers were just killing it.

Holy hell that was a long winded explanation of what you deemed to be luck … With nothing to really back up your assessment except the hard numbers that conclude Ralph having a better pp … Oh the dog days of summer as a hockey fan. Thanks for confusing the hell out of me and wasting my time Rom

What was confusing about my discussion of luck? I’d be happy to take another run at it if you are confused.

The “hard numbers” show two things:

1. The Krueger years produced one of the worst power plays in recent history. It was abominable at generating shot attempts. We know from the studies cited, this (unblocked shot attempts) is the most reliable indicator of PP strength (or, it is the most repeatable skill).

2. The Krueger years also enjoyed one of the best shooting % power plays in recent history. It was exceptional. We know from the studies cited, this (shooting % on the PP) is a heavily luck-dependent factor (or, it is unrepeatable luck).

Ok I’ll bite. The thing that hard stats don’t account for is quality of shots/chances. You can’t tell me the system that gave up the most shortys out of the whole league is superior to the one Ralph had. It’s just catagoricly wrong. Case and point Eakins never used Yak’s shot on the pp…. Almost all year ! That shows a bladent disregard for a successful attempt at the season, and IMHO should have cost the man his job. I agree with 90% of what you post Rom but I couldn’t let this one go .

“You can’t tell me the system that gave up the most shortys out of the whole league is superior to the one Ralph had.”

I never claimed such a thing.

I simply stated the fact that in this one category Eakins’ team managed to marginally out-produce Krueger.

The entire point is the following: Krueger did not offer a PP model any team could reliably look to for future success. The fact that Eakins didn’t either (which is mentioned in the piece) is irrelevant to that fact.

All the same, thanks for reading and taking the time to engage. Appreciate it