When one looks further at the top 10 best goal scoring seasons of all time, it becomes clear that there are systematic problems. Nine of the ten best goal scoring seasons are shown to have occurred since 1970. None occurred in the original six years. The other year was Babe Dye's 38 goal year back in 1924/25. Seven of the ten best years on the list were within three years of an expansion. Its clear that people tended to score more adjusted goals in expansion weakened years. During the original six years, there were no expansions, so they didn't make this list.

There are three specific years that I was expecting to see on or near the top ten from the original six era. I was expecting that Maurice Richard's 50 goals in 50 games season in 1945, Bernie Geoffrion's 50 goal season tying his record in 1961 and Bobby Hull's 1966 54 goal season breaking that 50 goal mark for the first time might feature prominantly on the list. Surprisingly, none of these three years are near the top in Peter Albert's top 50 list (only Hull made it in 40th spot). There are a couple other original six seasons represented on the list. Gordie Howe's 49 goal year in 1953, Jean Beliveau's 47 goal year in 1956 and Bobby Hull's find spots on the list in 12th and 30th position respectively. How could an entire era be so underrepresented? Was there really no great goal scorer in that era or does the analysis systematically overlook them? I think the situation is that the era is overlooked. With only six teams in the NHL, there were fewer bottomfeeding teams to play against where a goal scorer could pad his totals. In more recent times, particularly during expansions, some bottomfeeder teams existed. This is not to say that the average team or average player is any worse in either era, it is a reflection of the fact that more bad players and teams have existed in the larger league.

How would one attempt to adjust for the quality of opposition? Specifically for the presence or absence of bad players and teams that one could pad their statistics against. Its not an easy question. In baseball (where sabermetrics is a much more exact science) there are a few indicators of a league quality that naturally come out of the statistics. For example, one can use the ratio of double plays to errors as a measure. In a good league (ie the majors) there are more double plays then errors, but in a beer league errors are common and double plays are rare. There isn't a clear example of a statistic that has been accurately kept since the beginning of pro hockey that can capture the quality of the league.

The best I came up with after thinking about this problem for a while is the standard deviation of the player's ages from a defined mean age for a pro hockey player. In weak years of the NHL more really young player make the league and more aging players are able to hang on longer to continue their careers after their value is depleted. In strong years, there are few young players and the aging players tend to be forced into retirement instead of having the opportunity to stick around for another year or two. This is shown in the example of Gordie Howe. He was 18 when he first made the NHL in 1946. This was a league that was beginning to grow in strength after the Second World War had depleted its ranks for a few years. Gordie played until age 43 when he retired from the Detroit Red Wings in 1971. Three years later, he was lured out of retirement to play in the weaker WHA. He continued to have successful years in the WHA until it folded in 1979. He played one more year in the NHL when it expanded to include the surviving WHA teams and retired in 1980 at age 52. These final years would likely not have happened if the NHL (and WHA) were not weakened due to the rapid expansion of the 1970's.

I think this technique of using a standard deviation in age as a proxy for league quality would fail in the early days of pro hockey. In the early days, pro players often played around 20 games and their incomes from hockey were not enough to sustain them the whole season and players had a second job that they also worked. It was not uncommon for a still talented player to retire because of pressures from the other job. Moving to a new city for hockey would interfere with their career. Maybe they were at a point in their career where taking the time off for a 20 game season was too much. So many players retired to pursue their second often more lucrative career. This keeps the standard deviation of ages low and also keeps the quality of the league low.

Although I think using the standard deviation of player's ages as a proxy for league quality is probably the best solution to try to correct data for quality of opposition, I don't think it would fully solve the problem. I think it is a lot of work for little gain. I think the problem would still exist after this attempted correction.

The list of the best goal scoring seasons that the hockey outsider produced is a good one. It is good work. Its most glaring problem is the lack of a correction for quality of opposition. This correction is not an easy one to make. It may not be possible to make it in an unbiased manner from existing statistics. I would love some smart person to prove me wrong, but I am not sure its likely. This method produced a list of players who tend to be from expansion seasons and overlooks those players from the original six era. Nevertheless, it is a pretty good attempt at solving a complex problem.

As usual, I think pnep provided a wealth of information but without enough explanation of it.

When you list the standard deviation in age, do you include every player who played even one shift in the NHL? Or is there a cutoff for number of games played to be included? In the earliest years of the NHL does information exist to give us accurate ages of all players? If not how is this handled? Is the mean age used in these figures the same for each season or does it flucuate year by year as the population fluctuates?I assume the units are years - right?

Now standard deviation of adjusted team wins. You are adjusting for what? Length of schedule? Changes in rules adding shootouts and points for losing in overtime? Anything else? Now these numbers are in thousands, yet teams dont win thousands of games a year. Why?

I think the standard deviation in adjusted wins is a measure of competitive balance in the league. Is there parity? That does not necessarily show that the league quality is good. You can have parity in a bad league. You can have one or two really great teams in a high quality league.

I want to better understand these numbers. I think they might be very useful, but I need to know exactly what they represent before i misuse them.