Monte Carlo simulations are a really fun and interesting way to make predictions in just about anything, and this can be applied to many fields including hockey. Many different websites and experts use Monte Carlos to make predictions in hockey to determine final standings and points of teams. In this experiment, I create my own Monte Carlo to see if I can make similar prediction with reasonable outcomes. I then apply this method to the Vancouver Canucks and the Canadian teams in general.

To read learn more on my method, continue past the jump:

Monte Carlo

What is a Monte Carlo? The basic idea is you repeatedly sample something with random input. Wikipedia has a good article on how they work and its basic description is that Monte Carlo simulations are:

a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results; i.e., by running simulations many times over in order to calculate those same probabilities heuristically just like actually playing and recording your results in a real casino situation: hence the name.

We can use this method by iterating over the remainder of the schedule in a hockey season many times (from 10,000 to 25,000 or higher) and randomly determining the winner of each game. Over many iterations the results will converge towards the norm. This is what I did for both the NHL, OHL and the AHL. The key to a successful Monte Carlo is that in how you determine how an individual game winner is randomly chosen that still realistically represents real life.

#FancyStats Method

There are many methods you can choose to do this. For example, every game could become a coin flip, with both teams having a 50% chance of winning. It is well known that the home team wins 56% of games, so you could give the home team a 56% chance of winning. You may also have your own idea on how to determine the winner, such as weighting the team that isn't the Oilers a bit more heavily in games including Edmonton.

For my method, I based it on my previous research where I've shown that the NHL is statistically identical to a league where 24% of games are determined by the better team and 76% of games are determined at random (21%/79% in the AHL, 43%/57% in OHL).

For every game a random number between 1 and 100 would be generated. If the number was less than or equal to 24, the game was won by the stronger team. In this case, that was considered to be the team with the higher Scored-Adjusted Fenwick (Est. Possession for the AHL/OHL). If the generated number was higher then it would be determined by a coin flip, in which either team has a 50% chance of winning. OT points were accounted for: in each game there was a 24% chance the losing team would earn 1 point (22% for the AHL, 20% for OHL).

Results - Ontario Hockey League

To test the model I ran the numbers on the Ontario Hockey League (OHL) to see what kind of output I could generate. My OHL Eastern conference prediction from the Christmas hockey break are as follows:

Team

Predicted Rank

Probability

Oshawa

1

92.00%

Kingston

2

56.44%

Sudbury

3

38.00%

North Bay

4

41.76%

Barrie

5

53.56%

Peterborough

6

63.08%

Mississauga

7

38.28%

Niagara

8

32.00%

Belleville

9

36.88%

Ottawa

10

54.82%

and the Western Conference:

Team

Pred. Rank

Probability

Erie

1

80.88%

Guelph

2

45.50%

London

3

44.58%

Windsor

4

55.76%

Sault Ste. Marie

5

65.62%

Saginaw

6

48.52%

Owen Sound

7

38.58%

Kitchener

8

54.94%

Plymouth

9

53.10%

Sarnia

10

61.74%

Each OHL team has their most likely final conference placing listed as well as the probability they will finish at this position.

Results - American Hockey League

The next step was running it on the AHL and to see the likelihood of the Utica Comets, the Canucks farm team, of making the playoffs.

In Playoffs

Miss Playoffs

12.396%

87.604%

And we can use a Monte Carlo to see what their odds are to finish in each conference position are as follows

Position

% Chance

1.

0%

2.

0.044%

3.

0.204%

4.

0.536%

5.

1.188%

6.

1.988%

7.

3.404%

8.

5.084%

9.

6.92%

10.

9.332%

11.

12.072%

12.

14.168%

13.

15.608%

14.

16.408%

15.

13.044%

From this we can see that the Comets are not likely to finish in last place in the conference. This has been determined as likely given how much better they are in puck possession than those ahead of them in the standings, and how close they are in points.

Results - National Hockey League

Running the Monte Carlo on the Canucks we can estimate the number of points the Vancouver Canucks will earn as well as their chance of making the playoffs. This Monte Carlo was run during the Christmas break and it will slightly change as you are reading this and games have passed.

In Playoffs

Out of Playoffs

Points

90.40%

9.60%

101

Similar to the AHL, we can estimate their chance of finishing in each division position:

Division Place

% Chance

1

2.25%

2

7.05%

3

17.87%

4

58.59%

5

14.10%

6

0.14%

7

0.00%

And just for fun here is the chance of each Canadian team making the playoffs and their estimated points. And no, I didn't need the Monte Carlo to know that Edmonton had no chance.

Team

In Playoffs

Out of Playoffs

Points

Vancouver

90.40%

9.60%

101

Montreal

89.41%

10.59%

95

Ottawa

18.70%

81.30%

83

Toronto

17.41%

82.59%

83

Winnipeg

12.23%

87.77%

87

Calgary

0.12%

99.88%

74

Edmonton

0.02%

99.98%

70

Conclusion

My Monte Carlo seems to have some success. By comparing it with others such as SportsClubStats (who use a simpler method of determining game winners) as well as more advanced ones such as PuckPrediction the results seem to be in line. Regardless of the individual game winner method it appears that all these Monte Carlos used by hockey experts seems to be in agreement with each other. It will be interesting to see at the end of the season if any of these are close to the final output.

These aren't my opinions on what the final outcome of these teams will be. They are the statistical results of my model. It's a fun way to predict final outcomes in hockey and seems like a worthy method to try again in the future.

Of course, as with any predictions, there's just so much that can change between now and then. If there's anything I've learned this year with leagues such as the AHL, it is just how volatile they are and how quickly a team's possession rating can change (likely due to large roster fluctuation). It's easier said than done for some leagues.

I am a Van Fan in Bytown. Living in Ottawa for work, I research Sports Analytics and Machine Learning at the University of Ottawa. I play hockey as well as a timbit but I compete in rowing with hopes of Olympic Gold. Follow me on twitter at @joshweissbock and feel free to send any questions or comments my way.

it seems, the only reason we haven't been using scientific data to make decisions in hockey, or much else in this world, is because people deemed important by making important decisions will not relinquish their importance to the importance of scientific analysis. that's how you get the "best" american hockey minds talking about dreams they had to determine who should and should not play on their olympic team, a la brian burke and david poile in that illuminating burnside/lebrun article. and the one guy, dean lombardi, to bring up statistics and even provide a portfolio of information on keith yandle is chastised for his efforts by the other know it alls. complex? i think so.

By this method, the winning percentage of a 50% possession team facing a 49.997% possession team would be exactly the same as a 60% team facing a 40% team. In both cases, 24% of the time you assign them a win-by-skill and half of the other 76% you assign them a win-by-luck.

That can't be right.

The model should be set up so that you're adding a random number to the observed difference in skill, with the magnitude of the random number set such that the differences in skill on average will account for 24% of the results. Or maybe a little less than 24%, since possession isn't the entirety of the skill difference between teams.

Maybe I'm replying out of my place, but the math behind this prediction is pretty simple (not even calculus is really required) - the more difficult part of Monte Carlo is in getting the required number of simulations.

You could do this manually if you had a random number generator handy. It would just take you a long time to do it (think taking 3 random numbers for each game in the season, then repeating it 10,000 times).

The various percentages you see in the article (56% of times the home team wins, etc.) are simply taken from historical data (i.e. the results of past games).

Side note: I agree that the method could be even more #FancyStats is you took into account the relative differences in possession, rather than simply picking the higher number. I doubt the final results would be majorly different, but it might swap a few teams around.