Month: January 2016

With the regular Allsvenskan season and qualification play-off both being over months ago, instead of doing a season summary (fotbollssiffor and stryktipset i sista stund have already done that perfectly fine), I thought I’d see how my model has been performing on the betting market this season. Since my interest in football analytics comes mainly from its use in betting, this is the best test of a model for me. Though I usually don’t bet on Allsvenskan, if the model can beat the market, I’m interested.

Game simulation

To do this, I should first say a few things about how I simulate games. I want my simulations to resemble whatever they are supposed to model as much as possible, and because of this I’ve chosen not to use a poisson regression model or anything remotely like that. Instead I’ve build my own Monte Carlo game simulation in order to emulate a real football game as close as possible.

I won’t go into any details about exactly how the simulations is done, but the main steps include:

Weighting the data for both sides to account for home field advantage.

Predict starting lineups for each team using their most recent lineup, minutes played and known unavailable players.

Simulate a number of shots for each player, based on his shots numbers and the attacking and defensive characteristics of both teams.

Simulate an xG value for each shot, based on the player’s xG numbers and attacking/defensive characteristics of both teams.

Given these xG values, the outcome of the shot is then simulated and any goals are recorded.

Each game is simulated 10,000 times, obviously based only on data available prior to that particular game.

The biggest advantage of this approach is that it’s easy to account for missing players, it is in fact done automatically. It also seems more straightforward and easily understood than other methods, at least to me. Another big plus is that it’s fairly easy to modify the Monte Carlo algorithm in order to try new things and incorporate different data. The drawbacks include the time it takes to simulate each game. At 10,000 simulations per game it takes about a minute, meaning that simulating a full 240-game Allsvenskan season would take at least 4 hours. Also, since my simulations rely heavily on up-to-date squad info, such a database have to be maintained but this can be automated if you know were to look for the data.

For each game, the end results of all these simulations is a set of probabilites for each possible (and impossible!?) result, which can then be used to calculate win percentages and fair odds for any bet on the 1X2, Asian Handicap and Over/Under markets.

As an example of how the end result of the simulation looks, I’ve simulated a fictive Stockholm Twin Derby game, Djurgården vs. AIK. Here’s how my model would predict this game if it were to be played today (using last season’s squads, I haven’t accounted for new signings and players leaving yet):

Given these numbers the fair odds for the 1X2 market would be about 2.31-3.62-3.44 while the Asian Handicap would be set at Djurgården -0.25 with fair odds at about 1.99-2.01 for the home and away sides respectively. The total would be set at 2.25 goals, with fair odds for Over/Under at about 2.04-1.96.

Backtesting against the market

With my odds history database containing odds from over 50 bookmakers and the fact that timing and exploiting odds movements is a big part of a successful betting strategy, it’s not a simple task to backtest a model over a full season properly. I’ve however tried to make it as easy as possible and set out some rules for the backtesting:

The backtest is based on 1X2, Asian Handicap and Over/Under markets.

Only odds from leading bookmaker Pinnacle and betting exchange Matchbook is used. Maybe I’ll run the backtest against every available bookmaker in order to find out which is best/worst at setting its lines for a later post.

Two variations of the Monte Carlo match simulation is tested, where Model 2 weights home field advantage more heavily.

Only opening and closing odds are used in an attempt at simulating a simple, repeatable betting strategy.

For simplicity, the stake of each bet is 1 unit.

Since my model seems to disagree quite strongly with the bookies on almost every single game, there seems to exist high-value bets suspiciously often. To get the number of bets down to a plausible level, I’ve applied a minimum Expected Value threshold of 0.5. As EV this high is usually only seen in big underdogs, this may be an indicator that my model is good at finding these kind of bets, or that it is completely useless.

So lets’s take a look the results of the backtest – first off we have the bookmaker Pinnacle. Here’s the results plotted over time:

Pinnacle

We can immediately see from the results table that the model indeed focuses on underdogs and higher odds. Set against Pinnacle, both variations of the model seems to be profitable on the 1X2 market, with Model 2 (with more weight on home field advantage) performing better with a massive 1.448 ROI.

Both models recorded a loss on the Asian Handicap market and only Model 1 made a profit in the Over/Unders – a disappointment as these are the markets I mostly bet on.

The table above contains bets on both opening and closing odds – let’s seperate the two and see what we can learn:

Looking at these numbers we see that both models perform slightly better against the closing odds on the 1X2 market, while Model 2 actually made a tiny profit against the closing AH odds. We can also see that Model 1’s profit on Over/Unders came mostly from opening odds.

But what about the different outcomes to bet on? Let’s complicate things further:

So what can we learn from this ridiculous table? Well, the profit in the 1X2 market comes mainly from betting away teams which suits the notion that the model is good at picking out highly underestimated underdogs. Contrary to the 1X2 market, betting home sides on the Asian Handicap markets seems more profitable than away sides. Lastly the model has been more profitable betting overs than unders.

As we’ve seen, my model seems to be good at finding underdogs which are underestimated, and that at Pinnacle, this bias mostly exist in the 1X2 market, hence the huge profit.

Matchbook

But what about the betting exchange Matchbook, where you actually bet directly against other gamblers?

The 1X2 market seems to be highly profitable at Matchbook too, and Model 1 actually made a nice profit on AH, especially away sides – in contrast to the results at Pinnacle. Also, the mean odds here are centered around even money. Over/Unders again seems to be a lost cause for my model.

Conclusion

As I’ve mentioned, the model seems best at finding underdogs and high odds which are just too highly priced, and looking at the time plots we can see that these bets occur mostly in the opening months of the season. This may be and indicator of how the market after some time adjusts to surprise teams like this season’s Norrköping.

For a deeper analysis of the backtest I could have looked at how results differed for minus vs. plus handicaps on the AH market, and high vs. low O/U lines. Using different minimum EV thresholds would certainly change things and different staking plans like Kelly could also have been included, but I left it all out as to not overcomplicate things too much.

I feel I should emphasize that the different conclusions made concerning betting strategy from this backtest only applies to my model, and not Allsvenskan or football betting in general.

As we’ve seen, an Expected Goals model and Monte Carlo match simulation can indeed be used to profit on Allsvenskan. However, the result of any betting strategy depend highly on not only the model, but also when, where and what you bet on.