Modeling Tennis

Your faithful tourneygeek will go to the ends of the earth to bring you the best information on tournament design. Or at least as far as Ohio. For the next week, tourneygeek will be operating from temporary quarters in the Baymont Inn & Suites in Mason, Ohio. I’m here to watch the Western and Southern Open – the last important tennis tourney leading up to the U.S. Open. Where better to expand on the work started with Wimbledon, studying the effect of the rather odd seeding system used in high-level professional tennis.

First, I need to fine-tune my simulator for elite tennis – that’s the subject of this post.

One of the benefits of studying tournament and sports that other people actually care about is that there’s a lot more data to work with. Month ago, I tuned my match model for backgammon by using a single, anecdotal statistic about the relative success of the person who is, by common agreement, the best backgammon player in the world. I think I can do better with tennis.

For past models, I’ve generally assumed one of two distributions for the skills of the participants. For general tournament, where anyone might enter, I assumed a normal distribution of skill levels. But for elite tournaments, at events where it was reasonable to assume that only relatively skillful players would enter, I set a threshold of zero for the Z-levels representing the skill levels of the participants. I still assume that skill is normally distributed, but that a particular event attracts entrants only from the top half of that distribution.

To model the Western and Southern Open, a stricter threshold seems appropriate. The folks who play at tourneys like this one aren’t just better than average at tennis, they’re a whole lot better than average. So I begin by setting the elite threshold at two rather than at zero. Even the worst player at this tourney is exceptionally good.

That still leaves the problem of choosing a suitable luck factor for the model. Here I’m guided by information from Betfair, a clearinghouse for sports wagering in places where that’s legal. The odds on offer at Betfair imply that either Rafael Nadal or Roger Federer will win the Western and Southern about 45% of the time. After fussing with my simulator, I find I need a luck factor of about 1.15 to make is so that the top two seeds win that often.

So, that’s the model for elite tennis: an elite threshold of 2, and a luck factor of 1.15. Watch this space for the results from this model, which should shed additional light on the unusual seeding practices of professional tennis.