Thursday, March 22, 2012

At the simplest level, it means a lower-seeded team beating a higher-seeded team. This can happen for two reasons. First, the committee may have "blown" the seedings -- as they arguably did with Texas / Cincinnati and Purdue / St. Mary's this year, two games that most of the machine predictors thought would be upsets. Second, an upset can happen when the weaker team plays well and/or the better team plays poorly. College basketball teams don't play at their mean performance every game. Some games are better and some are worse, and this can lead to an unexpected result. This understanding suggests that upsets may be more likely when two inconsistent ("volatile") teams meet.

Imagine two hypothetical teams that played the same schedule. Team A averaged 84 points per game and scored between 81 and 88 points every game. Team B also averaged 84 points per game, but scored between 28 and 96 points. Now both these teams play Team C, that averaged 70 points per game against the same competition. Which is team is Team C more likely to beat? It seems reasonable to guess Team B.

So how can we identify these "volatile" teams? The obvious method is to measure something like the standard deviation of a team's performance over the course of the season. But we have to be careful in how we do this. For example, measuring the standard deviation of points scored might be very misleading because of pace issues.

Fortunately for me, I already have a good measure of team performance that includes standard deviation: TrueSkill. This probably isn't a perfect proxy for measuring a team's consistency, but it's certainly good enough for a quick investigation into the merits of predicting upsets by looking at consistency. (It's easier to think of this measure as volatility rather than consistency, so that the higher values mean more volatility.)

I took all of this year's first round games and ranked them according to the combined volatility of the two teams involved and then identified the most volatile game at each seed differential to see how well this predicted upsets:

Seeding

Most Volatile Game by Seed Differential

Upset?

8-9

Kansas St. - Southern Miss

N

7-10

St. Mary's - Purdue

Y

6-11

Murray St. - CSU

N

5-12

Vanderbilt - Harvard

N

4-13

Wisconsin - Montana

N

3-14

Marquette - Iona

N

2-15

Missouri - Norfolk St.

Y

1-16

Syracuse - NC Asheville

N

This seems mildly promising. It identifies two upsets correctly, including the Missouri-Norfolk St. upset. This is particularly interesting because that upset was not on anyone's radar. Most of the other games are at least "reasonable" choices for upsets in their seedings. (It also identifies CSU over Murray St, which may explain this pick by AJ's Madness in the Machine Madness contest.)

One problem with this approach is that seeding is a rather broad measure of team strength. For example, Duke was by far the weakest of the #2 seeds. It might be productive to use a more accurate measure of the strength differences between the teams. We can use the mean TrueSkill measure for each team to do that, and rank teams according to the sum of the standard deviations divided by the difference of the means. That results in this table:

Seeding

Most Volatile Game by Strength Differential

Upset?

8-9

Creighton - Alabama

N*

7-10

St. Mary's - Purdue

Y

6-11

SDSU - NC State

Y

5-12

Temple -USF

Y

4-13

Michigan - Ohio

Y

3-14

Georgetown-Belmont

N

2-15

Duke - Lehigh

Y

1-16

North Carolina - Lamar

N

* One point win for Creighton

This works remarkably well for this year's first round -- especially considering that there were no upsets in the 3-14 or 1-16 matchups. Of course, identifying the most likely upset at a particular seeding isn't quite the same as identifying the most likely upsets across the whole bracket, so let's look at the top 8 upsets predicted by this metric across the entire first round:

Seeding

Most Volatile Games Overall

Upset?

5-12

Temple - USF

Y

6-11

SDSU - NC State

Y

7-10

Notre Dame - Xavier

Y

7-10

St. Mary's - Purdue

Y

8-9

Creighton - Alabama

N*

7-10

Florida - Virginia

N

6-11

Cincinnati - Texas

N

8-9

Memphis-St. Louis

Y

* One point win for Creighton

Again, this is pretty good performance -- 75% correct in the first four picks and 50% correct in the first eight.

To a certain extent, a good predictor is going to capture some of this anyway (the Pain Machine identified the three correct upsets in the first four picks), but looking at the volatility of team performance may be good additional information in predicting tournament upsets.

Wednesday, March 21, 2012

In a previous posting I took a closer look at how the Pain Machine predicts upsets in the tournament and how effective it was this year. I thought it might also be interesting to look at how the top competitors in the Machine Madness contest predicted upsets. I put together the following table with the competitors across the top and an X in every cell where they predicted an upset. Boxes are green for correct predictions and red for incorrect predictions. The final row(s) in the table shows the scores & possible scores for each competitors.

Game

Pain Machine

Predict the Madness

Sentinel

Danny's
Conservative
Picks

AJ's Madness

Matrix Factorizer

Texas over Cincy

X

X

X

X

X

Texas over FSU

X

X

WVU over Gonzaga

X

X

X

Purdue over St. Mary's

X

X

X

X

X

NC State over SDSU

X

South Florida over Temple

X

X

New Mexico over Louisville

X

X

Virginia over Florida

X

Colorado State over Murray State

X

Vandy over Wisconsin

X

Wichita State over Indiana

X

Murray State over Marquette

X

X

Upset Prediction Rate

43%

25%

33%

0%

25%

29%

Current Score

42

43

42

41

41

39

Possible Points

166

155

166

161

137

163

(I'm not counting #9 over #8 as an upset. That's why Danny has only 41 points; he predicted a #9 over #8 upsets that did not happen.)

So what do you think?

One thing that jumps out immediately is that the competitors predicted many more upsets this year than in past years. Historically we'd expect around 7-8 upsets in the first two rounds. Last year the average number of upsets was about 2 (discounting the Pain Machine and LMRC). The Pain Machine is forced to predict this many, but this year the Matrix Factorizer also predicts 7, and Predict the Madness and AJ's Madness predict 4. From what I can glean from the model descriptions, none of these models (other than the Pain Machine) force a certain level of upsets.

Monte's model ("Predict the Madness") seems to use only statistical inputs, and not any strength measures, or strength of competition measures. This sort of model will value statistics over strength of schedule, and so you might see it making upset picks that would not agree with the team strengths (as proxied by seeds).

The Sentinel uses a Monte Carlo type method to predict games, so rather than always produce the most likely result, it only most likely to produce the most likely result. (If that makes sense :-) The model can be tweaked by choosing how long to run the Monte Carlo simulation. With a setting of 50 it seems to produce about half the expected number of upsets.

Danny's Dangerous Picks are anything but; it is by far the most conservative of the competitors. The pick of Murray State over Marquette suggests that Danny's asymmetric loss function component might have led to his model undervaluing strength of schedule.

AJ's Madness model seems to employ a number of hand-tuned weights for different components of the prediction formula. That may account for the prediction upsets, including the somewhat surprising CSU over Murray State prediction.

The Matrix Factorizer has two features that might lead to a high upset rate. First, there's an asymmetric reward for getting a correct pick, which might skew towards upsets. Secondly, Jasper optimized his model parameters based upon the results of previous tournaments, so that presumably built in a bias towards making some upset picks.

What's interesting about the actual upsets?

First, Texas over Cincy and Purdue over St. Mary's were consensus picks (excepting Danny's Conservative Picks). This suggests that these teams really were mis-seeded. Purdue vs. St. Mary's is the classic trap seeding problem for humans -- St. Mary's has a much better record, but faced much weaker competition. Texas came very close to beating Cincinnati -- they shot 16% in the first half and still tied the game up late -- which would have made the predictors 2-0 on consensus picks.

Second, the predictors agreed on few of the other picks. Three predictors liked WVU over Gonzaga, and the Pain Machine and the Matrix Factorizer agreed on two other games. Murray State over Marquette is an interesting pick -- another classic trap pick for a predictor that undervalues strength of schedule -- and both Danny's predictor and the Matrix Factorizer "fell" for this pick.

So how did the predictors do?

The Pain Machine was by far the best, getting 43% of its upset predictions correct. Sentinel was next at 33%. Perhaps not coincidentally, these two predictors have the most possible points remaining.

In terms of scoring, the Baseline is ahead of all the predictors, so none came out ahead (so far) due to their predictions. The PM and Sentinel do have a slight edge in possible points remaining over the Baseline.

So who will win?

The contest winner will probably come down to predicting the final game correctly. There's a more interesting spread of champion predictions than I expected -- particularly given the statistical dominance of Kentucky.

If Kentucky wins, the likely winner will be the Baseline or Danny. If Kansas wins, the Pain Machine will likely win unless Wisconsin makes it to the Final Four, in which case AJ should win. If Michigan State wins, then the Sentinel will likely win. And finally, if Ohio State wins, then Predict the Madness should win.

For the past three years that the Pain Machine has participated in the Machine Madness contest, I've maintained (without any real justification) that the proper strategy is to pick the correct upsets -- as opposed to simply picking the most likely outcome, which will be the higher seed in every case where the committee hasn't completely blown the seeding. In light of that, I wanted to review the PM's upset-picking strategy and see how it has worked out this year.

The PM predicts the Margin of Victory for each tournament game. With two exceptions this year, the predicted winner was the higher-seeded team. Historically, we know that the upset rate in the first round has been around 22%, and the upset rate for the whole tournament around 15%. (An upset is where a team seeded at least 2 lower than its opponent wins the game. A #9 over a #8 is not considered an upset.) In light of this, I force the PM's tournament picks to include 6 upsets in the first round and 5 more in the rest of the tournament.

The picking strategy is fairly straightforward. First of all, any games where the PM thinks an upset will happen are marked as upsets. After that, the PM marks the remaining of 6 games in the first round which have the lowest predicted MOVs as upsets and (after recalculating the rest of the bracket based upon those upsets) the remainder of 5 games in the rest of the bracket by the same criterion.

This year, that resulted in these upset picks (predicted MOV shown in parentheses, correct picks bolded) for the first round:

The PM picked 3 of these 6 upsets correctly: USF, NC State and Purdue. Texas shot just 16% in the first half and still managed to tie the game in the second half but couldn't finish the rally. The other two games were not very close. Still, getting 50% correct on upsets is probably pretty good performance.

The Norfolk State win really stands out here as the outlier -- it was at least twice as unlikely as the Duke-Lehigh upset. I don't have the statistic handy, but 23 point upsets have to be greater 1 in a 1000 historically. (The beating Norfolk St. took in the next round is indicative of how anomalous the first round upset was.) VCU was a darling upset pick for many, in part due to their Cinderella status last year. This year's VCU team was considerably weaker, and the win over Wichita State was another very unlikely result. The Georgetown upset was the least surprising. The 5 point differential is well within the ~10 point error margin of the PM's predictions.

Overall, I give the PM a very positive grade for it's upset picks. It's clearly able to identify games where upsets are likely. I may have to work on how it selects upsets, though. There isn't a strong correlation between the magnitude of MOV and the likelihood of upset when MOV is under about 6 points, so it may not make sense to pick the games with the lowest MOVs. It may make more sense to pick upsets based upon other factors.

Tuesday, March 20, 2012

It's a common assumption that neutral court games should be treated differently from games played at one team's home court, but is that really true? The SI article that looked at home court advantage concluded that it was primarily due to the referees treating the home team differently. That jibes with something I found -- that large home dogs don't get a HCA.

Presumably the refs don't give the benefit of the doubt when they know the home team is overmatched.

I did some other experiments (prior to starting the blog, so they aren't documented here) where I trained a predictor on regular season games using just a strength measure for each team, so that the prediction equation looked like this:

C2 was negative, and C3 (along with any C1/C2 ratio) was the "home court advantage".

I then tested the accuracy of this predictor on NCAA tournament games, first treating the higher seed as the home team, then the lower seed as the home team, and then washing out HCA altogether by dropping C3 and forcing C1 & C2 to be equal.

What I found was that the best prediction was made treating the higher seed as the home team. This makes some intuitive sense -- the refs are giving the benefit of the doubt to the team that they "know" is the better team. So I'm a little dubious that there's really no "HCA" in tournament games, although I don't know that anyone else has looked at it.

Friday, March 16, 2012

You should be able to see (I believe) the Pain Machine's tournament picks here.

For the tournament, I don't like to make all "chalk" picks so I am set up to force a certain number of upset picks. (This is complicated somewhat by the play-in games.) If the PM predicts an outcome where a lower seed beats a higher seed, that counts as one of the upset picks. Otherwise the PM converts the weakest wins to losses until it has the requisite number of upsets. This year, the PM's predicted first-round upsets (in order of likelihood) were:

There are a couple of upsets in the later rounds, notably Kansas over UK (although that game is a near coin-flip according to the PM -- UK by 0.6 points).

Texas came very close to beating Cincy. The PM actually also had Texas upsetting FSU, so that loss hurts.

Cal didn't even make it to the Temple game, having one of the worst tournament performances ever losing to USF. Nonetheless I kept this prediction, now USF over Temple. NC State beat SDSU. The Purdue / St. Mary game is tonight. I was at the WVU / Gonzaga game -- Gonzaga simply outplayed WVU and shot very, very well.

The PM is once again competing in the Machine Madness contest, which you can follow here. As with the past few years the "chalk" picks are leading the contest, but that will change if any of the competitors get a few upset picks right. A quick look suggest that so far only a few entries have gotten an upset pick correct -- the PM has the NC State result and AJ's Madness got the VCU pick correct. (The bottom couple of entries seem fairly random.) The overall winner will probably be determined by who wins the tournament. If Kentucky wins it will be Danny or the Matrix Factorizer (depending upon the outcome of Duke/Baylor). Otherwise it will go to the predictor who got the winner correct.