The Numbers Don't Lie

Sunday, April 6, 2014

Suppose I give you $1000 and then give you 2 choices, A and B. If you choose A, there is an 80% chance of keeping your money. Choosing B, there is a 50% chance of keeping your money. In either case, if you lose your money, you will have an equal chance of getting it back in the future. Obviously, choice A is the way to go. This is the choice Bo Ryan and the Wisocnsin Badgers had in front of them when they were ahead by 2 with 16 seconds and Kentucky had possession. On the floor for the Wildcats was 5 freshmen, Julius Randle, James Young, Dakari Johnson and the Harrison twins, Andrew and Aaron. Dakari Johnson makes 45.7% of his free throws. His chance of making 2 free throws in a row is about 20%, thus Wisconsin would have a near 80% of keeping their lead given that they would gather the rebound if Johnson were to miss his second free throw. By not fouling, Wisconsin accepted the 50% probability of Kentucky tying the game, based on their FG shooting percentage for the game. Worse, they handed Kentucky a 35% chance to take the lead, based on the probablity that one of the Harrisons or James Young would attempt and make a 3-pointer. The choice becomes a little less obvious if they were forced to foul their best free throw shooter and hero of the game, Aaron Harrison. The odds of him making both free throws and tying the game increases to 63%, better odds perhaps than a Kentucky basket but taking away almost any opportunity for them to take the lead(there remains the small chance of an offensive rebound and putback on a missed second free throw). Intuition and "the book" says not to foul. An intuition informed by rather simple math realizes you can almost completely avoid the worse case(losing in regulation) and live to play at least 5 more minutes of basketball.

Saturday, March 30, 2013

Louisville is the sole #1 seed remaining in a tournament that has had its fair share of surprises. But it was only 2 years ago that Kansas was in a similar role and they were subsequently sent home by the Cinderella #11 seed VCU.

As the chart below shows, it is rare to have only one remaining #1 seed by the time we reach the Elite Eight and more common to have all 4 remaining, like we did from 2007-2009. A more interesting measure is to compare the sum of the seeds, which is a rough indicator of both the parity in the field and, possibly more telling, how well(or not) the Selection Committee did in their seeding. Comparing the totals, the Committee came in slightly worse than average this year. Best year: 2007. Worst Year(s): 1990, 2000.

Monday, March 25, 2013

1. Florida Golf Coast is the first #15 seed to reach the Sweet 16. #14 seeds have made it twice(1986 Cleveland St. lost to #7 seed Navy,1997 Tenn.-Chattanooga lost to #10 seed Providence)

2. Lasalle vs. Wichita State marks the first matchup of a #13 and #9 seed. #13 seeds have played #1 seeds 4 times and lost each time: 2012 Ohio(vs. UNC), 2006 Bradley(vs. Memphis), 1999 Oklahoma(vs. Michigan State), 1988 Richmond(vs. Temple).

3. Lasalle vs. Wichita State is the first matchup of non-power conferences since 2006, when George Mason beat Wichita State and Memphis beat Bradley.

4. This is the 4th year in a row that the Sweet 16 has had three double-digit seeds. It's pretty common, occurring 13 times since 1985. There were 5 double-digit seeds in the 1999 Sweet 16(Gonzaga,Miami(OH), Missouri State, Oklahoma and Purdue).

5. Excluding results from the play-in round, teams in the 1st and 2nd rounds averaged 65.3 points per game which was the lowest total since 1985, the first year the tourney expanded to 64 teams.

Taking a look at the first round throughout history shows a a predictable pattern. The #1 seed is 112-0 versus the #16 seed and the winning percentage decreases at a steady pace from there, with a noticeable exception of the #5 seed. #12 seeds have upset the #5 seed just as often as the #11 seeds have upset the #6, and the #8-#9 matchups have been pretty even. Despite the complaints over the years, these numbers appear to validate the selection committee seedings.

Monday, March 18, 2013

The grid below shows the head-to-head records by conference in the NCAA torunament since 1985, winners by row and losers by column. Select one or more of the values in the Rounds filter to see results for particular rounds.

1. The ACC, the top performing conference with a .647 overall winning percentage, has a winning record against every other conference with the exception of the SEC where it is only 15-25.
2. Head-to-head matchups of the same conference are rare. Hard to believe that ACC teams have only met twice since 1985(in 1995 and 2001) and two PAC-12 teams have never met.
3. In games beyond the Sweet 16, the SEC is 8-3 against the ACC, the ACC is 11-1 against the Big 10 and the Big 10 is 9-1 against the Big East.

My next post will explore how the seeds have fared throughout history.

Sunday, March 17, 2013

March Madness is upon us and I thought it would be a good time to look at what the numbers say over the history of the NCAA tournament since it was expanded to 64 teams in 1985. The first glance I will take is to compare wins and losses by the 6 power conferences and the group of teams belonging to other conferences, sometimes confusingly called mid-majors. The chart compares the winning percentage of each broken down by each particular round of the tournament. The number of appearances for each conference is shown in parantheses
.

1. Clicking on a conference in the color legend will highlight the results for the chosen conference. In general, winning percentage decreases as teams get deeper in the tournament, which is not surprising. There are a couple exceptions to this trend. For example, the Big 10 takes a big jump from the Sweet 16 to the Elite 8 and the SEC jumps to the head of the pack in the Final 4 and Championship rounds.

2. The power conferences have consistently outperformed the others in the First Round of the tournament, but the other conferences catch up to a degree as they get deeper, winning 6 of their 13 appearances in the Final Four.

3. Of the six power conferences, the Pac-12 is the worst performer in terms of both games played(252) and overall winning percentage(.567). A true comparison between the power conferences should probably take into consideration the number of bid opportunities, since there are differences in the number of teams in each conference through the years.

4. The ACC separates itself from the pack when comparing overall winning percentage, but that is mainly due to their high rate of success in the early rounds. They fall behind the Big 12 once they reach the Sweet 16.

5. The Big 12(which also includes teams from the SWC and Big 8 in the early years) takes the biggest jump from one round to the next, leading the pack in the Sweet 16(.604) and falling to the bottom(.394) in the next round.

Note: I ignored results from the Play-In rounds, which were just recently added to the tournament.

Saturday, February 16, 2013

With their victory in the Super Bowl XLVII, the Baltimore Ravens moved ahead of the Indianapolis Colts and Pittsburgh Steelers and inched closer to the New Engalnd Patriots on the updated version of the NFL Moneyball Dashboard.

The Moneyball Dashboard compares the success of each NFL franchise since the year 2000 by measuring their performance in the post-season against their average payroll. Teams are awarded 1 point for qualifying for the playoffs and 1 point for each post-season win, so a Super Bowl win is currently worth 5 points. Teams landing in the upper left quadrant can be considered over-achievers as they have exceeded the NFL average for playoff success while their average payroll falls below the NFL average. On the flip side, teams in the lower right quadrant could be considered under-achievers. The size of the circle for each team helps to compare their regular season success by measuring their wins per season for each million dollars of payroll. Use the power of Tableau to compare teams by selecting them from the bubble chart or using the filters to the right.

1. Since the Houston Texans only joined the NFL in 2002, I think it is a little unfair to label them under-achievers. The Texans have finished .500 or above 5 of the last 6 years and have won a playoff game each of the last 2 years. Even though they have the biggest payroll in the NFL, they are surely headed in the right direction.

2. Due to a lack of franchise payroll data, the dashboard only reaches back to the year 2000. The New Engalnd Patriots benefit the most from this limitation having started their incredible run of post-season success in 2001. An analysis that includes measures dating back to Super Bowl I would perhaps provide a more fair comparison.

3. What do the post-season numbers say about competitive balance? About half of the teams find themselves clustrered between 6 and 15 playoff points. Seven teams(NE, BAL, PIT, IND, PHI, GB, NYG) exceed 19 points and could be called high achievers. Eight teams,excluding Houston(KC, JAX, CIN, MIA, WAS, CLE, DET, BUF) have yet to earn 5 playoff points and only 1 team, Buffalo, has not gained a post-season berth since 2000. Without any deeper statistical analysis, I think we could say these numbers support the idea that the NFL has achieved a fairly good level of competitive balance.

4. How does post-season success corrrelate with payroll? From a quick glance at the distribution, there is little or no correlation. While it is true that most of the poor performers have payrolls below the NFL average, so do the payrolls of the two most successful teams, the Patriots and Ravens. Interesting note: The best-fit line for just the NFC teams actually has a negative slope, suggesting that the cheaper teams actually do better. Either way, there is really no evidence of correlation.

Would love to hear suggestions on further analysis or ways I could improve the dashboard.

Followers

About Me

I'm currently employed with Illumination Works as an IT consultant helping Federal and commercial clients maximize their data assets. I am also an adjunct instructor at Xavier University where I teach students how to have fun with data.