March 13, 2018

SportsRatings Genetic Algorithm for NCAA Prediction

Announcing: SportsRatings' new Genetic Model for NCAA prediction

We've been working on a new model for NCAA prediction, and the preliminary results are in. The model tests several dozen different "genes" (i.e. heuristics) that relate to NCAA basketball, from power ratings to polls to combinations of different data, combines them together as "offspring", then kills off the weaker ones so the stronger genes in the "population" survive.

In the end, the model outputs a series of heuristics and associated weights on which to score the teams in the field. For the 2018 season, the initial results are below, with the "score" shows how close the teams are in the model's estimation. After running the results through the 2018 bracket, the "Finish" column shows how far the teams are expected to advance.

According to our model, we're in for a repeat of the 2016 final, with Villanova beating North Carolina. Duke makes the Final Four and the Arizona Wildcats crash it, too. Villanova's score is pretty far ahead of the Tar Heels, but from there it's pretty tight.

A few teams rank very high but the bracket isn't their friend. Top-ranked #1 seed Virginia is 4th in the model, but meet Arizona in the Sweet Sixteen and is eliminated; Another key Sweet Sixteen matchup is Duke vs. Michigan State; should the Spartans win they'd be favored to make the Final Four. Top seeds Kansas and Xavier only make the Elite Eight.

Seed

Team

Genetic Rank

Score

Finish

1

Villanova

1

99.4

Winner

2

North Carolina

2

74.0

Runner-up

4

Arizona

3

69.6

Final Four

1

Virginia

4

64.9

Sweet 16

2

Duke

5

62.8

Final Four

3

Michigan St.

6

61.9

Sweet 16

1

Kansas

7

59.4

Elite Eight

2

Cincinnati

8

56.9

Elite Eight

1

Xavier

9

56.7

Elite Eight

2

Purdue

10

54.2

Elite Eight

4

Gonzaga

11

48.0

Sweet 16

3

Michigan

12

39.1

Sweet 16

4

Wichita St.

13

38.2

Sweet 16

4

Auburn

14

33.4

Sweet 16

3

Tennessee

15

32.2

Sweet 16

6

Florida

16

31.5

Sweet 16

6

Miami FL

17

31.2

Second Round

3

Texas Tech

18

31.2

Second Round

5

Kentucky

19

29.9

Second Round

5

Clemson

20

28.7

Second Round

6

TCU

21

28.2

Second Round

5

West Virginia

22

27.5

Second Round

8

Seton Hall

23

25.0

Second Round

5

Ohio St.

24

23.7

Second Round

6

Houston

25

23.0

Second Round

7

Nevada

26

18.0

Second Round

7

Texas A&M

27

17.8

Second Round

7

Rhode Island

28

17.3

Second Round

8

Missouri

29

15.6

Second Round

10

Butler

30

13.7

Second Round

8

Creighton

31

13.5

Second Round

7

Arkansas

32

13.1

First Round

11

UCLA

33

12.0

First Round

8

Virginia Tech

34

11.9

Second Round

9

Alabama

35

11.9

First Round

9

Kansas St.

36

10.6

First Round

9

Florida St.

37

9.9

First Round

10

Texas

38

8.9

First Round

10

Providence

39

8.6

First Round

10

Oklahoma

40

8.1

First Round

11

St. Bonaventure

41

6.8

First Four

11

Loyola Chicago

42

6.6

First Round

12

New Mexico St.

43

5.8

First Round

11

San Diego St.

44

5.7

First Round

11

Syracuse

45

5.0

First Round

11

Arizona St.

46

5.0

First Four

12

South Dakota St.

47

4.4

First Round

9

North Carolina St.

48

3.9

First Round

12

Murray St.

49

3.7

First Round

14

Montana

50

2.8

First Round

13

Buffalo

51

2.3

First Round

14

Stephen F. Austin

52

2.2

First Round

13

College of Charleston

53

1.9

First Round

12

Davidson

54

1.9

First Round

13

UNC Greensboro

55

1.4

First Round

14

Wright St.

56

1.1

First Round

13

Marshall

57

1.0

First Round

15

Georgia St.

58

0.5

First Round

16

Radford

59

0.5

First Round

14

Bucknell

60

0.4

First Round

16

Penn

61

0.4

First Round

15

Cal St. Fullerton

62

0.4

First Round

16

Texas Southern

63

0.3

First Round

16

UMBC

64

0.2

First Round

16

North Carolina Central

65

0.1

First Four

15

Iona

66

0.1

First Round

15

Lipscomb

67

0.0

First Round

16

LIU Brooklyn

68

0.0

First Four

The model is very conservative in its early picks, probably due to the heavier weights assigned to scoring the entire tournament (there are variations on the model that score the First Round, Final Four, etc, alone, and we will be exploring these later). The model picks only one first round upset, 10-seed Butler over 7-seed Arkansas; even the 8s are all stronger than the 9s! Only one interloper makes the Sweet Sixteen—6-seed Florida. It's not until the Elite Eight, where top seeds Virgina and Xavier fall, that things get interesting.

So far the best "genes" are variations on our Strength power rating, several variants on Ken Pomeroy's data (dealing with tempo-corrected offense and defense especially), and the AP poll, both pre-season and final rankings. Also, we've found that for top teams, consistency is very important. We will continue to add new heuristics to the mix and see what works and what doesn't—or rather, which genes live and which ones die off.