Wednesday, September 24, 2014

As some long-term readers of this blog may know, I'm a professor at Texas Tech University and I meet occasionally with Red Raider volleyball coach Don Flora to discuss statistical aspects of the sport and find out what kind of analyses he might be interested in at a given time. We last met this past spring and he told me his big question: "What wins in the Big 12?" I took the meaning of the question to be: what combinations of success at hitting, blocking, digging, serving, etc., were associated with winning conference matches in the Big 12? I told Coach Flora I would have something for him, and proceeded to start thinking about how I would conduct analyses.

Now, with the Red Raiders opening their Big 12 portion of the schedule by hosting TCU tonight, I have the fruits of my inquiry. I first created a database of all 72 conference matches played a year ago (despite its name, the Big 12 has only 10 schools and one, Oklahoma State, doesn't field a women's volleyball team; nine teams playing a double round-robin schedule of 16 matches yields 72 total matches). For each team in a given match, I recorded its hitting percentage; blocks, digs, aces, and service errors per game; whether the team won or lost the match; and the number of games it took. Note that with 72 matches and two teams per match, there were 144 records or "stat-lines" possible.

One very basic comparison, among the techniques used by Penn State coach Russ Rose in his 1978 Master's thesis at Nebraska, is to see how often the team that outperformed its opponent on a given statistic won the match. As shown in the following chart, the team with the better hitting percentage in a match won nearly all the time (68 out of 72 matches). Having more blocks, digs, and aces also conferred sizable advantages, but not as powerfully as out-hitting one's opponent.

Hitting does not occur in isolation, however. Some teams might hit well, but not block well; or hit well and not dig well; etc. To probe this issue, I looked at the 144 stat-lines referred to above. To illustrate a stat-line, let's look at Texas Tech's (focal team) when it visited Kansas:

I then submitted the 144 stat-lines to a cluster analysis, a technique that attempts to sort cases (or stat-lines) into groups with other similar cases. In other words, the stat-lines within a group should end up relatively similar to each other (i.e., within-group homogeneity), but the different clusters of stat-lines will be dissimilar from each other (i.e., between-group heterogeneity). I obtained 10 clusters, but two of them had only four cases each, which is too small for statistical analysis. Ultimately, our interest will be in seeing the win-loss records of the eight viable clusters, but let's review some basics first. The following graphic illustrates the membership of Cluster 9, as an example (unless you have exceptionally strong eyesight, you'll want to click on the chart to enlarge it).

Seventeen stat-lines ended up in Cluster 9. Each focal team (to whom the stat-line belongs) is highlighted in yellow, with its opponent for the particular match appearing in the second column. Note that many different schools can appear in the same cluster. We're grouping performances, not teams per se. Averages for the complete sample of 144 stat-lines on the various volleyball performance measures are shown in red above each column.

Probably the funnest aspect of conducting cluster analyses is that you get to make up names for the clusters, based on their statistical properties. As seen in the above chart, I named Cluster 9 "Slightly Above-Average Hitting, Below Average Blocking, VERY GOOD DIGGING, HIGH ACES." The digs/game for the 17 cases are shown above in red outline; they range from 17.33 (Baylor, playing at West Virginia) to 20.50 (Texas Tech, hosting Oklahoma). All of these dig statistics exceeded the complete-sample average of 14.92, illustrating why a major part of this cluster's "identity" would consist of "very good digging." Apparently as a result of the digging, the teams in this cluster went 11-6 in the relevant matches, despite hitting only slightly above average in them (the average hitting percentage for the Cluster-9 teams was .223, compared to a complete-sample average of .214).

I've placed all the detailed statistics on the clusters below in an Appendix, for anyone who is interested (once again, please click on the graphic to enlarge it). In the remainder of this posting, I provide brief summaries of the clusters:

Cluster 1. Plagued by below average digging (12.92/game) and a high rate of service errors (2.71, compared to the complete-sample average of 1.73), teams whose stat-lines were in this cluster went 5-12 in the relevant matches.

Cluster 2. Cases in this group displayed great hitting on average (.253, compared to the full-sample mean of .214). They also served aces at a higher-than-average rate (1.93/game, compared to 1.14 for the full sample), but also committed more service errors (2.20/game) than the overall average (1.73). The kind of seemingly powerful/aggressive play exhibited in this cluster produced a 12-6 record.

Cluster 3. Characterized by below-average blocking (1.49/game, compared to the overall average of 2.19) and few aces (.74/game), cases in this cluster went 6-13.

Cluster 4. Though this cluster contained only four cases, the signs of poor play were quite vivid (e.g., .031 average hitting percentage, 1.08 blocks/game, a paltry 8.42 digs/game). Although caution is warranted due to the small size of this cluster, the results are just as one would expect: 0-4.

Cluster 5. This cluster excelled in most every way (.256 hitting percentage, 3.12 blocks/set, 1.36 aces/set with only 1.56 service errors), except for digging (12.02/game). The focal teams went 10-5 in the relevant matches.

Cluster 6. This group combined weak hitting (.130), blocking (1.04/game), and digging (11.78/game), with apparent caution from the service line (only .75 aces and 1.27 service errors, per game). This is not a pattern to emulate, as the teams went 1-9.

Cluster 7. Cases in this cluster hit at the overall average (.214), blocked (2.47) and dug (16.68) somewhat above average, but also showed caution when serving (.89 aces and 1.32 errors, per game). I would have expected these cases to have a winning record, but they didn't, going 15-16.

Cluster 8. Cases here hit (.268) and blocked (3.58) extremely well, rarely served aces (.58/game), and were pretty average on the other metrics. Dominating the net paid off big, as these cases went 8-1.Cluster 9. Discussed above.

Cluster 10. The other cluster with only four cases, the teams here played great defense (22.31 digs, and 2.60 blocks, per game) and went 4-0.

In conclusion, to answer Coach Flora's question, there are multiple ways to win in the Big 12 (see Clusters 2, 5, and 8), but they all seem to revolve around great hitting. One way to increase the sample size and achieve greater precision in a future study would be to look at win-loss records of games rather than matches. Box scores typically include team hitting percentages by game (to correlate with the winning of games), but blocks, digs, and serving statistics are only reported for the match as a whole. One final issue is that the present analysis tells us nothing about whether the findings are in any way unique to the Big 12; the same relationship between performance metrics and winning might emerge for other conferences, as well. We just don't know.

Tuesday, September 23, 2014

It's a busy week in women's college volleyball, with conference play opening up around the country. The Pac 12 schedule has each team starting off league play against its respective traditional/geographic rival. A pair of matches will be held tonight, featuring Cal (8-2 in nonconference) at No. 1 Stanford (10-0), and No. 20 UCLA (9-2) at No. 9 USC (7-3). Other Pac 12 rivalry matches will be held on Wednesday and Thursday. I already wrote about Stanford's fast start this season, so I will discuss UCLA and USC (among other teams) in the present posting.

The Big 10 (or B1G) begins play with matches Wednesday and Friday. The marquee match-up of the week, not just in the conference, but nationally, features a rematch of last December's national championship tilt between No. 3 Penn State (12-1) and No. 5 Wisconsin (9-1), in Madison. The Nittany Lions' only loss so far this season was in a five-gamer to Stanford, whereas the Badgers' only setback was to Washington, likewise in five games.

The following chart (on which you can click to enlarge) displays information on hitting percentages associated with Penn State, Wisconsin, UCLA, and USC, with each team having its own column.

Looking at PSU in the far left column, for example, we see that the Nittany Lions hit an amazing .395 as a team during nonconference play, with four players, led by middle-blocker Nia Grant (.525), exceeding .350. And this is without last year's seniors Deja McClendon, Ariel Scott, and Katie Slay. Talk about reloading rather than rebuilding! Meanwhile, Penn State has held its opponents to an aggregate .125 hitting percentage. The Nittany Lions' schedule has been moderately tough, including games against two NCAA Sweet Sixteen teams from a year ago -- American and Kansas -- one against traditional power UCLA, and the aforementioned match with Stanford.

Penn State's gaudy hitting percentages derive partly, but certainly not entirely, from matches against weaker teams. As a team, the Nittany Lions hit .442 against UCLA, with four PSU players each hitting .444 or higher. In the Kansas match, PSU came out smoking on serve-receipt, siding out on 100% (10-of-10) of the Jayhawks' Game-1 serves. Grant hit .467 in this match, Aiyana Whitney, .571, and the Lions as a team, .319.

Wisconsin, whose most impressive wins include a sweep of No. 7 Colorado State and a four-game victory over USC, is hitting .296 as a team, with three players at or near .400. Washington held the Badgers to a .178 team hitting percentage, however, outblocking them 22.0 to 7.5. Even on such a bleak hitting night for the Badgers, setter-turned-outside-hitter Courtney Thomas hit .406. Thomas was profiled in the September 11, 2014 issue of Wisconsin's Varsity Magazine. In beating 'SC, Wisconsin's hitting was at a more characteristic .320.

Finally, we have UCLA and USC. The Bruins' two losses were sweeps at the hands of Penn State and, very unexpectedly, Loyola Marymount (now ranked No. 21 in the nation), whereas the Blue and Gold's best wins have been over No. 16 Illinois and No. 25 Hawai'i. Senior Karsta Lowe is pacing the Bruin offense, not only hitting a team-leading .368, but also taking 30.7% of UCLA's spike attempts (367/1195). Younger players Claire Felix (So.) and Olga Strantzali (Fr.) are also contributing well offensively.

USC recently experienced a three-match losing streak, falling at home to Texas A&M (3-2) and Florida (3-0) two weekends ago and then to Wisconsin last week. The Trojans' best win so far, at least in terms of rankings, was at No. 14 Kentucky. In sweeping the Wildcats, 'SC hit .313 while holding UK to .090.

During the losing streak, the Trojans faltered both offensively and defensively. Sophomore Ebony Nwanebu has hit above .300 for 'SC since returning from early-season injury problems, but Florida kept her totally in check (5 kills and 5 errors on 21 attempts, for a .000 evening). Also, though not as dramatically, Wisconsin contained 'SC junior Samantha Bricio (11-5-46, .130). Defensively, during their losing streak, the Trojans let all three opponents exceed .300 in hitting percentage (Aggies, .319; Gators, .303; and Badgers, .320).

Bruins and Trojans, Nittany Lions and Badgers. Pretty good matches to begin play in the nation's major conferences!

Thursday, September 11, 2014

Heading into the third weekend of the 2014 women's college season, Stanford has been the most impressive team thus far, dominating the most recent AVCA national poll. The Cardinal (4-0) is by no means the only undefeated team; 10 teams in the Top 25 have perfect records. However, it's the difficulty of Stanford's opposition -- Iowa State (in Ames), Nebraska (in Lincoln), Penn State, and Illinois -- that makes the Cardinal's record so noteworthy.

Another challenge Stanford has overcome thus far is the absence of two of last year's seniors, three-time All-American MB Carly Wopat and All-Pac 12 honorable mention OH Rachel Williams. Let's explore how the Cardinal has adapted offensively. I first compared Stanford's offensive statistics for 2013 and 2014 (the latter statistics, based on only four matches, should of course be taken with caution). As the first graph shows, the Cardinal has not hit at quite as high a clip as last year, while its opponents (in the aggregate) have hit a little better this year than last. Still, this year's Stanford squad has outhit the opposition by a sizable margin (.275 to .186).

As the next chart shows, Williams (807 total spike attempts) and Wopat (606) together took roughly 35% of the Cardinal's 4,005 total swings last year. These two non-returning players from 2013 are shown in different shades of grey below, and the percentages on the second line below each player's name signify the share of the team's total swings they have taken (you can click on the graphics to enlarge them).

That's a lot of offense to replace. Three key Stanford returnees are Brittany Howard (shown in dark cardinal red in the chart), Jordan Burgess (pink), and Inky Ajanaku (bright red). The fact that Burgess's and Ajanaku's line-segments are wider this year than last signifies that they are each taking on a larger share of the Cardinal hitting. Whereas Ajanaku took 12.8% of Stanford's spike attempts last year, she is taking 16.7% of them this year. Burgess's share has gone from 20.3% to 28.1%. (It's pretty common for outside hitters such as Burgess to take more attempts than middle blockers such as Ajanaku.)

Morgan Boukather, who attempted only 48 spikes (around 1% of the team's total) in 2013, is way more active this season, having taken 17.7% of the Cardinal's attempts. Note that Boukather hits from the right side (opposite the setter in the rotation), whereas Williams and Wopat hit, respectively, from the left and middle positions on the front line.

Beyond how frequently an attacker is called upon to hit, there is the question of how effectively she is doing it. Ajanaku has upped her hitting percentage from the already high .438 in 2013 to .474 this season. Burgess, though getting more swings, is not hitting as efficiently this year (a hitting percentage of .194, compared to .294 last year). We'll see if she continues to get so many attempts. Boukather has been a bit up-and-down so far this season, hitting .167 vs. Iowa State, .417 vs. Nebraska, .353 vs. Penn State, and .097 vs. Illinois.