The Art of Logic

Wednesday, May 16, 2018

Side assignment is easy, right? In odd rounds, assign teams to sides at random. In even rounds, assign each team to the opposite side as the previous round. What could be easier?

The problem is that this makes even rounds harder to pair. Any tournament director can tell you that even rounds often "lock up" and that one has to break brackets to make matches. I know I've sat at a screen, wishing the two 5-0s that are both due Aff could hit, instead of each getting a pull-up.

I stumbled on an alternative, what I call the constrained side equalization (C.S.E.) method. Instead of balancing Aff-Neg rounds at the end of even rounds, this method works its magic at the end of odd rounds. Here's the C.S.E. in action:

Rd 1 - paired at randomRd 2 - paired at random, ignoring sides. If both teams were Aff in round 1, or both Neg in round 1, it's a computer flip-for-sides. If one team was Aff and the other was Neg, then the sides are equalized.

At the end of round 2, about 25% of teams will have two Affs, 25% two Negs, and 50% will be balanced. (It depends on the random pairings.)

Rd 3 - Teams with two Affs must go Neg; teams with Negs must go Aff. The balanced teams are not assigned to either side. If a balanced team is matched against a two-Aff team, then the two-Aff team goes Neg. Likewise, if a balanced team is matched against a two-Neg team, then the two-Neg team goes Aff. If a two-Aff team is matched against a two-Neg team, then the sides are equalized. And if a balanced team is matched against a balanced team, then it's a computer flip-for-sides.

At the end of round 3, every team will either have had two Affs and one Neg, or two Negs and one Aff. In other words, at the end of an odd round, the sides are "equalized."

The cycle repeats. Round 4 is paired at random, ignoring sides. Round 5 has the constraint that teams with three Affs must go Neg and teams with three Negs must go Aff; otherwise, any team can be paired against any other. If the tournament ends on an odd round, there's no special other consideration. If the tournament ends on an even round, you'd want to pair teams in the typical way for the final prelim.

Mathematically, it is as simple as this rule:

If the Aff rounds - Neg rounds is 2 or -2, then the team is assigned a sidefirst, then paired with an opponent; otherwise, a team is assigned an opponentfirst, then assigned a side (to equalize if necessary).

This works in odd or even rounds.

But why go to all this bother? The reason is simple: constraints.

Odd

Even

Avg.

Trad.

100%

25%

62.5%

Alt.

87.5%

100%

94%

In a traditional method, in odd rounds, 100% of possible matches (n (n - 1)) could be considered. There are no side constraints in odd rounds, so anyone could be matched against anyone. But in an even round, a tournament is limited to a fourth of (n (n - 1)). A due-Aff team can only be matched against a due-Neg team. This is a huge constraint.

Using the C.S.E. method, in odd rounds, teams with more Affs must go Neg and vice versa. Aside from this small constraint (only about one-eighth of possible matches ruled out), nearly anyone can debate anyone. And in even rounds, it's 100% of possible matches that can be considered. The C.S.E. method has much lower overall constraints than the traditional method.

In other words, the odd C.S.E. round is considerably easier to pair than the even traditional round (21 times better odds of finding a good pairing, in fact). If a side assignment for C.S.E. happens to not turn up a suitable pairing, why, you can reshuffle the teams--switching some randomly selected teams' side, excepting the couple side-constrained teams--and try again. This works whether it's an odd or an even round. In the traditional method, you can only reshuffle with an odd round. You're stuck with the even round side assignments you get with the traditional method. This inability to reshuffle the teams means the tournament can lock up. In the C.S.E. method, because any round can be reshuffled, there's always another chance to find a good pairing.

I worked out an example here. At the end of five rounds of C.S.E., every team had either two or three Affs. The method yielded side "equivalence."

But, intriguingly, the teams took different paths to get there. Some went Aff two times in a row. Some alternated. Although all the paths end with one of two correct results--two or three Affs--there were more path types to get there and thus more options to pair the teams. More paths = more flexibility. We've been doing side assignment the hard way!

Saturday, April 7, 2018

One method for ranking teams that I introduced to the debate community is the logit score. The logit score is derived from a logistic regression. The logit score combines a team's record, speaker points, and its opponents' strength into a single number. Because the logit score factors in record and points, it is performance-based, but that record is adjusted by opponent strength, making the logit score more fair than record alone. A win against a good team is "worth" more than a win against a weak team. If you take the worst opponent a team beat and the best opponent it lost to, and average those together along with the team's average speaker points, then you're approximating the team's logit score. Due to how the logit score is calculated, it is the likeliest team strength that explains its results: its record and its points.

I had previously looked for empirical support for the logit score in a college debate season. I took the real results for the entire season and used them to calculate each team's logit score. I then used those to retrodict the winner in every single match-up that had actually happened, with the higher ranked by logit score team retrodicted to win the round. The logit score did this better than every other ranking method I also tested, slightly edging out median speaker points, and doing better by a goodly margin than the win-loss record. Despite this success, there was the nagging concern that the logit score was being derived from an entire season's worth of information. This empirical support could not show if the logit score would work for a single tournament.

Therefore I set out to do an experiment. I created a simulation tournament in a program, and ran and re-ran it hundreds of times. I tested various tournament conditions, from random prelims to a typical method of power-matching to pre-matching (like a round robin). I looked to see whether in these kind of conditions--using only the information available in a tournament--the logit score fared as well in comparison to record-based rankings and to speaker point-based rankings.

The results are that, in any condition, the logit score is a vast improvement on the win-loss record, but not quite as good as speaker points. It may surprise people to realize that speaker points, even though they vary considerably from judge to judge, are the best information to rank teams. A team's median speaker points isn't affected too much by one judge. Speaker points are rich data when you only have six or eight rounds to rank a team.

However, I believe many in the community would not prefer to use speaker points alone. If nothing else, ignoring wins and losses gives a perverse incentive to teams to speak pretty and ignore winning key arguments. The logit score is a solid, thoughtful compromise. The logit score is based on both wins and points, so there's no perverse incentive to ignore key arguments--nor is there an incentive to ignore effective, mellifluous communication. Although the logit score is slightly less accurate for a single tournament than speaker points alone, the logit score is far more accurate than win-loss record is. The logit score is, in other words, a vast improvement on the status quo method--a compromise in name only.

Saturday, November 11, 2017

I've written about gerrymandering before (and solutions to it), but the more I think about it, the best way to fix the problem is to remove the incentive. Proportional representation is good (and here), but the simplest change to the system is mixed member representation.

Here's how the system could work:

Expand the size of the House to 540 districted members. This means the smallest district (Wyoming's) is everyone's size--about 600,000 constituents. The current number of House districts--435--has been fixed since 1911. Some things have changed since then... (I guess this isn't strictly necessary to a mixed-member system.)

Have independent commissions in each state draw the districts, with a first priority to keep communities of interest together, although districts need to have the same 600,000 people, so it won't be perfect. This means you'll have some heavily African-American districts, heavily Latino districts, and big rural districts. Geographical compactness can take a back seat. Yes, that's right--we might have some ugly, squiggly districts. Trust me, this will work. Independent commissions' proposal should be approved by 2/3 of the votes in the state legislature, which should be easily achieved because everyone can recognize that districts are reasonable communities of interest.

Modify ballots to include two questions for the House races: (a) Which candidate do you support for your House district? This could use approval voting to allow selection of more than one candidate in multi-candidate races. (b) Which party would you like to see control the House? This must be a single selection.

Winners on question 1 win their district and get the seat. Because of the way we've drawn the districts, we're more likely to see black representatives run and get elected, Latino representatives, etc.

We've got district races done and can look at the composition of the entire House so far. For example, the House might be 290 seats for party A, 250 for party B. This establishes the seat share for party A of 53.7%. In the next stage--the mixed member part--votes on question 2 are now compiled nationally. If the vote share and the seat share are not the same, then the party that is underrepresented in the House has at-large seats added. Seats are added until the seat share is within 1% of the vote share. In our example, let's say party A won 55% of the vote. Party A would have 4 seats added: 294 out of 544 seats is 54.04%. If two or more parties are underrepresented, whichever one is farther behind has a seat added first. Two parties might ping-pong back and forth in adding seats.

Once the total number of at-large seats each party gets is decided, then those new members are selected. The at-large members are chosen from the party's candidates who lost but received the most votes. In other word, party A's four additional seats would go to whichever of its candidates lost very close district races.

Stop and think about the incentives of this proposal. There's actually a triple incentive to draw fair districts. Independent commissions want to get the districting plans to supermajority status; there's no reason to draw unfair districts, as you'll lose any gains in the at-large seats part of the plan; and having several competitive districts might increase your state's representation in Congress. States would want to draw at least a few competitive districts to get one over on the neighboring states.

In theory, it's possible that you have to seat 519 additional members (party A wins 49.9999% percent of national House votes but loses every single district race), but in all likelihood, we're talking about an extra 5% of seats--perhaps 20-30 additional seats. Altogether, a 570-member House is about 30% larger than today's. It's big but still manageable. And it's gerrymander-proof. The incentive to gerrymander disappeared.

And here's the most exciting part: You can vote for a third-party to have seats in Congress, even if no one runs (or has a shot) in your district. Let's say you want to vote for the Democratic candidate but throw your party support to the Greens. Or for the Republican candidate and put party support behind the Libertarians. Nationally, those parties will pick up enough votes to amount to at least a few seats. All they have to do is field some candidates in some districts, who will lose, but get picked up in the at-large representatives process.

My hope would be that third parties win enough support to deny either of the two major parties an outright majority, forcing the major parties to form coalition governments with third parties. Suddenly, we're looking at a system that doesn't freeze third parties out of power entirely; we're looking at a system that gives third parties enough seats in Congress to be involved in some leadership decisions. Support for a major party's Speaker might come at the cost of a committee leadership position. The Green party might demand leadership of the Natural Resources committee to support a Democratic speaker. The Libertarian party might demand leadership of Judiciary to support a Republican speaker. It seems likely, though, that this system creates more third party involvement.

Thursday, March 23, 2017

I believe in that simple things done right are the bedrock of society: the bus line that's always running; the convenience store around the corner that's never out of bread, milk, or toilet paper, even during the worst snowstorms; or the reliable local newspaper. But there's perhaps no greater collective failure in this country than our massively incompetent ability to name streets properly. Naming streets should be as simple as 1-2-3:

A contiguous street gets a single name.

A name is used only on one street per city.

Name them in a pattern that's helpful for navigation.

To be clear, I'm talking about street names, not route designations, like U.S. 52 or State Route 39. A road could have a street name as well as a route designation, or even two route designations or more if geography forces the routes to consolidate for a stretch: Johnson Pass Rd. could be U.S. 52 and S.R. 39 all at the same time.

These rules seem clear, right? Rule 1 requires a little definition: a "street" may pass through multiple intersections in a straight or gently curving manner but must actually cross the other street. In other words, a "street" doesn't take a right angle at an intersection. Rule 2 requires a little clarification; let's allow "Maple Avenue" and "Maple Place" as two separate names, provided that they follow rule 3 by being close together--maybe even intersecting. But I don't believe cardinal indicators--"West Maple Avenue" and "East Maple Avenue"--ought to be allowed for separate streets. Those should be reserved for different sections of the same road.

The most common way the rules are violated is that two non-contiguous streets will get the same name. On a map, they're a straight shot, right in line with each other, but maybe there's a natural obstacle in the way, like a river. If I can't drive (or at least walk) from one end to another without turning, it's not one street; it's two. Give the two streets on the opposite river banks two different names.

You may think this doesn't seem like a big deal, but maybe I'll change your mind when I present to you the worst named street in the United States: Old Hickory Boulevard in Nashville, Tennessee. Look upon these maps and despair for your sanity. Our journey begins at Whites Creek, to the north of Nashville.

Crossing Eatons Creek Rd:

Crossing route 12, you may start to get an ominous feeling, noticing the Cumberland River to both the west and east:

Sure enough, you've hit a dead end:

This is west of Nashville.

Old Hickory Boulevard now jumps the river:

Please note: route 251 south of Old Charlotte Pike is Old Hickory Boulevard. Route 251 north of Old Charlotte Pike is a different road.

Old Hickory Boulevard jumps here, and gets a new route designation: route 254.

Next, OHB meanders along the south side of Nashville. Granny White is not exactly due south, but pretty close.

OHB now winds through Brentwood.

True story: I remember sitting in a Pargo's in Brentwood as a child when a tourist came into the restaurant in tears. "I've driven from one end of Old Hickory Boulevard to the other and I can't find this address!"

The manager took one look at her address. "Oh, this address is Whites Creek. That's the north side of town. This here's the south side." Hope you aren't in a hurry...

Now, watch what happens carefully after crossing 41A.

Did you see it? Old Hickory, which was route 254, took a right turn. Route 254 is now Bell Road.

OHB takes another jump:

As far as I can tell, that little section there is Pettus Rd.

Keep your eyes peeled:

Boom! Another right turn for you! Can't you just imagine a couple driving south on Old Hickory after getting off I-24 and the navicomputer is telling them to turn right onto OHB?

"But TomTom, I'm already on Old Hickory!" as they just breeze right onto Burkitt Road.

Maybe they'd have better luck if they got off I-24 going north on Old Hickory?

Nope.

BTW, Route 171 is now the third route designation. So what happens after that right turn off 171?

Old Hickory Boulevard vanishes at the star. The road used to continue, but then T.V.A. built a dam on the Cumberland river, creating Percy Priest lake to the southeast of Nashville. A section of OHB still exists under that lake. Does it confuse boat tourists as much as the land sections confuses car tourists?

Wait, Old Hickory was a ring road. Does it continue on the other side?

Hello? Anyone seen a crappily named road?

Oh, there you are!

And another jump!

And we're back on solid land. You'll notice Old Hickory now has its fourth route designation, route 265.

We'll just cross I-40. Now you'll recall that OHB already crossed I-40 once before (when OHB was route 251). That means we're now on the opposite side of Nashville: the east.

We just follow OHB north for a bit.

Hermitage, by the way, is the name of Andrew Jackson's house/plantation. Andrew Jackson was nicknamed Old Hickory because he was nuttier than a squirrel's poop.

Let's see... we'll just keep going north.

"Wait, WTF? We're on route 45 now? I thought we were on route 265... We must've changed back there. TomTom still says we're on Old Hickory, hon. Good ole Old Hickory won't let us down, right?"

OHB, now route 45, takes a northwest hook here because of the Cumberland river on both sides. (Like Old Hickory, it's everywhere in Nashville.) Here's the map:

"Oh look, dear, there's a neighborhood called Old Hickory! Oh, how cute.""Son of a..."

Now, it happens to be worth zooming in a little bit on Lakewood neighborhood first:

That's right, folks. It has two names. Hadley Avenue and OHB. It's officially broken all the simple naming convention rules and spiked the ball in the end zone.

But now, let's see what happens a little to the north, in Old Hickory neighborhood:

Nothing good for our tourists. OHB just disappears. (Hadley Avenue, the jerk, continues to the right.) Why is the neighborhood called "Old Hickory" when Old Hickory Boulevard doesn't run through it!!

Where did that wascally street go?

Oh, it magicked itself across its eponymous neighborhood. Right. To be clear, that whole section of route 45 I haven't marked is all Robinson Road. All the time. Sure, the locals who are just running down to the Piggly Wiggly know they turn on OHB which then becomes Robinson. But streets aren't named for locals, are they?

In case you're wondering, Old Hickory Community is where all the lost tourist children go to live, if their mums or dads can't navigate the streets of Nashville and pick them up by closing time.

Surely, surely, surely, OHB has pulled its last trick?

This one is a doozy. You'll notice an East Old Hickory Boulevard to the south of route 45. That's odd. Why would the East OHB be south of regular OHB?

Because route 45 ain't OHB any more.

East OHB is it. The best part is what happens inside that star. The name jumps from 45 to the surface street--but there's no physical connection. (Also, let's point out the East OHB goes around its corner, and that non-intersection changes its name to Sandhurst Drive.)

"Getting lost is... just a way to have an adventure, dear! Just... um, wasn't planning this and we're low on gas...""Oh look, hon, an Old Hickory Community. Maybe they can help us!"

If there's an East OHB, is there a West OHB? Indeed there is:

but you gotta take another jump.

OHB is nearly out of tricks, though:

At the star, it changes names from West OHB just back to plain vanilla Old Hickory Boulevard. BTW, crossing I-65 a second time means that we're on the north side of Nashville again.

A few more miles--crossing I-24 a second time--and we're back to Whites Creek:

You can almost hear the tourists wailing: "I just wanted... [sob] just to see... some country music stars' homes! I didn't want to drive all around creation!""And where are our children?!"

Let's do the numbers:

Route designations: Five (251, 254, 171, 265, and 45)Two street names simultaneously: Yes (OHB and Hadley)Street takes a right turn: Three times (all between 41A, I-24, and 171 in southeast Nashville)Jumps over water: Three (Cumberland river, Percy Priest lake twice)Jumps over other roads: Four (251 to 254; over Pettus Rd.; from route 45 to East OHB; from East OHB to West OHB)Jumps over neighborhoods: One--but double points because it's eponymousSwitching names while driving down the same street, not otherwise covered: Two (West OHB turning back into OHB; East OHB turning into Sandusky Rd. The West OHB to regular OHB could be OK, I guess... No one is going to get lost if the numbering makes sense... which it doesn't.)

I think this deserves a total of 15 naming violation points: +1 for two names simultaneously, +3 for right turns, +9 for jumps, +2 for two name switches. (Or maybe 14 points, if you're cool with West OHB to OHB.)

I defy anyone to come up with a worse named street in the U.S. Map-based proof required.

BTW, in case you couldn't tell, I'm originally from Nashville. No offense is intended; I think it's fair to poke a little fun at your hometown.

Sunday, February 26, 2017

A federal court recently struck down a gerrymandering scheme in Wisconsin in a case that could set a major precedent for the country. Once every ten years, after each Census is completed, the boundaries for House of Representatives districts have to be re-drawn to keep their populations equal. The U.S. Constitution leaves it to state legislatures to decide how to draw these districts. Gerrymandering is the intentional abuse of that power; legislatures might gerrymander to keep minority groups out of power or to benefit one political party. The longtime practice of gerrymandering has always had its critics. As President Obama recently said, “Politicians should not pick their voters; voters should pick their politicians,” though Obama didn’t coin the phrase and wasn’t the first to express exasperation about gerrymandering.

Contrary to popular opinion, gerrymandering isn’t about protecting incumbents by giving them safe districts. The actual process of gerrymandering involves two steps: packing and cracking. Packing is the placement of your opposition’s voters into a few, concentrated districts. Cracking is the distribution of the remaining opposition voters into districts that they can’t win. Here’s what a gerrymandering scheme using packing and cracking could look like:

Possible
party B gerrymander

Votes for party

District

A

B

Winner

1

95

5

A

2

45

55

B

3

45

55

B

4

45

55

B

5

45

55

B

Total

275

225

District 1 is packed with party A supporters. Party A’s remaining voters are cracked across the other four districts, which they can’t win. Even though party A received 275 out of 500 votes, or 55%, they win only one district out of five, or 20%. There’s no way party B could have gerrymandered this any better. Four districts are safe enough that party B will likely never lose those races, even in a bad election year for their party. Trying to give their party a bigger margin in any of those races would only make another race closer. Getting the right vote totals in each district may require drawing some unusual shape districts. Gerrymandering gets its name from an 1812 Massachusetts district map, approved by Governor Gerry, with one district that looked like a salamander. The map benefited his party, even though Gerry lost his own office for it.

The U.S. Supreme Court has never struck down a gerrymandering scheme that attempted partisan gain, only gerrymandering done to deprive minority groups of voting power. The Voting Rights Act prohibits racially motivated gerrymandering, and justices have also looked to the Equal Protection Clause of the Fourteenth Amendment. The Court has allowed the creation of districts where a minority group is a near majority of the voters to ensure that minority groups can elect their own representatives to Congress. In southern states, for example, African Americans vote so heavily Democratic, and white people vote so heavily Republican, that some districts must approach a 50-50 racial mix in order to elect black congresspeople. The Supreme Court has allowed this as long as race isn’t the primary factor in making the districts. Two racial gerrymandering cases will be heard by the Court soon, Bethune-Hill v. Virginia State Board of Elections and McCrory v. Harris, so the standards might be changing soon.

Partisan gerrymanders, however, have long been ignored, although Justice Kennedy has indicated that if a clear standard for judging gerrymandering’s severity could be found, he would rule against partisan gerrymandering as well. Along with the four liberal justices on the Court, Kennedy might bring forth a new Supreme Court precedent. The Court, by the way, cannot decline to make some ruling on the Wisconsin case.

The Wisconsin case is the result of an unlikely group of statisticians, political scientists, and lawyers attempting to serve up to Justice Kennedy a standard for judging gerrymandering. Their work is premised on the concept of a “wasted vote”: any votes above 51% or any vote in a lost race are considered “wasted.” In the hypothetical gerrymandering scenario, this is what the wasted votes look like:

Wasted
votes in party B gerrymander

Votes for party

Wasted votes for

District

A

B

Winner

A

B

1

95

5

A

44

5

2

45

55

B

45

4

3

45

55

B

45

4

4

45

55

B

45

4

5

45

55

B

45

4

Total

275

225

224

21

Party B gerrymandered the districts to waste 224 of party A’s 275 votes. Party A’s wasted votes almost equal the total votes party B received! Of course, the plaintiffs would also have to prove that gerrymandering happened intentionally, but proving too many votes are wasted is the necessary first step. No mathematical evidence, no case.

Using the wasted votes standard proposed in the Wisconsin case, seven states have Congressional districts that are suspicious: Florida, Michigan, North Carolina, Ohio, Pennsylvania, Texas, and Virginia—all of them pro-Republican gerrymandering. In Pennsylvania, the Republican Senate candidate won 51% of the two-party vote, as did Trump. The Pennsylvania House delegation, on the other hand, will be thirteen Republicans to five Democrats, or 72% Republican. One reason all the current gerrymandering schemes are Republican is that the G.O.P. controlled more state legislatures after 2010, when the last re-districting was done.

Another standard proposed for measuring gerrymandering is to look at the median district. In the hypothetical gerrymander given before, the median district—the middle in a list from party A’s worst to best district—is a 55% to 45% result in favor of party B. Yet party A received 55% of the overall votes. This gap of 10 percentage points between the median district and statewide total is sizable.

Packing isn’t necessarily bad. A district could reflect a real community of interest, a group of people with similar social, economic, and political interests. For example, in Oregon, the Democratic candidate for Portland’s congressional district ran unopposed. The people of Portland share a similar enough view with the Democratic candidate that it deterred any Republicans from challenging the seat. The Supreme Court has ruled that predominantly African American or Hispanic districts can ensure minority representation in Congress and can serve a community of interest’s needs.

Likewise, cracking isn’t necessarily bad either. It depends on the ratio. A 50-50 split district is competitive. Even a 52-48 split could swing to the other party in some years. The real question is about one party being systematically disadvantaged by packing and cracking. So how does Oregon fare?

Oregon
2016 results in U.S. congressional races

% votes for*

% wasted votes for

District

Democrat

Republican

Winner

Democrat

Republican

1

62

38

D

11

38

2

28

72

R

28

21

3

100

-

D

49

0

4

58

42

D

7

42

5

55

45

D

4

45

* This is the two-party vote share; third party and write-in results excluded for simplicity.

District 3 is “packed” for the Democratic candidate who ran unopposed. Offsetting this is the fact that District 2—all of eastern Oregon—is packed for the Republican. However, Republican voters seem to be “cracked” into Districts 4 and 5, central-west and southwest Oregon respectively.

How does Oregon look on either measures of gerrymandering? The Democrats took 58% of the two-party vote share. The median district is district 4, and Democrats won 58% there, so the gap is zero. However, on the wasted votes measure, Oregon is not doing as well. Democrats wasted 326,030 out of 991,008 votes statewide, or 33% wasted. Republicans wasted 524,332 out of 709,716 votes, or 74% wasted. Ideally, both parties would waste about 50%. The divergence between the two measures of gerrymandering—one good, one not-so-good—is why the Supreme Court wants to settle on one standard, not two or more competing definitions, of partisan gerrymandering.

Based on Oregon Republicans winning 42% of the two-party vote, the state might be expected to have about two Republican congresspeople out of five. One could imagine an alternative to the current district 4 and 5 arrangement that shuffled counties into two new districts: a greater Willamette Valley district comprising Salem, Albany, Corvallis, and Eugene, solidly Democratic; and a U-shaped Cascades, south-central, and coastal Oregon district, leaning Republican. This would move one of the districts into the Republican column. However, it’s often difficult to shift a few voters around and create balance as measured by wasted votes. The standards that people have proposed only kick in when gerrymandering creates a two-seat difference or more because it isn’t always possible in small states to make districts balanced. Geography can get in the way.

Some political scientists have proposed using computer programs to draw district boundaries, but this doesn’t solve the root of the problem. For example, a program might try to create more compact districts. That tends to pack Democrats into small, round city districts, wasting Democratic votes. Alternatively, a program might try to create short, straight-line district boundaries, cutting a state into districts like you might cut a cake into irregular polygons. That tends to pack Republicans into large, rectangular rural districts, wasting Republican votes. The bias in the program comes from preferring one type of shape to another. Natural and human geography can necessitate all different shapes to reflect real communities of interest. An eastern Oregon district makes sense, as does a coastal Oregon one, but one district is a near square and the other would be pencil-shaped.

The best hope is for states to put non-partisan commissions, not state legislatures, in charge of drawing reasonable boundaries. Iowa has a long-standing commission; Arizona, California, and New Jersey have newer commissions. There are strengths and weaknesses to each state’s set up for its commission, but the outcomes have been better with commissions than without. Perhaps the threat of losing a federal case for gerrymandering will persuade more state legislatures to enact a non-partisan option, only 204 years after Governor Gerry learned his lesson the hard way at the hand of Massachusetts voters.

Saturday, February 11, 2017

I recently published an article on a new debate team-rating method I invented, called the logit score. I hope the logit score will take its place among win-loss record, average speaker points, median speaker points, opponent wins, ranks, and so on as an effective way to rate (and thus rank) debate teams at a tournament.

What is the logit score?

The basic idea is simple: the logit score combines win-loss record, speaker points, and opponent strength into one score using a probability model. In other words, the logit score is the answer to the question, "Given these speaker points and these wins and losses to those particular opponents, what is the likeliest strength of this team?"

Let's take a step back and acknowledge a truth not universally acknowledged in debate: results should be thought of as probabilities, not certainties. A good team won't always beat a bad team--just usually. Off days, unusual arguments, mistakes, and odd judging decisions all contribute to a slight risk of the bad team winning. The truly better team won't always prevail. That means actual rounds need to be thought of as suggesting but not definitively proving which team is better. Team A beats team B. Team A is probably better, but then again, they could have had off day, been surprised by a weird argument, or had a terrible judge. If team A got much, much higher speaker points, it was very likely the better team. If team A only edged out team B by a little bit, then the uncertainty grows.

That's where the logit score comes in. Estimating team A's actual, true strength depends on putting together all of those probabilities and uncertainties into one model. I won't get into the specifics (the details are in the article), but the basic idea is using a logistic regression to put the probabilities for wins and losses to specific opponents as well as specific speaker points received together. The logit score for a team means: "If team A were estimated to be stronger, these results would be a bit more likely, but those other results would be far less likely. If team A were estimated to be weaker, these results would be far less likely, even though those other results would be a bit more likely. This logit score is the proper balance that makes all the results most likely overall." Because it factors in all the results in one probability model, the logit score isn't sensitive to outliers: unusually high or low speaker points, losses to outstanding teams, and wins over terrible teams don't affect the logit score much at all.

Does the logit score have any empirical results to back it up?

I took a past college debate season, used those results to give every team a logit score, and then looked to see how well logit scores "retrodicted" the actual results in a season. That is to say, how often did the higher logit scoring team win rounds against the lower logit scoring team? As a baseline of comparison, I also did the same kind of analysis by ranking the teams by win-loss record.

The logit score rankings got slightly more rounds correct than the win-loss record rankings.

The slightly higher accuracy is not, on its own, a reason to rush to adopt logit scores. It merely proves that the logit scores aren't doing anything crazy. For the most part, the logit scores reshuffles teams ever so slightly with their nearest peers. The moves are slight ups or downs, not drastic shifts.

The real reason to consider using logit scores is that (a) they are less sensitive to outliers, which can matter a lot for a six or eight round tournament; and (b) they factor in more information. Win-loss records only use speaker points as a tiebreaker; it's secondary. Measures of opponent strength usually come third. In other words, a team with a really tough random draw and goes 4-2 as a result of dropping the first two rounds might miss out on breaking if no 4-2s break--win-loss record comes first and opponent strength won't factor in in that scenario. The logit score on the other hand--because wins, points, and opponents are all factored in at once--could reflect that this team is in fact very strong because it only lost two rounds to very good opponents. (See how important it is to be less sensitive to outliers?) More information also rewards well-rounded teams: those that win rounds on squeakingly close decisions and don't receive great speaker points are penalized more under a logit score system than a win-loss-then speaker points-system.

Thursday, March 31, 2016

It's been a while since I've written anything--life gets in the way. Mostly, I've been working on my new book, Statistics for Debaters and Extempers, which is 23/29 written. I keep writing chapters but adding one new ones to the list. It's like the Winchester House. However, I do have some thoughts I want to share about teaching.

One post I'm proud of is the one about grading. Percent grades are not very informative for teachers. Standards-based grading (SBG) is far better. If you're not familiar with SBG, let me explain it really briefly. The idea is to note for each standard (skill or knowledge students are supposed to learn) for each assignment, you mark a score that the student earns. These scores are often 1 to 4, where 1 is "not demonstrated at all"; 2 is "developing"; 3 is "demonstrated"; and 4 is "mastery". Or some such other scheme. For example, on a math test on fractions, a student might receive a 4 on the adding fractions standard but a 3 on the multiplying fractions standard. All the other standards for the year for that test would be left "N/A". SBG can exist side-by-side with a percent grade, too.

Ideally, students would be assessed on each standard multiple times. They could demonstrate mastery on the standard on tests, homework, or projects. Students should be able to show at least a 3 on a standard multiple times, say three times, to earn an overall 3 on it. A SBG scheme might also look only at the most recent three times a standard has been assessed. For example, a {2, 3, 3} could be coded as a 2, a {3, 4, 3} coded as a 3, and a {3, 4, 4} coded as a 4. The student earning a 2 wouldn't be penalized; they'd be given another chance to earn a 3. The other two students who earned 3's and 4's wouldn't need another assessment.

One thing I hadn't thought about before: SBG opens the door to indicating to students which test, quiz, and homework questions reveal which level. For example, one could mark questions as 2's, 3's, and 4's. A teacher could explain that getting all the 2's right is a necessarily developmental step but not an endpoint. A student who can answer all the 2-level questions right should recognize the achievement but push himself or herself to do the 3-level questions. Likewise, a student getting all the 3-level questions right should recognize the achievement but push to do 4's. It basically, to use a buzzword, allows the teacher and student to differentiate the work they do. Kids at the top could be told, "When you do your homework, spend half the time on 3's to prove you can do them, and spend the rest of your time doing the 4's for exercise." Kids in the middle could be told, "Spend a third of your time on 2's to prove you can do them, a third on 3's to really exercise, and a third on 4's to see if you can really stretch." Kids at the bottom could be told to spend equal time on 2's and 3's. It gives every ability kid a chance to do comfortable practice and also practice time for growth.

* * *

A completely random idea: why do we have the S.A.T.? I think the biggest reason colleges want to keep it is because it is hard to know what schools' curricula cover and what their grading means. Grades from one school aren't really comparable to grades from another.

But what if the S.A.T. 1 format (you know, one hour each of math, reading, and writing) was basically ditched in favor of the S.A.T. 2 / A.P. subject style tests? Colleges could verify what each schools' transcript actually meant. Even if the tests aren't necessarily accurate for individual kids, they would be accurate for an entire schools' worth of test-takers.

Here's how I imagine it working. Gone are Saturday tests. Gone are students being solely responsible to sign up (this harms poor kids and kids who are the first in their families to go to school). It is the school's responsibility to look at the different test options and sign the kids up for the right tests. These tests would happen in May, during the school day, just like the A.P. tests do.

Math, English, and foreign languages would only need to be tested in the May of junior year. Obviously there would need to be a different exam for each foreign language. The English exam could have two options, say, a regular level exam and an honors level exam. (I imagine a vast chunk of material that overlaps between the two so that scores are comparable.)

Math would be a bit tricky. There would need to be several different exams reflecting the fact that juniors end up in very different places. The school would be responsible for guiding students in the different classes to pick the right exam. I imagine these tests would be about three hours, like the current A.P. tests are.

Sciences and history would be even trickier. Every student basically takes biology, chemistry, and physics but the order differs from school to school. Most schools do biology in freshman year, but some start with physics. In history, the usual sequence is world history, European history, and U.S. history, but there are many deviations from that pattern. However, this seems like it is a surmountable problem for the test designers. The bigger problem to me is making sure that these subject tests don't get bloated and require extensive cramming of facts and instead test higher level scientific and historical reasoning skills. (These subjects are the A.P. tests that come in for the most abuse for this issue.) To keep things balanced and prevent bloat, each of these tests would be kept to one hour.

Basically, I'm talking about expanding the A.P. tests for all students, not just at the honors level but also at the regular level. Everyone submits ten scores: math, English, foreign language, three sciences, three history, plus one more of their choice (could be computer science, or economics, or art history--whatever they want). Junior year, we're talking about a week of testing, but in sophomore and freshman year, it would only be two hours of testing (science plus history), so they would more or less have normal classes during that week. It's even possible to devise a basic schedule:

People complain about the inequity of A.P. testing, and I agree. But making the A.P. tests mandatory and putting the burden on schools solves that problem. And my system obviates the need for giving the S.A.T. 1, which is inequable because preparing for it requires work outside of school. This hurts the poor kids who won't be able get any additional help for it.