Thanks XimenBao, for accepting this debate. You have not explained whether you will be defending the current system or arguing for an alternative one. I assume that you will do that in this round.

Any average debater, on his day - can upset a top debater. Ranking debaters objectively is an interesting problem. A statistical solution to this problem was developed by Arpad Elo. He calculated a system where ratings are decided by looking at the rating of opponents a particular debater has lost to or defeated.[1]

The rough idea is that based on current rating, probability of winning against your rival is estimated based on statistical techniques. In case your probability of winning was low and you win - you get a big boost in rating. On other hand if the probability of winning was high and you win - it is not considered a big achievement and you get a small boost. Similarly losing against a weak opponent will cause big point loss - but losing against strong opponent causes small point loss.

Permit me to explain the incentives another way:If you contest against weak debater: Your chances of winning are high (EASY for you) but on winning point increase is less (BAD).

If you contest against strong debater: Chances of winning are low (CHALLENGING) but on winning, points increase is large (GOOD).

Thus this system, provides incentive to try tough things and refrain from the simple path.

==Advantages==

1. Fair: Current system is based on number of wins. This is heavily biased towards older player. In Elo system new players can top the tables quickly if they consistently defeat top debaters.

2. Well established: This system is working in several sports. Examples include chess, football, baseball. Many online games also use this system[2].

3. Reward for extraordinary debater: If a new debater, challenges and defeats top 10 debaters for 10 consecutive matches, we can say he should be in top 10. However in the current system will record his/her performance 10 wins - which will give him a very average percentile.

4. Emphasis on skill: It will force top debaters to maintain their forms. They will not be at top just because they fairly good debaters who joined 2 years ago and kept debating regularly. Rather they will have to be the best debaters everytime. I believe - they will find this challenge rewarding.

5. Opponent strength: Currently a person can increase his rank by taking part in large number of debates against low ranked debaters. On other hand a debater - who dares to debate better ones - is penalised as chances of victory are less. Moreover debating against a top debater will usually consume more time and research. In this system the reward of winning against a top debaters will compensate that.

6. Professional: Debaters can use their Elo rating on resume. Saying that I have Elo rating of 2600 on debate.org sounds much more professional than saying I have a rank 20 on debate.org.

7. Easy to implement: The implementation is simple. Stable GPL implementations are available[3]. Implementation from scratch is also not tough. (I can do it - and my majors is not computers!) It can be crowd sourced as well to a website like 'topcoder'[4] which incidentally uses a variant of Elo for rating its coders!

8. The old system is there: The Elo ratings can appear in a new column in leaderboard. If some one really wants to sort on basis of wins only - they can easily do so.

Thank you for your argument, Baggins. I will be arguing first in favor of the current system and then for two alternatives: no rating system at all and the Glicko-2 system. All of these are superior to the Elo system. My burden is to make a convincing and well-supported case that at least one of them is superior to the Elo system. Pro’s burden is to show that the Elo system is superior to all of my proposed alternatives.

Status Quo:

Before I begin to discuss the problems with Elo or the benefits of the alternatives, the idea of what goal a system is trying to accomplish needs to be explored. While Pro never explicitly states his vision of a ranking-system’s goal, I think it’s fair to conclude from his argumentation that he considers a ranking-system’s goal to evaluate and report a competitor’s skill level in a given activity, primarily because this is the goal of the Elo system. My defense of the status quo will discuss why this is not the current goal of debate.org, meaning a ranking-system with that goal will be at odds with the purpose of the website and thus inappropriate for adoption.

Debate.org is defined in R1 as “the website http://www.debate.org...; This refers to the website as it currently exists, with its current rules, guidelines, goals, and expectations. It does not refer to an alternative version of the website that’s changed to accommodate either of our visions of what it should be, apart from adopting a new rating system.

As it stands, the goal of debate.org is not to allow members to measure their skill as debaters against each other. The reason for debate.org is “to enable a person of any creed, nationality, gender, or sexuality to have a platform to voice their opinions and to share ideas on any topic they choose.” Its mission is “to be the premier website for online debates across thousands of topics.(1)” There is an absence of any language suggesting that ranking or competition is a goal of the site.

If matches of debating skill were the point of this website, the allowed debates would be limited to those which require debating skills. Instead, troll debates (2), discussions of pop culture (3), statements of preference regarding games (4), rap battles (5), and joke debates where facetious logic and humor lead to victory (6) are not only allowed, but count towards rankings. In addition, votes are allowed to stand even when voters are explicit that their RFD was because of their preformed opinions of the topic rather than the debating skills of the participants (7).

Debate.org is not intended to be a place to accurately determine the strengths of debaters. The two purposes are to be a platform for sharing ideas, and to be as popular as possible. These purposes are apparent in the choices debate.org has made. All of the debate practices linked above are allowed in order to encourage as many people to create as many debates as possible, even when they aren’t really even debates. The current ranking system works towards the goals of the website, as acquiring a high ranking requires long-term membership and frequent visits to the site, much more so than the Elo system would. Given this, debate.org should not adopt the Elo system because it runs contrary to the goals that it has set for itself and should remain with the status quo because it is consistent with those goals.

No System:

If my opponent argues that some other set of goals should be used instead of debate.org’s, in addition to supporting that, he must contend with the fact that debate.org as it stands provides a venue for a different kind of activity than chess, or sports, or other competitions Elo is used to rank. As Pro’s definition stated, the Elo system is to be used in “calculating the relative skill levels of players in two-player games such as chess. (8)” However, debates at debate.org are not games in the sense that chess is a game, and the system does not apply to activities like this.

Chess, sports, and other activities rated in Elo systems have objective winners and set rules. This leads to comparative outcomes. The reason you can say that a chess player with a higher Elo rating is better at chess than a player with a low rating is because you know that games of chess with identical rules were played and the winner increased his Elo rating appropriately. However, in debates the better debater may not ‘win’ and at debate.org wins may not represent the same game being played.

There is not an objective winner in debates. Instead the outcome is in the hands of other site members. Vote bombs can easily skew the result of a debate. Forfeits may even count as ties if no one votes on them. While this is an issue in all aspects of debate, in collegiate and even high school debate rfds are usually given in person, and there are feedback mechanisms which generally prevent completely biased judges from voting on the topic rather than the debaters. Such judges would not likely be invited back to a high school debate and complaints would be made if they were collegiate coaches.

There are very few set rules. Debate.org does not differentiate between a joke contest, a rap battle, or an actual debate. This would be analogous to a Elo system being used in chess that doesn’t differentiate between a chess puzzle, a chess match, and a 4 year old playing ‘horsie’ around the kitchen using the knight.

The website simply isn’t the kind of thing that can be rated with an Elo system. An Elo system will not trigger Pro’s advantages because it doesn’t show the same kind of results that it does in other games Pro uses for the basis of his argument. Having no rating system is more accurate, more fair, and thus better than having a rating system used for an inappropriate purpose.

Glicko-2:

Even if Pro wins that the goals of debate.org should not be used in determine what ranking system to use on debate.org and that the Elo system is applicable in the context of debate, it still isn’t the best system to use.

Glicko-2 is a system designed to expand the capabilities of the Elo system and address weaknesses in its mathematical model. The Elo system assumes that its ratings are reliable indicator of player ability, ignoring situations where that is not the case, such as a match between evenly ranked players where one has been playing constantly and the other is returning from a long stretch of not playing.

This means the reliability of the first player’s score is greater than the second player’s score. If the second player wins, his score should increase more than the first player’s should decrease, as the win shows that his strength is probably greater than it is currently scored. Conversely, since the first player’s score is already reliably determined, and is playing against an unreliably scored player, losing doesn’t reveal much new information and so the score should be reduced as much.

The Glicko system uses the concept of ratings deviation to calculate how reliable a players rating is and how many points should be gained or lost in consideration of that fact, creating a range of possible scores at a given confidence level (9).Glicko-2 then adds the further improvement of a the volatility factor which increases the accuracy of the confidence level further by taking into account erratic performances (10).

This rating system would then capture all of Pro’s advantages, as it’s the same type and kind of system, but is more advanced and fixes problems in the Elo system.

Conclusion:

Vote con because the goals of debate.org to maximize popularity and be a platform for speech are better served by the status quo than a system focused on competitive ability, because rating systems like Elo cannot fairly apply to debates here and we’d be better off with no ratings, or because the Glicko-2 system is a newer, improved system that fixes weaknesses in the Elo system and has all the advantages Pro listed for Elo.

In my post I had asked con to take a position so that we may debate accordingly. In reality - XimenBao has taken three different position! I will suggest that the three positions are unlikely to simultaneously reflect his true opinion. He is not sure of any of his argument and is just hedging his bets - hoping I will fail to defend Elo system against at least one of the propositions. The drawback of this approach is that he has not been able dedicate himself to any the arguments. This has also changed this into a lengthy debate (so much so that my opponent had to post references in comments section). The debate would have been more interesting and intense had he chosen his most preferred option from the three and argued it thoroughly. Still, I will do my best in this 3 in 1 debate.

==Rebuttals==

Status Quo: 'To maintain the status quo is to keep the things the way they presently are' [1].

1. The goal of debate.org: Con has referred to the purpose of debate.org to argue that 'there is an absence of any language suggesting that ranking or competition is a goal of the site.' However as the leaderboard already exists and users can check their ranks or percentiles (even if on flawed basis), this argument is irrelevant. I am asking for an improvement of an existing feature - not addition of a completely new feature.

2. Kinds of debate: Troll debates and inconsistent voting are areas of concern in this community[2][3]. All other kind of debate do have a win-loss feature over them - which ultimately contributes to their rank and are taken quite seriously. The joke debates actually needs lots of skill. Since these results already contribute to the ranking, they cannot be used to justify status quo.

3. Motivation for long term membership:My opponent belives that 'high ranking requires long-term membership and frequent visits to the site'.

a) High rank is not based on number of debate - but number of win. So obviously that is not the whole purpose.b) My opponent concedes - it is not accurate method about skill. Since this is not an accurate measure of skill - it fails as a tool of motivation for the best debaters.c) I have explained why Elo based system is better for motivation and the current system can be abused. My opponent has conceded all my arguments by not countering them.

To add another small point, the current format encourages debaters to accept troll challenges. This is because a win against a top 10 debater is equal to a win against troll. This act encourages trolls. Is that the goal of debate.org?

No system: Here I will be arguing that any system is better than no system

1. Not sport: My opponent argues that debates are not sports like chess. A debate on debate.org has following feature.a) Each debate ends in Win/Loss or tie.b) In general - a better debater is likely to win against a worse debater.

If you agree with two statements above, consistent rankings can be prepared, just like chess. If you disagree with them, then you should start a separate debate arguing that debate.org should do away with win/lose concept. If that happens - my guess is only trolls would remain here.

2. Vote bombs, No votes, Biased votes: I admit these are problems. However these are the problem with debate.org voting system. Debate.org community is concerned about problems like this and is trying to combat it. You can see several discussion on this issue in debate.org forum. If you think these issues render debate.org results totally meaningless, you should argue to abolish win/loss concept as such.

All the problems that the pro has highlighted are with the concept of winning and losing in debate.org - not with a ranking system. So I think we will gain nothing by debating it here. You are welcome to challenge me to a separate debate about that.

I agree with Pro that debates provide lots of opportunity to learn new things and interact with people apart from win/loss. However I do not see why having a good statistical ranking system will end that.

Glicko-2 system:As per Prof. Mark E Glickman author of Glicko and Glicko-2 system (from the link provided by you)

"I created the Glicko rating system in response to a particular deficiency in the Elo system which I describe below..."

and further,

The Elo system, coincidentally, turns out to be a special case of my system.

The Glicko system is just a variant of Elo system. The original Elo system is a special case of Glicko system. In my original resolution I had not said 'Elo system' but rather 'Elo based rating system'.

Interestingly I provided a reference to topcoder (It is an online forum for Programmers who compete with each other and also collaborate to solve industry problem). They don't really use pure Elo system. They have an in-house variant[4] which is far more sophisticated than either pure Elo or Glicko. They don't really have 2 player matches. The programmers have to submit some tasks which are then awarded points based on correctness + time taken to submit the task. These points are then compared to top scores and average scores to calculate ratings (They have volatility feature also).

Debate.org can also modify Elo system to consider number of votes polled. For example if the result of this debate is 1200 to 1000 votes - both should gain points as both performed better than expected. However going for this approach may be more expensive. Elo system is available for free and can be implemented easily.

I thank Baggins for finishing the debate. I’ll shortly move to countering his rebuttals, but first I want to address his opening complaints.

First, he never ties these complaints to a reason to vote for him, so a vote should not be given on this basis.Second, as Con there is nothing wrong with advancing multiple arguments. It’s part of the Pro/Con balance of advantages. Pro gets to pick the topic and take as much time to prepare as he wishes prior to posting his initial argument. Con has the burden to negate, to show that there is at least one argument that defeats Pro’s. To limit Con to affirming a single proposition is to take away fair ground for Con in favor of Pro.

Third, the instigator has control over the technical format aspects of debate. If he didn’t want an 8000 character debate, he could have picked a shorter format.

As I move to Pro’s rebuttals, note that he has not strengthened or expanded on his own arguments, so if his rebuttals fail to counter any of my arguments, mine are the only arguments left in play and this would necessitate a Con vote.

Status Quo:

I will address Pro ’s rebuttals by his numbering system in this section.

1. Pro responds to my description of the goals of debate.org as a target for what a rating system is trying to accomplish by pointing out that he is asking for a change in an existing feature and not the addition of a new feature.

This does not address the point, meaning that debate.org’s goals of creating a popular site that serves as a platform for different views remain unchallenged as the metric for judging different ranking systems.

2. Pro acknowledges that troll debates and inconsistent voting are areas of concern, but gives no reason to disregard them. He also argues that joke debates require skill and are taken seriously, but neither of these arguments oppose that skill telling jokes is not the same as skill debating and thus is inappropriate to include in an Elo ranking system which is supposed to be measuring the outcome of the same kinds of games. This also shows that the purpose of debate.org is not to measure competitors’ debate skills, but to encourage people to take part in as many ‘debates’ as possible regardless of kind.

3. Pro challenges my contention that high rankings on debate.org require many visits to the site as part of debate.org’s goal to increase its popularity through encouraging long term membership.

However, in 3a, Pro acknowledges that the number of wins is the basis of ranking. You can’t win if you don’t play, so this doesn’t counter my argument. In fact it strengthens it, because one of Pro’s advantages to an Elo system is that less debates are required for a high rank, making the status quo more aligned with debate.org’s goals.

In 3b and 3c, Pro alludes to measures that do not accurately measure skill not being adequate motivation for the best debaters. However, Pro has not made an argument about motivation nor tied how motivation for the best debaters fits into the goals of debate.org or why you should vote on it.

Pro also claims I’ve conceded all his arguments by not countering them. This leads me to believe Pro has not understood my arguments. If his ranking system is worse than the status quo by the unchallenged metric of debate.org’s goals, his arguments are countered because his advantages are irrelevant in context. If his system cannot measure debates because they are too different compared to chess and sports, then none of his advantages apply because his system doesn’t activate and his arguments are countered. If the Glicko-2 system captures all of his advantages and provides further benefits, his arguments are countered.

To close the status quo section, Pro never countered debate.org’s goals as a valid metric, or advanced a coherent argument why an Elo system fits those goals better than the status quo. Thus extend my R2 arguments on this as a reason for a Con vote.

No system:

Here, all of Pro’s arguments come down to a claim that if we have no ranking system we should also do away with the concept of win/loss and leave the site for the trolls. However, he doesn’t make an argument why that’s the case or why trolls would take over, just assertions. I’m not saying you can’t win a joke contest or a rap battle, but since debate.org produces results from them identical to actual debates, using a ranking system like Elo to measure debate skill that requires contests of the same nature doesn’t work. It’s not win/lose that’s the problem, it’s the ranking.

Pro also seems to have missed that my argument was “no system is better than a bad system.” Pro pointed out that debates end in win/lose/tie and better debaters are more likely to win debates, but he didn’t address my arguments why that isn’t enough for an Elo system to work.

Elo requires the games it measures to have an objective result and be the same kinds of games measured, and the bulk of the R2 No System argument was spent showing this. Pro does not dispute or challenge this contention and thus my argument stands.

Elo does not apply here, Pro gets none of his advantages, and the rankings would be inaccurate and misleading, hence worse than no system.

Glicko-2

First, I don’t see how the topcoder references impact the arguments in play and will not address them.

Second, Pro misreads the link he uses to claim Glicko-2 is a variant of Elo and claim that Pro should get to count Glicko-2 as part of his argument.

The Glicko-2 system did start as a response to weaknesses in Elo systems, but that doesn’t make it a variant. Pro is relying on a quote from the Glicko-2 creator that says that, though a coincidental and so undesigned method, if his system is used in a certain way, it can create a special instance where parameters and input result in a duplication of the Elo system. The Glicko-2 system is different. The Glicko-2 system is not a variant. But the Glicko-2 system is flexible enough that given the right conditions it will look and act like the Elo system even though most of the time it won't.

This was Pro’s only rebuttal to the Glicko-2 system. He did not dispute that it is better than an Elo system, so the Glicko-2 system captures all his advantages, adds value, and wins a Con vote.

Conclusion:

I thank Pro for providing an interesting debate and making me learn something more about ranking systems.

As his rebuttals did not counter my R2 arguments, the voting choices remain the same:

Vote Con because the goals of debate.org to maximize popularity and be a platform for speech are better served by the status quo than a system focused on competitive ability, because rating systems like Elo cannot fairly apply to debates here and we’d be better off with no ratings, or because the Glicko-2 system is a newer, improved system that fixes weaknesses in the Elo system and has all the advantages Pro listed for Elo.

I don't agree that it is the job of the opponent to claim rule violations. Debaters may not know what the rules are, and it's up to those judging the debates to decide on the fine points. Leaving it to the debaters encourages having a separate debate on what the rules are. In tennis it isn't up to the other player to claim if a ball is out, for much the same reason that we shouldn't require that in debates.

My interpretation is that 8000 characters mean a total of 8000 characters. I think it helps readers a lot to have the debates self-contained. Maybe it would be a worthwhile improvement in DDO to exclude the characters in links from the total count.

It's not a big deal. When both debaters put a lot of effort into a debate, I try to put more effort into the judging, so as to improve the craft.

It is a little bit of a hassle for people who want to check your references, especially after the debate has been heavily commented on. It's also something PRO could have cried abuse on, claiming it gives you an unfair comparative advantage. He didn't cry abuse though, and personally I hold debaters responsible for arguing about what's abuse and what's not. As it is he only briefly brought it up as evidence that you'd made the debate long - he didn't even complain about it. If PRO had successfully convinced me that it was abusive, I probably would have given him conduct.

I think you'll find that a lot of people have much different voting standards, and they'll determine abusiveness and conduct themselves rather than expecting it to be discussed in the debate. If that's how I rolled, I'd probably have given conduct to PRO, if not for the abusiveness of gaining extra characters then for making late voters have to sift through a bunch of comments if they want to check your references.

I've been away for awhile, but when I was here a year ago, no one complained about sources in comments and it was pretty common. I figured that was still the case. If standards have shifted, I apologize for the indiscretion.

A very good debate. Both sides debated well. I wish we had more debate of this quality on the site.

Putting references in comments defeats the character limits, unless the debaters have agreed to special rules. I think that it is a conduct violation.

In a "should" resolution, saying that the current purpose of the site would not be served is an irrelevant argument. Pro pointed that out. It largely defeats the first two options Con proposed.

So the question is whether Glecko-2 is better, or sufficiently different, than Elo. That's tough. I'm going to say the difference is mainly a technicality from the viewpoint of the spirit of the debate. A close call, I'll admit. The advantages of Glecko-2 were not made clear enough in the debateto be persuasive that it was a significant difference and an improvement.

Let's say the purpose of golf as announced by some association is to provide a sport for everyone to play. So dos that preclude having a rating system that determines the best golfers? Similarly, having competition in a debate site doesn't preclude the objective of letting everyone debate. Pro pointed out that the old system can be kept with the new. Anyone who doesn't like ratings already has the option of not looking at them.

Could a "joke debater" unfairly top the leader board? Joke debates are very difficult to win consistently, because what is being judged is whether or not the joke can be sustained in he face of real arguments. Anyone who can do that deserves to be on top, but I don't think it would even happen. Real debaters would pick off the jokester whenever he started to become highly rated.

I voted args to CON, and didn't vote on the other issues because I don't feel like judging them.

I had no major problems with PRO's case when it was originally presented. However, I don't think he was prepared for CON to use a multi-pronged attack against him. The burden of proof is such that even if I agree with PRO on two prongs, I still have to vote for CON to win based on the prong he won; if an of CON's prongs stand, then the resolution is negated.

During R2, when CON presented his counter-plans, I thought his defense of the staus quo was horrible. It seemed to quite transparently be a partially circular argument; the current ranking policies of the site were actually cited as one of the ways we can know the site's intent, and then the site's intent was used to justify sticking with the current policies.

CON also argued that the purpose of the site was not to measure oneself against other debaters, and while it might not be the entire purpose of the site, it's clear that it's something the site wanted to allow people to do - otherwise they'd never make a leaderboard. I think PRO adequately addressed this point, and I gave this prong to PRO.

===

CON's second prong was no system. I thought a lot of his points here were strong. He showed in this section that an Elo system would be unuseful, as it does not differentiate between debates with different rulesets. A poor debater who wins a bunch of joke contents against top 10's would be ranked highly. While CON may not have shown here why having no system is ideal, I think he showed that an Elo system is unideal, and I thought PRO's responses were inadequate. Gave this prong, and thus the debate, to CON.

===

The Glicko-2 point was a little murky. After hearing the args, I'm unsure whether I'd consider Glicko-2 to be "Elo-based," and since CON already won on prong 2, I didn't find it necessary to sift through sources to determine it myself.

Reasons for voting decision: Putting references in comments defeats the character limits, unless the debaters have agreed to special rules. I think that is a conduct violation. In a "should" resolution, saying that the current purpose of the site would not be served is an irrelevant argument. So the question is whether Glecko-2 is better, or sufficiently different, than Elo. That's tough. I'm going to say the difference is mainly a technicality from the viewpoint of the spirit of the debate. A lose call, I'll admit.

Reasons for voting decision: RFD in the comments; short version is that I gave args to CON because I thought his points were strong on "no ranking system" and I don't think PRO rebutted that point well.

Reasons for voting decision: Very strong first rebuttal from Con, and excellent response from Pro, though the site purpose was not as well dealt with as the elo-variant. 2:1 for Con simply as Pro had the BoP and the site purpose went to Con.

You are not eligible to vote on this debate

This debate has been configured to only allow voters who meet the requirements set by the debaters. This debate either has an Elo score requirement or is to be voted on by a select panel of judges.