Who Writes Wikipedia? — Responses

First, on a personal note, let me simply say thanks. I probably put more work yesterday’s post than anything else I’ve ever written. In addition to the research I describe, I’ve spent my free time the past few weeks going over the text of the article again and again, agonizing about the proper phrasing, getting everything just right. It was definitely worth it. My sincere thanks to everyone who made it possible.

Further research is needed

Getting down to business, many are interested in pursuing this line of quantitative research. The work I did was intended for an article, not a formal paper, and while I’m fairly confident the basic principles are correct there’s plenty more work to be done.

I was heartened to discover research by Seth Anthony which, independently and more formally, came to largely the same conclusions. As he explained on Reddit: “Only about 10% of all edits on Wikipedia actually add substantive content. Roughly a third of those edits are made by someone without an account, half of someone without a userpage (a minimal threshhold for considering whether someone is part of the “community”). The average content-adder has less than 200 edits: much less, in many cases.”

One of the more interesting things Anthony did was look at the work of admins in detail. In his sample, he noticed that none of the genuinely substantive edits were done by official site admins. He found that when admins originally joined the site, they contributed a lot less frequently and consistently but created a lot more substantive content. After they became admins, however, they turned into what Anthony calls “janitors”.

One of the wonderful things about Wikipedia is that literally all of the data — every single edit and practically every discussion made on or about the site — is easily available. So there’s an enormous amount more to learn about how it gets written. (In addition to nailing down what we know so far a little better.) If you’re interested to contributing to further research on this and related topics, send me an email and I’ll try to coordinate something.

Who gets to vote?

Another response was to think about the implications on who gets to vote in Wikipedia elections. ‘I tried to vote,’ commented Eric, ‘but since I am one of your “occasional contributors” (I’ve edited only one article to make content changes), I am not eligible[]. It appears that the opinions of “occasional contributors” will not be heard.’ Others, including William Loughborough and Jason Clark, expressed similar sentiments. ‘HURRAH, I am DISENFRANCHISED’, complained Bill Coderre.

Alienating the world

But by far the most common response was people sharing their experience trying to contribute to Wikipedia, only to see their contributions be quickly reverted or rewritten.

‘You can definately tell the “regulars” on Wikipedia’, joshd noted. ‘They’re the ones who … delete your newly reate[d] article without hesitation, or revert your changes and accuse you of vandalis[m] without even checking the changes you made.’ ‘Every modification I made was deleted without any comment’, complained CafeCafe. ‘I know there are a lot of people like me willing to help, but unless there is a real discussion behind, I won’t waste my time to help anymore which is a sad thing.’

bowerbird complained that ‘my contributions … have been warped by people who merely want to “make it sound like an encyclopedia” without having any knowledge of the topic’ while Ian ‘got fed up of the self-appointed officious jobsworths who [rewrite your] things [to] fit “their vision” … . My time is too valuable to argue with these people…’

Bill Coderre told of how he wrote entire articles from scratch, only to see them ruined ‘by some super-editors, who removed content, and turned what I thought was gosh-darn good writing into crap. … These people, by and large, “edited” thousands of articles. In most cases, these edits were to remove material that they found unsuitable. Indeed, some of the people-history pages contained little “awards” that people gave each other — for removing content from Wikipedia.’

And it seems like half of all the people I meet have a story about being listed for deletion and the nasty insults that ensued. Seriously, there have been numerous times I’ve said something about Wikipedia to a relatively well-known person and they responded back with a story about how someone insulted and deleted them. ‘[T]here are culture vultures overlooking Wkidpedia waiting to kill anything that doesn’t fit the norm’, wrote Mediangler.

Why does this matter? Why should we listen to the angry complaints of random people on the Internet? If occasional contributors are the lifeblood of Wikipedia, as the evidence suggests, then alienating such people just can’t be healthy for the project. As Ian wrote, ‘if we are to invest our valuable time contributing some expert knowledge on some subject, we want to know that our work will remain there for others, and not just keep getting reverted out in seconds by some control freak that knows nothing about the actual subject. … your article proves the exist[e]nce of this “inner gang” that I feel are actually holding Wikipedia back. To allow Wikipedia to grow and really pick the brains of the experts around the world, you need to do something to break up this inner gang and the mini empires they are building for themselves.’

Perhaps we can improve things with new rules (not only should you not bite the newcomers, you shouldn’t even bark at them) and new software (making it easier to discuss changes and defend contributions), but most importantly, it’s going to require a cultural shift. Larry Sanger famously suggested that Wikipedia must jettison its anti-elitism so that experts could feel more comfortable contributing. I think the real solution is the opposite: Wikipedians must jettison their elitism and welcome the newbie masses as genuine contributors to the project, as people to respect, not filter out.

Comments

Why does this matter? Why should we listen to the
angry complaints of random people on the Internet?

That is actually a lot more imortant than just being a rhetorical question. You have linked to a lot of claims, but before taking any of them at face value, you should investigate them. And I am not saying this to insult the people that commented, but because of my own experience administrating/moderating a medium sized community (small by wikipedia standards).
As you would know from politics, there is radically different ways of looking at and describing certain actions and deletions. And it is especially the casual users that (obviously) lack detailed knowledge on rules and conventions (again, this is not put anybody down, it’s quite normal for people who do not invest time into a project/community). I mean, everybody agrees that those people shouldn’t feel that way; disagreements are on how to achieve that end (in balance with the other goals).

One more thing: You seem to have learned your lesson from following politics.;) You bring up a common issue that is suffcicently generally phrased so that many people can identify with it. (“Yeah, I wrote something once, and somebody else edited it out”). Yet, even if you had actually proposed measures against the problem (which you didn’t), those measures couly only affect a minority of edits - because by and large (I am guessing here) the big majority of edits seem actually necessary to most people familiar with Wikipedia, including probably you. Would you actually improve the situation such, that all those people that you quote from the comments, would have a better experience next time? Hard to say, because you neither you nor I actually which rules those people ran up against.

Being newbie friendly (which I feel is where this going at) all the time is a super-hard job, and finding people that want to (or even are able to) spend a lot of their time being newbie-friendly is exceptionally hard. The problem here is not process or rules, it’s the sheer amount of newbies and their (on average) stupidity. Are you familiar with the usenet expression “the eternal september”…?

I am sure I don’t have to link to Shirky’s piece on the worst enemy of a group. I’ll just quote No. 2 of his conclusions - though I’ll leave it open on how to interpret and apply that to Wikipedia:

2.) The second thing you have to accept: Members
are different than users. A pattern will arise
in which there is some group of users that cares
more than average about the integrity and
success of the group as a whole. And that
becomes your core group, Art Kleiner’s phrase
for “the group within the group that matters
most.”

I would be interested in how you bring this together with how you would want to see Wikipedia organized in terms of rules, process or people (if you agree with Shirky on that, that is).

I think Sencer’s comment brings a bit of balance to the discussion but as somebody who has had material immediately reversed out on wikipedia - it was around a piece of corporate information - I would like to assure you it happened and it was shocking how quicly it happened. My concern with that is it felt liek censorship. What can be done differently - I think there has to be scope to allow competing or divergent views on a subject or issue such that the keeper of the main post cannot be the keeper of a divergent post.

Still Gang of 500 or 1000 might remain strong or good enough even you get a position. What are the substantial analysis about those people? Who are they, and why they came to behave and act such ways? Is that how ‘experts’ grow and come to hold certain mindsets? (Lewontin, I think, talk about this problem[not solutions or answers though].) Or what they went through? how it looks from their positions and their side of stories? No effective communication possible with those who labeled as ‘gangs’?

And by getting you elected, how that could/would really, substantially change? You need to persuade 500 gangs - or you don’t have to do that, you just can keep low entry setting open, and software design tweaks, and then - that kind of 500 gangs would just - change? (or melt away?)

Before saying like ‘Alienating the world’, can we know about how Jimbo Wales and his 400 or 500 close Wiki tops are talking, discussing? Are those discussions’ contents are accessible to everyone? or not? (No Monthly letter from Wiki HQ? But we only have Jimbo’s rubber stamp talks at several hundreds locations?)

Aren’t these related to the issue of handling of POVs - neutrality, impartiality and so on?

Maybe more fine point, but maybe also crucial, is, like old? - Ward Cunningham’s Wiki had ‘no deletion’ encouraged policy, even advanced users, or ‘gangs’ weren’t encouraged to do that. It said it’s against Wiki’s spirit. Rather, if you are advanced,

1: you need to be able to handle newbies, and get them to learn how to do difficult stuff. (say POVs handling, moderation over disagreement…)

2: don’t delete - keep it somewhere recorded and work from there, with ‘newbies/someone with (radically?) different perspective, information, opinion and so on.

And they talked about this as important thing, so that Wiki won’t become something like UseNet, which sank itself into regular flamers roaming pool. (How that happened? - among people. UseNet had some ideals and spirits too…)

I wonder how not to repeat, what could give us, assist us to bring change in all this.

I don’t know what kind of mining efforts there have been on the wiki corpus so far, but I can offer a couple of suggestions that spring to mind.

I’m often driven to distraction by wikipedia’s clumsy prose. A random example: “Since 1992, the current leader and Secretary-General of Hezbollah is Sheikh Sayyed Hassan Nasrallah.” It would be interesting to focus on those edits that don’t alter the substance of a sentence. E.g., I ran that one through Lingua::LinkParser, and it failed. But then I fixed it up a bit, and it parsed just fine: “Sheikh Sayyed Hassan Nasrallah has been the leader and Secretary-General of Hezbollah since 1992.” In a perfect world, you’d have a routine to preemptively suggest better grammar. But in our imperfect world, authors should get fine-grained feedback about these problems.

On a related note, I notice an awful lot of passive voice on wikipedia. “It is claimed that [Anderson] Cooper ‘doesn’t drink hot beverages’” or “His height is reported to be no more than 5’ 10”. A lot of that may simply indicate unsourced claims, but I suspect overall there’s more to it: that it’s a form of defensive prose, perhaps to ward off summary deletions. It would be fun to compare a sample of authors’ wiki posts to their blog posts, if that information were available, to spot any interesting grammatical differences.

I also wonder if there’s been any fine-grained analysis of the more volatile pages, the ones whose content changes the most. Can this lead to any predictive routines about a text’s level of controversy? Perhaps even more useful information than a section-level notice of a dispute. I nooded around with a routine to stratify argument threads within blog comment sections, but didn’t get this far. I assume others have worked on this problem, though.

Bravo for this series of Wikipedia articles and the research behind them, Aaron. I for one was terribly dissapointed to find that I was ineligible to vote for you. I wonder how many Wikipedia super-editors would have the knowledge required to knit together the genetics, pathology, biochemistry, and related articles that I have. It is also apparently to bad for me that I do many of my edits from the medical library computers and don’t bother to sign into Wikipedia before making the edits as well. (This is also from a library computer, so maybe you can find how many Wikipedia edits are from the block of IP addresses that this letter is sent from.)

Aaron, this isn’t a wikipedia election. It’s a Wikimedia Foundation election. There’s a difference: one is the name its authors give to the content it’s known for, the other is the web host for the content. With some people involved in both.

The vast majority of these complaints against wikipedia are complaints for which there are no possible solutions.

“My article got deleted, therefore wikipedia sucks”. “I added content to an article, which was promptly removed, therefore wikipedia sucks”. “I took some content out of an article which I thought was false, and it was put right back in, therefore wikipedia sucks”.

How can any of the above possibly go any differently? Articles are going to be deleted, as wikipedia cannot simply have articles on everything, from “my dog Bonkers” to “George Bush is a tard!”. Editors are going to disagree with one another and revert each other’s edits, that is the nature of wikis.

As for all the “I couldn’t vote, I don’t have enough edits, this is unfair”, how else is vote stacking going to be prevented? If people with few or no edits can vote, they can make 1000 accounts and stuff the ballot for their candidate.

These complaints are like someone new to the process of driving saying, “This road system sucks, I want to be able to drive on both sides of the road, and these stop signs and stop lights are wasting everyone’s time, and other drivers are driving badly and I hate that and something should be done, and also I don’t want to ever get a traffic ticket, and the roads should be 10 times wider, except my taxes shouldn’t be raised to pay for those wide roads”.

xyzzyplugh’s analogy to driving unintentionally illustrates one of the big flaws common in community developed knowledge bases—the tendency for “groupthink” to oppose ideas that run contrary to that which “everyone knows”. There is evidence to suggest that removing traffic lights, stop signs, lane markers, etc. actually improves safety and reduces accidents. See the story on one such experiment, in The Netherlands, here:

I hate wiki, and know at least a couple dozen professionals like myself who work in our respective fields who also groan or wince a little whenever wiki is merely mentioned.

And I think that the wiki people know what the problem is, I mean, come on guys, all you need to do is look at who it is that’s deleting and abusing new contributors. But nobody is ever actually held accountable, and wiki quickly takes on the feel to newcomers of a really crappy and elitist early 90’s BBS.

Now that wiki is removing so many useful links, many of them which clearly show that the little wiki nazi’s have edited the article to be decidedly on-NPOV, wiki has become a really sad joke.

Good luck with removing links and telling specialists in their fields that they can’t make corrections, and with the little awards for what, for making entries not just mediocre, but inaccurate and in many cases absurd.

I am an expert in a particular field. I found the quality of wikipedia information highly variable. Some articles were good. Others had inaccurate and sometimes completely wrong information.

Still other articles are zealously guarded by a group who are able to effectively censor information through a variety of techniques. This discourages experts. As a result, I only contribute occasionally. When I do, it’s for unimportant stuff, like cartoon characters and TV shows.

To be honest, Wikipedia is a place where, if more users agree with you on anything, they become correct. It doesn’t matter if it’s a picture of a penis ejaculating ont he front page of a ejaculation page, if dogs are used as meat in the US, or if Christian bands play Christian music. Wikipedia’s editors have no creditials. I’m saying this because I used to edit Wikipedia for years until I realized that what I did had little effect on anything.