Wikipedia Will Fail Within 5 Years

Over the weekend I had dinner with Mike Godwin, one of the most significant influencers of the development of Cyberlaw and a longtime friend. Mike and I were discussing Wikipedia, the community-edited and -maintained encyclopedia. I like Wikipedia a lot and use it pretty frequently. However, as recent events have indicated, Wikipedia is far from perfect.

In particular, I remarked to Mike that Wikipedia inevitably will be overtaken by the gamers and the marketers to the point where it will lose all credibility. There are so many examples of community-driven communication tools that ultimately were taken over—-USENET and the Open Directory Project are two that come top-of mind-—that I didn’t imagine that my statement would be controversial or debatable. Instead, I was surprised when Mike disagreed with my assertion. Mike’s view is that Wikipedia has shown remarkable resilience to attacks to date, and this is evidence that the system is more stable than I think it is.

Here’s my thinking. As Wikipedia grows in traffic, outlinks from Wikipedia become more valuable—-because of direct referrals and, perhaps more importantly, the PageRank that will flow from the link. Therefore, marketers will inevitably try to stuff links into Wikipedia. Because there are no barriers to editing Wikipedia, this is trivially easy for marketers to do. Eventually, marketers will build scripts to edit Wikipedia pages to insert links and conduct automated attacks on Wikipedia.

So long as the marketers’ scripted/repeated activity is trivial in quantity, the self-policing community of Wikipedia will patiently delete those attacks-—just like we delete spam from our in-boxes today. But over time, as the attacks become more determined and more automated, the Wikipedia community will become less enthusiastic about undoing the marketers’ changes. At this point, one of two things will happen:

1) Wikipedia will have to change its open access nature. Instead, Wikipedia will have to lock down lots of pages from being edited at all. Or Wikipedia will have to install some reputational management system to limit who has the right to post or edit content.

2) Alternatively, Wikipedia community members progressively will do less spam clean-up. This will lead to a gradual but ultimately irreversible downward spiral as more pages are taken over by marketers, decreasing the database’s credibility, while as database credibility decreases, community members will feel less incentive to clean up the pages.

Mike and I made a (wagerless) bet that on December 2, 2010, we will see where Wikipedia stands and decide the winner. If you have your own prediction on the fate of Wikipedia, please leave a comment.

UPDATE: So just about the same time I posted this initially, Jimmy Wales of Wikipedia announced that registration will be required before a person can create a new entry. Needless to say, I’m hardly surprised by this move, but it’s far too little to solve the real problem, and I’m confident that soon it will be followed by more limits on posting and editing. Marketers can still game the site by (1) creating fake accounts, and (2) editing existing postings (which is what I’d do if I were gaming the system); and both of these steps can be automated. Wikipedia is a fantastic idea with a finite life built into its architecture–it can be open access or spam-free, but not both. We’ve already seen the first significant step towards restricting access.

UPDATE 3: Wikipedia has offered a “semi-protection” option for “vandalized” pages, which restricts the ability of some people to edit pages. This is yet another step towards shutting down open access, but it’s a very small response to a very small subset of the problems Wikipedia faces. Look for continued expansions of limits on open access as the full scope of the problem becomes clear.

Share this:

There is middle ground. Wikipedia could institute technological measures to keep spamming to a minimum: monitoring the IP addresses of edits to prevent too many edits to too many articles within a given timeframe; requiring a registered and confirmed account before being allowed to edit; a captcha before changes go live (ugh). These types of measures (and new ones we haven’t thought of yet) could be implemented to keep the spamming to a manageable level so that the users of the site can continue to weed them out. Just a thought…

http://strangnet.se/blog Patrick

There’s no real gain to insert links into Wikipedia articles for Pagerank reasons because Wikipedia has implemented rel=”nofollow” for all external links.

http://www.ericgoldman.org Eric Goldman

Dave!, IP addresses can be easily forged or attacks can be distributed. New accounts can be automated (indeed, I’m assuming Wikipedia will have to introduce some robot checker into the new account creation process). So I think the attacks will become more automated, more determined and harder to manage.

Patrick, do you know if the nofollow header is used by the various sites that republish Wikipedia entries? (i.e., Answers.com). If not, there could still be PageRank benefit by inserting links.

http://strangnet.se/blog Patrick

You’re correct regarding answers.com and it probably applies to all the other sites that republish Wikipedia content. I guess they’ll have to take their responsibility, too.

Tariq

Anonymous editing really is a big problem for Wikipedia. There are already numerous “Wiki Wars” – mostly on ideological/political issues, but also authors changing articles about them to remove criticism, for example. These abuses/debates will likely continue to get worse.

What Wikipedia needs is a reputation system, like ebay sellers. People (registered users) can rate your edits. If you continuously make partisan or biased edits, you will presumably get a low score, and maybe be prevented from making too many edits moving forward.

This too is open to gaming – anything on a free, open system is – but it provides one additional check in the system.

http://www.ericgoldman.org Eric Goldman

Tariq, thanks for the comments. I think the current Wiki Wars are a small sample of Wikipedia’s future. It’s one thing when the wars are started by individuals with a political axe to grind or out of personal malice; these interventions can be easily corrected. But when the changes are widespread robotic attacks by marketers, the changes will be so pervasive and repetitive that fighting them will be much, much harder.

I think your proposed reputational system would be a partial solution. Notice that if a marketer attack is distributed across many IP addresses or accounts, a reputational system measuring repeated interactions won’t catch the one-off change. So inevitably Wikipedia will have to close down access to the one-off editor altogether or have some other system to prevent the first-time editor from being able to affect the publicly-displayed pages automatically. For example, at Epinions, ultimately we hid new posts from some forms of public view until they had been graded positively by another user.

Eric.

Kevin Sours

You seem to be focusing a lot on automated attacks, but my impression is that these should be relatively preventable on a technilogical level. It may be a pain to require a “bot check” on every save to the db, but I think it is doable. And I don’t think it will overly burden the people doing the editing (and to the extent that it does, I think it may improve matters). I believe that the current bot detection systems are well ahead of the ability of scripts to defeat them. I think that wikipedia is going to face some significant challenges, but I’m not not sure that automated edits are the most significant.

Hick Ninja

Couldn’t they just implement the “type in the text in this picture” checks for each edit submission? Then it would be nearly impossible to automate edits.