I was reading an interview today with Jorge Cauz, the president of Encyclopedia Britannica, in which he describes some of the Web 2.0-y tools that the company is preparing to roll out to enable readers to contribute to the encyclopedia’s content. (I’m on Britannica’s board of editorial advisors.) The interview touches, as you’d expect, on the great success that Wikipedia has achieved on the Web and, in particular, on its ever increasing dominance of Google search results. Cauz calls the tie between Wikipedia and Google “the most symbiotic relationship happening out there” – and I think he’s right.

Cauz’s remark reminded me that it’s been some time since I updated my informal survey of Wikipedia’s ranking on Google. A couple of years ago, I plucked from my brain, in as random a fashion as I could manage, ten topics from a range of knowledge domains: World War II, Israel, George Washington, Genome, Agriculture, Herman Melville, Internet, Magna Carta, Evolution, Epilepsy. I then googled each one to see where Wikipedia’s article on the topic would rank.

I first did the searches on August 10, 2006. The results showed that Wikipedia did indeed hold a strong position for each of the ten subjects:

World War II: #1

Israel: #1

George Washington: #4

Genome: #9

Agriculture: #6

Herman Melville: #3

Internet: #5

Magna Carta: #2

Evolution: #3

Epilepsy: #6

I next did the searches on December 14, 2007, and found that Wikipedia’s dominance of Google searches had, over the course of just a year and a half, grown dramatically:

World War II: #1

Israel: #1

George Washington: #2

Genome: #1

Agriculture: #1

Herman Melville: #1

Internet: #1

Magna Carta: #1

Evolution: #1

Epilepsy: #3

Today, another year having passed, I did the searches again. And guess what:

World War II: #1

Israel: #1

George Washington: #1

Genome: #1

Agriculture: #1

Herman Melville: #1

Internet: #1

Magna Carta: #1

Evolution: #1

Epilepsy: #1

Yes, it’s a clean sweep for Wikipedia.

The first thing to be said is: Congratulations, Wikipedians. You rule. Seriously, it’s a remarkable achievement. Who would have thought that a rag-tag band of anonymous volunteers could achieve what amounts to hegemony over the results of the most popular search engine, at least when it comes to searches for common topics.

The next thing to be said is: what we seem to have here is evidence of a fundamental failure of the Web as an information-delivery service. Three things have happened, in a blink of history’s eye: (1) a single medium, the Web, has come to dominate the storage and supply of information, (2) a single search engine, Google, has come to dominate the navigation of that medium, and (3) a single information source, Wikipedia, has come to dominate the results served up by that search engine. Even if you adore the Web, Google, and Wikipedia – and I admit there’s much to adore – you have to wonder if the transformation of the Net from a radically heterogeneous information source to a radically homogeneous one is a good thing. Is culture best served by an information triumvirate?

It’s hard to imagine that Wikipedia articles are actually the very best source of information for all of the many thousands of topics on which they now appear as the top Google search result. What’s much more likely is that the Web, through its links, and Google, through its search algorithms, have inadvertently set into motion a very strong feedback loop that amplifies popularity and, in the end, leads us all, lemminglike, down the same well-trod path – the path of least resistance. You might call this the triumph of the wisdom of the crowd. I would suggest that it would be more accurately described as the triumph of the wisdom of the mob. The former sounds benign; the latter, less so.

UPDATE: Interestingly, Britannica and Wikipedia seem to be headed toward a convergence in their editorial rules and regulations. After Wikipedia erroneously declared both Ted Kennedy and Robert Byrd dead on Inauguration Day, the Register noted that an embarrassed Jimmy Wales intensified his push to get the Wikipedians to adopt a policy of Flagged Revisions, which would require edits of sensitive articles, including those on living persons, to be vetted by editors before being incorporated into the Wikipedia site. (In what may be a preview of Wikipedia’s future, the Flagged Revisions policy has already been adopted by the German Wikipedia for all articles.) Such a move would, of course, represent a continuation of Wikipedia’s ongoing tightening of editorial controls over its content.

32 Responses to All hail the information triumvirate!

Their use of canonical URLs is very powerful. You can predict the URL for any given topic automatically, which allows you to sprinkle lots of links to Wikipedia into your hypertext. Your Content Management System can do it for you.

I wish more sites were as fanatical about their canonical URLs as Wikipedia is. Librarians could help, if they overcame some of their reservations about such things, and settled for worse is better..

While the obvious answer to “Is culture best served by an information triumvirate?” almost inevitably has to be, “No, of course it wouldn’t be!” I believe the real answer has to lie in the provision of a viable alternative. Wikipedia is (potentially for discernable reason) not dissimilar to democracy, in that it is ‘best worst’, and therefore first port of call in Wikipedia’s case. While the editorial processes of the big Wik might not be without their faults, people at least know that at least there will be something written there, and the caveats on many pages provide a useful steer. However, I’m not sure it would be groundless to believe that if Brittanica was able to deliver a similar quantity of topics, its brand would win in terms of quality – which would one rather cite, or indeed trust, given the choice?

I’m all for a bit of competition. Perhaps the starting point is to emphasise editorial and sourcing policy, as you point out it’s crowd distillation vs qualified expertise. I’m reminded of Randy Pausch’s last lecture (http://uk.youtube.com/watch?v=ji5_MqicxSo), in which he pointed out how he learned a lot about EB’s editorial policy when he actually became a contributor. I’m sure that was tongue in cheek… wasn’t it?

Tim Bray’s comments are, as usual, insightful, and I’m sure the way URLs are formed contributes to the phenomenon we’re seeing. But I think what Tim calls “default linking” is the real issue. Wikipedia has, simply, become the default link for many people. I’m certain, in fact, that many people who link to a Wikipedia article do so without having actually read the Wikipedia article and, certainly, without having taken the time to compare different sources and choose the best one. You might call this “linking without thinking,” and it spells the doom of an information economy in which links are the main currency.

See also this small study: “We selected 100 terms from prominent U.S. and world history textbooks… Google listed Wikipedia as the number-one hit a remarkable 87 times out of 100. The encyclopedia came in second 12 times and third once. In other words, the Wikipedia site was listed among the top three Google hits 100 percent of the time.”

I believe you are thinking about it a bit out of sequence. Each of the big 3 you mention are delivery channels, not the source of the information. Wikki has a lot of people writing and editing it’s content, many, many more than participated in the old physical medium environment.

Google shows Wiiki, because so many people select that as their source of information. One must assume they do that because they believe they are getting a high quality of information from that source (ignoring the potential for manipulation for a minute). How is that worse then picking a specific brand of dictionary or encyclopedia as your source?

And the technology infrastructure makes the processes cheap and fast enough so that many many more can take part as producers or consumers.

And this cost makes it much easier for either Google or Wikki to be replaced if something better comes along.

Also all of our family arguments over facts can not get settled with a quick search!

Remember when IBM and AT&T ruled us all? But then platforms changed, and Microsoft moved to the top. Now it is Google. (Yes, of course there are many differences; the story has rich detail; but the point is that things change.) Empires come and go. I remember a talk by Udi Manber, referring to a New Yorker cartoon. If I remember correctly, the cartoon had several frames. Initially, it was a shop named “Grandma’s Pies”. Next, a big storefront with the same name; later a frame with an immense factory, possibly named “Grandma Pie Enterprises”. In the final frame, there is a small store in front of the factory, perhaps with the name “Aunt Ellen’s Homemade Pies”.

And so it is possible that the game will change, the nature of authoritative explication will change (did you see the NYT article about how kids lookup first on youtube, in preference to search on Google), and the dominance of Wikipedia will not be a worry.

When googling for Nicholas Carr, roughtype.com is still first and second, and wikipedia sitting at the third place. When wikipedia will be the first result when googling for your name instead of your blog people will probably understand there is a problem.

billpetro : Are you sure? Nick tells a very different story about that history. And I’ve found the EB people quite interested in the times these days. But they do want a business model – not web-evangelist’s ranting, which only pays the bills of said hucksters.

(not saying you’re guilty of that, just a general comment – the whole social media bibble-babble is very much being revealed to make money only for the blatherers, if that).

Um. Frankly, i don’t understand the fuss. Wikipedia is the #1 link on many search results. OK. So what? It’s not like it is the only result. When I am searching for something I either want to know the bare minimum, in which case almost anything will suffice (ergo, it does not matter who is in the first place), or I do care enough to scan at least first half a dozen results (so, again, it does not particularly matter who is first). As for the people who always mindlessly click on the first search result and expect to see the ultimate truth… Well, I am afraid they can’t be helped. And most likely are never going to read (or care about) this here argument.

The more interesting story to me is the complete failure of every “create authoritative articles that will rank highly in Google” business to displace Wikipedia. (Examples: Squidoo, Mahalo, Google’s own Knol)

This is a response to a comment earlier in this thread. I have to disagree with Tim Bray’s assertion that Wikipedia’s dominance in Google is due to their “superior information architecture”. On the contrary they could adopt such a simple URL structure because there was not much information architecture going on. For example, if you look at any term that has more than one meaning, like “Paris”, the URLs are no longer obvious for the user to guess (e.g. /wiki/Paris,_Texas). Moreover, making the topic itself the unique identifier in the URL can create problems of persistence, because topics may change their name without changing their meaning. This URL structure does not offer that flexibility.

I do agree that some people may generate Wikipedia URLs directly, even through programmatic means, and it would work most of the time, which is a great advantage. The reason it works is not because of superior structure, but because of the sheer number of articles that’s there. You can almost pick any topic at random, and there is an article out there on that topic, even if it is just a stub. It is this coverage that makes it possible for people to generate these URLs directly.

Just a brief comment about the search engine “juice” driven by the wiki architecture… I run a Semantic Mediawiki site (MyWikiBiz.com), and while it is only the 126,000th most popular site, according to Alexa, it achieves ridiculously strong results in the Google and Yahoo search algorithms. So, my site is living proof that Internet search is tripping over itself to prioritize wikis in the results. (And I’m not saying this is a good thing for society, either.)

I want to thank Nicholas Carr for so succinctly putting these concerns about dominance into layman’s terms. Just as species survive calamity thanks to variation in the gene pool, I fear that our “information gene pool” is extremely vulnerable now, thanks to Wikipedia.

I’m not sure how someone can consider either “the web” or wikipedia as single entities. The biggest advantage of both of these entities is their distributed nature. Wikipedia doesn’t “report” anything, it’s members do. The web is controlled by no one person, company, country or group. And this may be nitpicking, but Google does not facilitate the “navigation” of the web, merely indexing and searching. We are free to navigate through countless other methods.

Wikipedia has simply replaced the World Book Encyclopedia at the local public library.

When I was growing up, my town library could not afford EB, so they bought WBE, and tried (frequently failing) to buy the update volume each year. If I needed information, I went to the library, read the WBE, and then searched the library for other sources of information. Some of my classmates would quickly write down bits of information from the WBE and leave. Others would stick around because their parents told them “stay for 2 hours”. Others like me would geek out and go from source to source willingly. Ultimately, some folks double-checked their research, and some took the easy route and took the first thing they found.

Wikipedia is that same “first thing they found.” Anybody with more than a vacuum in their head will know to check other sources, and to not take Wikipedia (or any other Wiki) as gospel. Check the referenced sources on the Wikipedia site; Check the other Google hits; Go wild and crazy and do the same search at MSN or Yahoo; Or, maybe take a trip to the local public library!

So the difference between 2009 (Google/Wikipedia) and 1974 (local library)? It is much easier for stupid and foolish people to be really, really lazy today. It is also much easier for wise and inquiring people to NOT be lazy, and find multiple sources of information.

Arguably, Wikipedia is not a single source, since no single person holds responsibility for it, or controls its contents. Every page is edited by different people. While it clearly has some problems, I think that unfortunately, many people are prone to believe everything they read on the internet anyway, and they could do worse than getting their information from Wikipedia. (Not everyone is as educated as you and I.)

As a side note, is the web really a single medium? Books and television are on it…

In sum, I strongly disagree with thekohser, in that I believe that our information gene pool has never been bigger or stronger. It would be a lot worse if there were a group of say 100 people who decided where our information comes from (e.g., what the Wikipedia articles say).

For most purposes, wikipedia searches are probably adequate. They may not be entirely comprehensive, they may not even be entirely accurate, but they are good enough, they come back fast, they aren’t filled with ad-spam, and the price is right. If you want a thoroughly comprehensive review of some topic written by a world expert, wikipedia may not provide it, but my guess is that that’s not what most internet searches are looking for.

You forgot the fourth co-conspirator — English language. Have you noticed how this technical standard operates with the same self-reinforcing monopolistic dynamics? The more folks use it, the more you HAVE to use it. Another stake in the heart of democracy?

You state: “It’s hard to imagine that Wikipedia articles are actually the very best source of information for all of the many thousands of topics on which they now appear as the top Google search result.”

It is not hard at all. Wikipedia articles are — on average — the best source for what most people want at the time: an easy to read overview that is as unbiased yet comprehensive as possible. What they, you and I don’t want at this time is a very creative, arty essay. Just some facts, mame.

You can prove this assertion very easily. Take your list and search for the article or web page you believe IS the best source. And time your search. I’d be very interested in seeing which articles you deem better than Wikipedia. We could then vote of whether others agreed with your choice.

Kevin Kelly – No, it’s isn’t. You’re misunderstanding how Google works. The model you give is simplistic version which is wrong at this level. There are many value-laden choices it makes in its algorithm. If it decided to rank newsorganizations over Wikipedia, it could do that.