Monthly Archives: May 2009

The thing about censorship is that, when done well, no one really knows what’s being censored. This is why last week’s leaked documents from Baidu, the largest Chinese-langauge search engine and blogging site, are so titillating. Maybe someone screwed up bad, or maybe someone on the inside had an attack of transparency; whatever the reason, we now have a huge pile of documents detailing Baidu’s censorship policy during the period from November 2008 to March 2009.

Whee!

The documents, now safely ensconed in a permanent home on Wikileaks, reveal for the first time a detailed inventory of the Chinese government’s priorities for, er, harmonization. There is a blacklist of 798 specific URLs, most of which seem to be recent news articles and discussion forum posts on sites both inside and outside of China. Far more interesting is a long list of sensitive keywords. Included policy documents suggest that the appearance of any of these terms in a blog post triggers a manual review by the staff of Baidu’s censorship team — whose names are listed in another of the leaked documents! While some of these topics have long been outright censored, such as “Tiananmen Square,” others are more general categories to be watched. Taken together, these sensitive terms are a fascinating portrait of China’s institutional paranoia.

Some categories are obvious, such as “Taiwan” and “naked chat”. Other areas are shockingly broad, such as “power” and “tyranny.” Certain media outlets such as Voice of America are considered unacceptable, and “SMS the answer” is forbidden within the “exam information” section. Also, China does not have any ketamine, AIDS, or ethnic conflict, and frowns upon one night stands. The main document of interest begins,

And I can’t read that either, so below is an automated translation, via The Dark Visitor who clearly used something more formidable than Google Translate. Still, machine translation really doesn’t work as well as one might like, or perhaps “electric chicken” makes perfect sense in context.

We dream the internet to be a great public meeting place where all the world’s cultures interact and learn from one another, but it is far less than that. We are separated from ourselves by language, culture and the normal tendency to seek out only what we already know. In reality the net is cliquish and insular. We each live in our own little corner, only dimly aware of the world of information just outside. In this the internet is no different from normal human life, where most people still die within a few kilometers of their birthplace. Nonetheless, we all know that there is something else out there: we have maps of the world. We do not have maps of the web.

I have met people who have never seen a world map. I once had a conversation with herders in the south Sahara who asked me if Canada was in Europe. As we talked I realized that the patriarch of the settlement couldn’t name more than half a dozen countries, and had no idea how long it might take to get to any of the ones he did know. He simply had no notion of how big the planet was. And to him, the world really is small: he lives in the desert, occasionally catches a ride to town for supplies, and will never leave the country in which he was born.

Online, we are all that man. Even the most global and sophisticated among us does not know the true scope of our informational world. Statistics on the “size” of the web are surprisingly hard to come by and even harder to grasp; learning that there are a trillion unique URLs is like being told that the land area of the Earth is 148 million square kilometers. We really have no idea what we’re missing, no visceral experience that teaches our ignorance.