In the past I sued Sohu for deleting my blog posts, but now I want to
praise them. Sohu is the only BSP that posts notices on my blog saying,
“this post has been hidden/removed for certain reasons . ” As a result,
when web users visit my Sohu blog, they can know that a post has been
hidden by Sohu. I think Sohu is brave to do this. I also run blogs on
Sina, Ifeng, and others, but they simply delete blog posts without
notifying my readers.

Last year I met Liu around the time he was trying to sueSohu for deleting some of his blog posts. He argued that the censored posts, analyzing criminal court cases, did not contain material that violated Chinese law, and that Sohu was therefore violating his user agreement. The lawsuit didn't get very far. Liu told me at the time that he writes on a dozen or so different blog hosting services because they all seem to have different censorship criteria - so for Chinese bloggers maintaining multiple blogs is the best way to keep your writing alive on the web.

My conversation with Liu inspired a systematic study of how blog-hosting companies serving mainland China censor their users' content. All Chinese blog-hosting companies are required by government regulators to censor their users' content in order to keep their business licenses. But as Liu discovered, they all make different choices not only about how to implement censorship requirements, but also how to treat the users who get censored.

Most Chinese bloggers who want an audience inside mainland China use domestic Chinese blog-hosting services - only a very tiny minority use overseas services like Blogger or Wordpress.com because they tend to be blocked, and even fewer have the tech skills to do their own custom Wordpress installation on their own rented server space. The aim of my research was to look at the Chinese blog-hosting services (which includes foreign brands offering services inside China to the Chinese market) and establish how much variation there is in terms of what gets censored and how it gets censored. Since it's not in the interest of people who work at blog-hosting companies to tell the truth about these things in great detail to a foreign researcher, I decided that the best way to do this would be to post a range of content across a number of blog-hosting services and track who censored what and how. With the help of John Kennedy, Ben Cheng, and some student research assistants, my team posted more than 100 pieces of content - passages from news items, blogs, and chatrooms of varying political sensitivity - consistently across 15 different Chinese blog-hosting platforms. We found that censorship levels and methods vary tremendously from company to company. I have written about some of the interesting findings that came up as we went along here, here, and here.

(Click on the chart at right to enlarge.) If I publish a chart naming who censors more than whom, it is likely
that those who censor less will get in trouble with the authorities. Therefore in the chart at right I have changed all the
company names to letters. Of 108 pieces of content on a variety of public affairs and news-related subjects from a variety of sources (ranging from Xinhua to dissident websites), the most censor-happy company deleted over half, while the most laid-back company censored only one. (Note that I only posted one item about FLG and one about Tiananmen because most bloggers expect those to be censored - it's more interesting to see how censorship works on topics that Chinese bloggers interested in current events might write about.)

UPDATE: In reaction to numerous queries I'm willing to disclose
the list of blog hosts tested, but I will not say here on a public blog
which ones correspond to the letters on the chart. The blog hosts are,
in alphabetical order: Baidu, Blogbus, BlogCN, iFeng, Mop, MSN Live,
MySpace, Netease, QZone, Sina, Sohu, Tianya, Tom, Yahoo! China, YCool.

Below are updated slides from presentation I gave at a recent conference discussing the details of my research results. A number of the slides illustrate the ways in which blog hosts not only censor different amounts of content, but examples of different censorship practices, with wide variation not only from service to service, but also depending on the nature of the content.

I am writing an academic paper about this research which - given the slow gears of academic journal publishing - will probably take a year or more to get published. Given how quickly things change, however, it's useful to many people for me to share my findings and solicit feedback now. Please post your feedback in the comments section at the bottom of this post, or send me thoughts by e-mail (see my "about" page).

Earlier this Fall I took my preliminary findings and showed them to a number of people who work for Chinese Web companies, to solicit their views on why different Chinese blog hosts are censoring their users' content so differently. In my presentation I offer several conclusions to be drawn from what was a very experimental and relatively small-scale project:

Internet ﬁltering (“the great ﬁrewall”) is only one part of Chinese Internet censorship.

Domestic web censorship is not centralized at all.

Domestic web censorship is outsourced by government to the private sector.

Domestic web censorship is inconsistent - if you can't post successfully in one place, itʼs usually possible to post your content somewhere else, at least for at least a while.

The system of “managing” user-generated web content in China appears to follow a similar logic and approach as the system for controlling professional news media.

When I write my paper those will be elaborated upon. I also identified a number of implications for researching Chinese censorship "inside the great firewall:"

Need to do more to foster a global “user rights” movement demanding greater transparency and accountability by Internet companies on privacy and free expression. The Global Network Initiative is a good start in this regard but we need much more.

There is also a set of more global questions:

Where else in the world is this kind of political censorship by web service companies of user- generated content happening? (Companies in the West already censor for child porn, copyright violations and sometimes hate speech.)

Will the “Chinese model” - in which governments demand censorship by web companies - spread globally?

What issues in this vein should the advocacy community be preparing for?

What further research needs to be done to better understand global trends?

Got any views or anything to add about any of the above? Please hit the comments section.

UPDATE (Dec.1st): The good folks at YeeYan have translated this blog post into Chinese, here.

» China: Rebecca McKinnon's Blog Censorship Research from Boing Boing
Rebecca McKinnon has published an extensive and densely informative blog post in which she shares findings of her ongoing Chinese blog censorship research. She is developing a more in-depth academic paper for release in 2009, and welcomes feedback and... [Read More]

Tracked on December 02, 2008 at 04:04 PM

Comments

Very interesting, impressive and important, Rebecca! Any chanse you might do a voiceover on the slideshow? Then it could serve as an introduction to the issues for a much broader audience, I believe.

Great post, sounds like a fascinating project. one thing that I was thinking of as I read the post and looked through the slides was ordering some of the censorship by type.

It seems like there is a range from automated filtering (if there is a keyword match in the title/body, the entry cannot be posted (or the keywords are starred out) to manual deletion (where entries are looked at by a human and approved before they are publicly posted, or are outright deleted, sometimes at a later time). It would be nice to see a breakdown of the types.

I'm also wonder about the TCP resets. it might be nice to geolocate the IP's that one connects to when making a post. My guess is that the ones displaying the tcp resets are hosted outside of China indicating that the "firewall" could be adding yet another layer to blog censorship.

Thanks Nart! Insightful thoughts as always. Yes there is definitely a range of automated and human methods as you say. Oftentimes it also seems like there's a combination - something is flagged by keyword then confirmed by a human. Other cases are hard to tell whether a human was involved or not without getting confirmation from the companies. I should be able to break it down a bit for the paper - though in some cases not conclusively.

Yes the one blog host displaying TCP reset appears to be hosting outside China.. However, interestingly, the same content posted on some other blogs does not trigger the GFW.. so it's something the company itself is doing.

Today's NYT article on Google censorship (http://tinyurl.com/54svs7) gives some further insight into the types of censorship, especially the role that the community has - Google will make human decisions about material flagged as inappropriate. Saudi Arabia's state censors are doing a similar thing: http://techdirt.com/articles/20081116/1953102841.shtml

Also, on the topic of "user rights" I think there is much to be done. See the story of Michelle Malkin in the above NYT link. Even when Google notifies that results have been censored or videos removed, it lacks the granularity it could. For example, what law was violated? What term of service was violated? Who, specifically, complained?

Rebecca MacKinnon's empirical research on Chinese Blog Service Providers' censorship pattern is arguably the first systematic attempt to map what I call "special speech zones"(SSZs). By special speech zones I mean areas (virtual or physical) set aside by authorities or hosts where certain speech is allowed or disallowed. While MacKinnon aims to map out which content items are allowed or disallowed in which major Chinese blog hosting sites, she has in effect initiated a "cat-chase-mouse" game of exposing the special speech zones.