Like offline communities, online communities need structure and social norms to remain useful and viable for their members. The technology that has enabled these new types of social formations to emerge has also created new types of problems for these communities. Online communities can be vulnerable to several types of abuse, including malicious postings and spam, and present new problems of information overload. As online communities grow from dozens of individual participants to thousands, the technical workload of limiting abuse can become problematic. Heterarchical moderation (whereby many, most, or all community members are given a small amount of power and responsibility for maintaining social norms and useful discussion) has recently emerged as an option to help limit abuse and promote community goals. This thesis examines three different online communities that employ heterarchical moderation regimes to help maintain community discussion norms. A large sample of conversations from each community is subjected to both quantitative and qualitative analyses to provide an overview of the implications heterarchical discussion moderation.

I looked at a six-week sample of comments posted to Slashdot, Kuro5hin, and the now-defunct(?) Plastic.com to try and get at some of the discursive and social dynamics of distributed comment-rating systems. Looking back at it, I recognize a lot of things I could have done better, especially with my so-called "quantitative" analyses (hey, I was in a humanities program). But I had a lot of fun with the project, and some very smart referees were generous enough to pass it, so I'm not too embarrassed to open it to wider scrutiny.

I trimmed the lengthy appendices from the PDF -- the Slashdot and K5 data are still available in those sites' archives.

AbstractLike offline communities, online communities need structure and social norms to remain useful and viable for their members. The technology that has enabled these new types of social formations to emerge has also created new types of problems for these communities. Online communities can be vulnerable to several types of abuse, including malicious postings and spam, and present new problems of information overload. As online communities grow from dozens of individual participants to thousands, the technical workload of limiting abuse can become problematic. Heterarchical moderation (whereby many, most, or all community members are given a small amount of power and responsibility for maintaining social norms and useful discussion) has recently emerged as an option to help limit abuse and promote community goals.

This thesis examines three different online communities that employ heterarchical moderation regimes to help maintain community discussion norms. A large sample of conversations from each community is subjected to both quantitative and qualitative analyses to provide an overview of the implications heterarchical discussion moderation.

Business 2.0 revealed their list of the 50 people who matter -- the moguls, geeks, organizations, and groups who are defining the media landscape. It's interesting to see the world of new media through the eyes of business journalists -- they are wary of internet hype after getting burned so often in the last decade, but still willing to engage in Web 2.0 hyperbole. "You! Consumer as creator" (should that be produser?) takes the #1 spot. The writeup on Jimmy Wales and Kevin Rose caught my eye.

The New New Media:
Kevin Rose (Digg) and Jimmy Wales (Wikipedia)

Old media is all about reinforcing the importance of the institution as the editorial filter. The new new media is all about the importance of the reader as the editorial filter. Tens of millions of users can create a collaborative intelligence that's far smarter than any one editor could ever hope to be.

Why does that seem so familiar? Oh yeah, Rolling Stone crowned Rob Malda (creator of Slashdot) the "king of new new media" back in 2000. (Sorry I don't have a link). The coronation was based on similar grounds: the primacy of "users" on Slashdot, the key element of participation, and so on. So, actually, Digg and Wikipedia are the new new new media. In Web2.0world, everything new is new again, and again. (Incidentally, those classy journos at Business 2.0 also released a list of people who don't matter anymore. Malda made this list.)

I've talked with colleagues often about the concepts of novelty and banalization. Banalization, or the loss of novelty for users of a technology, is a treacherous path for Web 2.0 applications. Technologies that survive the process flourish -- email, mobile phones, TiVo, maybe Wikipedia and Flickr. Other technologies fail once the novelty wears off (see the endless parade of social networking services) -- the service they provide is not something that can become effectively normalized, but is novelty itself. Massive investments are being made into social media, and it's not clear that these are much less risky than those made in the late 1990s. Cool still !necesarily= profit.

While p2p filesharers are clearly a key potential constituency, other issues, such as a desire to reform the pharmaceutical patent system, indicate the party has a notion of the broader implications of the platform (or at least the PR savvy to tap into them).

The Pirate Party has a chance (albeit a very, very small chance) of putting a member in Parliament in the September general election. But even if they don't succeed, they have effectivly forced the issue into the general political sphere. Other minor parties have adjusted their platforms to accomodate the Pirate Party's agenda and avoid losing votes. I'll be following the Party's progress this summer with interest.

As one might expect, citizens of other filesharing nations have followed suit and started their own Pirate Parties. There are even rumblings of a US version. While I doubt the PPUS will have a significant impact on US policy, I see nothing wrong with the creation of an overtly political organization dedicated to copyright reform. In a recent Slashdot discussion, one commentor noted that the US already has "legitimate organizations" like the EFF working towards these types of goals. But one reply was: Well on the other hand, "Electronic Frontier Foundation" doesn't make headlines. "Pirate Party" does. I know I'll never have a chance to cast a vote for a Pirate, but a group like this has the potential to provoke a broader public debate on IP issues.

For one of my research projects this summer, I'm looking at Wikipedia content. Rather than tax WP's servers with thousands of queries, I thought it might be useful to run a local WP mirror. Because I am interested in the functioning of the Mediawiki software as well as WP content, I wanted a local Mediawiki install, and not just a database mirror. Getting this to work has been a challenge, so I thought I'd note a few things I've learned here for future reference.

1. HTTP server and MediaWiki
Used a default Apache install with PHP and MySQL connectors through Synaptic. MediaWiki 1.4 is also available in the repositories, but those clever MediaWiki hackers have already released versions 1.5 and 1.6. Wanting the latest and greatest, I installed 1.6 manually (very easy, using MW's slick mostly-browser-based installer. Later, as I had trouble using various import scripts, I installed version 1.5. In hindsight, I probably should have just used the package install.

2. MySQL
Installed MySQL through Synaptic. I ended up changing some of the default settings to speed the import process. Basically, I made my /etc/mysql/my.cnf file match the settings in "my-large.cnf" example configuration file. Also, I disabled the log-bin option. Note that you will want to save your original my.cnf and change most of these options back after you complete the import, as these changes basicaly allow MySQL to use as much of your system's resources as it wants. Also doublecheck the MySQL data directory -- even a basic WP mirror will eat up a lot of gigs. I ended up sacrificing a 20GB hard drive to the ravenous Wikipedia database.

3. Wikipedia dump
Dumps of WikiMedia Foundation project content are available athttp://download.wikimedia.org/enwiki. I just wanted the most recent revisions of English Wikipedia articles, so I snagged http://download.wikimedia.org/enwiki/20060607/enwiki-20060607-pages-articles.xml.bz2. Save your file somewhere where you won't forget it.

4. Import the dump using a script
The big database dumps are provided in XML format, which requires some massaging if you want to cram it back into a SQL database. A PHP script is provided with MediaWiki 1.5 and higher, in /maintenance/importDump.php. This script works as advertised, but is very slow (as few as 8 pages inserted per second -- with over 3 million pages in my smallish dump, this would be a long, lonely road).

An alternative is MWDumper, a Java program that imports pages much faster. You'll need to install Java, of course, I've the Blackdown JRE package installed, seems to work. Using MWDumper, I am currently getting about 100 pages imported per second, can am watching red links in local WP articles slowly turn blue.