Notabilia

Visualizing Deletion Discussions on Wikipedia

As Doc Searls recently put it, Wikipedia is, like the protocols of the Net, "a set of agreements". A Web protocol defines the way in which computers communicate with each other and make decisions to ensure successful transactions. Wikipedia policies have the same purpose, but instead of transactions between machines, they regulate human decisions. An important part of these decisions bear on what topics are suitable for inclusion in Wikipedia and what topics are not. The present project looks into the nature and shape of collective decisions about the inclusion of a topic in Wikipedia.

Notable topics and AfD discussions

Like a garden, an online encyclopedia needs constant weeding. Unlike a garden, an online encyclopedia has thousands of potential gardeners. Over years Wikipedia has developed guidelines and policies to help editors collectively decide whether topics are suitable for inclusion or not. All articles, especially new ones, are reviewed by the community to determine if they meet Wikipedia's notability guidelines. Any editor can nominate an article for deletion and, if this nomination is legitimate, a community discussion takes place where any fellow gardeners editors have the opportunity to make their voices heard. The usual process is to have a week-long discussion during which community members can discuss in favor or against keeping the article. At the end of this period an administrator reviews the discussion and speaks the final verdict.

We analyzed and visualized Article for Deletion (AfD) discussions in the English Wikipedia. The visualization above represents the 100 longest discussions that resulted in the deletion of the respective article. AfD discussions are represented by a thread starting at the bottom center. Each time a user joins an AfD discussion and recommends to keep, merge, or redirect the article a green segment leaning towards the left is added. Each time a user recommends to delete the article a red segment leaning towards the right is added. As the discussion progresses, the length of the segments as well as the angle slowly decay.

What decides whether consensus is reached is the administrator closing the AfD discussion, not a headcount. As a result, the proportion of Keeps and Deletes may be at odds with the final decision, as illustrated by the above visualization. AfD discussions also take a variety of shapes depending on how they evolve over time.

Varieties of AfD patterns

Controversial

Swinging

Unanimous

Particularly controversial discussions where Deletes and Keeps alternate tend to follow a straight line as opposing opinions balance each other and consensus is hard to reach.

Discussions follow an s-shaped trajectory when a series of Deletes is followed by a consistent series of Keeps or vice versa. This pattern may indicate that participants with similar opinions flock together and join the discussion at the same time.

Unanimous discussions tend to get curly. An ideal discussion in which there is a total agreement among participants will approximately look like a logarithmic spiral.

The following visualization represents the 100 longest discussions that did not result in the deletion of the article (i.e the article was kept, merged or redirected):

Some interesting facts about AfD discussions

Articles can be nominated for deletion multiple times. The decision to keep the article resulting from an AfD can be overturned by a new nomination at a later time: the article "Sexuality of Robert Baden-Powell", for example, was deleted at the 4th nomination after 2 discussions ending with no consensus and one ending with a Keep decision.

Long AfD discussions are exceptional and not all AfD discussions are as crowded as those represented above. The longest AfD discussions do not give a representative picture of how this process works in most cases. The analysis of a large sample of AfD discussions (200K discussions that took place between November 2002 and July 2010) suggests that the largest part of these discussions ends after only a few recommendations are expressed.

Splitting up the chart by the outcome of the discussion, and the vote distribution, reveals some interesting facts:

Not all AfD discussions start with a Delete recommendation: sometimes the initial nominator suggests to merge or redirect the entry.

A typical discussion consists of only three or four recommendations. In most cases these discussions are so short simply because the topic under discussion is trivially non-notable.

Quite a few discussions are closed with a Keep decision right after the initial nomination (thereby overriding the nomination). Deletion decisions occurring after only one recommendation is expressed, in comparison, are much less frequent.

Delete decisions tend to be fairly unanimous. In contrast, we found many Keep decisions resulting from a discussion that leaned towards deletion: this can happen if no clear consensus emerged from the discussion, or if participants advocating deletion did not appeal to the right policy. In general, this means that administrators tend to apply the deletion guidelines in a conservative way.

Further analysis shows that consensus is often hard to reach. The figure below (left) shows the distribution of duration (in seconds) of AfD discussions. While the vast majority lasts less than one week, there is a sizable minority that takes up to two weeks or more. Some discussions took more than one year to be closed!

How passionate are these discussions? The figure below (right) shows the distribution of activity rates (in "votes" per second), i.e. how fast new participants join a discussion. In general, one new participant per day is the norm but in many cases the activity rate can increase to one new participant every half an hour.