Word Count: A great way to waste time

ByJim Regan, csmonitor.comAugust 5, 2004

HALIFAX, NOVA SCOTIA
— Procrastinators of the world rejoice. This week we have a website with an attractive, minimalist design, and both more and less content than almost anything else on the Web. And yet this is a site which, for most visitors, will serve absolutely no practical use even as it eats up otherwise productive time. Welcome to WordCount.

WordCount is, in the words of creator, Jonathan Harris, "an artistic experiment in the way we use language." Onscreen, it's a scrolling list of the 86,800 most commonly used words in the English language - arranged from most frequently repeated "the" to least "conquistador." (So while there may be no content in the form of plot or point, WordCount is uniquely comprehensive it its meaninglessness.)

The information on which this ranking is based comes from the British National Corpus, and includes every word that appears two or more times within that resource. A somewhat larger collection of words (verbs, nouns, proper names, even trademarks), the BNC was compiled between 1991 and 1994 from unscripted conversation as well as such written sources as newspapers, periodicals, academic and popular books, letters and memoranda, and even school and university essays.

Naturally, due to shifts in language related to time and geography, these figures aren't absolute. For example, "Internet" will have certainly climbed the rankings since '94, while a survey done in the US would doubtless reveal a (sadly) higher ranking for "dude" (which currently rests just ahead of "ursodeoxycholic" at 35,307) and a lower one for "spiffing" (which is already near the bottom of the list at 78,084). Still, unless you have some specific scientific need for accuracy, the collection more than serves its stated purpose - to make the visitor feel, "embedded in the language, sifting through words like an archaeologist through sand."

The means for sifting this information (and the only active page of the site) is a Flash interface offering multiple methods for exploration. The words themselves are displayed in a format that surfers will be familiar with in the context of an interactive timeline, with user commands scrolling the visible contents from one segment of the line to another. Words are presented in a continuous string, withoutspacing, but alternating text colors of grey and black effectively delineate one word from the next.

Words are also scaled in accordance with popularity, getting smaller as their number climbs. (Naturally, this renders the text unreadable very quickly - so once a particular destination on the list is located, the words onscreen are enlarged to a legible size.)

As mentioned, visitors have several options when it comes to navigating the collection. The first is the simple act of clicking at any spot along the timeline - which should be a comfortable option for anyone who has ever chosen a travel destination by closing their eyes and throwing a dart at a map. (The unpredictable nature of the results has its attractions, though my attempts to randomly locate "serendipity" were unsuccessful.)

A Find Word search offers the best option for the goal oriented, and a By Rank search exists for those who might simply want to know what word corresponds with a number like 3,992. (Look it up yourself.) Finally, Previous and Next arrows are the most basic options, and while moving through 87,000 words via this method is only for the true, dedicated procrastinator, the arrows can be useful in ferreting out the coincidental relationships between a chosen word and those around it. (Such as the 705-706 pairing of "computer" and "security" or the 992-995 set of, "America ensure oil opportunity.")

But "nearest neighbor" relationships aren't the only way to squander the hours at WordCount. Many visitors are seeking validation by checking out the popularity of their names. (Condolences to any Alonsos out there.) Political name counters on the left will be pleased that John (266) ranks higher than George (913), while those on the right can point out that Republican (4,634) scores better than Democrat (7,135). If you were born from January to September, you can see what word corresponds to your 5 digit birth date - if October or later, you can still find a match for month and day. (Mine happened to be "positive," which while uncannily accurate, succeeded in putting a lot of pressure on me that I REALLY don't need in my life at this particular moment...)

If you enjoy Googlewhacking, you might also try your hand at finding words which didn't make it into the top 86,800 (like "procrastinate"), and if you have a desire to truly annoy your friends, you can start sending letters consisting entirely of numbers.

The possibilities are, as they say, endless, and there's also the option to debate what, if anything, the list tells us about our culture at this moment in history. Does the fact that "love" (384) scores better than "sex" (1,236) reveal a greater desire for emotional than physical intimacy in today's society - or simply that very few people tend to say things like, "I want to have sex with my new car!"?

Future plans include the ability to measure word counts within specific texts, websites, and even the Internet as a whole. (At which point "htm" will probably rocket into the top 100.) Until then there is much time to be wasted, and no time to lose.