On “Geek” Versus “Nerd”

To many people, “geek” and “nerd” are synonyms, but in fact they are a little different. Consider the phrase “sports geek” — an occasional substitute for “jock” and perhaps the arch-rival of a “nerd” in high-school folklore. If “geek” and “nerd” are synonyms, then “sports geek” might be an oxymoron. (Furthermore, “sports nerd” either doesn’t compute or means something else.)

In my mind, “geek” and “nerd” are related, but capture different dimensions of an intense dedication to a subject:

geek – An enthusiast of a particular topic or field. Geeks are “collection” oriented, gathering facts and mementos related to their subject of interest. They are obsessed with the newest, coolest, trendiest things that their subject has to offer.

nerd – A studious intellectual, although again of a particular topic or field. Nerds are “achievement” oriented, and focus their efforts on acquiring knowledge and skill over trivia and memorabilia.

Both are dedicated to their subjects, and sometimes socially awkward. The distinction is that geeks are fans of their subjects, and nerds are practitioners of them. A computer geek might read Wired and tap the Silicon Valley rumor-mill for leads on the next hot-new-thing, while a computer nerd might read CLRS and keep an eye out for clever new ways of applying Dijkstra’s algorithm. Note that, while not synonyms, they are not necessarily distinct either: many geeks are also nerds (and vice versa).

An Experiment

Do I have any evidence for this contrast? (By the way, this viewpoint dates back to a grad-school conversation with fellow geek/nerd Bryan Barnes, now a physicist at NIST.) The Wiktionary entries for “geek” and “nerd” lend some credence to my position, but I’d like something a bit more empirical…

“You shall know a word by the company it keeps” ~ J.R. Firth (1957)

To characterize the similarities and differences between “geek” and “nerd,” maybe we can find the other words that tend to keep them company, and see if these linguistic companions support my point of view?

Data and Method

(Note: If you’re neither a geek nor a nerd, don’t be scared by the math. It’s not too bad… or you can probably just skip to the “Results” subsection below…)

I analyzed two sources of Twitter data, since it’s readily available and pretty geeky/nerdy to boot. This includes a background corpus of 2.6 million tweets via the streaming API from between December 6, 2012, and January 3, 2013. I also sampled tweets via the search API matching the query terms “geek” and “nerd” during the same time period (38.8k and 30.6k total, respectively). Yes, yes, yes… I collected all the data six months ago but just now got around to crunching the numbers. It’s been a busy year!

A great little statistic for measuring how much company two words tend to keep is pointwise mutual information (PMI). It’s commonly used in the information retrieval literature to measure the cooccurrence of words and phrases in text, and it also turns out to be a good predictor of how humans evaluate semantic word similarity (Recchia & Jones, 2009) and topic model quality (Newman & al., 2010).

For two words w and v, the PMI is given by:

,

where in this case is the probability of the word(s) in question appearing in a random tweet, as estimated from the data. For instance, if we let v = “geek,” we compute the log-probability of a word w in the “geek” search corpus, and subtract the log-probability of w in the background corpus.

Results

The PMI statistic measures a kind of correlation: a positive PMI score for two words means they “keep great company,” a negative score means they tend to keep their distance, and a score close to zero means they bump into each other more or less at random.

With that in mind, here is a scatterplot of various words according to their PMI scores for both “geek” and “nerd” on different axes (ignoring words with negative PMI, and treating #hashtags as distinct):

Many people have asked for a high-res PDF of this plot, so here you go.

Moving up the vertical axis, words become more geeky (“#music” → “#gadget” → “#cosplay”), and moving left to right they become more nerdy (“education” → “grammar” → “neuroscience”). Words along the diagonal are similarly geeky and nerdy, including social (“#awkward”, “weirdo”), mainstream tech (“#computers”, “#microsoft”), and sci-fi/fantasy terms (“doctorwho,” “#thehobbit”). Words in the lower-left (“chores,” “vegetables,” “boobies”) aren’t really associated with either, while those in the upper-right (“#avengers”, “#gamer”, “#glasses”) are strongly tied to both. Orange words are more geeky than nerdy, and blue words are the opposite. Some observations:

Collections are geeky. All derivatives of the word “collect” (“collection,” “collectables”, etc.) are orange. As are “boxset” and “#original,” which imply a taste for completeness and authenticity.

The science & technology words differ. General terms (“#computers,” “#bigdata”) are on the diagonal — similarly geeky and nerdy. As you splay up toward more geeky, though, you see products, startups, brands, and more cultish technologies (“#apple”, “#linux”). As you splay down toward more nerdy you see more methodologies (“calculus”).

#Hashtags are geeky. OK, sure, hashtags are all over the place. But they do tend toward the upper-left. And since hashtags are “#trendy,” I take it to mean that geeks are into trends. (I take this one back. The average PMI score for all hashtags is 0.74 with “geek” but 0.73 with “nerd.” The difference isn’t statistically significant using a paired t-test or Wilcoxon test, or practically significant using a common-sense test.)

Hobbies: compare the more geeky pastimes (“#toys,” “#manga”) with the more nerdy ones (“chess,” “sudoku”).

Brains: the word “intelligence” may be geeky, but “education,” “intellectual,” and “#smartypants” are nerdy.

Reading: “#books” are nerdy, but “ebooks” and “ibooks” are geeky.

Pop culture vs. high culture: “#shiny” and “#trendy” are super-geeky, but (curiously) “cellist” is the nerdiest…

The list goes on. If you want to poke around yourself, download the raw PMI scores (4.2mb) and let me know in the comments what you find. Since many people have asked: I computed PMI for all words appearing in the search tweets with “geek” and “nerd” (millions) and then manually scanned roughly 7,500 words with positive PMI scores for both. The scatterplot contains about 300 words that I hand-picked because they made sense.

(Update: I learned that Olivia Culpo — a self-described “cellist nerd” — was crowned Miss Universe on December 20, 2012. The event was heavily tweeted smack in the middle of my data collection, so that probably explains the correlation between “cellist” and “nerd” here. It also underscores the limitations of time-sensitive data.)

Conclusion

In broad strokes, it seems to me that geeky words are more about stuff (e.g., “#stuff”), while nerdy words are more about ideas (e.g., “hypothesis”). Geeks are fans, and fans collect stuff; nerds are practitioners, and practitioners play with ideas. Of course, geeks can collect ideas and nerds play with stuff, too. Plus, they aren’t two distinct personalities as much as different aspects of personality. Generally, the data seem to affirm my thinking.

I wonder how similar the results would be if you applied this method to the Google Books Ngrams corpus, or something more general instead of a niche media like Twitter. I also wonder what other questions might be answered with this kind of analysis (for example, my wife and I have a perennial disagreement over which word is wetter: “moist” vs. “damp.”).

Finally, when I mentioned to a friend that I was going to write up this post, she said “Well, I guess we know which one you are.” But do we really? I may be a science nerd, but I’m probably a music geek…

Update (June 25, 2013): Woah. This has gotten more attention than I ever anticipated. A few impressions. (1) Prior to writing this, I had no idea there was a “geek vs. nerd” holy war in certain corners of the Internet; fueling these flamewars was certainly not my intent. Lighten up! (2) I fear I’ll be better known for this diversion than for any of my “real” research. To be clear: this was a fun way to kill a few hours on a Saturday afternoon, not necessarily my best science. I think the writeup here is sound and self-evident, but I’m the first to acknowledge that there are better corpora, methods, and analysis techniques — which could use a grant, grad student, and/or more than an afternoon — for uncovering this all-important “Truth.” (3) For those interested in the etymologies of “geek” and “nerd,” I found this cool writeup.

Very Interesting and thought provoking. Thank you. Never really thought what the differences are between geek and nerd. Your post is an eye opener that synonyms can be statistically analyzed similarly.

Nice, but there is a problem with your analysis if the number of occurrences of a word is low. Then your log-odds can become high even though there is in fact no interesting association between the word in question and either geek or nerd. This could explain the curious ‘cellist’, which is not nerdy at all… You could filter out things that appear too few times, or, somewhat nerdier, add pseudo counts to word occurrences.

It would be interesting to see a variant of the analysis that, to your point, shows relative frequency information, perhaps in this visualization it could be achieved by word size or transparency (e.g. low frequency = more transparent text in the graph).

I suspect the cellist-nerd pair would still show up due to something occurring right in the middle of the time period those tweets cover:

Adding a time distribution element to the results would uncover the persistence or transience/emergence of each correlation, so you’d be able to identify the sudden emergence of the cellist nerd on Dec 20th.

Nice find! I’m pretty sure that the cellist-nerd correlation has something to do with this event, then. And yes, there are all kinds of interesting variants one can imagine… among other things, to overcome the problems of analyzing such a short time window.

Woah!! This is brilliant! I can totally relate to it when you talk about the drastic difference between geek and nerd. Well, I am kind of both because I am a statistician by profession and also a tennis player! So, I can go from being a total nerd from 9 a.m. to 5 p.m. to being a sports geek on the tennis court! At times I am amazed myself at the stark contrast in my personality at work and on the field. Kudos to a great post and congrats on getting freshly pressed.

Interesting study, I’m glad I can now use the terms in a well defined manner.

I have to say I’d question the reliability of your data. Intermittent trends could cause correlations to increase for a short period. Since internet trends are generally quite short-lived, I’m not sure how relevant this would be to your data, but sampling over a longer time-period would correct for this.

I’d be interested to see this kind of sampling done with other words too, it could actually throw up all sorts of things, such as changes in perception of words, how words are changing in meaning, evolution of new words etc.

Perhaps, gathering the statistics and formulating them, and finding a display for them, categorizes you as a geek, but the impetus, and the probable enjoyment of the results, marks you as a nerd. LOL Great piece of work (and now I’m not sure which one I am)….

Nice article. I’m a clinical human factors engineer by trade, a data visualization enthusiast by design, and an avid sci-fi/music/Ameritrash boardgame geek by coincidence. Generally, the term “nerd” applies to the sciences that interest me at work, while the term “geek” applies to the sciences that interest me at home. P.S. Geeks around here tend to be drawn to R, nerds to Stata. 😉

‘moist’ and ‘damp’ are often used in slightly different contexts, but they still occur together frequently enough on the Web for us to determine significant differences in intensity. We recently wrote a paper about how this can be done: http://www.demelo.org/gdm/intensity/.

In this case, it seems that “damp” is more frequently thought of as stronger.

While that’s true, I meant “cultish technology” in the same way you refer to a “cult movie” like The Princess Bride or The Big Lebowski. Not that it’s on the fringes, but that it has a cult following within and beyond the mainstream.

just astonishingly remarkable how you shed light on these two words that have always beat the world. I can’t exactly place myself i.e a nerd or a geek but one thing is certain…everybody has got both a nerd and a geek in them! Thanks, pal

In my impression the term “nerd” has changed it’s meaning in the last decades from something rather negative to something that’s almost mainstream. That’s why I’m not sure what people are referring to, when they use the term “nerd”. I’ll probably link to this article in the future 🙂

It seems to me that a “sports nerd” might describe someone who obsesses over sports scores and statistics but does not actually participate in sports. I’m not one myself, so I don’t have a lot of insight into the psychology, but it was the first thing I thought of.

I disagree with the primary conclusion that geekiness is about collecting stuff, though. (Anecdotally, I don’t collect anything, and I’m definitely a geek about a lot of things.) The collection angle may still be true for some kinds of geeks, but the key distinction I see (in this data and more generally) is that geekiness is associated with technology and an outward-looking focus. Nerdiness is associated with an inward-looking focus on a usually academic topic.

Nice post! I knew there was a difference between those two terms.. but believe or not it´s very easy to be both of them, I consider myself geek and nerd… one with more strenght than the other but still the two…

Well, to my mind a geek is a SPECIALIST – having an obsessive special interest in one or two topics (such as Spiderman comics of the late 60s) – while a nerd is a GENERALIST – trying to be pedantically knowledgable about anything and everything.

Found you in the “Freshly Pressed” email. I rarely click on any of those links, but I’m so glad I read this. This should go viral. I am comforted that one can be styled a geek/nerd. I have my own dilettante/professional struggle. Maybe I should pray, “Grant me the geekiness to enjoy the things I cannot master, the nerdiness to persist with the things I can, and the patience to tolerate the difference.”

Very interesting ! It would be great to keep this list of words and compute the google distance instead to see if it corroborates your results.http://en.wikipedia.org/wiki/Normalized_Google_distance
If you have the list of words, I will be happy to contribute to your investigations.

First of all, totally nerdy that you used .png for the graphic: I love it!

Next, I am clearly more of a nerd, but I want to be a geek, at least where books are concerned. I am definitely a music geek (I have a pretty huge record collection), but have longed to become a sci-fi fantasy geek. Sadly I’m too busy being a nerd about science and other topics to spend much time geeking.