On “Geek” Versus “Nerd”

To many people, “geek” and “nerd” are synonyms, but in fact they are a little different. Consider the phrase “sports geek” — an occasional substitute for “jock” and perhaps the arch-rival of a “nerd” in high-school folklore. If “geek” and “nerd” are synonyms, then “sports geek” might be an oxymoron. (Furthermore, “sports nerd” either doesn’t compute or means something else.)

In my mind, “geek” and “nerd” are related, but capture different dimensions of an intense dedication to a subject:

geek – An enthusiast of a particular topic or field. Geeks are “collection” oriented, gathering facts and mementos related to their subject of interest. They are obsessed with the newest, coolest, trendiest things that their subject has to offer.

nerd – A studious intellectual, although again of a particular topic or field. Nerds are “achievement” oriented, and focus their efforts on acquiring knowledge and skill over trivia and memorabilia.

Both are dedicated to their subjects, and sometimes socially awkward. The distinction is that geeks are fans of their subjects, and nerds are practitioners of them. A computer geek might read Wired and tap the Silicon Valley rumor-mill for leads on the next hot-new-thing, while a computer nerd might read CLRS and keep an eye out for clever new ways of applying Dijkstra’s algorithm. Note that, while not synonyms, they are not necessarily distinct either: many geeks are also nerds (and vice versa).

An Experiment

Do I have any evidence for this contrast? (By the way, this viewpoint dates back to a grad-school conversation with fellow geek/nerd Bryan Barnes, now a physicist at NIST.) The Wiktionary entries for “geek” and “nerd” lend some credence to my position, but I’d like something a bit more empirical…

“You shall know a word by the company it keeps” ~ J.R. Firth (1957)

To characterize the similarities and differences between “geek” and “nerd,” maybe we can find the other words that tend to keep them company, and see if these linguistic companions support my point of view?

Data and Method

(Note: If you’re neither a geek nor a nerd, don’t be scared by the math. It’s not too bad… or you can probably just skip to the “Results” subsection below…)

I analyzed two sources of Twitter data, since it’s readily available and pretty geeky/nerdy to boot. This includes a background corpus of 2.6 million tweets via the streaming API from between December 6, 2012, and January 3, 2013. I also sampled tweets via the search API matching the query terms “geek” and “nerd” during the same time period (38.8k and 30.6k total, respectively). Yes, yes, yes… I collected all the data six months ago but just now got around to crunching the numbers. It’s been a busy year!

A great little statistic for measuring how much company two words tend to keep is pointwise mutual information (PMI). It’s commonly used in the information retrieval literature to measure the cooccurrence of words and phrases in text, and it also turns out to be a good predictor of how humans evaluate semantic word similarity (Recchia & Jones, 2009) and topic model quality (Newman & al., 2010).

For two words w and v, the PMI is given by:

,

where in this case is the probability of the word(s) in question appearing in a random tweet, as estimated from the data. For instance, if we let v = “geek,” we compute the log-probability of a word w in the “geek” search corpus, and subtract the log-probability of w in the background corpus.

Results

The PMI statistic measures a kind of correlation: a positive PMI score for two words means they “keep great company,” a negative score means they tend to keep their distance, and a score close to zero means they bump into each other more or less at random.

With that in mind, here is a scatterplot of various words according to their PMI scores for both “geek” and “nerd” on different axes (ignoring words with negative PMI, and treating #hashtags as distinct):

Many people have asked for a high-res PDF of this plot, so here you go.

Moving up the vertical axis, words become more geeky (“#music” → “#gadget” → “#cosplay”), and moving left to right they become more nerdy (“education” → “grammar” → “neuroscience”). Words along the diagonal are similarly geeky and nerdy, including social (“#awkward”, “weirdo”), mainstream tech (“#computers”, “#microsoft”), and sci-fi/fantasy terms (“doctorwho,” “#thehobbit”). Words in the lower-left (“chores,” “vegetables,” “boobies”) aren’t really associated with either, while those in the upper-right (“#avengers”, “#gamer”, “#glasses”) are strongly tied to both. Orange words are more geeky than nerdy, and blue words are the opposite. Some observations:

Collections are geeky. All derivatives of the word “collect” (“collection,” “collectables”, etc.) are orange. As are “boxset” and “#original,” which imply a taste for completeness and authenticity.

The science & technology words differ. General terms (“#computers,” “#bigdata”) are on the diagonal — similarly geeky and nerdy. As you splay up toward more geeky, though, you see products, startups, brands, and more cultish technologies (“#apple”, “#linux”). As you splay down toward more nerdy you see more methodologies (“calculus”).

#Hashtags are geeky. OK, sure, hashtags are all over the place. But they do tend toward the upper-left. And since hashtags are “#trendy,” I take it to mean that geeks are into trends. (I take this one back. The average PMI score for all hashtags is 0.74 with “geek” but 0.73 with “nerd.” The difference isn’t statistically significant using a paired t-test or Wilcoxon test, or practically significant using a common-sense test.)

Hobbies: compare the more geeky pastimes (“#toys,” “#manga”) with the more nerdy ones (“chess,” “sudoku”).

Brains: the word “intelligence” may be geeky, but “education,” “intellectual,” and “#smartypants” are nerdy.

Reading: “#books” are nerdy, but “ebooks” and “ibooks” are geeky.

Pop culture vs. high culture: “#shiny” and “#trendy” are super-geeky, but (curiously) “cellist” is the nerdiest…

The list goes on. If you want to poke around yourself, download the raw PMI scores (4.2mb) and let me know in the comments what you find. Since many people have asked: I computed PMI for all words appearing in the search tweets with “geek” and “nerd” (millions) and then manually scanned roughly 7,500 words with positive PMI scores for both. The scatterplot contains about 300 words that I hand-picked because they made sense.

(Update: I learned that Olivia Culpo — a self-described “cellist nerd” — was crowned Miss Universe on December 20, 2012. The event was heavily tweeted smack in the middle of my data collection, so that probably explains the correlation between “cellist” and “nerd” here. It also underscores the limitations of time-sensitive data.)

Conclusion

In broad strokes, it seems to me that geeky words are more about stuff (e.g., “#stuff”), while nerdy words are more about ideas (e.g., “hypothesis”). Geeks are fans, and fans collect stuff; nerds are practitioners, and practitioners play with ideas. Of course, geeks can collect ideas and nerds play with stuff, too. Plus, they aren’t two distinct personalities as much as different aspects of personality. Generally, the data seem to affirm my thinking.

I wonder how similar the results would be if you applied this method to the Google Books Ngrams corpus, or something more general instead of a niche media like Twitter. I also wonder what other questions might be answered with this kind of analysis (for example, my wife and I have a perennial disagreement over which word is wetter: “moist” vs. “damp.”).

Finally, when I mentioned to a friend that I was going to write up this post, she said “Well, I guess we know which one you are.” But do we really? I may be a science nerd, but I’m probably a music geek…

Update (June 25, 2013): Woah. This has gotten more attention than I ever anticipated. A few impressions. (1) Prior to writing this, I had no idea there was a “geek vs. nerd” holy war in certain corners of the Internet; fueling these flamewars was certainly not my intent. Lighten up! (2) I fear I’ll be better known for this diversion than for any of my “real” research. To be clear: this was a fun way to kill a few hours on a Saturday afternoon, not necessarily my best science. I think the writeup here is sound and self-evident, but I’m the first to acknowledge that there are better corpora, methods, and analysis techniques — which could use a grant, grad student, and/or more than an afternoon — for uncovering this all-important “Truth.” (3) For those interested in the etymologies of “geek” and “nerd,” I found this cool writeup.

Nerd all the way. Except in the case of music, where the stacks of records in my studio and the pile of synthesizers in my basement indicate I cover the board on all sides. Including vegetables. Excepting cellist.

nerdy! i like this distinction. i think of damp as a description of something I would expect to be dry, a surface condition, whereas moist is a degree of wetness and refers more to the thing itself . I personally think of moist as being wetter than damp.

Because there are two measurements for each #hashtag: the PMIs for “geek” and for “nerd.” The paired t-test gives us a more meaningful measure of whether the PMI for one is systematically higher than the other.

What’s not on that chart, and should be, is Shakespeare. I am both a Shakespeare geek and nerd (although my geekiness and nerdosity is not limited to that). I’m not alone. And there is a surprising degree of overlap between that Bill and various other Bills.

Bill is surprisingly absent from Twitter, it would seem. That, or there were too many variations of his name to show up on the chart. Bill, William, Shakespeare, misspellings of Shakespeare (so many), Billy, Will, #BillShakespeare, #WillShakespeare, #WilliamShakespeare…who knows how many others.

So… nobody’s referenced the song “Pencil-neck geeks”? In that song, there’s no difference between the two–it’s pretty old, though–70s or 80s. Back then, I don’t remember making any distinction between the two. I wonder when this started?

Is e meaning of the words consistent over time? If we took some other data the spanned many years and partitioned it in five year slices, would we then see the meaning change over time? I imagine geek has become much more positive over the last 10 years while nerd may have begun to even more take on the meaning of just being socially awkward without requiring any of the other traits described.

I really don’t consider your analysis to be all that accurate. In fact, judging by the way you wrote the article (and correct me if I’m wrong), you already had your conclusion set before performing the experiment.

Yes, according to your data the geeks tend to focus on collections, but I find the evidence suggests that geeks are more *practitioners* of their subjects than nerds. For example, both cosplay and #cosplay appear deep in the geek spectrum. Cosplay is inarguably a form of “practicing” a fandom.

Nerds are not necessarily achievement oriented, but specifically *academic achievement* oriented. The presence of entrepreneur and #kickstarter on the geek side demonstrate a strong achievement orientation in the geek spectrum as well as a firm emphasis on the consideration of ideas.

My own conclusion from the data is that nerds tend to think and study their subjects academically while geeks tend to focus on entertainment and actively involve themselves in their subjects culturally. For example, a Dr Who geek would dress up as a blue phone box and go to conventions. Nerds tend to focus on intellectual subjects and study the trivia of their subject. For example a Dr Who nerd would know every fact and figure and obscure canon reference for the series but may never have a costume or participate in any convention or event. Both would likely cite me for not spelling out Doctor Who.

Just as a side note, could this study have been done 10 years ago, the results would have been significantly different. “Geek” has become pop culture, and thus many of the traditionally “geeky” key words have migrated closer to the nerd side of the spectrum as more fashion-centric keywords have invaded on the geek side.

I would like to see a broader study done with different data sets. I don’t use twitter, pinterest, or google so, using me as an example of a class, there is a large subset of geek/nerd culture that would not be included in such studies. I do use facebook (it’s how I found this study), but I don’t believe facebook has any open data sets to work with. If you could combine data sets from Google, Yahoo! and Bing that may provide a more comprehensive working set.

I conclude by suggesting that when speaking of my entertainment interests and hobbies (movies, music, games etc) I am a geek; when speaking of my intellectual interests and hobbies (languages, cultures, religions, meaningless research like this) I am a nerd. What does that make me? A “gerd”? A “neek”? Perhaps I’m a greek, if you grock.

Actually, I like this interpretation and I don’t think it’s that far off from what I had imagined… perhaps I just didn’t articulate as well. Hence why I consider myself more of a science nerd but a music geek.

I actually thought it was a fine distinction. I’m not entirely sure “practicing” fandom (via cosplay) can be considered equivalent to practicing what the cosplay/fandom actually is.

For example, I often cosplay as a Klingon [cellist] with my Klingon band, which isn’t the same thing as actually writing/performing Klingon music or analyzing Klingon music theory (activities which I also do). While the latter might be closer to being a form of [nerdy] practice, I rarely feel like the former is.

All that just boils down to the issue that, well, Klingons are a fictional race anyway so how can my Klingon music activities and practice really be, well, a form of practice? For example, Cosplaying a Klingon warrior isn’t the same thing as practicing (i.e. training) to be a Klingon warrior.

Then there’s simply the uneasy relationship I have with various fandoms. Sometimes I wonder how different the world might be if all the folks putting so much energy into a culture centered around fictional cultures would actually learn about ‘non-fictional’ cultures–especially those in their own regions. I’m often bemused that folks know more about me-qua-Klingon than me-qua-Thai.

interesting comparison of common usage of the terms. I haven’t read all the comments but maybe there is a degree of introversion or extroversion in the equation. Nerds are simply into their own thing whereas Geeks are more in your face about their obsessions?

A geek cares too much about things no one else knows about, a nerd knows too much about things no one cares about.

For instance, a grammar nerd would rewrite the above sentence so that it doesn’t end in a preposition. A grammar geek would go on a tirade about the fact in ends in a preposition. It is possible to be both.

Thank you so much for accidentally helping me with my MA thesis!
A friend pointed out this article to me for fun reading. Then I saw your link to your “real research” and I thought, well ok, maybe a nerd has something cool to contribute to my extremely nerdy MA thesis in linguistics. I’m writing about pragmatic speech barriers in the communication of humans and machines with a special focus on AI. And voilá, you’ve written several really cool and helpful articles about machine learning! So thank you, thank you, thank you:-)

Nice and interesting chart, however you should delete autism and autistic. Use of the words geek or nerd in reference to the medical terms autism and autistic may increase the undeserved stigma patients of autism endure.

The presence of these words indicates an association, whether good or bad, between autism and geek/nerd-ism. There’s actually evidence to support a medical association between them (i.e. geeks/nerds tend to fall into the autism spectrum more than the general public).

But whether or not that is the case, to delete the terms would be tampering with his research results, which is counter-productive to science. If you don’t want autistic persons to have a stigma, then don’t stigmatize them.

Yes, negative PMI is possible: that implies that the occurrence of “geek” (or “nerd”) in a tweet decreases the chance to seeing the other word. I link to a spreadsheet in the article that has a bunch of the raw PMI scores, including negative ones.

perhaps you should mention “data noise”, or limitations or something caused by the short 1 month timeframe. My understanding is that Olivia Culpo (Miss Universe Dec 2012), is a self described “Cellist Nerd”.

Nice. I have always taken nerd to mean someone who had short comings in the areas of social interaction and Geek to be some one who is intensely interested in some subject. So frequently some one is such a geek that he/she has little time for socializing and thus are a nerd too. But then you get some one who is very socially apt but also “in to” something. And of course you have people who are just not very social.

your color code should represent the continuum too. It only discriminates between geek and nerd but it doesn’t represent the intensity. According to colors, for instance “vegetables” is as nerdy as “cellist”. You should implement gradients

I found that funny, too. I haven’t had the chance to look for the tweets responsible to see the context, but my guess is that there was a spike of tweets in December about “nerdy schoolgirl” type fashion, which tipped it well above the baseline probability of “miniskirt” in the 2.6M corpus (similar to “cellist” as explained above). For instance, nerd mini skirts on Etsy.

A post on Gizmodo brought my to this post which sent me to your google scholar page which lead me to your home page which lead me to Duolingo which I never would have found without your post. What a wonderful thing the internet. A truly wonderful thing.

Your statistics show correlation, not causality as no variable was manipulated a priori… (i.e. phrases such as “Collections ARE geeky” and “Academic fields ARE nerdy” cannot be stated, only that they are correlated…) Thoroughly interesting read though! 🙂 #nerdforlife #geekforever

Good point. So, if you have any ideas about how to do a controlled study that manipulates hundreds of thousands of word variables with real human subjects, I’m sure the entire field of distributional semantics would love to talk to you about it. 🙂

As someone who could justly be accused of being a nerd, a geek, AND a sports fan… I’m thinking of the frequent reference to the practitioners of advanced analytics (especially in baseball) as “sports/stats/numbers geeks” or “[…] nerds,” pretty much indiscriminately as far as I can tell. (Though I haven’t done anything close to this kind of study, I just read too many sports blogs when I should be doing my more conventionally nerdy work.)

Would it make sense to try it without distinguishing the hashtags, i.e. to include both spiderman and #spiderman as the same thing? I don’t understand this technique to understand well enough if it would matter…and maybe these things are sufficiently different that your way is better justified.

BTW, this is Sarah Hamersma, who you may or may not remember from Madison. My husband Rob somehow came upon this. You are internet famous!!!

Hi Sarah… of course I remember you guys! Collapsing hashtags is a fine thing to do… I think I mainly wanted to see if there were subtle distinctions between a word and its hashtag, or to see if hashtags in general were more associated with one over the other. In the end, I think separating them probably just introduced noise into the analysis. 😦

When I saw the Simpsons characters I wondered what I am, since I’m like the guy at the left, dressed like the one at the right. When I saw the graphic reality smacked me in the face. I cover almost all of it! LOL.

Fascinating stuff! A very interesting read – I always find the geek/nerd divide a very interesting area, and I love the idea that the nerdy side contains more abstract terms than the geeky side which is full of ‘stuff’ as you put it. Well done on this.

I always thought of myself as a geek (sort of a generalist though…), but your detailed evaluation paints me as walking the line between geek and nerd. I always wondered why even my geeky friends think I’m so weird. Thanks for the clarification!

I keep saying that I want a neon sign ‘Geek’ flag for my office once I become a professor, as I fly that flag proudly. However the fact that I am an academic (and got excited over this data) puts me squarely in the ‘Nerd’ camp. As usual, I find myself hard to codify!