He Counts Your Words (Even Those Pronouns)

Tuesday

Oct 14, 2008 at 12:01 AMOct 15, 2008 at 5:23 AM

James W. Pennebaker looks at every single word people use — even the tiny ones — and is leading a resurgent interest in text analysis.

James W. Pennebaker’s interest in word counting began more than 20 years ago, when he did several studies suggesting that people who talked about traumatic experiences tended to be physically healthier than those who kept such experiences secret. He wondered how much could be learned by looking at every single word people used — even the tiny ones, the I’s and you’s, a’s and the’s.

That led Dr. Pennebaker, a professor of psychology at the University of Texas, down a winding path that has taken him from Beatles lyrics (John Lennon’s songs have more “negative emotion” words than Paul McCartney’s) all the way to terrorist communications. By counting the different kinds of words a person says, he is breaking new linguistic ground and leading a resurgent interest in text analysis.

Take Dr. Pennebaker’s recent study of Al Qaeda communications — videotapes, interviews, letters. At the request of the F.B.I., he tallied the number of words in various categories — pronouns, articles and adjectives, among others.

He found, for example, that Osama bin Laden’s use of first-person pronouns (I, me, my, mine) remained fairly constant over several years. By contrast, his second-in-command, Ayman al-Zawahri, used such words more and more often.

“This dramatic increase suggests greater insecurity, feelings of threat, and perhaps a shift in his relationship with bin Laden,” Dr. Pennebaker wrote in his report , which was published in The Content Analysis Reader (Sage Publications, July 2008).

Kimberly A. Neuendorf, a professor of communications at Cleveland State University who has extensively studied content analysis, agreed with that assessment. Mr. Zawahri, she said, “is clearly repositioning himself to provide a singular platform for his opinion” and “reaffirming his status as an important individual in the dynamic.”

Because it is hard for the human brain to count and compare all the I’s, a’s and the’s in a sample of speech or writing, Dr. Pennebaker had to invent a software program to do it. The program, Linguistic Inquiry and Word Count (LIWC, pronounced luke), contains a vast dictionary, with each word assigned to one or more categories.

There are social words (talk, they), biological words (cheek, hands, spit), “insight” words (think, know, consider) and dozens of other groupings. LIWC compares a text sample to its dictionary and, within seconds, provides a readout of how many words appear in each category.

To test-drive the program, Dr. Pennebaker, a pioneer in the field of therapeutic writing, asked a group of people recovering from serious illness or other trauma to engage in a series of writing exercises. The word tallies showed that those whose health was improving tended to decrease their use of first-person pronouns through the course of the study.

Health improvements were also seen among people whose use of causal words — because, cause, effect — increased. Simply ruminating about an experience without trying to understand the causes is less likely to lead to psychological growth, he explained; the subjects who used causal words “were changing the way they were thinking about things.”

Dr. Pennebaker, 58, has conducted numerous studies since then, all of them demonstrating that it’s not just what we say that matters but how we say it. Where traditional linguistics “is really more interested in context, how sentences are put together and what a meaningful phrase is,” he said, “our approach is simply counting words.”

In study after study, the articles and pronouns, which text analysts often call “junk words,” have proved crucial.

For example, Dr. Pennebaker has found that men tend to use more articles (a, the) and women tend to use more pronouns (he, she, they). The difference, he says, may suggest that men are more prone to concrete thinking and women are more likely to see things from other perspectives.

Jeffrey T. Hancock, an associate professor of communication at Cornell, uses word counting to study language and deception, particularly on the Internet.

Liars, he says, use more “negative emotion” words (hurt, ugly, nasty) and fewer first-person singulars. “These very simple dimensions have emerged again and again,” he said, “despite the fact that there were 40 years of research before this.”

Dr. Pennebaker says that because speech patterns are akin to a personal signature, his software might be used to identify authors of anonymous blogs and e-mail messages, and as supporting evidence in legal testimony. But he acknowledges that it cannot be definitive; too much depends on probability.

Still, the technique is drawing attention from a variety of sectors. Dr. Pennebaker has received a grant from the Army Research Institute to study the language of social dynamics, particularly how leaders use language. Joseph Psotka, a research psychologist at the institute, said that over time, this kind of study “could be very helpful for training and leadership development, but precisely how we don’t know yet.”

Dr. Pennebaker’s program has been translated into several languages, with an Arabic version in the works; Dr. Pennebaker notes that his Qaeda analysis was constrained by its reliance on English translations.

“Function words vary between one language and another and reveal a lot about another culture,” he said.

Dr. Psotka said counting and categorizing the words used by a foreign speaker could provide clues about “the subtle attitudes, not just the meaning of the words — to get a sense of whether or not negotiation or discussion is going smoothly.”

Dr. Pennebaker has also turned his word-counting machinery toward the presidential campaign (at wordwatchers.wordpress.com), and he likes to look at age-old questions like whether Shakespeare had a co-playwright, who wrote the Federalist Papers and even whether a couple will stay together.

“The more similar they are in terms of language,” Dr. Pennebaker said, “the more likely they are to be together several months later.”

Never miss a story

Choose the plan that's right for you.
Digital access or digital and print delivery.