Tag Archives: linguist

While people keep banging on about Chomsky as being the be all and end all of linguistics (I’m looking at you philosophers of language), there have been many linguists who have had a much more substantial impact on how we actually think about language in a way that matters. In my post on why Chomsky is not really a linguist at all I listed a few.

Sadly, one of these linguists died yesterday. It was Charles J Fillmore who was a towering figure among linguists without writing a single book. In my mind, he changed the face of linguistics three times with just three articles (one of them co-authored). Obviously, he wrote many more but compared to his massive impact, his output was relatively modest. His ideas have been with me all through my life as a linguist and on reflection, they form a foundation about what I know language to be. Therefore, this is not so much an obituary (for which I’m hardly the most qualified person out there) as a manifesto for a linguistics of a truly human language.

The case for Fillmore

The first article, more of a slim monograph at 80 odd pages, was Case for Case (which, for some reason, I first read in Russian translation). Published in 1968 it was one of the first efforts to find deeper functional connections in generative grammar (following on his earlier work with transformations). If you’ve studied Chomskean Government and Binding, this is where thematic roles essentially come from. I only started studying linguistics in 1991 which is when Case for Case was already considered a classic. Particularly in Prague where function was so important. But even after all those years, it is still worth reading for any minimalist out there. Unlike so many in today’s divided world, Fillmore engaged with the whole universe of linguistics, citing Halliday, Tesniere, Jakobson, Whorf, Jespersen, and others while giving an excellent overview of the treatment of case by different theories and theorists. But the engagement went even deeper, the whole notion of ‘case’ as one “base component of the grammar of every language” brought so much traditional grammar back into contact with a linguistics that was speeding away from all that came before at a rate of knots.

From today’s perspective, its emphasis on the deep and surface structures, as well as its relatively impoverished semantics may seem a bit dated, but it represents an engagement with language used to express real meaning. The thinking that went into deep cases transformed into what has become known as Frame Semantics (“I thought of each case frame as characterizing a small abstract ‘scene’ or ’situation’, so that to understand the semantic structure of the verb it was necessary to understand the properties of such schematized scenes” [1982]) which is where things really get interesting.

Fillmore in the frame

When I think about frame semantics, I always go to his 1982 article Frame Semantics published in the charmingly named conference proceedings ‘Linguistics in the morning calm’ but it had its first outing in 1976. George Lakoff used it as one of the key inspirations to his idealized cognitive models in Women, Fire, and Dangerous things which is where this site can trace its roots. As I have said before, I essentially think about metaphors as a special kinds of frames.

In it, he says:

By the term ‘frame’ I have in mind any system of concepts related in such a way that to understand anyone of them you have to understand the whole structure in which it fits; when one of the things in such a structure is introduced into a text, or into a conversation, all of the others are automatically made available. I intend the word ‘frame’ as used here to be a general cover term for the set of concepts variously known, in the literature on natural language understanding, as ‘schema: ‘script’, ‘scenario’, ‘ideational scaffolding’, ‘cognitive model’, or ‘folk theory’.

It is a bit of a mouthful but it captures in a paragraph the absolute fundamentals of the semantics of human language as opposed to projecting the rules of formal logic and truth conditions onto an impoverished version of language that all the generative-inspired approaches try to do. Also, it brings together many other concepts from different fields of scholarship. Last year I presented a paper on the power of the concept of frame where I found even more terms that have a close affinity to it which only underscores the far reaching consequences of Fillmore’s insight.

As I was looking for some more quotes from that article, I realized that I’d have to pretty much cut and paste in the whole of it. Almost, every sentence there is pure gold. Rereading it now after many many years, it’s becoming clear how many things from it I’ve internalized (and frankly, reinvented some of the ideas I forgot had been there).

Constructing Fillmore

About the same time, and merging the two earlier insights, Fillmore started working on the principles that have come to be known as construction grammar. Although, by then, the ideas were some years old, I always think of his 1988 article with Paul Kay and Mary Catherine O’Conner as a proper construction grammar manifesto. In it they say:

The overarching claim is that the proper units of a grammar are more similar to the notion of construction in traditional and pedagogical grammars than to that of rule in most versions of generative grammar.

Constructions, according to Fillmore have these properties:

They are not limited to the constituents of a single syntactic tree. Meaning, they span what has been considered as the building blocks of language.

They specify at the same time syntactic, lexical, semantic and pragmatic information.

Lexical items can also be viewed as constructions (this is absolutely earth shattering and I don’t think linguistics has come to grips with it, yet).

They are idiomatic. That is, their meaning is not built up from their constituent parts.

Although Lakoff’s study of ‘there constructions’ in Women, Fire, and Dangerous Things came out a year earlier (and is still essential reading), I prefer Fillmore as an introduction to the subject (if only because I never had to translate it).

The beauty of construction grammar (just as the beauty of frame semantics) is in that it can bridge much of the modern thinking about language with grammatical insights and intuitions of generations of researchers from across many schools of thought. But I am genuinely inspired by its commitment to language as a whole, expressed in the 1999 article by Fillmore and Kay:

To adopt a constructional approach is to undertake a commitment in principle to account for the entirety of each language. This means that the relatively general patterns of the language, such as the one licensing the ordering of a finite auxiliary verb before its subject in English as illustrated in 1, and the more idiomatic patterns, such as those exemplified in 2, stand on an equal footing as data for which the grammar must provide an account.

(1) a. What have you done? b. Never will I leave you. c. So will she. d. Long may you prosper! e. Had I known, . . . f. Am I tired! g. . . . as were the others h. Thus did the hen reward Beecher.

(2) a. by and large b. [to] have a field day c. [to] have to hand it to [someone] d. (*A/*The) Fool that I was, . . . e. in x’s own right

Given such a commitment, the construction grammarian is required to develop an explicit system of representation, capable of encoding economically and without loss of generalization all the constructions (or patterns) of the language, from the most idiomatic to the most general.

Notice that they don’t just say ‘language’ but ‘each language’. Both of those articles give ample examples of how constructions work and what they do and I commend them to your linguistic enjoyment.

Ultimately, I do not subscribe to the exact version of construction grammar that Fillmore and Kay propose, agreeing with William Croft that it is still too beholden to the formalist tradition of the generative era, but there is something to learn from on every page of everything Fillmore wrote.

Once more with meaning: the FrameNet years

Both frame semantics and construction grammar impacted Fillmore’s work in lexicography with Sue Atkins and culminated in FrameNet a machine readable frame semantic dictionary providing a model for a semantic module to a construction grammar. To make the story complete, we can even see FrameNet as a culmination of the research project begun in Case for Case which was the development of a “valence dictionary” (as he summarized it in 1982). While FrameNet is much more than that and has very much abandoned the claim to universal deep structures, it can be seen as accomplishing the mission of a language with meaning Fillmore set out on in the 1960s.

Remembering Fillmore

I only met Fillmore once when he came to lecture at a summer school in Prague almost twenty years ago. I enjoyed his lectures but was really too star struck to take advantage of the opportunity. But I saw enough of him to understand why he is remembered with deep affection and admiration by all of his colleagues and students whose ranks form a veritable who’s who of linguists to pay attention to.

In my earlier post, I compared him in stature and importance to Roman Jakobson (even if Jakobson’s crazy voluminous output across four languages dwarfs Fillmore’s – and almost everyone else’s). Fillmore was more than a linguist’s linguist, he was a linguist who mattered (and matters) to anyone who wanted (and wants) to understand how language works beyond a few minimalist soundbites. Sadly it is possible to meet graduates with linguistics degrees who never heard of Jakobson or Fillmore. While it’s almost impossible to meet someone who doesn’t know anything about language but has heard of Chomsky. But I have no doubt that in the decades of language scholarship to come, it will be Fillmore and his ideas that will be the foundation upon which the edifice of linguistics will rest. May he rest in peace.

Post Script

I am far from being an expert on Fillmore’s work and life. This post reflects my personal perspective and lessons I’ve learned rather than a comprehensive or objective reference work. I may have been rather free with the narrative arc of his work. Please be free with corrections and clarifications. Language Log reposted a more complete profile of his life.

Note: This was intended to be a brief note. Instead it developed into a monster post that took me two weeks of stolen moments to write. It’s very light on non-blog references but they exist. Nevertheless, it is still easy to find a number of oversimplifications, conflations, and other imperfections below. The general thrust of the argument however remains.

How Far Can You Trust a Neuroscientist?

Image via Wikipedia

A couple of days ago I watched a TED talk called the Linguistic Genius of Babies by Patricia Kuhl. I had been putting it off, because I suspected I wouldn’t like it but I was still disappointed at how hidebound it was. It conflated a number of really unconnected things and then tried to sway the audience to its point of view with pretty pictures of cute infants in brain scanners. But all it was, is a hodgepodge of half-implied claims that is incredibly similar to some of the more outlandish claims made by behaviorists so many years ago. Kuhl concluded that brain research is the next frontier of understanding learning. But she did not give a simple credible example of how this could be. She started with a rhetorical trick. Mentioned an at-risk language with a picture of a mother holding an infant facing towards her. And then she said (with annoying condescension) that this mother and the other tribe members know something we do not:

What this mother — and the 800 people who speak Koro in the world — understand that, to preserve this language, they need to speak it to the babies.

This is garbage. Languages do not die because there’s nobody there to speak it to the babies (until the very end, of course) but because there’s nobody of socioeconomic or symbolic prestige children and young adults can speak the language to. Languages don’t die because people can’t learn them, they die because they have no reason (other than nostalgia) to learn them or have a reason not to learn them. Given a strong enough reason they would learn a dying language even if they started at sixteen. They just almost never are given the reason. Why Kuhl felt she did not need to consult the literature on language death, I don’t know.

Patricia Kuhl has spent the last 20 years studying pretty much one thing: acoustic discrimination in infants (http://ilabs.washington.edu/kuhl/research.html). Her research provided support for something that had been already known (or suspected), namely that young babies can discriminate between sounds that adults cannot (given similar stimuli such as the ones one might find in the foreign language classroom). She calls this the “linguistic genius of babies” and she’s wrong:

Babies and children are geniuses until they turn seven, and then there’s a systematic decline.

First, the decline (if there is such a thing) is mostly limited to acoustic processing and even then it’s not clear that the brain is the thing that causes it. Second, being able to discriminate (by moving their head) between sounds in both English and Mandarin at age 6 months is not a sign of genius. It’s a sign of the baby not being able to differentiate between language and sound. Or in other words, the babies are still pretty dumb. But it doesn’t mean they can’t learn a similar distinction at a later age – like four or seven or twelve. They do. They just probably do it in a different way than a 6-month old would. Third, in the overall scheme of things, acoustic discrimination at the individual phoneme level (which is what Kuhl is studying) is only a small part of learning a language and it certainly does NOT stop at 7 months or even 7 years of age. Even children who start learning a second language at the age of 6 achieve a native-like phonemic competence. And even many adults do. They seem not to perform as well on certain fairly specialized acoustic tests but functionally, they can be as good as native speakers. And it’s furthermore not clear that accent deficiencies are due to the lack of some sort of brain plasticity. Fourth, language learning and knowledge is not a binary thing. Even people who only know one language know it to a certain degree. They can be lexically, semantically and syntactically quite challenged when exposed to a sub-code of their language they have little to no contact with. So I’m not at all sure what Kuhl was referring to. François Grosjean (an eminent researcher in the field) has been discussing all this on his Life as Bilingual blog (and in books, etc.). To have any credibility, Kuhl must address this head on:

There is no upper age limit for acquiring a new language and then continuing one’s life with two or more languages. Nor is there any limit in the fluency that one can attain in the new language with the exception of pronunciation skills.

Instead she just falls on old prejudices. She simply has absolutely nothing to support this:

We think by studying how the sounds are learned, we’ll have a model for the rest of language, and perhaps for critical periods that may exist in childhood for social, emotional and cognitive development.

A paragraph like this may get her some extra funding but I don’t see any other justification for it. Actually, I find it quite puzzling that a serious scholar would even propose anything like this today. We already know there is no critical period for social development. Well, we don’t really know what social development is, but there’s no critical brain period to what there is. We get socialized to new collective environments throughout our lives.

But there’s no reason to suppose that learning to interact in a new environment is anything like learning to discriminate between sounds. There are some areas of language linked to perception where that may partly be the case (such as discriminating shapes, movements, colors, etc.) but hardly things like morphology or syntax, where much more complexity is involved. But this argument cuts both ways. Let’s say a lot of language learning was like sound development. And we know most of it continues throughout life (syntax, morphology, lexicon) and it doesn’t even start at 6 months (unless you’re a crazy Chomskean who believes in some sort of magical parameter setting). So if sound development was like that, maybe it has nothing to do with the brain in the way Kuhl imagines – although she’s so vague that she could always claim that that’s what she’d had in mind. This is what Kuhl thinks of as additional information:

We’re seeing the baby brain. As the baby hears a word in her language the auditory areas light up, and then subsequently areas surrounding it that we think are related to coherence, getting the brain coordinated with its different areas, and causality, one brain area causing another to activate.

So what? We know that that’s what was going to happen. Some parts of the brain were going to light up as they always do. What does that mean? I don’t know. But I also know that Patricia Kuhl and her colleagues don’t know either (at least not in the way she pretends). We speak a language, we learn a language and at the same time we have a brain and things happen in the brain. There are neurons and areas that seem to be affected by impact (but not always and not always in exactly the same way). Of course, this is an undue simplification. Neuroscientists know a huge amount about the brain. Just not how it links to language in a way that would say much about the language that we don’t already know. Kuhl’s next implied claim is a good example of how partial knowledge in one area may not at all extend to knowledge in another area.

What you see here is the audio result — no learning whatsoever — and the video result — no learning whatsoever. It takes a human being for babies to take their statistics. The social brain is controlling when the babies are taking their statistics.

In other words, when the children were exposed to audio or video as opposed to a live person, no effect was shown. At 6 months of age! As is Kuhl’s wont, she only hints at the implications, but over at the Royal Society’s blog comments, Eric R. Kandel has spelled it out:

I’m very much taken with Patricia Kuhl’s finding in the acquisition of a second language by infants that the physical presence of a teacher makes enormous difference when compared to video presence. We all know from personal experience how important specific teachers have been. Is it absurd to think that we might also develop methodologies that would bring out people’s potential for interacting empathically with students so that we can have a way of selecting for teachers, particularly for certain subjects and certain types of student? Neuroscience: Implications for Education and Lifelong Learning.

But this could very well be absurd! First, Kuhl’s experiments were not about second language acquisition but sensitivity to sounds in other languages. Second, there’s no evidence that the same thing Kuhl discovered for infants holds for adults or even three-year olds. A six-month old baby hasn’t learned yet that the pictures and sounds coming from the machine represent the real world. But most four-year olds have. I don’t know of any research but there is plenty of anecdotal evidence. I have personally met several people highly competent in a second language who claimed they learned it by watching TV at a young age. A significant chunk of my own competence in English comes from listening to radio, audio books and watching TV drama. How much of our first language competence comes from reading books and watching TV? That’s not to say that personal interaction is not important – after all we need to learn enough to understand what the 2D images on the screen represent. But how much do we need to learn? Neither Kuhl nor Kandel have the answer but both are ready (at least by implication) to shape policy regarding language learning. In the last few years, several reports raised questions about some overreaching by neuroscience (both in methods and assumptions about their validity) but even perfectly good neuroscience can be bad scholarship in extending its claims far beyond what the evidence can support.

The Isomorphism Fallacy

The fundamental problem underlying the overreach of basic neuroscience research is the fallacy of isomorphism. This fallacy presumes that the same structures we see in language, behavior, society must have structural counterparts in the brain. So there’s a bit of the brain that deals with nouns. Another bit that deals with being sorry. Possibly another one that deals with voting Republican (as Woody Allen proved in “Everyone Says I Love You“). But at the moment the evidence for this is extremely weak, at best. And there is no intrinsic need for a structural correspondence to exist. Sidney Lamb came up with a wonderful analogy that I’m still working my way through. He says (recalling an old ‘Aggie‘ joke) that trying to figure out where the bits we know as language structure are in the brain is like trying to work out how to fit the roll that comes out of a tube of tooth paste back into the container. This is obviously a fool’s errand. There’s nothing in the tooth-paste container that in any way resembles the colorful and tubular object we get when we squeeze the paste container. We get that through an interaction of the substance, the container, external force, and the shape of the opening. It seems to me entirely plausible, that the link between language and the brain is much more like that between the paste, the container and their environment than like that between a bunch of objects and box. The structures that come out are the result of things we don’t quite understand happening in the brain interacting with its environment. (I’m not saying that that’s how it is, just that it’s plausible.) The other thing to lends it credence is the fact that things like nouns or fluency are social constructs with fuzzy boundaries, not hard discrete objects, so actually localizing them in the brain would be a bit of a surprise. Not that it can’t be done, but the burden of evidence of making this a credible finding is substantial.

Now, I think that the same problem applies to looking for isomorphism the other way. Lamb himself tries to look at grammar by looking for connections resembling the behavior of activating neurons. I don’t see this going anywhere. George Lakoff (who influenced me more than any other linguist in the world) seems to think that a Neural Theory of Language is the next step in the development of linguistics. At one point he and many others thought that mirror neurons say something about language but now that seems to have been brought into question. But why do we need mirror neurons when we already know a lot of the immitative behaviors they’re supposed facilitate? Perhaps as a treatment and diagnostic protocol for pathologies but is this really more than story-telling? Jerome Feldman described NTL in his book “From Molecule to Metaphor” but his main contribution seems to me lies in showing how complex language phenomena can be modelled with brain-like neural networks, not saying anything new about these phenomena (see here for an even harsher treatment). The same goes for the Embodied Construction Grammar. I entirely share ECG’s linguistic assumptions but the problem is that it tries to link its descriptive apparatus directly to the formalisms necessary for modeling. This proved to be a disaster for the generative project that projected its formalisms into language with a imperfect fit and now spends most of its time refining those formalisms rather than studying language.

So far I don’t see any advantage in linking language to the brain in either the way Kuhl et al or Feldman et al try to do it (again with the possible exception of pathologies). In his recent paper on compositionality, Feldman describes research that shows that spacial areas are activated in conjunction with spatial terms and that sentence processing time increases as the sentence gets removed from “natural spatial orientation”. But brain imaging at best confirms what we already knew. But how useful is that confirmatory knowledge? I would argue that not very useful. In fact there is a danger that we will start thinking of brain imaging as a necessary confirmation of linguistic theory. Feldman takes a step in this dangerous direction when he says that with the advent of new techniques of neuroscience we can finally study language “scientifically”. [Shudder.]

We know there’s a connection between language and the brain (more systematic than with language and the foot, for instance) but so far nobody’s shown convincingly that we can explain much about language by looking at the brain (or vice versa). Language is best studied as its own incredibly multifaceted beast and so is the brain. We need to know a lot more about language and about the brain before we can start projecting one into the other.

And at the moment, brain science is the junior partner, here. We know a lot about language and can find out more without looking for explanations in the brain. It seems as foolish as trying to illuminate language by looking inside a computer (as Chomsky’s followers keep doing). The same question that I’m asking for language was asked about cognitive processes (a closely related thing) by William Uttal in The New Phrenology who’s asking “whether psychological processes can be defined and isolated in a way that permits them to be associated with particular brain regions” and warns against a “neuroreductionist wild goose chase” – and how else can we characterize Kuhl’s performance – lest we fall “victim to what may be a ‘neo-phrenological’ fad”. Michael Shremer voiced a similar concern in the Scientific American:

The brain is not random kludge, of course, so the search for neural networks associated with psychological concepts is a worthy one, as long as we do not succumb to the siren song of phrenology.

What does a “siren song of phrenology” sound like? I imagine it would sound pretty much like this quote by Kuhl:

We are embarking on a grand and golden age of knowledge about child’s brain development. We’re going to be able to see a child’s brain as they experience an emotion, as they learn to speak and read, as they solve a math problem, as they have an idea. And we’re going to be able to invent brain-based interventions for children who have difficulty learning.

I have no doubt that there are some learning difficulties for which a ‘brain-based intervention’ (whatever that is) may be effective. But it’s just a relatively small part of the universe of learning difficulties that it hardly warrants a bombastic claim like the one above. I could find nothing in Kuhl’s narrow research that would support this assertion. Learning and language are complex psycho-social phenomena that are unlikely to have straightforward counterparts in brain activations such as can be seen by even the most advanced modern neuroimaging technology. There may well be some straightforward pathologies that can be identified and have some sort of treatment aimed at them. The problem is that brain pathologies are not necessarily opposites of a typically functioning brain (a fallacy that has long plagued interpretation of the evidence from aphasias) – it is, as brain plasticity would suggest, just as likely that at least some brain pathologies simply create new qualities rather than simply flipping an on/off switch on existing qualities. Plus there is the historical tendency of the self-styled hard sciences to horn in on areas where established disciplines have accumulated lots of knowledge, ignore the knowledge, declare a reductionist victory, fail and not admit failure.

For the foreseeable future, the brain remains a really poor metaphor for language and other social constructs. We are perhaps predestined to finding similarities in anything we look at but researchers ought to have learned by now to be cautious about them. Today’s neuroscientists should be very careful that they don’t look as foolish to future generations as phrenologists and skull measurers look to us now.

In praise of non-reductionist neuroscience

Let me reiterate, I have nothing against brain research. The more of it, the better! But it needs to be much more honest about its achievements and limitations (as much as it can given the politics of research funding). Saying the sort of things Patricia Kuhl does with incredibly flimsy evidence and complete disregard for other disciplines is good for the funding but awful for actually obtaining good results. (Note: The brevity of the TED format is not an excuse in this case.)

A much more promising overview of applied neuroscience is a report by the Royal Society on education and the brain that is much more realistic about the state of neurocognitive research who admit at the outset: “There is enormous variation between individuals, and brain-behaviour relationships are complex.”

The report authors go on to enumerate the things they feel we can claim as knowledge about the brain:

The brain’s plasticity

The brain’s response to reward

The brain’s self-regulatory processes

Brain-external factors of cognitive development

Individual differences in learning as connected to the brain and genome

Neuroscience connection to adaptive learning technology

So this is a fairly modest list made even more modest by the formulations of the actual knowledge. I could only find a handful of statements made to support the general claims that do not contain a hedge: “research suggests”, “may mean”, “appears to be”, “seems to be”, “probably”. This modesty in research interpretation does not always make its way to the report’s policy suggestions (mainly suggestions 1 and 2). Despite this, I think anybody who thinks Patricia Kuhl’s claims are interesting would do well do read this report and pay careful attention to the actual findings described there.

Another possible problem for those making wide reaching conclusions is a relative newness of the research on which these recommendations are based. I had a brief look at the citations in the report and only about half are actually related to primary brain research. Of those exactly half were published in 2009 (8) and 2010 (20) and only two in the 1990s. This is in contrast to language acquisition and multilingualism research which can point to decades of consistently replicable findings and relatively stable and reliable methods. We need to be afraid, very afraid of sexy new findings when they relate to what is perceived as the “nature” of humans. At this point, as a linguist looking at neuroscience (and the history of the promise of neuroscience), my attitude is skeptical. I want to see 10 years of independent replication and stable techniques before I will consider basing my descriptions of language and linguistic behavior on neuroimaging. There’s just too much of ‘now we can see stuff in the brain we couldn’t see before, so this new version of what we think the brain is doing is definitely what it’s doing’. Plus the assumption that exponential growth in precision brain mapping will result in the same growth in brain function identification is far from being a sure thing (cf. genome decoding). Exponential growth in computer speed, only led to incremental increases in computer usability. And the next logical step in the once skyrocketing development of automobiles was not flying cars but pretty much just the same slightly better cars (even though they look completely different under the hood).

The sort of knowledge to learn and do good neuroscience is staggeringly awesome. The scientists who study the brain deserve all the personal accolades they get. But the actual knowledge they generate about issues relating to language and other social constructs is much less overwhelming. Even a tiny clinical advance such as helping a relatively small number of people to communicate who otherwise wouldn’t be able to express themselves makes this worthwhile. But we must not confuse clinical advances with theoretical advances and must be very cautious when applying these to policy fields that are related more by similarity than a direct causal connection.

[dropcap]I[/dropcap] have a number of pet peeves about how people use language. I am genuinely annoyed by the use of apostrophes before plural of numerals or acronyms like 50′s or ABC’s. But because I understand how language works, I keep my mouth shut. The usage has obviously moved on. I don’t think, ABC’s is wrong or confusing, I just don’t like the way it looks. But I don’t like a lot of things that there’s nothing wrong with. I get over it.

Recently I came across a couple of blog posts pontificating on the misuse or overuse of the word literally. And as usual they confuse personal dislike with incorrect or confusing usage. So let’s set the record straight! No matter what some dictionaries or people who should know better say, the primary function of the word “literally” in the English language is to intensify the meaning of figurative, potentially figurative or even non-figurative expressions. This is not some colloquial appendage to the meaning of the word. That’s how it is used in standard English today. Written, edited and published English! Frequently, it is used to intensify expressions that are for all intents and purposes non-figurative or where the figurative nature of the expression can be hypostesized:

1. “Bytches is literally a record of life in a nineties urban American community.” [BNC]

2. “it’s a a horn then bassoon solo, and it it’s a most worrying opening for er a because it is. it is literally a solo, er unaccompanied” [BNC]

3. “The evidence that the continents have drifted, that South America did indeed break away from Africa for instance, is now literally overwhelming” [BNC, Richard Dawkins]

The TIME magazine corpus can put pay to the non-sense about “literally” as an intensifier being new or colloquial. The use of the word in all functions does show an increase from the 40s, peak in the 1980s and 2000s returning to the level of 1950s. I didn’t do the counting (plus it’s often hard to decide) but at a glance the proportion of intensifier uses is if anything slightly higher in the 1920s than in 2000s:

4. This is almost literally a scheme for robbing Peter to pay Paul. [TIME, 1925]

5. He literally dropped the book which he was reading and seized his sabre. [TIME, 1926]

But there are other things that indicate that the intensifier use of literally is what is represented in people’s linguistic knowledge. Namely collocations. Some of the most common adverbs preceding literally (first 2 words in COCA) are graded: 1. quite (558), 2. almost (119), 5. so (67), 7. too (54), 9. sometimes (42), 12. more, 15. very, 16. often.

7. Squeezed almostliterally between a rock and a hard place, the artery burst. [COCA, SportsIll, 2007]

Another common adverbial collocate is “just” (number 4) often used to support the intensification:

8. they eventually went missing almost justliterally a couple of minutes apart from one another [COCA, CNN, 2004]

Other frequent collocates are non-gradual: “up”, “down”, “out”, “now” but their usage seems coincidental – simply to be attributed to their generally high frequency in English.

The extremely unsurprising finding is that if we don’t limit the collocates by just 2 preceding words, by far the most common collocate of literally is “figuratively” (304). Used exclusively as part of “literally and figuratively”. This should count as its own use:

9. A romantic tale of love between two scarred individuals, one literally and one figuratively. [COCA, ACAD, 1991]

But even here, sometimes both possible senses of the use are figurative but one is perceived as being less so:

10. After years of literally and figuratively being the golden-haired boy… [COCA, NEWS, 1990]

This brings us to the secondary function (and notice I don’t use the word meaning, here) of “literally”, which is to disambiguate statements that in the appropriate context could have either figurative or literal meaning. Sometimes, we can apply a relatively easy test to differentiate between the two. The first sense cannot be rephrased using the adjective “literal”. However, as we saw above, a statement cannot always be strictly categorized as literal or figurative. For instance, example (2) above contains a disambiguating function although it is not between figurative or non-figurative but rather between two non-figurative interpretations of two situations that it may be possible to describe as a ‘solo’ (one where the soloists is prominent against background music and one where the soloist is completely unaccompanied.) Clear examples are not nearly as easy to find in a corpus, as the prescriptivist lore would have us believe and neither is the figurative part clear cut:

11. And they were about literally to be lynched and they had to save their lives. [COCA, SPOK, 1990]

12. another guy is literally a brain surgeon [COCA, MAG, 2010)

Often the trope does not include a clear domain mapping, as in the case of hyperbole.

13. I was terrified of women. Literally. [COCA, LIFE, 2006]

This type of disambiguation is often used with numerals and other quantifiers where a hyperbolic interpretation might be expected:

14. this is an economy that is generating literally 500,000 jobs because of our foreign trade [COCA, SPOK, PBS, 1996]

15. While there are literally millions of mobile phones that consumers and business people use [COCA, MAG, 2008]

“Literally” also has a technical sense meaning roughly “not figuratively” but that has nothing do with its popular usage. I could not find any examples of this in the corpus.

The above is far from an exhaustive analysis. If I had the time or inclination, we could fine tune the categories but it’s not all that necessary. Everyone should get the gist. “Literally” is primarily an intensifier and secondarily a disambiguator. And categorizing individual uses between these two functions is a matter of degree rather than strict complementarity.

None of the above is hugely surprising, either. “Literally” is a pretty good indicator that figurative language is nearby and a less good indicator that strict fact is in the vicinity. Andrew Goatly has described the language of metaphor including “literally” in his 1997 book. And the people behind the ATT-META Project tell me that they’ve been using “literally” as one of the indicators of metaphoric language.

Should we expect bloggers on language to have read widely on metaphor research? Probably not. But by now I would expect any language blogger to know that to look up something in a dictionary doesn’t tell them much about the use of the word (but a lot about the lexicographer) and the only acceptable basis for argumentation on the usage of words is a corpus (with some well recognized exceptions).

The “Literally Blog” that ran out of steam in 2009 was purportedly started by linguistics graduates who surely cannot have gotten very far past Prescriptivism 101. But their examples are often amusing. As are the ones on the picture site Litera.ly that has great and funny pictures even if they are often more figurative than the phrases they attempt to literalize. Another recent venture “The literally project” was started by a comedian with a Twitter account on @literallytsar who is also very funny. Yes, indeed, as with so many expressions, if we apply an alternative interpretation to them, we get a humorous effect. But what did two language bloggers think they were doing when they put out this and this on “literally”, I don’t know. It got started by Sentence First, who listed all the evidence to the contrary gathered by the Language Log and then went on to ignore it in the conclusion:

Well this is pretty much nonsense. You see, “pretty much” in the previous sentence was a hedge. Hedges, like intensifiers, might be considered superfluous. But I chose to use that instead of a metaphor such as “pile of garbage”. The problem with this statement is twofold. First, no intensifiers add anything to what they intensify. Except for intensification! What if we used “really” or “actually” – what do they add in that “literally” doesn’t? And what about euphemisms and so many other constructions that never add anything to any meaning. Steven Pinker in his recent RSA talk listed 18 different words for “feces”. Why have that many when “shit” would suffice?

Non-literal literally amuses, too, usually unintentionally. The more absurd the literal image is, the funnier I tend to find it. And it is certainly awkward to use literally and immediately have to backtrack and qualify it (“I mean, not literally, but…”). Literally is not, for the most part, an effective intensifier, and it annoys a lot of people. Even the dinosaurs are sick of it.

What is the measure of the effectiveness of an intensifier? The examples above seem to show that it does a decent job. And annoying a lot of prescriptivists should not be an argument for not using it. These people are annoyed by pretty much anything that strikes their fancy. We should annoy them. Non-sexist language also annoys a lot of people. All the more reason for using it.

“Every day with me is literally another yesterday” (Alexander Pope, in a letter to Henry Cromwell)

For sure, words change their meanings and acquire additional ones over time, but we can resist these if we think that doing so will help preserve a useful distinction. So it is with literally. If you want your words to be taken seriously – at least in contexts where it matters – you might see the value in using literally with care.

But this is obviously not a particularly useful distinction and never has been. The crazier the non-intensifier interpretation of an intensifier use of “literally” is, the less of a potential for confusion there is. But I could not find a single example where it really mattered in the more subtle cases. But if we think this sort of thing is important why not pick on other intensifiers such as “really”, “virtually” or “actually” (well, some people do). My hypothesis is that it’s a lot of prescriptivists like the feeling of power and “literally” is a particularly useful tool for subjugating those who are unsure of their usage (often because of a relentless campaign by the prescriptivist). It’s very easy to show someone the “error” of their ways when you can present two starkly different images. And it feels like this could lead to a lot of confusion. But it doesn’t. This is a common argument of the prescriptivist but they can rarely support the assertion with more than a couple of examples if any. So unless a prescriptivist can show at least 10 examples where this sort of ambiguity led to a real consequential misunderstanding in the last year, they deserve to be told to just shut up.

Which is why I was surprised to see Motivated Grammar (a blog dedicated to the fighting of prescriptivism) jump into the fray:

Non-literal “literally” isn’t wrong. That said… « Motivated Grammar Non-literal literally isn’t “wrong” — it’s not even non-standard. But it’s overused and overdone. I would advise (but not require) people to avoid non-literal usages of literally, because it’s just not an especially good usage. Too often literally is sound and fury that signifies nothing.

Again, I ask for the evidence of what constitutes good usage? It has been good enough for TIME Magazine for close to a century! Should we judge correct usage by the New York Review of Books? And what’s wrong with “sound and fury that signifies nothing”? How many categories of expressions would we have to purge from language, if this was the criterion? I already mentioned hedges. What about half the adverbs? What about adjectives like “good” or “bad”. Often they describe nothing. Just something to say. “How are you?”, “You look nice.”, “Love you” – off with their heads!

And then, what is the measure of “overused”? TIME Magazine uses the word in total about 200-300 times a decade. That’s not even once per issue. Eric Schmidt used it in some speeches over his 10-year tenure as Google’s CEO and if you watch them all together, it stands out. Otherwise nobody’s noticed! If you’re a nitpicker who thinks it matters, every use of “literally” is going to sound too much. So, you don’t count. Unless you have an objective measure across the speech community, you can’t make this claim. Sure, lots of people have their favorite turns of phrases that are typical of their speech. I rather suspect I use “in fact” and “however” far too much. But that’s not the fault of the expression. Nor is it really a problem, until it forces listeners to focus on that rather than the speech itself. But even then, they get by. Sometimes expressions become “buzz words” and “symbols of their time” but as the TIME corpus evidence suggests, this is not the case with literally. So WTF?

Conciliatory confession:

I just spent some time going after prescriptivists. But I don’t actually think there’s anything wrong with prescriptivism (even though their claims are typically factually wrong). Puristic and radical tendencies are a part of any speech community. And as my former linguistics teacher and now friend Zdeněk Starý once said, they are both just as much a part of language competence as the words and grammatical constructions. So I don’t expect they will ever go away nor can I really be too critical of them. They are part of the ecosystem. So as a linguist, I think of them as a part of the study of language. However, making fun of them is just too hard to resist. Also, it’s annoying when you have to beat this nonsense out of the heads of your students. But that’s just the way things are. I’ll get over it.

Update 1:

Well, I may have been a bit harsh at the blogs and bloggers I was so disparaging about. Both Sentence first and Motivated grammar have a fine pedigree in language blogging. I went and read the comments under their posts and they both profess anti-prescriptivism. But I stand behind my criticism and its savagery of the paragraphs I quoted above. There is simply no plausible deniability about them. You can never talk about good usage and avoid prescriptivism. You can only talk about patterns of usage. And if you want to quantify these, you must use some sort of a representative samples. Not what you heard. Not what you or people like you. Evidence. Such as a corpus (or better still corpora provide.) So saying you shouldn’t use literally because a lot of people don’t like it needs evidence. But what evidence there is suggests that literally isn’t that big a deal. I did three Google searches on common peeve and “literally” came third: +literally +misuse (910,000), preposition at the end of a sentence (1,620,000), and +passive +misusewriting (6,630,000). Obviously, these numbers mean relatively little and can include all sorts of irrelevant examples, but they are at least suggestive. Then I did a search for top 10 grammar mistakes and looked at the top 10 results. Literally did not feature in either one of these. Again, this is not a reliable measure, but it’s at least suggestive. I’m waiting for some evidence to show where the confusion over the intensifier and disambiguator use has caused a real problem.

Update 2:

A bit of corpus fun revealed some other interesting collocate properties of literally. There are some interesting patterns within individual parts of speech. The top 10 adjectives immediately following are:

TRUE 91

IMPOSSIBLE 24

STARVING 14

RIGHT 9

SICK 8

UNTHINKABLE 8

ALIVE 6

ACCURATE 6

HOMELESS 6

SPEECHLESS 6

The top 10 nouns are all quantifiers:

HUNDREDS152

THOUSANDS118

MILLIONS55

DOZENS35

BILLIONS17

HOURS14

SCORES14

MEANS11

TONS11

The top 10 numerals (although here we may run up to the limitations of the tagger) are:

ONE25

TWO12

TENS9

THREE8

SIX8

107

1007

246

NEXT5

FIVE4

There are the top adverbs:

JUST91

OVERNIGHT24

ALMOST19

ALL17

EVERYWHERE14

NEVER13

DOWN12

RIGHT12

SO11

ABOUT11

And the top 10 preceding adverbs:

QUITE552

ALMOST117

BOTH91

JUST67

TOO50

SO38

MORE37

VERY31

SOMETIMES30

NOW26

One of the patterns in the collocates suggests that “literally” often (although this is only a significant minority of uses) has to do with scale or measure. So I was thinking is it possible that one can use the intensifier literally incorrectly (in the sense that most speakers would find the intensity inappropriate). For example, is it OK to intensify height of a person in any proportion? Is there a difference between “He was literally 6 feet tall” (disambiguator) and “He was literally seven feet tall.” (intensifier requiring further disambiguation) and “He was literally 12 feet all” (intensifier). The corpus had nothing to say on this, but Google provided some information. Among the results of the search “literally * feet tall” referring to people, the most prominent height related to literally is 6 or 7 feet tall. There are some literally 8 feet tall people and people literally taller because of some extension to their height (stilts, helmets spikes, etc.) But (as I thought would be the case) there seem to be no sentences like “He was literally 12 feet tall.” So it seems “literally” isn’t used entirely arbitrarily with numbers and scales. Although it is rarely used to mean “actually, verifiably, precisely”, it is used in proportion to the scale of the thing measured. However, it is used both when a suspicion of hyperbole may arise and where a plausible number needs to be intensified. And most often a mixture of both. But it is not entirely random. “*Literally thousands of people showed up for dinner last night” or “*We had literally a crowd of people” is “ungrammatical” while “literally two dozen” is OK even if the actual number was only 18. But this is all part of the speakers’ knowledge of usage. Speakers know that with quantifiers, the use of literally is ambiguous. So if you wanted to say “The guy sitting next to me was literally 7 feet tall”, you’d have to disambiguate and say “And I mean literally, he is the third tallest man in the NBA.”

Somebody commented on the Language Log saying “of course [...] Chomsky was a massively gifted linguist” http://j.mp/9Q98Bx and for some reason, to use a Czech idiom, the handle of the jar repeatedly used to fetch water just fell off. Meaning, I’ve had enough.

I think we should stop thinking of Chomsky as a gifted linguist. He was certainly a gifted mathematician and logician still is a gifted orator and analyst of political discourse (sometimes putting professionals in this area to shame). But I honestly cannot think of a single insight he’s had about how language works as language. His main contribution to the study of language (his only one really) was a description of how certain combinatorial properties of English syntax can be modeled using a particular formal system. This was a valuable insight but as has been repeatedly documented (e.g. Newmeyer 1986) its runaway success was due to a particular historical context and was later fed by the political prominence of its originator. Unfortunately, everything that followed was predicated on the model being isomorphic with the thing modeled. Meaning all subsequent insights of Chomsky and his followers were confined to refining the model in response to what other people knew about language and not once that I can think of using it to elucidate an actual linguistic phenomenon. (Well, I tell lie here, James MacCawley who worked with GB – and there must have been others – was probably an exception.) Chomsky’s followers who actually continued to have real insights about language – Ross, Langacker, Lakoff, Fillmore – simply ceased to work within that field – their frustration given voice here by Robin Tolmach Lakoff:

[Generative approaches to the 'science' of language meant] “accepting the impossibility of saying almost everything that might be interesting, anything normal people might want or need to know about language.“ (Robin Tolmach Lakoff, 2000, Language War)

So who deserves the label “gifted linguist” defined as somebody who repeatedly elucidates legitimate language phenomena in a way that is relevant across areas of inquiry? (And I don’t mean the fake relevance followers of the Universal Grammar hypothesis seem to be finding in more and more places.)

Well, I’d start with MAK Halliday who has contributed genuine insights into concepts like function, cohesion, written/spoken language, etc. Students on “linguistics for teachers” courses are always surprised when I tell them that pretty much all of the English as first or second language curriculum used in schools today was influenced by Halliday and none by Chomsky – despite valiant efforts to pretend otherwise.

But there are many others whose fingerprints are all over our thinking about language today. The two giants of 20th century linguistics who influenced probably everyone were Roman Jakobson and Charles Fillmore – neither of whom established a single-idea school (although Jakobson was part of two) but both were literal and metaphorical teachers to pretty much everybody. Then there’s William Labov who continues to help shift the “language decline” hypothesis on which much of 19th century philology was predicated. And, of course, there are countless practicing linguists who have interesting things to say about language every day – one needs to look no further than the contributors to the excellent Language Log. I don’t want to list any others of the top of my head lest I forget someone important, but here some of my favorites:

My personal favorite linguist has long been Michael Hoey whose “lexical priming” hypothesis deserves more discussion and a lot more following than it has received. I got a real chill of excitement reading William Croft’s “Radical Construction Grammar”. It is probably the most interesting and innovative view of language that has come about since de Saussure.

Most of my thinking about language has been influenced by George Lakoff (so much I translated his thickest book into Czech – http://cogling.info) and Ronald Langacker who could both be said to be ‘single-theory’ thinkers but are actually using this theory to say interesting things about language rather than using language to say interesting things about their theory.

I have said to people at one point or another, you should read one of these linguists to understand this point about language better. I have never said that about Chomsky. Not once. I have said, however, you should read this thing by Chomsky to understand Chomsky better. (Not that it always helps, I’ve come across a book called Structure of Language whose authors’ sparse reference list includes all of Chomsky’s books but who refer to his work twice and get it wrong both times.) There is no denying Chomsky’s multi-disciplinary brilliance but a particularly gifted linguist he is not. He is just the only one most people can think of.

BTW: Here’s why I think Chomsky’s wrong. But that wasn’t really the point. Whether he’s right or wrong, he’s largely irrelevant to most people interested in language, and the sooner they realize they’re wasting their time, the better.