November 30, 2006

Group glee

One of the best things about teaching undergraduates is how much you learn. Yesterday, a student and I were discussing possible sources for his term paper on "intrinsically funny words", and as we poked around on Google Scholar, we stumbled over Lawrence W. Sherman, "An Ecological Study of Glee in Small Groups of Preschool Children", Child Development, 46(1) 53-61 1975. The abstract:

A phenomenon called group glee was studied in videotapes of 596 formal lessons in a preschool. This was characterized by joyful screaming, laughing, and intense physical acts which occurred in simultaneous bursts or which spread in a contagious fashion from one child to another. A variety of precipitating factors were identified, the most prevalent being teacher requests for volunteers, unstructured lags in lessons, gross physical-motor actions, and cognitive incongruities. Distinctions between group glee and laughter were pointed out. While most events of glee did not disrupt the ongoing lesson, those which did tended to produce a protective reaction on the part of teachers. Group glee tended to occur most often in large groups (7-9 children) and in groups containing both sexes. The latter finding was related to Darwin's theory of differentiating vocal signals in animals and man.

The exact definition of "group glee", from the body of the paper:

A general description of group glee was first established as a very intense, joyfully affective state maintained throughout a majority of the group (one-half or more). To isolate a critical incident of group glee, two crieteria, noted in the codes as behavioral manifestation and ratio, had to be present.

Behavioral manifestation. -- Three categories of overt behaviors through which group glee manifested itself were laughter (Laf), screaming (Scr), and intense physical acts (Phys). Laughter was limited to instances of vigorous and joyful laughter. Screaming as limited to ebullient vocalizations which were emitted either in an organized, chantlike fashion or in random disarray. Intense physical actions were described as joyful physical behaviors such as hand clapping, jumpting up and down, or other intense physical expressions. [...]

Ratio. -- If one of the behavioral manifestations or combinations thereof was recognized, a ratio of the number of children involved in the the incident to the number of children present at its occurrence was calculated. If this ratio was 50% or more, an incident of group glee was noted as being present.

Sherman found ten categories of "precipitating causes". All of them are familiar -- group physical activity (like dancing lessons), cognitive incongruity (painting with a string, or making a speech error), taboo-breaking (transgressing the teacher's authority or using a taboo word like "stinky-pu"), suspense-resolution and terminal points of activities, etc. The student's concept of "instrinsically funny word" came up in a couple of different categories: nonsense words and nonsensical phrases are examples of "cognitive incongruity", whereas words like "underwear" would be in the taboo-breaking category. (The commonest single category among Sherman's precipitating factors was the "me, me, me" response to requests for volunteers, which seems to like a somewhat different kind of behavior from the other sorts of group glee discussed.)

What caught my attention was the table reproduced below:

This shows a slight but significant tendency for group glee to occur more often in mixed-sex groups. You may be able to see the pattern a bit more clearly if we redo the table in terms of the percent of lessons in which group glee occurred:

Group composition

Group glee
(percent of lessons)

All girls

19%

Girls > boys

28%

Girls = boys

38%

Girls < boys

30%

All boys

12%

Sherman's paper was published in 1975, and there seems to have been relatively little work on the subject since then. If I was the Emperor of Academia, we'd have Departments of Group Glee Studies, Institutes for Interdisciplinary Group Glee Research, international workshops on Cross-cultural Group Glee Investigations, annual meetings of the American Group Glee Association (and its splinter group, the Association for Group Glee Science)... Well, maybe not. But at least there'd be some more research. Just think of the applications in stand-up comedy, for example.

November 29, 2006

Fear and loathing on Massachusetts Avenue

It's just three days before an invited lecture by a linguist at MIT's
Brain and Cognitive Sciences (BCS) department, and suddenly (this
was on Tuesday) someone from
outside the MIT community sends a message to all the lists usually
reserved for advertising talks to MIT Linguistics faculty, students, and
visitors, and attempts to send it to the BCS department too (though that
list turns out to be closed). But the message is not an
announcement. It's a diatribe claiming that
the invited lecturer is a liar who falsifies the cultural and linguistic
evidence (even, curiously, when conflicting
evidence is available in his own earlier published
works). The sender is concerned that he and his friends
might not get a chance to expose the
lies and make all these allegations from the floor of the lecture room
before being "cut off", so he is getting them in early by email. He ends
his barrage of charges with a sarcastic mock advertisement for
exploitation of native peoples for personal gain:

You, too, can enjoy
the spotlight of mass media and closet exoticists! Just find a remote
tribe and exploit them for your own fame by making claims nobody
will bother to check!

What we have here is something I've never seen before: an attack
ad against a linguist. What the hell is going on? Who is this hated
linguist who is publicly libeled and branded as a self-promoting
fabricator before he can even arrive at Logan Airport and take the
taxi ride to MIT to give his talk?

The linguist at the sending end of this nasty piece of dirty
campaigning will remain nameless here (and I repeat, he is not
a member of MIT's excellent department).
But I can name the linguist under attack: it's
Daniel Everett, now at Illinois State University. He's a distinguished
scholar in the field of Amazonian linguistics whom I first came to know
when Desmond Derbyshire and I published, in Handbook of Amazonian
Languages, Volume 1 (Mouton, 1986), a remarkable 200-page chapter of
Everett's about the grammar of a fascinating and quite unusual language
called Pirahã. Everett's recent work has been discussed in a
number of magazine articles in the past few months (that's where the
"spotlight of mass media" comes in), and several times here on Language Log
(this post
includes a reference list of our various mentions of him).

The work Everett will be talking about at MIT has to do with what he
says are cultural factors that help to make the Pirahã and their
language so unusual. He presents an account of his claims in print, in
Current Anthropology46(4), 2005, pp. 621-647, where
you can also read critical comments by other linguists (pp. 635-641),
plus Everett's reply to them (pp. 641-644). A very brief paraphrase
of his claims follows (in my words, not his, but I'm doing my best
simply to report his views).

According to Everett,
the Pirahã place such a high cultural value on concreteness and
immediacy that they have essentially no time for considerations of matters
historical, artistic, or mathematical. They couldn't care less how the
universe began or who was alive two hundred years ago; they don't engage
in aesthetically motivated pictorial or dramatic art; and they don't do
mathematics — they don't even count. This focus on the here and
now, the real and concrete, is a positive value for them, not a
lack or deficit, so it has
great force, enough to account for the fact that
their complete absence of interest in the
hypothetical and the abstract has remained stable for some 200 years
of contented and almost totally monolingual and monocultural
life. Some of
the features of their language are results of the influence of this
culture: they don't have names for the colors of the rainbow, they don't
have a system of names for numbers, and their language lacks certain
grammatical features that most linguists think are universal —
notably, there aren't any tenses that involve reference points distinct
from the point of utterance (as in It had vanished, which says
that the vanishing took place earlier than some reference point in the
past, which itself is earlier than the moment of utterance), and there
also isn't any recursive syntactic subordination (clauses [which are
inside clauses [which are inside other clauses [which are inside other
clauses]]], and so on). These, roughly, are the claims that Everett is
prepared to defend in detail in his lecture.

So if Everett has it right, some languages are less like well-studied European
languages than we thought any languages were. Fair enough,
you might think. But what's all this about
lying and self-promoting and exoticizing and misleading the gullible
public? Why the hostility and abuse by advance email attack ad?

The attack message seemed to me to reveal a certain anxiety, even
panic, which had spilled over into anger. I have ruminated on what is
driving this, and I am led to consider three possible motivating forces,
perhaps all simultaneously involved: (1) on the linguistics front,
a certain defensiveness concerning cherished hypotheses about linguistic
universals; (2) on the political and ideological side, a strong
reaction against any perceived negative criticism of a Third World
people's culture; and (3) with regard to religion, a prejudice
against (particularly fundamentalist Protestant) Christianity. I am
not going to try and adjudicate things here (though my revulsion against
advance discrediting of visiting speakers by defamatory email should be
clear). Let me just try to explain, as briefly and neutrally as I can,
what I mean by (1)-(3), and where these forces seem to come from, and
how they apply to this case. I'll leave it at that.

(1) Linguistic universals.
The business about linguistic universals is perhaps the most reasonable
of the causes for angst. For those many linguists who closely follow the
thinking of Noam Chomsky, it is an article of their scientific creed that
languages are really very similar under the skin. Superficially they
look wildly different, and the feeling may not very rapidly recede even
after some effort at trying to learn a new language, but if you can just
attain a deep enough theoretical insight into how things work, you will
be able to perceive that in profound ways all languages share the same
organizing principles. It is not just that they constitute what
philosophers call a natural kind, but that they are skeletally identical.
Distracting lexical and morphological variety may blind us to that, but
deep down it will be found (through difficult theoretical work) to be
true. They regard it as extremely important for their whole branch of
the field of linguistics that this should be so, and it threatens very
deep beliefs of theirs when a linguist comes along saying something
really shocking, like that he knows of a language with, say, no
subordinate clauses of any sort.

Everyone should agree that remarkable claims should occasion remarkable
amounts of debate. Such debates are what theoretical linguistics is all
about. But wait: at MIT there are talks modifying or jettisoning hypothesized
linguistic universals every week. Why has this particular talk attracted
such a rare thing as a public message of abuse sent out in advance?
I think we have to consider the other two points as well.

(2) Political sensibilities.
The majority of academics today, especially in the social sciences
(including linguistics), bend over backwards to express — and I
have no doubt they actually feel — an enormously strong intuitive
revulsion against saying anything that might be perceived as even remotely
critical of another ethnic, racial, or cultural group or its cultural
products — particularly criticism of a poor or Third World culture
by people from a dominant Western culture. It is extremely unusual to
find anthropologists who will come out and directly attack a culture they
know well (it has happened, as in the case of Colin Turnbull's surprising
condemnation of the Ik in his book The Mountain People, but it's
very rare).

This sensibility is well-meant. People who are hostile to the rights
and status of minorities talk about it scathingly using the term
"political correctness" (PC), but I'm not interested in giving any
coarse anti-PC tirade here; I just want to get a clear view of what's
been going on.

The fact is that academics in fields like linguistics, anthropology,
and comparative psychology have often spent years trying to be sensitive
to the merits and capabilities of despised groups of people. They're not
wrong to feel that way, and to some extent I hold views of the same sort
myself. I have met ordinary people in Australia who don't know what
they're talking about who would be happy to tell me how the Aborigines
are vile, stupid, drunken, violent, worthless savages. This disgusts me;
but I can just imagine how much more it must infuriate linguists who
have devoted thousands of difficult hours studying the astoundingly
wonderful languages of Australian Aboriginal tribes, and getting to
know and like the people who speak them. Use of the dread phrase
"primitive language", or even a hint of it, will often send a linguist
over the edge with anger. I think some younger linguists may pick
up this attitude of infuriatedness even before they have spent
thousands of hours on fieldwork that has taught them just how complex
little-studied languages of preliterate peoples can be. That may have
happened here.

The malicious sender has picked an odd target in Everett, however.
Everett has done his thousands of hours, and he
is not saying anything denigratory about the Pirahã.
He likes and admires them. He has found their language astonishing,
and extremely complicated. Here's what he said about in his
Current Anthropology paper:

I thank the Pirahã for their friendship and help for more than
half of my life. Since 1977 the people have taught me about their
language and way of understanding the world. ...
No one should draw the conclusion from this paper that the Pirahã
language is in any way "primitive." It has the most complex verbal
morphology I am aware of and a strikingly complex prosodic system. The
Pirahã are some of the brightest, pleasantest, most fun-loving
people that I know. The absence of formal fiction, myths, etc., does not
mean that they do not or cannot joke or lie, both of which they particularly
enjoy doing at my expense, always good-naturedly. Questioning Pirahã's
implications for the design features of human language is not at all
equivalent to questioning their intelligence or the richness of their
cultural experience and knowledge.

That's what Everett actually says. Yet still, I think, it is so hard
for a sensitive linguist or social scientist to hear claims that might
be construed as critical of a culture that the deep revulsion may spring
up unbidden before the actual claims are even put out on the table, before
the lecture is even given, before the taxi has even left the airport.

(3) Religion.
And so we come to anti-Christian prejudice. Earlier in his life,
Everett was a missionary linguist working with the Summer Institute of
Linguistics, the organization of fundamentalist Protestants founded
by Kenneth Pike for the purpose of analyzing the remaining undescribed
indigenous languages of the world and translating the Bible into each
of them. Not everyone knows that Everett left SIL many years ago, and
now does not believe Christian doctrines or practice a religious faith.
Still, the SIL's work goes on, and it did provide Everett's original
motivation to go to Brazil, back in the 1970s, and commence working on
the Pirahã language.

In that context, consider these three propositions, which I think are
true. (a) There is a tendency for more people to be atheists in
academia than in the rest of the population (it doesn't matter why).
(b) There is a tendency for social scientists to believe (with some
justification, of course) that missionaries over the centuries have done
great harm to indigenous people around the world, particularly in earlier
centuries. (c) Christian fundamentalists have been doing their own
cause great harm among intellectuals by repeatedly attempting crazy
things like taking over school boards and pushing creationist or
cryptocreationist ideas illicitly into science classes. If you put
(a), (b), and (c) together, you have some basis for a certain amount of
suspicion toward anyone who is thought to be an active, practising,
Protestant-fundamentalist missionary operating or appearing within the
academic sphere.

I say you have a basis for some suspicion; I don't say you have an
excuse for being prejudiced. I personally think prejudice against
Christians is as unedifying and immoral as prejudice against Jews.
Suppose (contrary to fact) that Everett were still a missionary: suppose
he really did still think that God had instructed him to ensure that the
Pirahã can read the Gospel according to St Mark in their
native tongue. Would this be grounds for a prejudice so deep that it
would insist that everything he did was evil and twisted, and everything
he said about the language was some devious lie? I've worked with
linguists who are practising Christians. I don't share their religious
beliefs, but they seem to be perfectly honest. I don't know why they
would lie about something as banal as subordinate clauses (as opposed to
the origin of the universe or the issue of whether we have immortal
souls). Where's the gain for God in telling lies about
tensed complements?

I think that if you now consider the effects of the linguistic,
political, and religious points working together, you might be able to
put together the rudiments of a sort of psychiatric explanation of the
message sender's outburst of anxiety and rage — though not a
justification for it.

Anyway, whether that's right or not, I do know this: the lucky people
who live in the Boston area (I regret that I now do not) have a chance
to hear Everett in person on Friday, because despite the hate campaign
he still plans to get in that taxi at Logan Airport and take it to MIT's
Building 46. His lecture is called "Culture and Grammar in
Pirahã", and it's on Friday, December 1, from noon to 1:30 p.m.,
in room 46-3310 at MIT (that is, Room 3310 of building 46; MIT people do
have a system of number names, and they use them to name buildings).
Language Log readers in New England who get there early enough to find a
seat can check out what Everett actually says, rather than what his
enemies say he says, and then make up their own minds.

[Update:
Dan Everett's talk did place as scheduled on December 1;
it was not boycotted by the linguists in the area; about 125 people
showed up, in fact; and a good, spirited discussion followed in the
question period. You can actually listen to it, and look at the
handout, thanks to Ted Gibson's lab:
handout
in PDF form here, and audio
for Windows Media Player here.]

Censorship at the Daily Mail?

This is amusing. Apparently whoever moderates readers' comments over at the Daily Mail doesn't want Fiona Macrae's carelessness and credulousness to be exposed. She's the writer who basically copied out the press release for Louann Brizendine's book The Female Brain, using as her lede a factoid (about women talking three times more than men) which has repeatedly been debunked, most recently the day before in the Guardian, and which Dr. Brizendine herself has withdrawn after I pointed out that no actual studies support any similar numbers. (See this Language Log post for a list of links that go into mind-numbing detail on the factual background -- which is that the numbers reproduced in the Daily Mail piece are a pseudo-scientific urban legend, unconnected to any actual study; and that the many studies that do correlate talkativeness and sex find only small differences, often in the direction of more words from men.)

As of this writing, there are some 20 comments on Macrae's Daily Mail article, generally along the lines of

Like we didn't know this already?Only three times as much?Why spend money on studying the obvious?Someone had to do a study to figure this out?You know... in the 90's they said that women spoke about 6000 words a day, while men spoke around 2000. It seems that the count is different, but the ratio stayed the same. Interesting.I don't really think that it took several doctors doing a clinical study or writing a book to conclude that women talk more then men! Good Grief ask any husband or honest woman.

Several Language Log readers have sent me email to the effect that they attempted to submit a comment (using the facility at the bottom of the online Daily Mail article), referencing my Boston Globe article ("Sex on the brain, 9/24/2006), or the 11/27/2006 Guardian article, or some of the Language Log posts on the subject. Some have also attempted to correct Macrae's careless mis-copying of the book's name and author, which she renders as The Female Mind by Louan Brizendine instead of The Female Brain by Louann Brizendine -- but none of these comments have appeared on the Daily Mail's site, now nearly two days after the first of them was submitted. Perhaps the intern who moderates comments has gone on to other things.

The many unmoderated comments at Fark on Macrae's story are generally similar to those at the Daily Mail, though some of the farkers are even more straightforwardly misogynistic:

Nah, the male scientists just stopped paying attention at three.The funny thing is, while women *TALK* three times as much as men, they don't really communicate three times as much information. Thus, a lot of what they say is either redundant or null data.That is why our eyes glaze over, and we say "yes, dear".Take all your clothes off and we will pay rapt attention to whatever you are saying.I saw something about this on 20/20 weeks back - they said the female brain releases a chemical like endorphins when talking, so the vimens actually catch a buzz by yapping so much. Eventually they will evolve where their tongues are hung in the middle so they can flap on both ends...I thought it just felt like three times as long. My wife's stories take longer for her to tell than the actual event being described.Listening to women speak is like torture. It is the worst torture.Its not the endless yapping that really gets to me, its the shrieking laughter that gets louder and more shrill the more women are in a group.Just have her sit so there's a TV on behind her and over her shoulder, nod, smile, etc. Just try not to exclaim, "God, I'd like to fark that!" when a hottie appears on the TV and the GF is talking about her mother."There are, however, advantages to being the strong, silent type. Dr Brizendine explains that testosterone also reduces the size of the section of the brain involved in hearing - allowing men to become "deaf" to the most logical of arguments put forward by their wives and girlfriends."I call shenanigans, This is feminist crap. I cannot become "deaf" when a harpy is around. "Logical". Ha. No doubt.From observing my ex-girlfriends, it's like they just don't feel right if they aren't vomiting up an endless stream of words at all times. It's almost always due to expanding every posiible tangent in their "stories" into an avalanche of pointless detail.
STFU, women. Just STFU. The one-sided therapy sessions that you have the nerve to call "conversations" make your boyfriend/hubby fantasize about ditching your chatty ass.

I took the zipped-lip picture from a comment at fark, which was intended (I guess) as as an imaginary solution to the wish expressed in the last comment. But a couple of the fark commenters did link to a Language Log post debunking the factoid.

Somewhere, I expect, there is a web forum where the anti-male counterparts of these comments are on display, in reaction to the same story. The old-fashioned version is something along the lines of "that's because we have to repeat everything three times to get it through men's thick skulls", but there are many newer ones, spinning the "women talk three times more" factoid in terms of men's stereotypically lower verbal ability, men's stereotypical inability to create rapport through communication, men's stereotypical difficulty in expressing or even understanding their emotions, and men's brutish characteristics in general. In fact, come to think of it, that's pretty much the theme of the part of Dr. Brizendine's book that discusses men.

It's interesting how much people enjoy hearing about "scientific studies" that confirm their prejudices, and how easily this allows pseudo-scientific urban legends to be established. The public's appetite for stereotype-affirming bamboozlement explains a lot about science journalism -- and pop psychology books as well. What with population increases and all, there are now thousands of suckers born every minute.

[Update -- Looking around the rest of the Daily Mail's site, it seems that they routinely stop accepting comments after the first ten or twenty, thus maintaining the traditional one-way correction-free newspaper model, while giving the impression of reader participation without the expense and trouble of actually allowing it to take place.]

Normally when I hear about the latest study confirming some female stereotype, I don't bat an eye. So, we talk more than men, whatever. Maybe it's true, maybe it will be debunked. But peeling back the onion of the book's press coverage gave me pause. At a moment, when enthusiastic publicity is given to studies concluding women spend eight and half years of their lives shopping and proponents of single-sex classrooms argue that boys should be allowed to roughhouse while girls should not, the tenacity of idiotic stereotypes is unsettling. No doubt the study of differences between women's and men's brains will unravel untold wonders, but it's hard to underestimate how rife with scientific imposters the path there will be.

November 28, 2006

Yet another epicene pronoun: Hu are we kidding?

On his excellent "Web
of Language" site, Dennis Baron writes
of the latest effort to introduce a non-gender-specific (or "epicene")
singular third-person pronoun into English. D.N. DeLuna, a part-time
writing teacher at Johns Hopkins University,
has proposed using "hu" as a concise replacement for "he or she." As
she told the Chronicle
of Higher Education, DeLuna intends "hu" to be pronounced as "huh"
— "except with not as much aspiration." (Huh?) In addition to
the Chronicle, DeLuna has managed to get press attention from the Los
Angeles Times and the Hartford
Courant. But despite the flurry of interest, Baron convincingly argues
that we shouldn't count on "hu" being any more successful than the dozens
and dozens of other epicene pronouns that have been proposed over
the past century and a half.

The neuroendocrinologist formerly known as Prince

I love when a book explains supposedly scientific information in language that approximates that Prince song "International Lover."

That's such a nice summary, it doesn't matter than Ann gets the name of the book wrong (it should be "The Female Brain", not "The Female Mind"), or that she quotes the Daily Mail's bullet points as if they were true.

If you care about the science as well as the rock lyrics, there are some links here.

Regression to the mean in British journalism

Ironically, just a day after the article by Stephen Moss in the Guardian, "Do women really talk more?" (1/27/2006), which quotes Dr. Louann Brizendine retracting her assertion that "A woman uses about 20,000 words per day while a man uses about 7,000", the Daily Mail published an article by Fiona Macrae, "Women talk three times as much as men, says study" (1//28/2006), which presents Dr. Brizendine's assertion as fact. (For more on the relevant science, see the links here.)

Russell Craig, who describes himself as "a devoted Language Log reader", sent in this note about some of the ripples from the Daily Mail's belated splash into the sex-words pool:

On the Drudge Report yesterday was the following headline: "Women 'talk three times as much as men'...". Of course Language Log has alerted me to be wary of such claims. I checked out the link, which is a story in the UK Daily Mail talking about Dr. Luan Brizendine's infamous book. The opening lines:

It is something one half of the population has long suspected - and the other half always vocally denied. Women really do talk more than men.

In fact, women talk almost three times as much as men, with the average woman chalking up 20,000 words in a day - 13,000 more than the average man.

The story doesn't mention that the study has its nay-sayers, so I decided to post a helpful comment with a link to Language Log. This web site reviews and filters comments, so when the comment had not appeared later in the day, I sent in a new comment, this time with a specific link to some of your articles specifically discussing this book. When I checked back this morning, the option to post comments on this story had been taken away.

This does not say much for the quality of journalism at the UK Daily Mail, particularly their "Femail" division.

Well, as I've often had occasion to remark, the traditional media will never be able to fulfill their undoubted
promise as an information source until they can find a way to impose
some of the elementary standards of accuracy and accountability that we take for granted in the blogosphere.

[By the way, the misspelling of Dr. Brizendine's first name, which should be "Louann" with two n's, is not Russell's fault -- he got it from the Daily Mail article.]

[Update -- John Lawler points out that

just noticed that there are already 9 pickups of the Daily Mail article on Google News (grouped with one about women's response to porn; one wonders how the content-matching program was trained), from all over the Anglophone world. You can't keep a good factoid down, apparently.

An early New Year's resolution

Note to self: when talking to the press, never mention Eskimos and their words for snow.

In an otherwise clear and fair article on sex differences in talkativeness ("Do women really talk more?", The Guardian, 11/27/2006), Stephen Moss manages to misrepresent both me and the poor old Inuit:

In the end, [Liberman] concluded that the figures were probably based on guesswork, likening the "fact" that women talk more than men to the often stated "fact" that the Inuit have 17 words for snow. Both, he said, were myths. The Inuit actually have only one word for snow; and research shows only minute differences between the amount that men and women talk. "Whatever the average female v male difference turns out to be," he concluded, "it will be small compared to the variation among women and among men; and there will also be big differences, for any given individual, from one social setting to another."

Here's how I used the Eskimo-snow-words analogy in the Boston Globe article from which Moss takes that concluding quote:

EXPERTS TELL US the Eskimos have about four dozen words for snow. Or is it 200? Or seven? Or maybe four? Here's a hint: It's roughly the same number as in English. And here's another hint: Most of the people who throw Eskimo snow-word numbers around don't know anything about it, and haven't bothered to look it up.

A summary of the truth about Inuit snow vocabulary can be found in an earlier Language Log post by Geoff Pullum, "Sasha Aikhenvald on Inuit snow words: a clarification", 1/30/2004. I believe that Moss' reference to the number 17 comes indirectly from a Language Log post by Arnold Zwicky, "Only 17 words for snow", 1/9/2006, referring to a classic snowclone sighting in a book review by Christopher Buckley:

The Inuit language contains -- what? -- 17 different words for "snow"? The AD's must have twice that many for "vomit."

On the phone with Stephen Moss, I may well have mentioned "17" as one of the the smallest of the many invented-and-unsourced counts for Eskimo snow vocabulary; and I may have tried to explain how a single Inuit root can give rise to an indefinitely large number of derived words, given the polysynthetic nature of the language. But I certainly would not have said that "the Inuit actually have only one word for snow".

Frankly, I don't remember exactly how that part of the conversation went. But in the future, I've decided, I'm going to swear off the Eskimos completely when talking with representatives of the fourth estate. No good can possibly come of it.

Word counts

When scientists want to support a factual assertion in print, they either present some experimental evidence, or they add a footnote referencing some earlier publication of evidence. Journalists have an analogous pair of methods: one is to report what they themselves experienced, and the other is to quote an eye-witness, an official spokesperson, or an expert. But every once in a while, journalists act like scientists and do an experiment. In yesterday's Guardian, Stephen Moss gives an example: "Do women really talk more?" For this article, he wired up a man and a woman -- Tim Dowling and Hannah Pool, who (I think) are Guardian staffers -- and recorded and transcribed everything they said for a day.

I think this started because back in September, I wrote a piece in the Boston Globe, "Sex on the brain", about Louann Brizendine's claim that women use about 20,000 words a day and men only about 7,000. This in turn followed up on some Language Log posts during the previous month, which you can find listed here. I noted that none of Brizendine's end-notes provided any factual support for the words-per-day claim; that version of this claim are common in psychological self-help books and even religious tracts; and that the relevant parts of the experimental literature show no meaningful sex difference in talkativeness, with several studies even showing men as slightly talkier.

The Guardian's experiement was consistent with the literature:

Hannah said 12,329* words
Tim said 11,279 words
*Hannah accidentally turned off her recorder for two hours, however, so her real total could be 14,000.

And Stephen Moss even reached Louann Brizendine by phone -- in a picturesque location! -- and she graciously conceded the point:

When I reach Brizendine, just as she is crossing the Golden Gate bridge, she tells me that she has accepted the criticism of the numbers quoted in the book - on both volume of words and rate of speech - and will be deleting them from future editions. Nor will they appear in the UK edition, to be published by Bantam in April. "I understand Mark Liberman's point and I am grateful to him," she says. "He felt I was passing on data that was not nailed down, and thus perpetuating a myth, so it will be taken out in future editions." She admits language is not her specialism, and she had been reliant on the advice of others.

This is excellent journalism. And it warms a linguist's heart to see how engaged Moss gets in the details of the project -- he's learning to do linguistic research, and he seems to have enjoyed it. But I'm afraid that this wasn't very good science, all the same.

In fact, Moss understands this (some of the following is apparently quoted from observations by "our linguist, Dr. Jane Sunderland"):

This is one man and one woman sampled on one, not necessarily, typical day. Moreover, our man admits that he is naturally reserved, while our woman is noted for her effervescence and says she always feels the need to act as a facilitator in conversations. They might almost have been chosen to act out the urban myth of taciturn man and talkative woman. [...]

Tim spent the first part of his recording at home, watching television, not talking to his family, and made two 40-minute tube journeys alone. He spent the day in the Guardian offices - which he doesn't usually - surrounded by people he did not know particularly well, and with his head down. (Hannah was also in the office, but she works there every day and is very relaxed in the environment.) Despite this (and despite at one point describing himself as "a man of few words"), Tim produced more than 11,000 words over 14 hours. [...]

In contrast to Tim, Hannah was with people most of the day (the exception being shopping in Sainsbury's). When you are with people you usually talk to them. (Incidentally, Hannah's figure suggests that for anyone to produce 20,000 words in a day would be difficult.)

We should add that the two subjects in this case knew what the point of the experiment was, and were able to adjust their behavior to influence the results. If you really wanted to draw conclusions about men and women in general, you'd need to record a demographically balanced sample of people in a balanced sample of contexts. With one woman and one man, you could get almost any result at all. It's nice that the Guardian's result was a plausible one, but it's a puzzle for philosophers, I think, why people are so ready to be influenced by the results of single-trial experiments on phenomena they know to be highly variable.

By a curious coincidence, another study featuring the interpretation of word counts from a small sample recently played a prominent role in a major English-language publication. This study was not a one-day journalistic lark, but a serious, decade-long study that has played a major role in influencing a public-policy debate that is central to our society. And yet, it has some issues in common with what Stephen Moss did.

By age 3, children from privileged families have heard 30 million more words than children from underprivileged families. Longitudinal data on 42 families examined what accounted for enormous differences in rates of vocabulary growth. Children turned out to be like their parents in stature, activity level, vocabulary resources, and language and interaction styles. Follow-up data indicated that the 3-year-old measures of accomplishment predicted third grade school achievement.

This is obviously serious stuff. Here's some of Tough's discussion:

They found ... that vocabulary growth differed sharply by class and that the gap between the classes opened early. By age 3, children whose parents were professionals had vocabularies of about 1,100 words, and children whose parents were on welfare had vocabularies of about 525 words. The children’s I.Q.’s correlated closely to their vocabularies. The average I.Q. among the professional children was 117, and the welfare children had an average I.Q. of 79.

When Hart and Risley then addressed the question of just what caused those variations, the answer they arrived at was startling. By comparing the vocabulary scores with their observations of each child’s home life, they were able to conclude that the size of each child’s vocabulary correlated most closely to one simple factor: the number of words the parents spoke to the child. That varied greatly across the homes they visited, and again, it varied by class. In the professional homes, parents directed an average of 487 “utterances” — anything from a one-word command to a full soliloquy — to their children each hour. In welfare homes, the children heard 178 utterances per hour.

What’s more, the kinds of words and statements that children heard varied by class. The most basic difference was in the number of “discouragements” a child heard — prohibitions and words of disapproval — compared with the number of encouragements, or words of praise and approval. By age 3, the average child of a professional heard about 500,000 encouragements and 80,000 discouragements. For the welfare children, the situation was reversed: they heard, on average, about 75,000 encouragements and 200,000 discouragements. Hart and Risley found that as the number of words a child heard increased, the complexity of that language increased as well. As conversation moved beyond simple instructions, it blossomed into discussions of the past and future, of feelings, of abstractions, of the way one thing causes another — all of which stimulated intellectual development.

Hart and Risley showed that language exposure in early childhood correlated strongly with I.Q. and academic success later on in a child’s life. Hearing fewer words, and a lot of prohibitions and discouragements, had a negative effect on I.Q.; hearing lots of words, and more affirmations and complex sentences, had a positive effect on I.Q. The professional parents were giving their children an advantage with every word they spoke, and the advantage just kept building up.

This is certainly consistent with our expectations -- our stereotypes -- and unlike the 20,000-vs.-7,000 legend, it's based on experimental data. However, as Hart and Risley write:

All parent-child research is based on the assumption that the data (laboratory or field) reflect what people typically do. In most studies, there are as many reasons that the averages would be higher than reported as there are that they would be lower. But all researchers caution against extrapolating their findings to people and circumstances they did not include. Our data provide us, however, a first approximation to the absolute magnitude of children’s early experience, a basis sufficient for estimating the actual size of the intervention task needed to provide equal experience and, thus, equal opportunities to children living in poverty. We depend on future studies to refine this estimate.

They also tell us clearly that their sample was a small one:

Our final sample consisted of 42 families who remained in the study from beginning to end. From each of these families, we have almost 2 1/2 years or more of sequential monthly hour-long observations. On the basis of occupation, 13 of the families were upper socioeconomic status (SES), 10 were middle SES, 13 were lower SES, and six were on welfare.

Now, six is a bigger number than one, obviously, and it's big enough that it makes sense to do statistical significance tests on tables like this one, taken from Hart and Risley (2003):

Families' Language and Use Differ Across Income Groups

Families

13 Professional

23 Working-class

6 Welfare

Measures & Scores

Parent

Child

Parent

Child

Parent

Child

Protest scorea

41

31

14

Recorded
vocabulary
size

2,176

1,116

1,498

749

974

525

Average
utterances per
hourb

487

310

301

223

176

168

Average different
words per hour

382

297

251

216

167

149

a
When we began the longitudinal study, we asked the parents to complete a
vocabulary pretest. At the first observation each parent was asked to
complete a form abstracted from the Peabody Picture Vocabulary Test (PPVT).
We gave each parent a list of 46 vocabulary words and a series of
pictures (four options per vocabulary word) and asked the parent to
write beside each word the number of the picture that corresponded to
the written word. Parent performance on the test was highly correlated
with years of education (r = .57).

b
Parent utterances and different words were averaged over 13-36 months of
child age. Child utterances and different words were averaged for the
four observations when the children were 33-36 months old.

But six should not be a big enough number to lay our concerns to rest. We wouldn't try to predict the results of a national election based on an in-depth survey of six people in one city. Should we make national educational policy based on a similarly small sample,even if the data comes from 2 1/2 years of monthly visits? Does a sample of children from six poor families in 1980's St. Louis, as observed in a monthly visit from researchers with recording equipment, gives a meaningful picture of the experience of the millions of people that Tough's article takes them to represent? In particular, it's not clear how to reconcile this picture of monetary poverty engendering linguistic poverty with the central role that "lower SES" people have always played in American linguistic creativity.

This is not a criticism of Hart and Risley, who did a marvelous piece of research. But I think that it amounts to a criticism of several related scientific disciplines, including my own. More than a decade after Hart and Risley's first publication, the "future studies" that they "depend on to refine [their] estimate" are mostly still just as hypothetical as ever.

I'm not sure to what extent she has misrepresented the sample size though:

"Meaningful Differences in the Everyday Experience of Young American Children is one of the most thorough studies ever conducted."

"There is no room here to do justice to this epic analysis, but no one could fail to be convinced by it."

She didn't mention it was based on just "six poor families in 1980's St. Louis" :-)

Well, it was indeed an "epic analysis", especially for its time -- they collected 42*2.5*12 = 1,260 hours of recordings, transcribed and coded them in many ways, and then followed the same kids through their subsequent career in school, and cross-correlated everything with everything, more or less. That doesn't change the fact that the "welfare" group, whose kids are at greatest risk of low achievement in school, was N=6.]

Very very good pope

This morning I heard the voice of a Turkish woman, with a thick
accent, being interviewed in an Istanbul street by an NPR reporter.
The woman said:

Jampol very, very good.

Jampol is John Paul; she was speaking about the last Pope
(who was popular and well received when he visited Turkey), and she
was saying that he thought him a very, very good man.

And it made me realize that there was something
amazing about what she said: her English was bad (note
the lack of is in her utterance quoted above), but in one way it
went beyond anything that had been described in English
grammar textbooks by the end of the 20th century.

At least, I can say this much (and the part that follows has been
slightly revised since I first posted it, to make it more accurate):
I have never been able to find a
published grammar which gives a full description of the following
fact: pre-head modifiers
of both adjective and adverb categories, in noun phrases and
adjective phrases and adverb phrases, can be repeated to express
intensification of the expressed quality, the number of repetitions
being a signal of the degree of
intensification (so that very, very good is better than very good; big, big, big problems are bigger problems than big, big problems, and so on).

Now, I am assuming that this is one generalization, not two.
That is, I am taking very, very good and a good, good
man to
be instances of the same phenomehon. This could be wrong. But
if it's right, I don't know of any prior grammar that points it out in full
generality. The closest approach is in the best of the earlier large
grammars, Randolph Quirk et al.'s A Comprehensive Grammar of
the English Language (Longman, 1985), page 473:
"Some intensifiers can be repeated for emphasis."
(It's not really emphasis, it's
intensification; I used the word "emphasis" in the first version of
this post, but I shouldn't have.)
An additional observation is made there: "the repetition is permissible
only if the repeated items come first or follow so". That is,
so very very nice is possible, and much much too kind,
and very much too kind, but not *very much much too
kind. Nice point.

What I don't find anywhere in Quirk et al.'s big book
is the observation that
attributive adjectives can repeat for intensificatory effect as well, as
in a good, good man.

When I realized in 1999 that intensificatory reduplication
(of both adjective modifiers in the
noun phrase and adverb premodifiers in adjective phrases and adverb
phrases) needed to be described in
the Adjectives and Adverbs chapter of The Cambridge Grammar of the
English Language, I rummaged around in all the earlier reference
grammars I could find to see what they had said about it, and the answer
was that the exact facts had apparently never been recorded.
What Rodney Huddleston and I
wrote for Chapter 6 of The Cambridge Grammar
(pages 561-562) was apparently the
first description that dealt with both adjectives and adverbs.

Yet the Turkish woman apparently knew how to do this kind of
intensificatory repetition. At least, she knew
the part of it that applies to adverbs like very.
She has almost certainly
never seen either the Quirk volume or The Cambridge
Grammar (the latter has only been out four
years and costs about $150). So how did she know you could
repeat such words for intensificatory effect, when it is almost inconceivable
that she could have learned it from a book?

You might say this is no big deal: it's not a difficult thing
to learn, and hardly needs a description. I don't know about the
first part of that, but the second part is not true. Here's why
you need a description: not all adjectives can reduplicate, and
which ones can does not follow from any basic principle known to me.
This is not relevant for very, which really only occurs as
a pre-head modifier; but there's an interesting general point about
this not being a matter of mere basic common sense. Notice that
in the following examples, the starred ones are not grammatical:

Just then a huge huge huge spider appeared.
*The spider looked huge huge huge in comparison to the fly.
Whether Airbus can overcome its major, major problems
is not clear.
*Whether Airbus's problems are major, major is not clear.
We need a good, good man to do this job.
*We need a man who is good good to do this job.

The generalization is simple: you can reduplicate an adjective
for emphasis if it's in what The Cambridge Grammar calls
attributive function, but not if the adjective is in
predicative function. I don't see how you could have guessed
that if you hadn't looked at the data and I hadn't told you what
the answer was.

There are similar facts regarding adverbs. As pre-head modifiers
they can reduplicate, but as (for example) verb phrase adjuncts they
cannot:

They did a really nice thing for me on my birthday.
They didn't need to do anything for my birthday, really.
They did a really, really nice thing for me on my birthday.
*They didn't need to do anything for my birthday, really really.
He is totally awesome.
The program wasn't eliminated totally.
He is just totally, totally awesome.
*The program wasn't eliminated totally totally.

Well, the obvious answer to how the Turkish woman learned to
reduplicate the modifier very is that she had heard people who
spoke English saying very very, and she knew enough to imitate
them in this regard.

Another possibility would be that reduplication of modifiers for
emphasis happens to be a property of Turkish (I have been told that
this statement is indeed true), and the woman tacitly
knew just enough about English (namely that very was a pre-head
modifier in an adjective phrase) that she was able to unthinkingly transfer
it from Turkish to English, and by good luck she was right, because it is a
feature of English too.

Somewhat less plausible, in my view, would be a Chomskyan
line: that reduplication of modifiers for emphasis is a
linguistic universal, held in common by all natural languages
and built into human brains at birth or conception, so no one
ever has to learn it.

I don't know which is right. But I have sometimes seen statements,
by philosophers and other people who haven't done much close-up study of
language acquisition process, suggesting that foreign adults learn the
rules of the target language out of books, or are told what the rules are
by their teachers. In the case at hand, such statements seem
extraordinarily implausible. If it's right that no
grammarian had written any account of this simple feature of English
before 2002, we can be sure at least that any foreign speaker of English
who has learned emphatic reduplication of pre-head adjective and
adverb modifiers learned it in some other way than by reading about
it in a grammar textbook, and any teacher who has ever given a
lesson on it deserves to be congratulated for having done some
original research.

[Update:
I know I'm going to be flooded with mail (I won't be able to answer
it all) from people who insist
you can reduplicate predicative adjectives, and they'll
send me examples like You were wrong, wrong, wrong!.
Briefly, let me point out that you have to distinguish
among different but superficially similar phenomena. Certainly,
you can pause at the end of a sentence after an adjectival
predication, and simply repeat the adjective, like this:
It is disgraceful. Disgraceful.
But in fact it's the whole adjective phrase that's repeated here:
It is sad to see. Sad to see.

And you can do the same with any kind of phrase; it doesn't
need to be an adjective phrase, it could be a preposition phrase:
It is beyond belief. Simply beyond belief.
You can even interrupt yourself and do a repetition in the
middle of a sentence, as in the famous line from Casablanca: "I am shocked, shocked, to find that
gambling is going on here." But these examples involve the
intonation breaks associated with parenthetical additions or
interruptions (and they are often written with dashes rather
than commas). They do not have the quality
the smoothly integrated arbitrary repetition for degrees of
intensification that you find with attributive adjectives modifying
nouns and adverbs modifying adjectives.

My examples above were
carefully designed to be unsuitable for the sort of parenthetical
restatement I'm referring to here. It's not that no one can find
a predicative adjective being repeated; it's that close attention
to the details reveals that attributive adjectives and pre-head
adverb modifiers are being reduplicatively emphasized in ways
that predicative adjectives and post-head adverb phrase
adjuncts are not.

One other point: another undescribed feature of English
I discovered in 1999, also described in The Cambridge
Grammar (page 562) was tautologous use of synonymous
but distinct adjectives for intensification, as in tiny little bird
(or little tiny bird) or great big hole. And
again, this is restricted to attributives: no one says
*The bird was little tiny, or
*The hole was great big.]

November 27, 2006

Slurry accent II

OK, there's new data, and so I've got a new hypothesis about how Lawrence Henry came to refer to "a kind of commercial London speech known as 'slurry.'" It wasn't a slip of the ear, heard in place of "Estuary". It wasn't a malapropism, merging "estuary", "Surrey" and "slang". No, it was a textual misreading.

When I posted about this a few days ago ("Slurry", 11/24/2006), I also wrote a letter of inquiry to the editor of The American Spectator, where Henry's column was published. They put my note into their online reader mail (including my own slip of the fingers, "slip of the error" for "slip of the ear"!), and Lawrence Henry responded:

I learned of the accent I call "slurry" from none other than Dick Francis, can't now remember which novel. He described it in some detail as a kind of commercial affection based on suburban London, and gave extensive examples in a character's speech.

Dick Francis has written many books, I'm afraid -- too many for me to search, even if I owned them all. If anyone knows or can find where he used the word slurry to describe an accent of the right kind, please let me know. While waiting for the true citation, though, I have a guess about what has happened. If we search amazon.com books on a9 for {"slurry accent"}, we get seven results. None of them are to Dick Francis books. But they're like the sample of four below:

Dorothy Garlock, The Listening Sky, p. 17: "If she's got her eye on the boss man, it'll do her no good:" This voice had the slurry accent of the South. "Who ain't got a eye on him? Lordy mercy."

A.J. Zerries, The Lost Van Gogh, p. 72: "Hello, Ryder, and not so tense -- it's your old friend, Aaron," said the man on the stoop in a slurry accent.

Kavita Daswani, The Village Bride of Bevery Hills, p. 38: I kept my eyes lowered as I heard these people in their relaxed, slurry accents talking about what had happened that morning or debating between Chinese and a sandwich.

Paul Garrison, Red Sky at Morning, p. 129: "We will speak English to spare your sailors unnecessary distress." "Admiral," the well-dressed Wong responded with a slurry accent." His English came less easily than Admiral Tang's.

It would be easy to misread a phrase like a slurry accent as if it were analogous to "a brummy accent" rather than to "a fussy accent". So my hypothesis is that Dick Francis described a character's affectation of Estuary English, using a phrase like "a slurry accent" or "his slurry accent", and Lawrence Henry misread slurry as a name rather than a description.

[Update -- Ben Zimmer writes:

I haven't found any references to a "slurry" accent in a Dick Francis novel, but he frequently uses "sloppy" to describe a disfavored southeastern (UK) speech pattern:

To The Hilt, p. 19
I'm not good at voices and accents, but I'd say his was sloppy southeast England.

Trial Run, p. 29
They had a rough, sloppy way of speaking, swallowing all the consonants. Southern England. London or the Southeast , I should think, or Berkshire.

Twice Shy, p. 41
I listened to the utterly English sloppy accent and thought that it couldn't have less matched the body it came from.

And it's not like Francis is unfamiliar with "Estuary" as a dialectal descriptor:

Shattered, p. 22
Her accent was Estuary, Essex or Thames: take your pick.

Map of South Asia

I came across this terrific map on Wikipedia.
It shows South Asia, with the names of the various countries, the states and union territories of India, and the provinces of Pakistan in the local language and writing system. The caption भारत
in the lower right says "India" in Hindi, but the map actually includes Sri Lanka, Pakistan, China, Tibet, Nepal, Bangladesh and Burma as well. [Click on the map for a larger version.]

In addition to being pretty, it's a nice reminder of some of the writing systems you (or at least I) have still to learn. Sinhala, in particular, continues to intimidate me. All those curlicues give me a headache.

Addendum 2006-11-27

Reader Vajra Chandrasekara points out that Sri Lanka is mis-spelled. It is written
ශ්රී ලංකාව,
which makes sense but is not the way the word is actually written, which is
ශ්‍රීලංකාව.
Stephen Carlson points out that the Chinese for "People's Republic of China" is in traditional characters, not the simplified characters in use in China today. For example, the last character, "country", would be 国 in simplified characters. I'm not sure why that is - the little bit of information on the creator's Wikimedia user page indicates that he is "from China". Anyhow, its okay with me - I prefer traditional characters.

Greetings, comrade

This explains a lot of public discourse about language, at least in America:

You could start with Mark Twain's famous linguistic takedowns of Fenimore Cooper and Mary Baker Eddy, and work forward. The traditional British equivalent is mocking social inferiors who presume above their station -- or is that an unfair stereotype? Anyhow, both impulses together support the Bushisms business, whereby elitists can mock the powerful for being low-class.

November 26, 2006

Doublespeak and the War on Terror

A briefing paper entitled
Doublespeak and the War on Terrorism
by
Timothy Lynch
of the
Cato Institute seems to be getting
belated attention. It appeared in September, but this
AP report
by Calvin Woodward came out today. The briefing paper addresses the attempt of the
Bush Administration
to make more palatable its violations of civil liberties by using doublespeak,
e.g. dubbing "warrants" "national security letters" in the hope
that the courts will be fooled into thinking that judicial oversight is not required,
or describing the suicide attempts of prisoners at Guantanamo Bay
(referred to by the Bush Administration as "detainees", as if they were witnesses to a
traffic accident asked by the police to remain until they could be interviewed)
as "self-injurious behavior incidents".

The AP article adds a few examples from other areas, such as the use of
"food insecurity" rather than "hunger", and "redeployment" rather than "retreat"
in reference to Iraq. It also suggests that advocates of abortion rights
speak of "choice" because "abortion" sounds unpleasant. It may be true
that abortion rights advocates prefer to avoid the term "abortion", but
I think there's more to it than that. Describing one's movement
as "pro-abortion" suggests that one actually favors
abortion, that is, considers that abortions are a fine thing. Few if any advocates
of abortion rights take such a position. Their position is, rather, that women
should have the right to have an abortion if they consider it the best choice:
"pro-choice" really is a more accurate description than "pro-abortion". In the
abortion debate if one wants an example of the use of propagandistic use
of language, it is the use of the self-designation "pro-life" by opponents of
abortion rights. Opponents of abortion rights are not in general advocates
of a "pro-life" stance: many of them are quite sympathetic to military
activity and favor the death penalty, both of which are considered by many
others to be "anti-life" stances. And those who oppose abortion under any
circumstances, even when the life of the mother is threatened,
are not "pro-life" even in this narrow context. Rather, they take a position
that values the life of the foetus over that of the mother. So
"anti-abortion" is a much more accurate term than "pro-life".

If it bothers you that the doublespeak addressed in the Cato Institute
paper is all on the part of the right wing (that is, the authoritarian
branch - the Cato Institute is itself considered right wing, but it
represents the libertarian branch), a better example of left wing doublespeak
would be "diversity", a replacement for "affirmative action" meant to persuade
people that something different and better is intended.

Descriptivism in literature

While Mark was reporting on prescriptivism in Pynchon, I found a nice example of level-headed descriptivism in Philip Roth's I Married a Communist (not Roth's newest, but I've been catching up). It's on the first page of chapter 4:

It was like penetrating a foreign language and discovering that, despite the alienating exoticism of its sounds, the foreigners fluently speaking it are saying no more than what you've been hearing in English all your life.

Obviously, this is being used as a metaphor for something else; it's not as direct as Mark's example. Still, it's very clearly descriptive as opposed to prescriptive.

Please feel free to add more examples of this kind in the comments area.

Prescriptivism in literature

I've been reading Thomas Pynchon's new novel, "Against the Day", and found a dramatization of linguistic prescriptivism on the very first page:

"Oh, boy!" cried Darby Suckling, as he leaned over the lifelines to watch the national heartland deeply swung in a whirling blur of green far below, his tow-colored locks streaming in the wind past the gondola like a banner to leeward. [...] "I can't hardly wait!" he exclaimed.

"For which you have just earned five more demerits!", advised a stern voice close to his ear, as he was abruptly seized from behind and lifted clear of the lifelines. "Or shall we say ten? How many times," continued Lindsay Noseworth, second-in-command here and known for his impatience with all manifestations of the slack, "have you been warned, Suckling, against informality of speech?" With the deftness of long habit, he flipped Darby upside down, and held the flyweight lad dangling by the ankles out into empty space -- "terra firma" by now being easily half a mile below -- proceeding to lecture him on the many evils of looseness in one's expression, not least among them being the ease with which it may lead to profanity, and worse. As all the while, however, Darby was screaming in terror, it is doubtful how many of the useful sentiments actually found their mark.

Though I'm sure that there are many literary examples of prescriptivism -- several in Tom Sawyer and Huckleberry Finn alone, no doubt-- I can't actually call any to mind at the moment. Wait, I take that back, here's (a marginal) one from P.G. Wodehouse. Anyhow, send me your favorites, and I'll add them to this post.

[I thought that I remembered some prescription in Tom Sawyer or Huckleberry Finn, but I haven't turned it up. However, I did find this interesting passage from A Tramp Abroad, which presents the view that "bad grammar" is a natural cause for shame:

Animals talk to each other, of course. There can be no question about that; but I suppose there are very few people who can understand them. I never knew but one man who could. I knew he could, however, because he told me so himself. He was a middle-aged, simple-hearted miner who had lived in a lonely corner of California, among the woods and mountains, a good many years, and had studied the ways of his only neighbors, the beasts and the birds, until he believed he could accurately translate any remark which they made. This was Jim Baker. According to Jim Baker, some animals have only a limited education, and some use only simple words, and scarcely ever a comparison or a flowery figure; whereas, certain other animals have a large vocabulary, a fine command of language and a ready and fluent delivery; consequently these latter talk a great deal; they like it; they are so conscious of their talent, and they enjoy "showing off." Baker said, that after long and careful observation, he had come to the conclusion that the bluejays were the best talkers he had found among birds and beasts. Said he:

"There's more TO a bluejay than any other creature. He has got more moods, and more different kinds of feelings than other creatures; and, mind you, whatever a bluejay feels, he can put into language. And no mere commonplace language, either, but rattling, out-and-out book-talk--and bristling with metaphor, too--just bristling! And as for command of language--why YOU never see a bluejay get stuck for a word. No man ever did. They just boil out of him! And another thing: I've noticed a good deal, and there's no bird, or cow, or anything that uses as good grammar as a bluejay. You may say a cat uses good grammar. Well, a cat does--but you let a cat get excited once; you let a cat get to pulling fur with another cat on a shed, nights, and you'll hear grammar that will give you the lockjaw. Ignorant people think it's the NOISE which fighting cats make that is so aggravating, but it ain't so; it's the sickening grammar they use. Now I've never heard a jay use bad grammar but very seldom; and when they do, they are as ashamed as a human; they shut right down and leave.

On first reading this, I thought that "bad grammar" might be a euphemism for cussing, but I don't think it is, since Jim Baker goes on to explain that

Now, on top of all this, there's another thing; a jay can out-swear any gentleman in the mines. You think a cat can swear. Well, a cat can; but you give a bluejay a subject that calls for his reserve-powers, and where is your cat? Don't talk to ME--I know too much about this thing; in the one little particular of scolding--just good, clean, out-and-out scolding--a bluejay can lay over anything, human or divine.

]

[From Luke Gibbs:

Not necessarily literature, but my all-time favorite example of grammatical authoritarianism comes from the film "Life of Brian."

Of course, the centurion in Life of Brian is a great example, because he
messes up the rules he's enforcing. The locative is "domi" and it's used
for locations, not motion towards. The fact that "domus" takes the
locative is relevant, but it just means that accusative appears without a
preposition (much like "go home" in English, in fact).

And from Simon Cauchi:

You asked us to send you our favourites. Here are two of mine, the first relating to speech and the second to writing.

The poem "The Schoolmaster" (subtitle "abroad with his son"), by C. S. Calverley (1831-84), consists of eight stanzas. The fourth stanza goes like this:

The noise of those sheep-bells, how faint it
Sounds here -- (on account of our height)!
And this hillock itself -- who could paint it,
With its changes of shadows and light?
Is it not -- (never, Eddy, say "ain't it") --
A marvellous sight?

And there's also Calverley's "Forever", which I will copy out in full because all the online texts seem to be corrupt:

Forever; 'tis a single word!
Our rude forefathers deem'd it two:
Can you imagine so absurd
A view?

Forever! What abysms of woe
The word reveals, what frenzy, what
Despair! For ever (printed so)
Did not.

It looks, ah me! how trite and tame!
It fails to sadden or appal
Or solace -- it is not the same
At all.

O thou to whom it first occurr'd
To solder the disjoin'd, and dower
Thy native language with a word
Of power:

I had had good teachers. At prep school an English master called
Chris Coley had awoken my first love of poetry with lessons on Ted
Hughes, Thom Gunn, Charles Causley and Seamus Heaney. His predecessor,
Burchall, was more a Kipling-and-none-of-this-damned-poofery sort of
chap, indeed he actually straight-facedly taught U and Non-U
pronunciation and usage as part of lessons: 'A gentleman does not
pronounce Monday as Monday, but as Mundy. Yesterday is
yesterdi. The first 'e' of interesting is not sounded,' and so
on.

I remember boys would get terrible tongue lashings if he ever
overheard them using words like 'toilet' or 'serviette'. Even 'radio'
and 'mirror' were not to be borne. It had to be 'wireless' and 'glass'
or 'looking-glass'. Similarly we learned to say formidable, not
formidable, primarily not primarily and
circumstance not circumstance and never, for a second
would such horrors as cirumstahntial or substahntial be
countenanced. I remember the monumentally amusing games that would go
on when a temporary matron called Mrs Amos kept trying to tell boys to
say 'pardon' or 'pardon me' after they had burped. The same spin
upper-middle-class families get into to this very day when Nanny
teaches the children words that Mummy doesn't think are quite the
thing.

'Manners! Say "pardon me".'

'But we're not allowed to, Matron.'

'Stuff and nonsense!'

It came to a head one breakfast. Naturally it was I who engineered
the moment. Burchall was sitting at the head of our table, Mrs Amos
just happened to be passing.

'Bre-e-eughk!' I belched.

'Say "pardon me", Fry.'

'You dare to use that disgusting phrase, Fry and I'll thrash you to
within an inch of your life,' said Burchall, not even looking up from
his Telegraph — pronounced, naturally, Tellygraff.

'I beg your pardon, Mr Burchall?'

'You can beg what you like, woman.'

'I am trying to instil,' said Mrs Amos, (and if you're an
Archers listener you will be able to use Linda Snell's voice
here for the proper effect, it saves me having to write 'A am traying
to instil' and all that), 'some manners into these boys. Manners
maketh man, you know.'

Burchall, who looked just like the 30s and 40s actor Roland Young —
same moustache, same eyes — put down his Tellygraff, glared at
Mrs Amos and then addressed the room in a booming voice. 'If any boy
here is ever told to say "Pardon me", "I beg your pardon", or heaven forfend, "I beg pardon", they are to say to the idiot who told
them to say it, "I refuse to lower myself to such depths, madam." Is
that understood?'

We nodded vigorously. Matron flounced out with a 'Well,
reelly!' and Burchall resumed his study of the racing column.

Some readers may need a refresher course in the mysteries of the whole U vs. non-U thing, which most Americans find roughly as familiar as the interpretation of West African scarification patterns. ]

[And how could I forget this previously posted passage from Wodehouse's Jeeves in the Offing:

Normally as genial a soul as ever broke biscuit, this aunt, when stirred, can become the haughtiest of grandes dames before whose wrath the stoutest quail, and she doesn't, like some, have to use a lorgnette to reduce the citizenry to pulp, she does it all with the naked eye. "Oh?" she said, "so you have decided to revise my guest list for me? You have the nerve, the--- the---"

I saw she needed helping out.

"Audacity," I said, throwing her the line.

"The audacity to dictate to me who I shall have in my house."

It should have been "whom," but I let it go.

"You have the---"

"Crust."

"---the immortal rind," she amended, and I had to admit it was stronger, "to tell me whom"---she got it right that time---"I may entertain at Brinkley Court and who"---wrong again---"I may not. Very well, if you feel unable to breathe the same air as my friends, you must please yourself. I believe the 'Bull and Bush' in Market Snodsbury is quite comfortable."

Cyber Monday vs. eDay

As countless media reports are informing us, tomorrow is "Cyber
Monday," the day that supposedly kicks off the online holiday shopping
season. The brazenly cynical coinage of "Cyber Monday" was recounted here
last year, when the masterminds at Shop.org saw "an opportunity to
create some consumer excitement" by anointing the Monday after
Thanksgiving with a new title modeled on "Black Friday." The idea was
to make "Cyber Monday" a kind of self-fulfilling prophecy, boosting
online sales on a day that had previously ranked as only the twelfth
busiest on the shopping calendar.

So how much did last year's "Cyber Monday" hype pay off? Depends who
you ask. According to pressaccounts
relying on statistics from Shop.org, the Monday after Thanksgiving was
the second biggest day for online retail sales in 2005. But as far as I
can tell by Shop.org's holiday
shopping report, all they can actually claim is that Cyber Monday
received the second-most votes in a survey asking retailers, "What day
during the 2005 holiday season represented the largest amount of
revenue from sales?" Market research from comScore suggests
that Cyber Monday was in fact the ninth
busiest online shopping day last year, with $485 million in
transactions. That paled in comparison to the real peak two weeks
later: Dec. 12, 2005 saw $556 million spent online.

Just to confuse matters further, a company called Coremetrics says
that
the zenith of the online shopping season occurs not two weeks after
Cyber Monday but one week after, or December 4 on this year's calendar.
A Nov. 6 press
release from Coremetrics seeks to debunk the "marketing myth" of
Cyber Monday and introduces yet another neologism for what the company
believes will be the busiest online shopping day: "eDay." In this
battle of marketing coinages, "eDay" has certain advantages: the snappy
"e-" prefix is a bit more au courant
than "cyber-", William Gibson fans notwithstanding. (Really, when was
the last time you heard anyone refer to "cyberspace" unironically? It
sounds so Matrix-y and
Y2K-ish.) Plus, "eDay" has triumphal resonances with "V-Day" and
"D-Day."

But I wouldn't count on "eDay" gaining the neologistic upper hand
over
"Cyber Monday." Media commentators have firmly latched on to the "Cyber
Monday" concept, even as they acknowledge that it isn't really the busiest online shopping
day of the season. Perhaps writing about Cyber Monday helps fill the
post-Thanksgiving lull in the news cycle, and it's an easy followup to
the boilerplate "Black Friday" shopping stories. I would also expect
online retailers to continue transforming Cyber Monday into a
legitimate shopping event by offering all sorts of sales and promotions
for the Monday after Thanksgiving. It could take another year or two,
but the self-fulfilling marketing prophecy of Cyber Monday might still
come to pass.

November 25, 2006

Dialect representation, resented

A couple of days ago, I commented
on the use of "unusual spelling intended to represent dialectal or
colloquial idiosyncrasies of speech" (from the OED's definition of "eye
dialect"), noting that this is likely to be understood as expressing
contempt. A case in point, from a letter in the NYT Book Review, 5/8/05, from Butch
Trucks (of the Allman Brothers Band), about a Rolling Stone story about the band
written by Grover Lewis:

In Lewis's article, all the dialogue
among members of our group seemed to be taken directly from
Faulkner. We are from the South. We did and still do have
Southern accents. We are not stupid. The people in the
article were creations of Grover Lewis. They did not exist in
reality.

(The letter went to the Times
because a review, by Roy Blount Jr., of Splendor in the Short Grass: The Grover
Lewis Reader, which reprints the RS story about the band on tour,
had appeared in the Book Review
on 4/3/05.)

The reference to Faulkner is surprising. If you go back and look
at your Faulkner, you'll see that he is sparing in his use of special
spellings of all types, including those representing ordinary casual
speech (goin' for going, wanna for want to) and those representing
dialect features (ma for my, brotha for brother). I suspect
that he NEVER uses unusual spellings for perfectly
ordinary pronunciations (enuff
for enough and the
like). He does indicate non-standard and dialectal features of
morphology, syntax, and the lexicon, though, as in this dialogue from
the black maid Dilsey early on in The
Sound and the Fury:

Aint you got no better sense than
that. What you want to listen to Roskus for, anyway.

That's quite enough to let us "hear" the characters in our head
and supply some version of the phonetics. For the classier white
characters, like the Compsons, we're pretty much on our own, though we
can be sure that their speech had regional features.

As for Grover Lewis, I'll have to get hold of the book to see just how
he represented the speech of the Allman Brothers and their
crew. What Trucks tells us in his letter isn't about
pronunciation specifically:

We had a road manager that was a
graduate of Georgia Tech and before coming with us had been a bank
auditor. He was an educated and sophisticated man. Mr. Lewis quotes him
as calling the desert as we flew over Arizona "a right smart of sand".
I worked with this man for many years and never did I hear him use a
phrase that even resembled this.

Lewis, by the way, was a Texan, complete with cowboy boots and a Texas
accent.

Morphemedar

Just a while ago I came across the word sarcasmdar somewhere on the net. It gave 75 Google hits at the time of the writing, which indicate its meaning is something along the lines of a device for detecting sarcasm. Most of the time people are using it when claiming theirs or someone elses is broken.

Perhaps there is even a snowclone in the making - my X-dar is broken. The relatively low number of ghits might indicate it is quite new, or maybe just unpopular. I couldn't think of any other -dars to look for yet. Oh wait, of course:

grammardar : 5 hits (duplicates)
"My grammardar just imploded"

Pekka's morphemedar is clearly in working order. There are plenty of other instances: jewdar, blackdar, sexdar, fishdar, etc. My guess is that there has been a low-frequency process of spontaneous neologism-formation going on here for some time. The fact that radar -- though originally coined as an acronym for "radio detection and ranging" -- can be re-analysed as ra(dio)+dar means that the new morpheme -dar probably sprung into fitful existence soon after radar came into general use. A few of the -dar neologisms -- notably gaydar so far -- have caught on and spread, which presumably somewhat increases the productivity of the background process.

[Barbara Zimmer writes:

I found a few references to humordar, including this one:
It be strongly advised that ye turn on your 'humordar' t'distinguish between reality an' fiction or fantasy. Me opinions are simply that-me opinions
at http://lesbianpiratequeen.wordpress.com/.

November 24, 2006

Mapuche is ours, not yours

Back in 2004, prompted by Bill Poser's
report of a lawsuit in which a relative of the person who coined the term googol was suing Google over a property claim on the word Google, I satirically
claimed personal ownership of the nouns crump, ether, parsley, helicopter, oligarchy, and rhodium, the preposition of, and all derivatives of the verb snuggle. I took it to be
self-evidently hilarious that anyone could claim ownership of some ordinary non-trademarked dictionary word, especially
on grounds of a family connection (and never mind the fact that Google and googol are not the same word). Now the
Mapuches
seek to claim ownership of their entire language, on the basis of a tribal
connection, and they regard
Microsoft's
localization of its software by translating
messages into Mapuche as theft of the Mapuche people's stuff. It really
is very hard for a satirist to keep out ahead of real life, isn't it?

A couple of correspondents have suggested to me and Mark
that the press reports
are crazier than reality; they claim
Spanish-language accounts of what is going
on reveal that the Mapuche people are objecting to pre-emption of decisions about which alphabetic writing system to adopt for their
language. Quite a few have been on offer, and the Microsoft decision
to go for one of them, known as Azmfuche, was taken without general agreement by the
Mapuche but will now be definitive. Published reports of the
situation include
this
one and
this
one, and one page that does discuss the orthographic issue is
here.
But my general impression is that even the Spanish sources
reveal plenty of fundamentally misguided political ranting on the
part of the Mapuches. A language is not something that could be
or should be controlled by a people or its political leadership, and
making software available in a certain writing system or language
is not a threat to, or a theft of, cultural patrimony. Not even if
it does encourage a tendency toward standardization in some
particular direction. So far I still see this story as having a tinge
of the ridiculous about it.
For some serious and informed discussion of the issue that takes
a more sympathetic view, see
this
post by Jane Simpson.

Thanks to Henry Heller and Luis Casillas for
informed correspondence about the Spanish sources.

Language as property?

It's hard to know what is really going on in Chile, where Reuters tells us that "Mapuche tribal leaders have accused [Microsoft] of violating their cultural and collective heritage by translating the software into Mapuzugun without their permission" ("Chilean Mapuches in language row with Microsoft", 11/23/2006) . In particular, it's not clear from the article what the basis for the suit is, or what relief is being sought. The theory may be that a language is a piece of property belonging to (some representative body of) the people who speak it. If this idea were really to be accepted into the system governing the usual laws of property, I suspect that the consequences would surprise and displease many of those who start out supporting it . For some discussion, see "The Algonquian morpheme auction" (3/3/2004).

I haven't seen much about these issues within the "free culture" movement -- but there are some links here, for example this (more details here).
Here's a question: if the use of a language has to be licensed by the tribal elders, can they withhold this permission from someone who wants to criticize them, or to say something else that they don't approve of?

Slurry

... and what is this "slurry" accent Mr Henry claims some Londoners have? Does he mean Estuary?

I wondered about that myself. What Mr. Henry wrote (in "To Accent or No", The American Spectator, 11/22/2006) was this:

To the Pygmalion audience, a glottal "t" indicated a yob. Today's Brits have adopted it as part of a kind of commercial London speech known as "slurry."

The context and the quotation marks show that he thinks of slurry as a name for a type of speech, not a description. I've never heard any such term for a London-area dialect -- or any other variety of English -- though I certainly have heard of "Estuary English", and have even blogged about it ("Estuary English", 8/28/2004). The UCL Department of Phonetics and Linguistics has a web site devoted to Estuary English, which they define as "a name given to the form(s) of English widely spoken in and around London and, more generally, in the southeast of England — along the river Thames and its estuary".

Overall, what Mr. Henry has to say about "slurry" fits "Estuary English" pretty well, so I suspect that we're looking at a slip of the ear: Mr. Henry heard someone talk about "Estuary", and heard (or remembered) it as "slurry". I'm no kind of expert in the modern sociolinguistics of the British Isles, so I admit that I might be missing something here (though Google seems to be missing it too, as far as I can tell.) If anyone can provide evidence that slurry has wider use as a term for London-area speech, please let me know.

The reason for Henry to bring up "slurry" in the first place was his unhappiness with the speech of a TV sports personality:

MY BUGABOO, HOWEVER, IS THE GLOTTAL "T." Around here, you hear it especially in the phrase "at home," which becomes "a' home." A certain class of English speaker, heard especially on the BBC, employs glottal "t's" in a self-conscious way, as a cultural signal of knowingness or savvy or in-crowdism. Listen to a BBC reporter. He will not always use the glottal "t," but will suddenly begin to employ it the more insinuating becomes his tone.

Newly anointed CBS golf anchor Nick Faldo uses more glottals the more clever he becomes, a shame, because he is in fact clever, but the glottals render him almost incomprehensible to an American audience. You're a broadcaster now, Nick. Time for some speech lessons.

To the Pygmalion audience, a glottal "t" indicated a yob. Today's Brits have adopted it as part of a kind of commercial London speech known as "slurry."

Henry's sociolinguistic observations about "a certain class of English speaker" may well be correct -- in an earlier post, I quoted Kate Joester's opinion that

I think there's probably another dimension in prejudice against Estuary English in particular. It's associated with "youth culture" and with being a fake accent acquired by speakers who are "really" something else in order to be youthful and cool.

However, Martyn Cornell goes on to argue that Henry has misconstrued Nick Faldo's accent:

Nick Faldo, who comes in for a kicking from Mr Henry over his alleged glottal stops, grew up in Welwyn Garden City, Hertfordshire, about 10 miles from where I grew up and where I once worked as a reporter on the local paper - I interviewed Nick just after his first victory in a major, and to me, naturally, he has a perfectly fine lower middle class Northern Home Counties accent not that different from my own.

There's something interestingly typical going on here. Henry's article displays an intense interest in matters of pronunciation and an obviously acute faculty of observation; but it also displays an almost complete ignorance of the concepts, skills and background knowledge that are relevant to the kinds of linguistic description that interest him. This combination of intense interest and spectacular ignorance is, I think, unique to the area of speech and language. You don't find birders obsessively compiling a life list that includes insects and bats under the misapprehension that these are also members of the taxonomic class Aves. You don't find photography enthusiasts who think that f-number is a measure of film speed, or that nikon is a noble gas used to prevent condensation inside lens assemblies.

If you're familiar with its history, you might argue that The American Spectator is a special case. (If you're not familiar with its history, read the Wikipedia entry or Byron York's Atlantic article from November, 2001.) But I don't think so. There's no particular political connection here.

[Update -- Stephen Jones wrote that "Where the guy got the word [slurry] from is beyond me. Slurry and Estuary don't even sound alike". I agree -- though some pronunciations of "estuary" do have all but one of the phonetic segments in "slurry" -- but what else could it be? The Cupertino effect seems even less likely than a mis-hearing, unless there is a possible typo that hasn't occurred to me.]

[John Cowan has another idea:

I shouldn't be surprised if the confusion is semantic. Googling
for "slurry estuary" (no quotes) shows that what's at the bottom
of most estuaries (probably including the Thames) is in fact
a slurry.

I hadn't encountered the term, but suspected an eye-pun (as 'twere) on the slurring of sounds, plus a play on the assonance of slurry (the manure-derived fertilizer) and the county Surrey. Searching for Slurrey as an alternative spelling consistent with this hypothesis initially threw up lots of derogative references to Surrey, BC, but some narrowing of the search terms threw up some more appropriate references. Inevitably, there were several other references to the English Surrey as Slurrey, such as one to 'Slutton in Surrey' (a reference to the Surrey town of Sutton). On a talkboard I saw something suggesting a cultural reference, too, which may be indirectly relevant:

everyone knows about the slurrey sluts. ... that was the mid-19th century, when Surrey had different prostitution laws than London proper. Nowadays it's a pretty nice place.

There's also a couple called the Slurreys on the British TV comedy 'Stella Street'; it wouldn't take much to find out if they have an appropriate accent, but I can't play YouTube videos on this computer so I can't check myself. Most illuminating of all, though, was this note on the schedule for a pub crawl on http://www.scienceninjateam.co.uk:

Green Dragon (near the top of Surrey Street market, opposite something called "The Ship"), Croydon, South London/Surrey (or "Slurrey").

Though I can't claim to have found much to suggest the term is exactly dominant, there's enough to say that it was probably more than merely a mishearing of 'Estuary' (a somewhat unlikely explanation anyway). The link to Surrey is more likely still if 'he has a perfectly fine lower middle class Northern Home Counties accent', of course -- this hypothesis would suggest the 'Slurrey accent' is not Estuary English at all.

When it comes to Martyn Cornell's objection to Lawrence Henry's analysis, it's worth bearing in mind that snobbishness is where you find it -- a 'lower middle class' accent is quite enough to qualify for derision among many, especially in the context of Middle England stereotypes: this is precisely the demographic stereotyped, for example, in the Dursley family in Harry Potter.

This is helpful background. But I don't think it's likely that Mr. Henry meant slurry to describe a traditional lower-middle-class accent different from Estuary English, since he describes it as "a kind of commercial London speech" that "today's Brits have adopted", as an artificial "cultural signal of knowingness or savvy or in-crowdism".

And John Wells, who ought to know if anyone does, wrote that he has "never hear of an accent called 'slurry'".

My conclusion? Mr. Henry might be referring to a clear and well-thought-through concept, precisely named by a term that the rest of us haven't learned yet; and then again, he might just be venting a few incoherent and ill-informed prejudices, and "slurry" might just be a careless blend of half-remembered words like estuary, Surrey and slang. We report, you decide.]

Tensor v. Takemoto

You should head on over to TstT, to read the Tensor's comments on Timothy Takemoto's suggestion that Japanese would make a good international language. Takemoto's arguments are really much funnier than anything that pundits in the Anglosphere have come up with recently.

November 23, 2006

Nevertheless

N.C. State fans don't take well to
losing to the hated Tar Heels, nevertheless 23-9 to a 1-9 UNC team
playing for a lame-duck coach.

This is nevertheless used as
a negative-form additive connector, like not to mention, never mind, or to say nothing of. John
Schaefer, who found the example, speculates (reasonably enough) that
it's a feature of speech that has found its way onto the net (though
it's not a use of nevertheless
I recall hearing before), and he wonders how you'd do a Google search
to find more occurrences. Good question.

[Addendum 11/24: Doug Wilson on ADS-L and Marilyn Martin by e-mail suggest that the target was probably "much less", a connective that follows negative clauses -- and ends in "less". Bingo.]

So, first, a query: has anyone noticed similar examples? Mail me
if you have actual cites.

[Addendum 11/24: John Schaefer's found another one: "Miami can continue to have both and help kids that may never have seen the campus of college, nevertheless a private school." (from "dg", 7/21/06, here).]

Second, some thoughts on searching (on Google or elsewhere) for what is
probably a very infrequent use of a very frequent word. One way
that might get the task down to something manageable would be to do it
in two steps: first, determine what material follows not to mention and to say nothing of with some
frequency (never mind might
not work, because it so often stands alone); and then, search for nevertheless followed by this
material.

This is the sort of strategy that the Stanford ALL Project used
recently in searching for instances of quotative all ("And she was all 'Were you in
the church?'") in a gigantic database of postings on newsgroups that
Google has accumulated: we refined the search by first determining the
most frequent words immediately following an initial quotation mark,
then used the top 40 words as part of a regular expression searching
for a personal pronoun plus contracted copula, followed by all, all like, like, say, or go. (The results of this
research were reported on at the recent NWAV conference at Ohio
State.) Even so, the project took a lot of time and depended on
considerable cooperation from colleagues at Google and on research
assistants supported by Stanford. Not a quick and easy task.

A friend of mine's pet bear

Ben Zimmer, following up on the first of my recent postings
on possessives in English, writes about the phrase in the header,
which comes from the 11/22/06 "Mark Trail" cartoon, as critiqued on the
Comics Curmudgeon site.
A poster on that site refers to "the semantic nightmare of a sentence
coming out Mark's mouth in panel three": "You stole a friend of mine's
pet bear!"

Actually, I can't see anything to object to in this sentence, unless
you object in general to possessives of NPs (like a friend of mine) that don't end in
their head nouns, though these have been around for centuries and are
not hard to find in real life.

To see how we get to a friend of
mine's pet bear I'll have to describe with some care how the
determinative possessives of English NPs work. The details are
important.

There are several other items usually labeled as possessives which go
by the big generalization, rather than having their possessives
stipulated, for instance:

generic one: One should never count one's chickens before they're hatched.

anaphoric indefinite one: The big cat's tail is shorter than the small one's.

compound indefinite pronouns: We're
collecting everybody's
opinions.

Special case 2: NPs without a
possessive.

There are several classes of NPs that simply lack a possessive.
(Geoff Pullum and I talked about a number of these in a 1996 LSA paper,
the
handout for which is available on my website.) Some are
single words:

expletive there: *There's
being no food in the refrigerator upsets me.

infinitival clauses: For you to walk
me home really pleases me. *For you to
walk me home's really pleasing me is a surprise to everyone.

(These lists are merely illustrative, not exhaustive.)The big generalization: Z.
Otherwise, the possessive form of a NP x has a Z suffix on the last word w
of x

I've said this with some care. In particular, I did NOT
say that the possessive form of x
uses the possessive form of its last word w, since that wouldn't provide
possessives for NPs that end in words that are not nouns (like the friend I was telling you about or everyone I know), since such words
of course do not have possessive forms.

And now we make a prediction about the determinative possessive
corresponding to the pet bear of a
friend of mine (using an alternative expression of possession
which has the preposition of):
it should just follow the big generalization: a friend
of mine's pet bear, with Z suffixed to the last word, mine, of the possessor phrase a friend of mine. That's
where we started.

Just to wrap up this description, here's how Z is realized:

Realization of Z. The
possessive Z is suppressed if w
itself ends in a Z suffix (the birds'
wings). Otherwise, possessive Z has the same phonology as
plural Z and 3sg present Z:

the basic variant is z (bird's, Chicago's);

for a word ending in a sibilant, epenthesize schwa between it and the z
(Max's, judge's);

otherwise, for a word ending in a voiceless consonant, devoice the z (cat's, Rick's).

The careful reader will have noticed that I haven't said anything about
the personal pronoun it.
I'm saving that for a future posting.

Eye dialect

In posting
about Lawrence Henry's American
Spectator column on accents, Mark Liberman refers in passing to
Henry's "eye dialect", is challenged on this by Daniel Ezra Johnson,
and defends his use of the term by saying:

Well, the OED glosses "eye dialect" as
"unusual spelling intended to represent dialectal or colloquial
idiosyncrasies of speech", which seems close enough in this case.

The problem here is that there are two distinct but related concepts,
and we have only one widely used term to label them.

One concept is the OED's: a representation of dialect (or colloquial)
pronunciations via unusual spellings. It would certainly be
useful to have a term for this, and "eye dialect" is a nearly
transparent candidate for the purpose.

But there's another tradition, in which the term is used for unusual
spellings for perfectly ordinary pronunciations, functioning to suggest that
the speaker is uneducated or crude -- the sort of person who would
spell the words that way. AHD4's definition links the two (but
gives examples only of the second):

The use of nonstandard spellings, such
as enuff for enough or wuz for was, to indicate that the speaker
is uneducated or using colloquial, dialectal, or nonstandard speech.

Using eye dialect (in the first sense) is a tricky business; no matter
what the writer's intent (which might be just to provide local color),
it's likely to be understood as expressing contempt, and in any case
readers often find it tiresome. Writers would be well advised to
use it sparingly.

Using eye dialect (in the second sense) is pretty much by definition a
put-down.

I've always used "eye dialect" in the second sense, so I'd suggest
"dialect spelling" for the first sense. But then who's going to
listen to ME?

Obligatory adjectives and optional articles?

The sharp-eyed Language Log reader will have noticed that Arnold
Zwicky's latest post begins with the phrase the
sharp-eyed Éamonn McManus. Now, it is well known that
proper names of people usually don't take definite articles, allowing for some quite
rare exceptions (the Donald for Donald Trump; the Bill Clinton
of 1992 for a temporal stage of Bill Clinton's life history; etc.).
Arnold certainly could not have begun his post by saying *The Éamonn McManus noticed a gap in the
list. Yet sharp-eyed appears to be just an ordinary
adjective in attributive modifier function, as in simple Simon,
poor Aunt Beth, lucky Pierre, good old John,
fearless Evel Knievel, sweet Georgia Brown,
Calvin Trillin's locution
the wily
and parsimonious Victor S. Navasky, and so on;
and these are always optional: drop an attributive adjective and what's
left is always a grammatical noun phrase. Yet dropping the adjective
from the sharp-eyed Éamonn
McManus does not leave behind a grammatical noun phrase. It
produces something utterly unacceptable. So are attributive adjectives
optional or not? How do we give an accurate description of what's going
on here?

Don't stand there looking at me. I don't know. Syntax is hard, and
people who think everything about English syntax is known already have no
idea of the actual ignorance-riddled state of the art.

It's worse than I said, actually. Dropping the definite article from
a singular noun phrase is normally impossible unless the noun can be
construed as denoting some uncountable substance or stuff: in normal
conversational English (I ignore newspaper headlines) we cannot drop
the definite articles in something like Yesterday the vice president
flew to Iraq to get *Yesterday vice president flew to Iraq.
I couldn't have begun this post by saying *Sharp-eyed Language Log reader will have noticed... Yet in the
sharp-eyed Éamonn McManus we can drop the definite
article: Arnold could have begun by saying Sharp-eyed
Éamonn McManus noticed a gap in the list. So is the
definite article obligatory with singular non-mass nouns
or not?

I repeat: don't look at me. I don't know.
I have only a few short decades of experience with this
extremely difficult subject.

Thanks to Paul Postal for pointing out I needed to
distinguish proper names of people from other proper names.
Many proper names (like "the Mississippi") not only permit the
definite article, they require it.

Let's meet at mine

The sharp-eyed Éamonn McManus noticed a gap in the list of
independent possessive constructions in my recent "Overpossessive" posting:
I illustrated the anaphoric zero, predicative, and double genitive
constructions with both pronominal and non-pronominal possessives (mine, Sandy's), but gave only a
non-pronominal illustration for the locative construction: Let's meet at Sandy's. This
was not an oversight -- I find Let's
meet at mine unacceptable in a context where there's no
antecedent for the missing head, and most other English speakers make
the same judgment -- but now McManus provides some attestations of
pronominal locative possessives (from the U.K. and Ireland), which
suggests that some speakers are beginning to simplify their grammars by
eliminating an odd constraint on one specific construction.

Which reminded me of Baker's Paradox: although learners
generalize ("project") from the language they hear, producing many
utterances that are not directly modeled for them, in some cases they
resist obvious generalizations and seem to conclude that things they
haven't heard just aren't grammatical; they learn lexical exceptions
and very specific constraints.

The paradox gets its name from C. L. Baker, author of the 1979 Linguistic Inquiry paper "Syntactic
theory and the projection problem", in which the issue (for lexical
exceptions) was clearly presented. (Baker was the first student
to write a Ph.D. dissertation under my direction, back in the
Pleistocene Epoch, so it pleases me to refer to his work here.)
More recently, Peter Culicover's book Syntactic
Nuts (1999) examined a series of puzzling constructions in
English, using (apparently) arbitrary differences in syntactic behavior
between lexical items to conclude that learning must be, among other
things, "conservative". And now, for her dissertation, Stanford
student Liz Coppock is looking at the cases from this literature, plus
some others, so Baker's Paradox has been very much on my mind.

The question is how people learn things like the following:

You can give $100 to the library, give the library $100, or donate $100 to the library, but not
*donate the library $100.

You can be the likely winner,
be likely to win, or be the probable winner, but not *be probable to win.

You can be happy, be a happy person, or be glad, but not *be a glad person.

(Searching on the net will get you small numbers of examples like the
asterisked ones above. While most people are conservative
learners, a few are more adventurous.)

On to syntactic constructions, a world in which construction-specific
(though systematic) constraints are rife. A couple of well-known
examples from English:

In main-clause wh-interrogatives,
prepositions can be stranded or (in a rather formal style) fronted, but
in wh-interrogative complement clauses, fronted prepositions are much
less acceptable:

Which city did they fly from?
[main, stranded]
From which city did they fly? [main, fronted]
I wonder which city they flew from. [embedded, stranded]
?? I wonder from which city they flew. [embedded, fronted]

In two serial-verb-like constructions
-- which I'll call GoV and TryAndV -- for most speakers the verbs must
obey the Inflection Condition of Pullum 1990 ("Constraints on
intransitive quasi-serial verb constructions in modern colloquial
English", in OSU WPL 39.218-39), which requires that they be in a form
identical to their base form (either the base form itself, or the
non-3sg present); in other similar constructions, in particular GoAndV,
there is no constraint:

I'll go and see what I can do. [GoAndV,
base]
I'll go see what I can do. [GoV, base]
I'll try and see what I can do. [TryAndV, base]
I always go and see what I can do. [GoAndV, 1sg pres]
I always go see what I can do. [GoV, 1sg pres]
I always try and see what I can do. [TryAndV, 1sg pres]
He always goes and sees what he can do. [GoAndV, 3sg pres]
*He always goes see(s) what he can do. [GoV, 3sg pres]
*He always tries and see(s) what he can do. [TryAndV, 3sg pres]
I went and saw what I could do. [GoAndV, 1sg past]
*I went see/saw what I could do. [GoV, 1sg past]
*I tried and see/saw what I could do. [TryAndV, 1sg past]

In all such cases, the puzzle is why so few people generalize, why so
few eliminate the wrinkles in the grammar. Locative possessives
provide yet another instance of the puzzle: the other three types of
independent possessives are unconstrained, but the locative
construction maintains its constraint against personal pronouns.

Until recently, that is. In McManus's words:

Up until recently I would have assumed
that nobody would ever say Let's
meet at mine/yours/hers/ours/theirs but in fact it seems to be
current usage in England and spreading to Ireland. I'm pretty sure
nobody ever used that construction when I was growing up in Dublin, but
my slightly-younger brother now uses it all the time. Google finds a
few hits for Let's meet at mine,
including [here],
where it's the title of the page and the product it's selling. There
are very few hits, though (none at all for ours, theirs, hers and only one possibly
non-native one for yours),
for what should be a very common phrase in chat forums so it may be
that it's not yet all that widespread. But I'd bet on it spreading
further because it's handy, immediately comprehensible, and logical.

Getting rid of the let's
pulls in a modest number of examples, most of which seem to be from
British sources, for example:

Looking forward to seeing the others I
agreed to meet at theirs at 8pm. (link)

(this from a blog full of British English features). Searches
varying other parts of the search string will no doubt yield many more
examples. The independent possessive pronouns are on the march!

[Addendum: two others have written to confirm that this usage is widespread in speech in the U.K. and also in Australia and New Zealand. Both were under the impression that it spread fairly recently.]

Why Americans can't learn foreign languages

What struck me most of all about
Lawrence Henry's piece on accents
was something Mark
didn't even mention.
Mr Henry notes that in American English
a totally unstressed vowel is reduced to a sound usually written
down as "uh" (the sound linguists call schwa); and he
goes on:

An accurate enough phonetic observation:
the first syllable in these words
is pronounced with a schwa, whereas many other languages
have no schwas at all, in any words. My horse laugh at the
quoted remark comes not from this phonetic fact but from
the astoundingly dopey idea that it is a "fault" that provides
the key to the riddle of why Americans don't do so well at
learning foreign languages.

Steve Jones points out me, for example, that western
varieties of Catalan do not have schwa, but in Central Catalan
(of Barcelona) there is reduction that makes schwa the
most frequent vowel in actual speech; yet this doesn't
correlate with any perceptible difference in
language-learning ability
Catalan speakers from different regions of eastern Spain.
Henry's remark about
how vowel reduction to schwa "accounts for our relatively poor
performance" really is astoundingly dumb.

Why we Americans, with our staggering wealth of resources
and (for example) the most highly ranked graduate schools
in the world, do so poorly by any measure on our command of
foreign tongues is a complex question with a mainly
sociological, political, historical, educational, and
social-psychological answer. (Never forget that
John Kerry is said to have had to attempt
concealment of his fluent French to avoid bad press during his
Presidential run, and
Nebraska in the early 1920s had a law making foreign
language instruction illegal, and in that very same state as recently
as 2003 a father was
threatened
by a judge with
loss of the right to visit his child
if he didn't speak English during his visits... This
country could not exactly be said to be uniformly friendly toward
polyglotism. Nor does it always honor the
accomplishment of those immigrants and Native Americans
who speak a heritage language
at home and English elsewhere — in fact punishment of
Native American children for speaking their Amerindian
language while in school used to be commonplace.)
It's certainly quite a bit more complex than anything traceable to
the reduction of unstressed vowels to
schwa. Don't give up on taking foreign language lessons simply on
the grounds that as an American you are doomed to failure by
your learned vowel reduction habits.

This morning, as I counted my blessings, public and private, I thought about how many of them are transformed curses, and gave special thanks for all that blogging has done for me in this respect. For me as an individual linguist, it can only be frustrating and depressing to observe the conjunction of intense public interest and unprecedented public ignorance with respect to matters of speech and language. But as a writer for Language Log, I can join H. L. Mencken in viewing this as a "daily panorama ... of private and communal folly ...so inordinately gross and preposterous, so perfectly brought up to the highest conceivable amperage, so steadily enriched with an almost fabulous daring and originality, that only the man who was born with a petrified diaphragm can fail to laugh himself to sleep every night, and to awake every morning with all the eager, unflagging expectation of a Sunday-school superintendent touring the Paris peep-shows".

In other words, it's not just another sad example of our educational system failing to provide an intellectual with the tools needed for the job -- no, it's a topic for a Language Log post!

Today's example is provided by Lawrence Henry ("To Accent or No", The American Spectator, 11/22/2006). Christopher S. Mackay brought this article to Geoff Pullum's attention, and Geoff mentioned it yesterday in the break room at Language Log Plaza, observing that it's "a feast of layperson's efforts to talk about phonetics without having the phonetics", and that it "it
comes out with some very strange claims about accents and languages and sociolinguistics". One of our younger staffers, who has not yet entirely mastered Mencken's technique, remarked that "crap like that just makes my head hurt". But I agree with Pullum: it's a virtual Thanksgiving feast.

The first dish is Mr. Henry's version of the common opinion that an "accent" is what everyone else has:

Cursed with acute hearing, I have bequeathed my boy Bud unaccented speech. Bud talks…well, like Brian Williams. How did I do that? By making fun of local locutions and teaching Bud to hear.

If you look up accent in the dictionary, ignoring the stuff about stresses and diacritics, you'll find glosses like "a characteristic pronunciation, especially one determined by the regional or social background of the speaker"; "a way of speaking typical of a particular group of people and especially of the natives or residents of a region"; "a way of pronouncing words that indicates the place of origin or social background of the speaker"; "the mode of utterance peculiar to an individual, locality, or nation".

In that sense, Brian Williams has an accent, just like Tom and Ray Magliozzi do. At least, that would certainly be the opinion of a resident of London, Melbourne or Cape Town. But Mr. Henry feels that Eastern Massachusetts pronunciation is a deviation from a neutral norm:

This has cost Bud in the court of peer opinion. His confreres at school seem to regard him as a snob for correct speech.

In fact, I bet they say that poor Bud has a "snooty accent". Or maybe they use some other adjective -- but I bet they don't say "ain't it odd how Bud has no accent?" Bud's dad continues:

Massachusetts is like that. If we lived in Texas, would I have equally mocked the local tendency to say "awl" for "oil"? Something in New England speech grates me wrong, and has made me a stickler for diction.

What kind of speech grates Mr. Henry right? Well, he tells us in his last paragraph:

I would rather my boys talked like Bobby Jones than Archie Bunker. If I could choose an accent for my own, which I no longer can, I would talk like golf announcer and former Amateur champ Steve Melnyk, like Jones, a Georgian. But I strongly suspect that, like me, over time, my boys will end up talking without any real accent at all. My son Bud has noticed that his classmates' accents are less pronounced than their parents'. Absent some temporary fad, like slurry or Valley Girl, that is the established trend. I am really not sure if that is to be mourned or rejoiced.

In fact, there's some controversy about what the "established trend" is. Perhaps some social strata are becoming more homogenized -- the youth of (say) Andover MA and Alpharetta GA may be more similar in their speech than their parents are, I guess -- but in other cases, there's evidence that some regional and social dialects in America are diverging. In any case, even if all Americans ended up speaking in exactly the same way, this would not be "speaking without any real accent at all", no matter how plain and flat the participants in this unlikely confluence felt the results to be. It would still be the characteristic pronunciation of a particular class, place and time, even if the class, place and time were "all native speakers of American English", "all of the United States", and "the middle of the 21st century".

The second dish in this feast is Mr. Henry's presentation of the Law of Least Effort, prepared in a delicately-flavored reduction of the notion that standard speech is also the most highly optimized, and garnished with sprigs of eye-dialect:

Many of the characteristics of regional accents are very labor-intensive. Speech usually elides toward the easy. It is much easier to say "and" than the tortured New England "ee-und," much easier to say "ahn" than "oh-wahn" ("on"). Why do these pronunciations persist?

Now, we know that Mr. Henry knows that eastern New England speech is r-less, because he mentions it in the context of an interesting discussion of dialect ideology:

I overheard a girl from Charlestown, who was taking a speech class, say that she had a hard time saying the terminal "r" in "brother" or "sister," instead of her accustomed "brothuh" or "sistuh." "It sounds unfriendly," she objected.

To my ears, au contraire, Eastern accents sound thuggish, threatening, and aggressive. TV and radio commercial producers use those accents to suggest savvy, but usually in a working class character, like a plumber. My wife finds Southern accents threatening, in a macho sort of way. In commercials, those cultural markers, Southern accents signify much the same thing as the working class Easterner: savvy about something nitty-gritty, like motor oil.

But curiously, it doesn't occur to him to wonder why Brian Williams doesn't drop all those complicated final-r-related lingual contortions, in favor of the New Englanders' simpler and much less labor-intensive schwa. And why do "accentless" Americans insist on all that back-to-front and low-to-high tongue motion in words like "hi" and "bye", instead of the restful, open monophthongs of Sourthern States English?

For dessert, you won't be able to resist at least a taste of Mr. Henry's verbs. His accent may be American standard, but his use of verbs is distinctly innovative. For instance, he'll take a verb that usually comes with a prepositional complement, and use it as a plain transitive. His last sentence, for example -- "I am really not sure if that [trend] is to be mourned or rejoiced -- implies that it's possible to rejoice a trend -- in this case, the alleged trend towards phonetic homogenization -- rather than to rejoice at a trend, or rejoice because of a trend. And as we noted earlier, he says that New England speech "grates me wrong". This seems to be a blend of "grates on me" and "rubs me wrong", but whatever the source, it creates a distinctly non-standard relationship between the grater and the writer.

And now, it's time to turn from these linguistic delicacies to preparations for the physical feast.

[Daniel Ezra Johnson writes:

i know you've moved on to gustatory pursuits today, but i thought i'd note that the 'phonetic' spellings in henry's piece were a) surprisingly on-the-money, as i hear eastern massachusetts speech, and b) not at all fairly called 'eye-dialect', as i understand that term.

Well, the OED glosses "eye dialect" as "unusual spelling intended to represent dialectal or colloquial idiosyncrasies of speech", which seems close enough in this case. And I agree that Mr. Henry does a creditable job of representing pronunciations, whatever you call the method he uses.]

[From Peter Howard:

Your recent post reminded me of a conversation I had with an audience member after a Joy of Six poetry performance in New York. At the time, one of our number was a San Francisco native; the rest of us were from various parts of England. I was asked, "Is Wayne an American?" and I confirmed that he was. "I thought he must be." came the reply. "He's the only one of you who doesn't have an accent."

]

[And another amusing anecdote from Jay Cummings:

This article reminds me of the time I was in Brookhaven, NY, along
with 3 of my colleagues. Two were a German Jew and a middle class
Englishman, both of whom had lived long in the US, but strongly
maintained (to my ears at least) their native accents. The other was
a Texan, similarly unchanged in accent despite having lived in
southern California for many years. And then there was me, a
Minnesotan descendant of Swedes, Norwegians, Germans and English.

We were at a restaurant in town that featured a number of Greek dishes
on the menu, and an obviously native Long Island staff. The waitress
came to take our orders, and after I chose my entree, she asked me if
I would like a Greek salad or a tourist salad with the meal. I did not
know what a tourist salad was, but I didn't really like Greek olives
and feta cheese, so I asked for the tourist salad, and she wrote this
on her pad without comment. She left for the kitchen.

We looked at each other, and the Texan asked me what a tourist salad
was. I replied I didn't know, and none of the rest of us had any idea
either. Then a short time later, it dawned on me, and I laughed aloud,
Oh, she meant a _tossed_ salad!" We all chuckled a bit, and the
waitress returned with our beverages.

To explain our laughter, I mentioned that we had not understood her
accent, and had just figured it out. With great amazement she stared
at us and said, with perfect justification I think, "Youse gennelmin
think _Oi_ have an accint?"

A Lawsuit over a Dictionary Entry

Back in October on a radio show pundit Norman Spector, former chief of staff to Conservative Prime Minister Brian Mulroney referred to Liberal Member of Parliament Belinda Stronach, a former Conservative who crossed the floor a year ago, as a bitch (audio here; text here).
This caused a bit of a furor. Spector defended his use of the word by saying that he had used the term correctly in the sense of "a treacherous or malicious woman", for which he relied on the Oxford English Dictionary.

In her column of November 18th, Vancouver Sun columnist Daphne Bramham wrote:

the former adviser and confidante to both a prime minister and a premier sanctimoniously tried to bluster his way out of it, claiming that he was using an arcane definition from the Oxford Dictionary meaning treacherous behaviour. I've not been able to find it in any of the versions of Oxford I've consulted.

Spector is now suing Bramham along with the owner, publisher, and editor-in-chief of the Sun for libel. Whether he will win is unclear for a number of reasons, including the fact that Bramham never explicitly said that Spector made up the definition but only suggested it by innuendo, but the facts at least are on Spector's side. In the online version of the OED after sense 1a "female of the dog" and 1b "female of the fox, wolf and occasionally of other beasts", we read:

2. a. Applied opprobriously to a woman; strictly, a lewd or sensual woman. Not now in decent use; but formerly common in literature. In mod. use, esp. a malicious or treacherous woman; of things: something outstandingly difficult or unpleasant.

Nor do I see that there is anything arcane about this sense. I'd say that it describes pretty well what the word means to me. The sense that I would call arcane is 2c "A primitive form of lamp used in Alaska and Canada", with which I was unfamiliar.

The really funny thing here is that they are fighting about whether Spector's use of the term was semantically correct when you'd think that the issue would be whether it was chivalrous.

November 22, 2006

Final-vowel thankfulness

On the feminist blog Echidne
of the Snakes (via Wonkette),
a guest-blogger using the
handle "olvlzl" suggests an ethnopoliticolinguistic "reason to
be thankful" this Thanksgiving:

In our pride at having Democrats name the first
woman as Speaker of the House we have forgotten two interesting and
telling facts, Nancy Pelosi is the first person with a name ending in a vowel to be
Speaker of the House.
She has also risen higher
in power
than anyone
else with a name ending in a vowel in the history of the country.

Commenters
tried to figure out exactly what counts as a vowel to "olvlzl" (who
could use an extra vowel or two herself). Her definition of "vowel"
isn't strictly
orthographic, since she discounts surnames ending in silent "-e" like those of
Presidents Coolidge, Fillmore, and Pierce. (Don't even mention Monroe!) And it's not strictly phonological, since "-y" doesn't seem to count when it's pronounced as /i/ (President John Kennedy, Speaker Tom Foley) or as part of a
diphthong like /eɪ/ (Speaker Henry Clay, Chief Justice John Jay). But
the blogger's point isn't really about vowels per se, but about
Pelosi's Italian descent:

While we are looking at the facts of her gender and her party
affiliation to
explain her utter rejection by the Washington DC Establishment and the
Republican media we shouldn't forget this fact could count for a lot of
the snooty snark. We shouldn't forget that for people with a heritage
from the Mediterranean basin, and elsewhere, she also represents a
great leap forward.

This isn't the first time we've seen "person whose name ends in a
vowel" used as code for "person of Italian (or southern European)
descent." It came up a
year ago when Samuel Alito was nominated for the Supreme Court. At
the time, Matthew Continetti of the Weekly
Standard took it as "a point of ethnic pride" that he had a vowel
at the end of his name, just like Scalia and Alito (whose names were
being fused into the derogatory nickname "Scalito").
As Eric Bakovic noted on phonoloblog,
"Continetti's point seems to be that having a vowel at the end of your
(last) name more or less identifies you(r name) as being of Italian (or
at least 'ethnic') descent." As with the comment on Pelosi, "final vowels" are really ethnic markers masquerading as (folk-)phonological units. Linguists needn't concern themselves with definitional niceties in such cases... and for that we can be thankful.

[Update #1: Seth Finkelstein points out that "name ending in a vowel" as shorthand for Italianness is an old trope in American discourse on ethnicity, with Google News Archive turning up examples from the mid-'80s relating to such figures as Mario Cuomo and Geraldine Ferraro. Here's the earliest example I've found on the Proquest archive:

New York Times, Apr. 14, 1967, p. 23
Judge Di Lorenzo explained that his organization [sc. the American Italian Anti-Defamation League] was attempting to stop the press and television from using the word "Mafia" in crime stories and to abolish the stereotype criminal in movies: "He is always dark-complexioned, and his last name always ends in a vowel," the judge said.
]

[Update #2: John Kroll writes:

I can't argue with the online cites that clearly limit "name ends with a vowel" to Italians or at least southern Europeans. But I've used it and heard others use it much more generally.
Although my name doesn't -- my Polish ancestors even tacked on an extra consonant when they came to America -- I've got plenty of ends-with-a-vowel cousins and I've used "ends with a vowel" to distinguish between people of
English/Irish/Scotch/German descent and, well, pretty much everyone else -- or, at the least, almost all other European nationalities.
For me and the Poles and Italians I grew up with, "ends with a vowel" distinguished those of us whose ancestors largely arrived in the mass
immigration of the late 1800s and early 1900s, by which time the early birds had locked up the power, good jobs and good neighborhoods; and from blacks, the only group we were aware of that was clearly far worse off.
Of late, I think my sense of it has even expanded to include all Asians and Latinos, in a broader sense of being those people who fall somewhere on the spectrum of American tolerance between the Mayflower offspring and the
descendants of slaves. I can find at least one online backup for that: "As for Henry Bonilla, who will be introducing himself around Dallas next week, the GOP may be attracted by the fact that his last name ends with a vowel." (link) ]

Bird elevated

Hot news from the Association for Computational Linguistics: Steven
Bird (of this parish) has been elected vice-president/president-elect (a pair of positions the association telescopes as "vice-president-elect")
of the ACL, effective January 1, 2007. There will be the usual
manic elevation ceremony at Language Log Plaza, date and time to be
announced.

Mixing idioms

A behind-the-envelope calculation illustrates why it makes sense for
Microsoft to risk irking techies with its piracy battle.

Not quite clear whether Larry meant behind the veil, behind the
curve, back of the envelope, pushing the envelope, back of the curve,
behind the woodshed, back of the veil, pushing the veil, back of the
woodshed, pushing the woodshed, or pushing the curve, is it?

A Google check suggests that Larry may be the only person ever to
have used the phrase "behind the envelope calculation" in the history of
the world. If I had found even one other occurrence, then under the
OICTIQ
principle I might have considered the possibility that we have a new
idiom emerging here; but I think not. I'd say we're simply looking at a
one-off mistake due to a confusion between two idioms. A sort of phrasal
malapropism. And if you are surprised that a linguist would think a native
speaker can make a mistake about the use of his own language,
you
shouldn't be.

Although, of course, we shouldn't forget that this is the kind of error
that can sometimes act as a little seed from which a legitimate linguistic
change might one day grow.

[Update: Dave Errington has made the very sensible suggestion
that Dignan might have taken "back of the envelope" to relate to the
phrase "in back of the envelope", meaning "behind the envelope"
(as opposed to "on the back of the envelope"), and thus replaced
the former by the latter either as a confusion or because he saw
the two as synonymous. And John Cowan suggests that there might
even have been an editorial intrusion here — a general
substitution of behind for (in)
back of, carelessly over-applied to a case that
meant "on the back of"; there
were a few (misguided) 20th-century
usage handbooks that followed the opinionated
grouchiness of Ambrose Bierce (Write It Right, 1909) and
called (in)
back of an illiteratism, for seventy or eighty years.
(There is in fact nothing wrong with the phrase, though it is
distinctively American rather than British.)]

Mehrabianian matters

KQED radio's "Forum" show continues to offer interviews on
language-related subjects. As reported here
yesterday, on Monday it was Kitty Burns Florey on sentence
diagramming. Yesterday it was Anne Karpf talking about her
recently published book The Human
Voice. Along the way she savaged the literature on the
relative contributions of words, voice, and body language to
communication -- both the original Mehrabian research and the "7 - 38 -
55" version that spread into folk knowledge (recently discussed here)
-- and disputed claims that women talk a lot more than men (a topic
that Mark Liberman has been returning to on Language Log again and
again after his first postings on the subject this summer). I
haven't seen her book yet, but she sounds generally level-headed.
Meanwhile, you can listen to the Florey and Karpf interviews via the
"Forum" homepage.

November 21, 2006

Freedom for data

Good news today for bloggers like us here at Language Log Plaza: the
California Supreme Court has struck another blow against common law liability for republication. You can't be sued for libel (they claim) simply
for reporting on a blog what another source has said
(see this
report). Ilena Rosenthal had been sued for publishing, on a web site
she did not control, certain statements taken from an email by Tim Bolen
about a couple of medical doctors, Stephen Barrett and Terry Polevoy,
who run a web site that attacks alternative medicine (Rosenthal is a
defender of alternative medicine). Among other things, she alleged that
Barrett is "arrogant, bizarre, closed-minded; emotionally disturbed,
professionally incompetent, intellectually dishonest, a dishonest
journalist, sleazy, unethical, a quack, a thug, a bully, a Nazi, a
hired gun for vested interests, the leader of a subversive organization,
and engaged in criminal activity (conspiracy, extortion, filing a false
police report, and other unspecified acts)".

What we learn from the
Supreme Court's judgment is not just that she can't be sued for libel for
reporting those judgments about Barrett in another forum, but also
that I can't be sued for letting you know what she said.
I suppose it could conceivably still be actionable for me to tell you that
Strunk and White
are dishonest, closed-minded, emotionally disturbed, professionally
incompetent, unethical, fanatical, ignorant, linguistic charlatans and
puppy torturers, because
I'd be the primary utterer of that claim, not just a reporter of it.
But hey, they're dead.

What's most important is that I would definitely be
free to cite the phrase emotionally disturbed, professionally incompetent,
intellectually dishonest, a dishonest journalist, sleazy, unethical,
a quack, a thug, a bully, a Nazi, a hired gun
for vested interests, the leader of a subversive organization, and engaged
in criminal activity as an attested example of a 13-part
coordination
in which the coordinates are not all of the same grammatical category
(the 13 coordinates are adjective phrase, adjective phrase, adjective phrase,
noun phrase, adjective, adjective, noun phrase, noun phrase, noun phrase,
noun phrase, noun phrase, noun phrase, and past participial verb phrase,
respectively). And I could link to the source. The court has made the
world safer for data, which is what we care about here at Language Log.

W and Vietnam: together again, linguistically

"President in Vietnam. I bet you never thought you'd hear those words in the same sentence. It's like saying Bill Clinton and celibacy in the same sentence." --Jay Leno

The interesting thing about this one is that the joke depends precisely on the fact that W and Vietnam have often been together linguistically, in numerous sentences discussing his efforts to stay out of Vietnam physically. During the previous two presidential campaigns, I'll bet that there were hundreds if not thousands of such sentences in the media, and probably quite a few in comedians' monologues as well. One of the issues was the way that W avoided service in Vietnam by enlisting in the National Guard, an opportunity that he was alleged to have gotten as a result of his family's political string-pulling. Another issue was W's allegedly casual attitude towards the requirements, such as they were, of his National Guard duty. (This was a time when the draft was used to supply manpower for the war, instead of the call-ups of Guard and Reserve units that are now normal.)

In a story by George Lardner Jr. and Lois Romano, published in the Washington Post on July 28, 1999, under the headline "At Height of Vietnam, Bush Picks Guard", we have these five sentences containing the words Bush and Vietnam:

1. Later, when Bush was commissioned a second lieutenant by another subordinate, Staudt again staged a special ceremony for the cameras, this time with Bush's father the congressman – a supporter of the Vietnam War – standing proudly in the background.
2. Vietnam was clearly a crucible for Bush, as it was for Bill Clinton, Al Gore and most other men who left college in the late 1960s.
3. Bush maintains that he joined the National Guard not to avoid service in Vietnam but because he wanted to be a fighter pilot.
4. As he drifted, Bush struggled with his own feelings about Vietnam and the turmoil he saw around him in America.
5. Bush says that toward the end of his training in 1970, he tried to volunteer for overseas duty, asking a commander to put his name on the list for a "Palace Alert" program, which dispatched qualified F-102 pilots in the Guard to the Europe and the Far East, occasionally to Vietnam, on three- to six-month assignments.

If we include sentences with pronouns or full noun phrases referring to W, we get three more sentences from the same article:

6. He didn't dodge the military. But he didn't volunteer to go to Vietnam and get killed, either.
7. By enlisting in the Guard, his son not only avoided Vietnam but was able to spend much of his time on active duty in his home town of Houston, flying F-102 fighter interceptors out of Ellington Air Force Base.
8. "I'm saying to myself, 'What do I want to do?' I think I don't want to be an infantry guy as a private in Vietnam. What I do decide to want to do is learn to fly."

It's easy to find more like this -- the next election brought the whole CBS memogate business -- but this is enough to make the point. Leno's linguification is paradoxical: he can make a joke about how you don't expect to find Bush and Vietnam in the same sentence, precisely because Bush's efforts to avoid Vietnam have been so extensively and memorably discussed.

[Russell Borogove suggests that Leno's joke depended on the word "in" being part of the sentence we never expected to hear. Maybe so -- but I was relying on the parallelism implied by Leno's next observation, "it's like saying Bill Clinton and celibacy in the same sentence", which seems to set up the analogy Bush:Vietnam::Clinton:celibacy. Russell's idea, I think, is that the analogy should be Bush:in Vietnam::Clinton:celibacy, which seems forced to me. I construed the joke as relying on a sort of metaphorical connection between being linguistically close and being geographically close. I might be wrong -- but as we've seen many times in the past, people are not shy about making metaphorically-intended assertions about what words do (or don't) occur together that are obviously false, if taken literally.

I guess another option might be that Leno meant a generic (U.S.) president, and not George W. Bush in particular. I considered and rejected that idea, since Clinton visited Vietnam in November of 2000, when he was still president.]

[Jim Lewis writes:

Some years ago, in the 80s, I would guess, the New Yorker ran a humor piece by Veronica Geng -- I can't find the text itself on the web, but you can find it in one of her books -- called 'Love Trouble is My Business'. The piece begins with an epigraph quoting a Village Voice article, which itself quotes a Sunday Times story. The Times story said something about Ronald Reagan and Proust; the Voice writer suggested that it would be the only time the words "Mr. Reagan" and "read Proust" would ever appear in the same sentence. In Geng's piece, the words "Mr. Reagan" and "read Proust" occur in every sentence.

A quick search on amazon.com turns up "Love Trouble: New and Collected Work", for which the "search inside" feature is available. This shows that the piece entitled "Love Trouble Is My Business" starts on page 149. The opening quote is from Geoffrey Stokes (in the Village Voice, August 14, 1984). Stokes in turn quotes Francis X. Clines, writing in the Sunday Times, to the effect that "subjects such as the Soviet Union seem to haunt Mr. Reagan the way vows to read Proust dog other Americans at leisure", and comments that "This may be the only time in history in which the words "Mr. Reagan" and "read Proust" will appear in the same sentence".

Since Stokes was unwise enough to use the future tense, the door is open for Geng to write a story that begins:

I glanced over at the dame sleeping next to me, and all of a sudden I wanted some other dame, the way you see Mr. Reagan on TV and all of a sudden get a yen to read Proust. Not that she wasn't attractive, with rumpled blond curls adn a complexion so transparent you could read Proust through it -- that is, as long as her cute habit of claiming a tax deduction for salon facials didn't turn up in some IRS stool pigeon's memo to Mr. Reagan.

And so on. ]

[Jay Cummings writes:

I think what is interesting about Leno's joke is that it would
not be particularly funny if it was not a linguification. "I bet
you never expected to see the President in Vietnam." Of course,
a professional comedian manages to make some amazing things funny,
but still, the lingufied version seems funny even on the page.

Maybe this is some recognition of the absurdity of the formulation?

]

[And Jim Lewis adds:

Now that I think about it, I should point out that Geng cheats a little bit. The original Times story, and the Voice article that refers to it, uses "read Proust" in a way which indicates that the "read" is present tense. Geng's piece shifts between present tense "read" and the past tense "read", which I assume counts as two different words, no? Maybe not, but I'd love to have heard the discussions between her and the New Yorker's celebrated fact checkers. (I doubt that humorous pieces are granted a pass on such matters: they fact-check poems over there.)

Snowclones in the New Scientist

When Saddam Hussein claimed the first Gulf war would be "the mother of all battles", he coined an endlessly reusable formula that has given us the mother of all plagues, stink bombs, waves, firework displays and brain cells (all, alas, taken from the pages of the mother of all science magazines).

Overpossessive

I can see how this happened, but the result looks odd indeed:

Then there
are families like R.’s and
his partner’s’ that from the outset seek to create a sort of extended
nuclear family... ("Gay Donor or Gay Dad", by John Bowe, New York Times Magazine 11/20/06. p. 69)

Let's take this step by step. First, we want families like X, where X is an
independent possessive (one lacking a nominal head). For personal
pronouns, there are special forms for the independent possessive -- mine in families like mine -- while for
other NPs the independent possessive is identical to the determinative
possessive (which is in construction with a following nominal head),
for which the default form is pronounced with a final Z (with three
variants, according to phonetic context), spelled with final ’s;
that gives us things like families
like George’s, families like
my best friend’s, families
like my friend from Chicago’s.

Ok, now we want the X in families
like X to refer to the family comprising R. and his partner, so
we need the possessive of R. and his
partner, and that would be, following what I just said, R. and his partner’s: there are families like R. and his
partner's that... This is fine, but it doesn't sound quite
right to some people, because it seems to coordinate R. (non-possessive) with his partner's (possessive), which
looks like a failure of parallelism. How to fix that? Make
the first conjunct possessive as well.

(Notice that warnings against non-parallel coordination might have
played a role in the development of these "distributed"
possessives. Proscriptions and prescriptions can have all sorts
of side effects.)

Now we have families like R.’s and
his partner’s, with possessiveness distributed across the two
conjuncts. This is also fine, though it might be understood as
meaning "families like R.’s family and families like his partner’s
family", referring to two families rather than one. That is, for
people who can distribute possessives, the resulting expressions are
systematically ambiguous between reference to one thing (the
distributed possessive) and two (coordination of ordinary
possessives). This is not the end of the world; as listeners and
readers, we use context, background information, and reasoning about
what is plausible to discern intended meanings, and we do this all the
time, with enormous speed and (usually) considerable accuracy. (I
believe that I am not inclined to distribute possessives, but I'm not
about to try to stop other people from doing it, and I have no trouble
figuring out what they mean when they do it.)

So far we have two versions of the independent possessive: families like R. and his partner’s
and families like R.’s and his
partner’s. This would be a good moment to quit hassling
the possessive and go on with the rest of the sentence, but, alas, Bowe
— or an editor — chose to think some more about families like R.’s and his partner’s.
Here's the problem: R.’s and his
partner’s looks like a simple coordination of two
possessives. But we want to mark possessiveness on an entire
expression referring to R. and his partner as a pair. So we need
a mark of possessiveness at the end of the whole expression R’s and his partner’s. This
is where the reasoning runs off the tracks -- possessiveness is already
adequately, perhaps more than adequately, marked -- but let's press on.

[Addendum later: well, maybe we shouldn't. Daniel Ezra Johnson notes that the final apostrophe has disappeared in the on-line version of the story (I just checked, and he's right), which suggests that the whole thing might have been a cut'n'paste error. I still have some useful things to say, but the original point is somewhat blunted.]

How would we indicate possessiveness at the end of R.’s and his partner’s? Up
above, I gave the default scheme, involving Z or 's, but there's a special case, for
expressions in which the last word already has a Z suffix. This
happens most frequently when the last word is a regular plural of a
noun, as in the NPs the birds
and my friends: the birds’ wings, my friends’ advice (cf. my children’s advice). This
word does not have to be the head of the NP: The advice of my friends’ [not friends’s] being so helpful, I decided to...
In any case, the possessive suffix is suppressed in speech, its
presence indicated in spelling by a final apostrophe.

The possessive suffix is suppressed not only by a plural Z suffix, but
by other Z suffixes as well. In particular, it's suppressed by
another POSSESSIVE suffix. It takes a little
work, but you can devise examples in which two possessive suffixes
would be expected but only one surfaces. (By the way, none of the
observations about English I'm making here are novel; they've been
around for some time.)

Background: independent possessives occur in at least four
constructions:

Anaphoric zero: Kim’
essay was long, but mine/Sandy’s was even longer.

Now, the handbooks don't even contemplate such examples, so they don't
tell you how to punctuate them. I've chosen to minimize the
number of punctuation marks, using ’s
to stand for two possessive suffixes. You could make a case for ’s’, extending the orthographic
marking of a suppressed Z from the paradigm examples: Let's meet at that friend of Sandy’s’
place. It looks ugly to me, but at least it's
consistent. This is in fact the spelling in the Times example we started
with. The spelling would be defensible, but the problem with the families of R.’s and his partner’s’
is not the orthography, but the signalling of an entirely spurious
possessive suffix at the end of the independent possessive.

While we're on the subject of Astounding Possessives, let me mention
two problematic cases that John Singler and I and our students at NYU
and Stanford, respectively, have been looking at over the years: the
Coordinated Pronoun Problem and the You Guys Problem.

The Coordinated Pronoun Problem.
Suppose you are a married man, and you want to talk about the problems
that you and your wife have been having; you want to talk about X problems, where X is a possessive
expression referring to your wife and you as a couple. What you
get off the shelf (see discussion above) is: my wife and I’s problems. A
lot of people recoil from this (and similar examples with other
personal pronouns as a second conjunct); the I’s sounds just wrong. The
easy solution is to distribute the possessive (again, see discussion
above): my wife’s and my problems.
This risks losing the sense of your wife and you as a unit, a
couple. So you might be moved to combine the virtues of the
ordinary possessive and the distributed possessive.

A number of people have stretched English grammar in search of a
solution. (Sightings of these non-standard variants go back at
least to a 10/16/91 posting to the Linguist List by Steve
Harlow.) Such a solution will have a possessive 's at the end of X, as in the
ordinary possessive, but it will avoid the ugly I’s, in favor of something less
ugly — for instance, my’s,
using the my from the
distributed possessive: my wife and
my’s problems. (The parallel for the Times example would be families like R. and his partner’s’.)
Singler and I have collected examples, and you can google some up — 21
webhits for my wife and my’s
-- though people have tried a variety of other solutions, covering all
the morphological possibilities: my
wife and me’s (2 hits), my
wife and myself’s (35), my
wife’s and mine’s (47). (In contrast, I get 11,000 hits
for my wife and I’s and
29,300 for my wife’s and my,
though maybe half of the latter are irrelevant.)

Yet another solution is the exact parallel to the Times example: distributed
possessives plus final ’s,
that is, my wife’s and my’s problems.
Again, Singler and I have some examples, but this time Google is not
our friend: no webhits for my wife’s
and my’s or my wife’s and me’s,
two for my wife’s and mine’s,
13 for my wife’s and myself’s.

[Addendum: Aaron Dinkin points out yet another resolution: my wife and my problems (for 'the problems of my wife and me'). It's hard to tell how common this one is, since you can really search for examples only with a head noun supplied. But there are at least a few examples out there.]

The You Guys Problem. The
combination of a plural personal pronoun (you, we, or us) with a plural noun presents a
puzzle in syntactic analysis: is the pronoun a determiner modifying the
noun as head; or is the pronoun the head, with the following noun in
apposition to it; or are they co-heads, in a kind of copulative
compound? Might different speakers have different analyses?
Might some speakers have more than one analysis? Syntacticians
have puzzled over these questions for years. For the first person
plural pronouns, the topic is especially vexed, since prescriptions
about pronoun case interfere with attempts to collect judgments.

For one particular instance of this combination, the very frequent
informal you guys, speakers
exhibit much more variation in their choice of possessive forms than
for others, in ways that suggest that they see the combination as
having two equal parts AND that they treat the whole
thing as an expression that doesn't necessarily involve an ordinary
plural noun guys.

First, the off-she-shelf possessive would be you guys’, as in you guys’ ideas. A lot of
people shrink back from that; I myself am not particularly comfortable
with it. One pretty common alternative distributes the
possessive: your guys’, as in
your guys’ ideas = "the ideas
that you guys have". I collected my first examples at the 2005
Berkeley Linguistics Society meeting, where one commenter on a paper
referred repeatedly to your guys’
analysis. A little while later I heard Barry Bonds use
this possessive (referring to the reporters at a press conference),
then found piles of examples on the net, and collected some more
examples from the speech of graduate students and colleagues.

An alternative is to treat you guys
as an expression that just happens to end in /z/. Then the
off-the-shelf possessive would be you
guys’s, and Singler and his students have plenty of
instances. About 11,700 Google webhits, which certainly isn't
chopped liver.

Finally, you can do both at once: your
guys’s. Some informants report preferring this to the
singly marked you guys’s, and
it gets a lot of webhits (about 26,500), though many of these are
probably references to the line "Could I use your guys’s phone for a
sec?" in the 2004 film Napoleon
Dynamite.

In more formal speech and writing, of course, you don't use you guys at all, just you, an alternative that is also
available in informal speech and writing, but at the risk of ambiguity
between singular and plural. In many cases, this ambiguity is
actually troublesome, so you guys
is a good thing to have, especially if you speak a dialect that lacks a
distinguished plural like y’all.
Once you have it, though, you're stuck with finding a possessive form
for it.

One of those people that care(s)

The second hour of KQED's "Forum" radio program this morning had as its
guest Kitty Burns Florey, author of Sister
Bernadette's Barking Dog: The Quirky History and Lost Art of
Diagramming Sentences (Melville House, 2006), a charming and
decidedly non-technical account of Reed-Kellogg sentence diagramming
and those who have loved it. She kept reminding her listeners
that she was neither a linguist nor an English teacher, she carefully
made no claims about the pedagogical values of sentence diagramming,
and she was realistic about change in language (while struggling to
recognize what was "technically" or "traditionally" correct). But
of course most of the phone calls were from people retailing
their pet peeves about English grammar and usage, complaints that will
be familiar to readers of Language Log.

Early on in the calls came one beginning firmly:

I'm one of those people that cares...
that care.

(meaning that the caller cared about prescriptive correctness).
The caller laughed and then went on with her complaints, and nobody
remarked on either of the usage points in her first sentence.

[Correction: now that I can access the recording of the show, I see that I got my transcription backwards: "Susan from Berkeley" says: "I was going to say that uh I'm one of those people that care [laugh]... that cares." This is a bit more delicious than what I thought I heard the first time, as we'll see below. (Thanks to Jonathan Lundell.)]

It's been five whole months since we wrote
about the choice of singular or plural verb in a restrictive
relative clause following one of
+ plural NP: singular to go with one,
or plural to go with the plural NP? (The plural variant is
considerably older, but the singular has been around at least since
Shakespeare and people have been complaining about it since around
1770, after it began appearing with some frequency in the works of
respected writers; MWDEU suggests that there's a
subtle difference in meaning or discourse function between the
alternatives, so that both should be accepted as standard.) The
caller went for the singular first and then altered it to the plural,
possibly recognizing the "correction" with her laugh. Maybe she
cares too much. [Addendum: now we see that she started with the prescriptive standard (plural) and revised it to the sometimes-proscribed version (singular).]

People who maintain that they CARE about grammar very
often care about that as a
restrictive relativizer with human-denoting heads, maintaining that
only who is acceptable in
formal writing (or even acceptable, period). MWDEU tells the convoluted story of
relativizer that with
reference to human beings: it came first, then fell out of favor, but
was revived in the 18th century, though with a bad taste left over from
its years in exile among the common people; John Simon and William
Safire have deplored it.

In searching the Language Log archives for the link to my earlier
posting "One of those who", I pulled up the postings in which this
expression and some of its variants were used (rather than mentioned)
by the bloggers. Our usage on the singular/plural issue is
divided: two to two for "one of those who" (singular in Mark Liberman's
postings #2459 and #2466, plural in Geoff Pullum's #937 and Mark's
#1347), an edge for the singular for "one of those people who" (Mark's
#1209, #2381, and #3044, versus plural in Geoff Pullum's #1461 and
someone I quoted in #3555). In any case, we are not unhappy with
the singular.

On the that/who issue, we seem not to have used
that at all for reference to
human beings in the contexts "one of those people..." or "one of
those..." So we're inclined to be who users. But we wouldn't
deride the "Forum" caller for her choice of that.
That's ok with us.

Wah piang eh! Si beh farnee!

Victor Mair writes:

I fell in love with Singaporean English when I heard it spoken in the delightful movie entitled "I not Stupid."
It's an amazing mix of English, Hokkien and other Sinitic languages, Malaysian, Indian languages,
and probably some other elements as well. One of the things that is most peculiar about Singlish,
as it is fondly called by the natives, is the extensive use of Sinitic particles that add all
sorts of nuances to an expression or sentence.

Translation: That's not possible. The professor implied otherwise. Therefore, a failure to do so would result in an unfavourable outcome for me.

"Aiyah my essay cheem meh? Where got cheem?"

Translation: Is my essay really difficult to understand? That can't be the case.

Be sure to read some of the entries in the extensive "Talking Cock" lexicon, and as background, the Wikipedia entry for Singlish.

I have an ex-Singapore army man in my Classical Chinese course. He's smart and very funny; I really like him. He told me about a movie called "Army Daze" that depicts how all young men in Singapore have to serve in the army, regardless of their background and character. To get a good taste of Singlish and the life of a Singapore army recruit, here's the whole film in nine parts:

English as a Quasi-Official Language of China

When I got back from Manchester, two guest posts from Victor Mair were waiting in my inbox. Here's Victor's first note:

In an earlier post, I observed the ubiquity of English-language teaching in Chinese schools, in most cases starting from the elementary grades. More evidence for the growing importance of English is its usage instead of Chinese at international conferences, meetings, and diplomatic events, and even more prominently in business with other countries.

Attached hereto is a photograph that accompanies an article about the signing of an agreement between China and Cambodia concerning cultural preservation. Wen Jiabao, the Premier of the PRC, is seated just to the left of center. The signing took place in Phnom Penh on April 8, 2006. I'm only coming across this now as I go through some of the back issues of Zhongguo Wenwu Bao (China Cultural Relics News) that I brought back from a recent trip to China. This report appeared on the front page of the April 12th issue.

Note that the Kampuchean hosts permitted the use of the older English spelling of the name of their country as "Cambodia."

Click on the pictures below for larger versions.

One comment: since the event took place in Phnom Penh, wouldn't the banner have been prepared by the Cambodian hosts? And in that case, isn't this a question of Chinese diplomats permitting the use of English, and not insisting on Chinese being displayed as well, rather than a case in which the Chinese government itself prepared an English-and-Khmer sign for a bilateral meeting? But it's certainly striking to see the Chinese premier signing a bilateral agreement with his Kampuchean counterpart under a banner in English and Khmer.

[Update 11/27/2006 -- Josh Jensen writes:

I'm catching up on the recent LL posts, and I just read this one. It reminded me of a conversation I had with a Chinese young woman last year, a guide for our adoption agency group in Guangzhou, China. She complained that when she'd visited South Korea, there weren't enough English-language signs.

No Dragon in that Sausage?

Trading standards officers have ordered the Black Mountains Smokery in Powys, Wales to change the name of its Welsh Dragon sausages on the grounds that they are made with pork, not dragon meat.
I'm all for truth in advertising and proper labeling, but it is hard to believe that many consumers have been misled. Even the dullest consumer presumably knows that dragon meat is extraordinarily rare, and, at least in Wales, it seems reasonable to expect consumers to know that the dragon is a symbol of Wales. I'm not surprised that the manufacturer reports that it has received no complaints about the absence of dragon meat from its products.

Beyond the lack of common sense, what I find peculiar is the assumption that the mention of an animal in a brand name implies that the product contains that animal. Do people assume that Koala™ hose connectors are made from koalas, that Grizzly™ salmon oil is made from grizzly bears, or that Deer™ red chilis contain venaison? I just hope that the trading standards people don't start regulating the Chinese names of foods. One of my favorite Chinese words, and foods, is 龍蝦 "lobster" (Cantonese luŋ4 ha1, Mandarin lóng xiā), literally "dragon shrimp".

November 20, 2006

Typographical bleeping antedated to 1591

It's common to disguise scatological or blasphemous language by replacing some letters with asterisks, hyphens, blanks or other typographical maskers. This avoids violating the letter of an explicit or implicit prohibition against printing certain words. Some might also see this as an instance of magical thinking, where it's safe to cause people to think of certain words, but saying them out loud, or writing them directly and completely, would invoke a sort of incantational power to harm. In any case, I've been curious about this history of this practice, and in some earlier posts ("The history of typographical bleeping", 6/10/2006; "The earliest typographically-bleeped F-word", 6/15/2006) we tracked English-language examples back to a poem by John Oldham published in 1680.

In response to my call for earlier cases, Simon Cauchi takes a form of this practice back almost a hundred years, to 1591. His note is beyond the jump.

I can offer three examples from the works of Sir John Harington (1560-1612), but note the use of parentheses rather than dashes or hyphens.

In fine, he made to him the like request
As Sodomits made for the guests of Lot.
The Judge him and his motion doth detest
Who though five times repulst yet ceaseth not,
But him with so large offers still he prest
That in conclusion like a beastly sot,
So as it might be done in hugger-mugger
The Judge agreed the Negro him should ( )

Secondly, in Harington's epigram "Of a faire woman; translated out of Casineus his Catalogus gloriae mundi", the sixth couplet reads (in the printed edition of 1618):

A narrow mouth, small waste, streight ( )
Her finger, hayre, and lips, but thin and slender:

but there is no bleeping in the manuscript prepared for presentation to Prince Henry (Folger MS V. a. 249), where the text reads:

A narrow mouth, small waste, strayght privy member,
her fingers, hayr and lips, but thin and slender

(The spellings "strayght" and "streight" are of course to be understood as "strait".)

Thirdly, the epigram "Of Garlick. To my Ladie Rogers" is short enough to be quoted in its entirety. The printed edition reads:

If Leeks you like, and doe the smell disleeke,
Eate Onions, and you shall not smell the Leeke.
If you of Onions would the sent expell,
Eate Garlicke, that will drowne the Onyons smell.
But sure, gainst Garlicks sauour, at one word,
I know but one receit, what's that? (go looke.)

In the Folger MS the last line is also bleeped, but by other means:

I know but one receipt, what's that? Tobacco.

The Folger MS was intended for the eyes not only of Prince Henry but also of his father King James I, whose dislike of tobacco was well known.

Note that in these cases, the hint that allows the reader to infer the writer's intention is the (meter and) rhyme, rather than the initial letter.

The last example has a special twist: the reader is led to expect "a turd", and then sees "tobacco", a substitution that conveys an additional message. This is a familar technique, which I know that I've seen several times in humorous songs -- but at the moment, all that I can remember are a couple of fragments of tune, without the words. If your memory is better than mine, let me know.

[OK, here's one -- Antoine Hervier writes

In the movie Shrek, when the Ogre and Donkey arrive in Duloc, they are greeted with a cute little song, with these lyrics :

I do remember that one now, but it's not the one that was on the tip of my tongue. Nor am I trying to remember Sweet Violets, sent in by George Kesteven. However, this note from Daniel R. Tobias nails it:

Regarding your request in the Language Log, one famous example of a song where a "bad word" is substituted with something that doesn't even rhyme is "Shaving Cream", originally written in 1946 by Benny Bell, sung by Paul Wynn, and redone by Dr. Demento in 1975.

A schoolyard rhyme that I remember as "Miss Lucy", but Wikipedia's entry calls "Miss Susie", consists of a series of stanzas each of which seems to be leading to a slightly naughty word, which is then made part of an innocent starting word beginning the next verse, like:

"...Miss Lucy went to heaven
and the steamboat went to

Hello operator
Get me number nine..."

It's so old that perhaps it actually dates to a time when some people had single-digit phone numbers.

Yes, "Miss Susie" is the one that I was remembering. How could I forget?

Anyhow, there are dozens of these songs out there. In some cases, the taboo word at the end of the stanza is replaced by a completely separate word which also starts the next stanza; in other cases, a homonym is used in the same way.]

[Jacob Coughlan contributed this set of variant verse, all the way from Melbourne, Australia -- but I recall hearing several of them in Mansfield Center, Connecticut:

In response to your article "Typographical bleeping antedated to 1591", in which you asked readers to give examples of humourous songs with ribald lyrics substituted for something more innocuous, here are the lyrics to one such song from my childhood, as I knew it. There are, of course, many variations.

Anyway, in this particular song, the lines run into each other, so that the beginning of the suggested dirty word morphs into the innocuous beginning of the next verse. The joke is rammed home by the shocking inclusion of an (unexpected) actual dirty word in the final verse:

Aunty Mary had a canary,
thought it was a duck,
took it round the corner
taught it how to...
Fried eggs for dinner,
fried eggs for tea,
the more you eat,
the more you want,
the more you gotta...

Peter had a boat,
the boat began to rock,
up came Jaws
and bit off his...

Cocktails,
ginger ales,
forty cents a glass,
if you don't like it
you can shove it up your...

Ask no questions,
tell no lies,
I saw a Chinaman
doing up his...

Flies are bad,
mosquitoes are worse
that is the end
of my fuckin' naughty verse.

Hope you enjoyed that as much as I did.

]

[Eric's contribution:

I hope you're not getting flooded with rude, half-remembered schoolyard verse, but probably you are. I was struck by how much your quoted version resembled what I remember from childhood half a world away (Bronx, NY), and also how much it differed. So the real question is -- is someone tracking versions of Miss Susie, by time and place? Because with a little data, the next step is a cladogram!

The version I know has no "Susie" in it at all. And, in retrospect, it's almost certainly two independent pieces welded together at about "Engine engine number nine":

Ungowa! Shipowa!
Your mother don't take no shower!
I said it, I meant it,
I'm here to represent it.

Engine engine number nine
Sock it to me one more time!

Ikey and Dikey
Were playing in the ditch,
Ikey called Dikey a
Dirty son of a . . .

Bring along the children
And let them play with sticks
So when they get older
They'll know how to play with . . .

Dixie had a baby,
She named him Tiny Tim
She put him in the pisspot
To see if he could swim.

He swam to the bottom
He swam to the top
Along came a bumblebee
And stung him up his . . .

Cocktails, ginger ale
Five cents a glass
And if you don't like it
You can shove it up your . . .

Ask me no more questions
I'll tell you no more lies
A kid got hit with a bag of shit
Right between the eyes!

I like the idea of Miss Susie cladistics -- but a trendier name would be "memetic phylogeny". ]

[ Chris Conroy wrote:

In response to your request for songs with an expected dirty rhyme, here are two that came immediately to my mind. The first is an unreleased Weird Al Yankovic parody, "It's Still Billy Joel to Me" (parody of "It's Still Rock n' Roll to Me" by, naturally, Billy Joel).

The relevant verse (the 'B' verse, if you're familiar with the song):

Now everybody thinks the new wave is super
Just ask Linda Ronstadt or even Alice Cooper
It's a big hit, isn't it
Even if it's a piece of junk

It's still Billy Joel to me

It's a fun parody, at least if you're a Billy Joel fan with a sense of humor. It's a shame Weird Al couldn't get the rights to release it. (Apparently he always seeks permission from the songwriter, even though, as parody, he's not required to.)

The second example is more obscure. It's a parody of an old hymn called Dies Irae,
written about the "culture wars" going on in the Catholic Church between proponents of
traditional hymnody and pop/folk-style contemporary music. The writer is one of the former.
Most of the references won't mean much to someone outside of Catholic music circles,
but I think you'll find this one verse is particularly clever nonetheless. The three names mentioned
in the last line are the three most prolific composers of contemporary music used in Catholic Masses
today. The final name does not actually rhyme with the other two line endings,
though from the author's point of view, I suppose it might as well:

Smite them, Lord, yet of thy pity
Take their songsters to thy city:
Even Haugen, Haas, and Schutte.

I bet that one has them ROSL (rolling in the sacristy laughing).]

[Daniel R. Tobias provides the version of Miss Susie from upstate New York in the 1970s:

All the variants of the Miss Susie / Lucy rhyme are interesting... does anybody know when and where it originated anyway? Occasionally, the versions have bits in them that seem to date them, like the "Jaws" reference in one of the quoted versions (that seems to refer to a mid-1970s movie). I note that the drink is five cents a glass in one version and forty in another; can that be charted against the Consumer Price Index in an attempt to date the variants? At least in the era before the Internet and such pop-cultural stuff as The Simpsons and South Park which like to make use of things like these rhymes, they spread entirely by kids teaching them to other kids without any assistance of the mass media.

Anyway, the version that I remember, from upstate New York in the mid 1970s, goes like this (similar but not identical to the Wikipedia-quoted version):

Miss Lucy had a steamboat
The steamboat had a bell
Miss Lucy went to heaven
and the steamboat went to

Hello operator
Get me number nine
If you disconnect me
I will kick your fat

Behind the refrigerator
There is a piece of glass
Miss Lucy sat upon it and
it broke her little [or: it went straight up her]

Ask me no more questions
Tell me no more lies
Boys are in the bathroom
pulling down their

Flies are in the pantry
Bees are in the park
Boys and girls are kissing in the
D-A-R-K, D-A-R-K...

Well, some people have pointed out that one piece of the song must have originated at the time when human operators were involved in making telephone connections, among a set of local numbers small enough that a single digit like "nine" would have been a normal selection among them (with connections made by human operators, not all numbers need to have the same number of digits...). Folklorists study this sort of thing, but I'm not sure whether there
is a literature on Miss Susie.]

[Michael Mann offers a German example:

Your post "Typographical bleeping antedated to 1591" reminded me of a song that was quite popular here in Germany in 1978 (I wasn't yet born then, but I read that it was quite popular), sung by Rudi Carrel: "Goethe war gut". You can find the lyrics here:

Like the "turd"-example you gave, Carrel similarly played with the expectations of the audience, "rhyming":

Bitterest battles in the war on error

A peculiar feature of linguistic prescriptivism is that the most
passionate assertions of rightness and wrongness often occur in
precisely those areas of the language where there is the most
ambivalence among native speakers. Several months ago we saw just such
a case of manic overcodification when a newspaper reporter told us
about an editor who preposterously
insisted that the comparative form of strict should be more strict and never ever stricter. Now comes another
pronouncement of rigid exactitude in the highly inexact arena of
comparative and superlative inflection.

It started with this sentence in the Guardian,
appearing in an article earlier this month about a poll that ranked
President Bush as a greater threat to world peace in the eyes of the
British public than either Kim Jong-Il or Mahmoud Ahmadinejad:

As a result, Mr Bush is ranked with some of his
bitterest enemies as a cause of global anxiety.

"Surely," the reader
asked, "your correspondent knows that the correct English form is 'most
bitter'." I can sympathise to some extent with a writer who in this
context felt driven to a new extremity. But what we are involved in
here is the war on error and, following Mr Bush's example, we shall
seek out errorists and bring them to justice.

But Mayes' bon mot about
"the war on error" is, sadly, followed by this odd statement on the
acceptability of bitterest:

One of the weapons in my arsenal is the
wonderful Oxford English
Dictionary on line, but it is at a total loss to find any recorded use
of "bitterest".

Huh? If the esteemed reader's editor of the Guardian doesn't know
how to use
a damn dictionary, then all I can say is: the errorists have
already won.

It's true that the online OED doesn't specifically mention bitterest as an inflected form of bitter. But guess what? It rarely
specifies any comparative or superlative forms with -er/-est, unless
there's something noteworthy involved. The OED is similarly mum about
how to make a comparative or superlative out of dumb, but that doesn't rule dumber and dumbest out of the lexicon. (Dumberer
is another matter.) Some dictionaries do in fact explicitly list
comparatives formed with -er and
superlatives formed with -est,
and those that do, such as American Heritage
and Random House,
show bitterer and bitterest without comment. (Webster's Third New International
hedges just a little bit, saying the inflected forms of bitter are "usually" -er/-est.)

With only a modicum of know-how in using the online OED's full-text
search feature, Mayes could have quickly found no fewer than 41
citations throughout the dictionary featuring the word bitterest. The earliest of these is
from Layamon's Brut,
dating to the turn of the 13th century: "Her heo sculeð ibiden bitterest
alre baluwen." (In a more modern rendering,
that would be: "He shall therefore abide bitterest of all bales.") That
citation even appears in the entry for bitter,
so it's hard to miss. Elsewhere in the OED's text one can find bitterest used throughout the
course of modern English, right up to the present day. Some notable
examples from English literature:

To these we could add many hundreds of attestations in English
poetry, drama, and prose from Chadwyck-Healy's Literature Online
database. Even a more modest online literary collection like Mastertexts.com
will give us a wealth of examples from Jane Austen, two Bronte sisters
(Anne and Emily), Wilkie Collins,
Charles Dickens, Thomas Hardy, Jack London, Mark Twain, and more
Thackeray. (Dickens, Hardy, London, and Twain also use bitterer,
by the way.)

Additionally, the OED has recorded usage of bitterest
in a wide range of modern periodicals, from the Daily Chronicle to the
Catholic World to Time to, whaddayaknow, the Guardian. (Under hep, one can find this 1960 comment
from a Guardian writer: "Not even its bitterest critics could accuse
the Labour party of being 'hep'.") In fact, bitterest shows up a whopping 304
times in the online
archive of the Guardian, averaging about 40 appearances a year
since 2000.

So where does the reader's insistence on the unacceptability of bitterest come from, and why is
Mayes, who should really know better, so ready to accept this "new extremity"?
This is not one of the typical prescriptivist bugaboos, as it does not
occur in any of the prim grammar guides that I've checked. In fact,
several guides from the late 19th century explicitly give the opposite
advice, though some, such as Wm. Smith and T.D.
Hall's A
school manual of English grammar (1887), do note that "many of
those compared by er and est take also more and most." Eduard Mätzner's
magisterial Englische Grammatik (1860-65)
also embraces bitterest
and similar forms in this passage (from a later English
translation):

Various other adjectives of two syllables are
also compared by er
and est, according to no very definite rule: bitter, bitterer,
bitterest; clever, clever, cleverest; cruel,
crueller, cruellest; handsome, handsomer, handsomest; tender, tenderer; tenderest. The correct usage in
such words can be learned only
by careful study of the dictionary and of the best authors.

So there's clearly no proscription against bitterest even among those who care
deeply about such things. As with stricter vs. more strict, we are faced with a choice
between two perfectly acceptable alternatives, and that choice may be
dictated by a range of phonological, prosodic, stylistic, and pragmatic concerns
rather than an overt grammatical rule. CGEL is, as
always, enlightening on this point, as is Britta Mondorf's article
"Support for more-support" in Determinants of Grammatical Variation in
English (2003). Mondorf points out that "extensive amount of
more-support for adjectives in
<-r, re> can be attributed to the avoidance of phonological
identity effects." That would explain a preference for more bitter over bitterer, but it does nothing to
weaken the case of bitterest
as an acceptable alternative to most
bitter. (Interestingly, Mondorf also provides evidence that
certain semantic criteria can override the phonological disposition
against bitterer and similar
forms, theorizing an affinity between concrete meanings and -er forms. She gives two examples
of comparative bitter from
the Daily Telegraph, contrasting concrete and abstract usage: "the beer
is bitterer" versus "the more bitter takeover battles of the
past.")

If you feel strongly one way or the other on this bitter debate, register your voice
in this poll
hosted by UsingEnglish.com. Last I checked, 16% have voted for bitterer/bitterest, 38% for more
bitter/bitterest, and
46% for more bitter/most bitter. There is, of course, no way to
vote for "all of the above" or "depends on the context," since such wishy-washy acceptability judgments are hardly ever considered in the polarized discourse of verbal hygiene.
In the war on error, you're either with us or against us.

[Update: Languagehat and his commenters continue the discussion. I particularly like LH's turn of phrase, "cockamamie ukase."]

What, that lynching stuff?

I meant to tell you a few days ago, when Trent Lott was chosen to
return to the political spotlight as Senate minority whip in the next
Congress, about how it reminded me of a Language Log post. On NPR
this week I heard an interview with one of the Republican Senators
who participated in the meeting where the decision on Lott was made.
Was there any discussion, the NPR interviewer wanted to know, about the
incident of 2002? (You'll recall that in 2002, at the 100th birthday
party for Strom Thurmond, Trent Lott said that if the country had
elected the Thurmond to the presidency in 1948, "we wouldn't have had
all these problems over all these years." Thurmond in 1948 was not
only militantly in favor of segregation but also dead set against Federal
anti-lynching laws. The inescapable implication of Lott's remark was that
a Thurmond presidency would have prevented the rise of the civil rights
movement; activist negroes would have been hung from trees rather than
getting to demonstrate and vote and gain access to white schools the way
they finally and disruptively did in the 1960s.) What caught my ear was
that in response the Republican Senator being interviewed simply said:
"It
didn't come up". The exact words of the punchline from the hilariously
spot-on Dilbert cartoon that Mark discussed (in connection with
stereotypes of male empathy deficit) just two months before:
the strip where Dilbert gets the numbers
from Yvonne without asking about how she is coping with her sextuplets
now that her house has burned down and she's had shoulder surgery.
Sometimes linguistic life imitates linguistic art so beautifully.

November 19, 2006

Find that mystery mumbling phonetician

A further case of a
linguist
raising serious suspicions and almost getting arrested can be found
in the biography entitled The real Professor Higgins by Beverley
Collins and Inger Mees (Mouton, 1999; page 352). Professor Daniel Jones.
It was pointed out to me by John Wells of
University College London (UCL).
The story concerns Professor Daniel Jones,
the distinguished founder of the Department of Phonetics at UCL (later
the Department of Phonetics and Linguistics, where I taught for
several years early in my career). Jones,
one of the most important figures in the
entire history of phonetics, was once taken for a spy — in
wartime, so this could have meant the firing squad.

During the Second World War (1939-1945), the operations of
the Department of Phonetics
were evacuated to Aberystwyth in Wales, because London was under constant
German bombing. Professor Jones did not move his domicile
to Wales, but had to go
there sometimes, especially to conduct examinations.
(The tradition at UCL
has always been to examine phonetics in part through a process of live
dictation of invented nonsense words which the examinees have to write
down in the International Phonetic Alphabet, and this requires the
simultaneous co-presence of the candidates with an expert practical
phonetician who can pronounce arbitrary syllable strings perfectly
and recognize accurate transcriptions of them.) Here's the way Collins
and Mees tell the tale of Jones's curious incident:

On one of his rare trips to Wales, Jones was busily checking his
phonetic transcriptions for the examinations, noting snatches of the
Welsh conversation in the carriage, and practising "nonsense words"
to himself. He was quite unaware that some perceptive passengers had
been distressed by the strange activities of an elderly gentleman who
was not only apparently muttering odd noises in a strange language
which was neither English nor Welsh, but also writing down peculiar
signs and symbols in his notebook. On his arrival at his destination,
Jones was alarmed to find the local constabulary waiting to arrest
him on suspicion of being a spy.

(By the way, you are probably so young and modern and
part of the cell phone
generation that it may not have occurred to you that the passengers
couldn't have alerted the Aberystwyth police by cell phone calls from
the train, because cell phones were still science fiction, some fifty years
into the future. But the technology for
radio communication
from trains was known as early as 1914, and there were also
techniques involving inductive coupling to telegraph
lines running alongside the train; it would have been possible
in principle for a passenger to go down the corridor of the train and
alert railway personnel, and for them to pass on
the alarm in some way while the train was in motion.)

Strange mutterings, notations in the International Phonetic
Alphabet, attentive listening to fellow passengers talking in Welsh...
Well, it doesn't exactly sound like
Casino
Royale, does it? But I guess in wartime people get
really nervous. Perhaps (as George Kesteven points out to me)
Jones was practicing his bilabials and the other passengers took the
wartime slogan "loose lips sink ships" somewhat too
literally.

Jones did not, however, cause the Aberystwyth post office to be closed.
Barbara Citko, that
Polish
bioweaponry Mata Hari,
still seems to be
the only linguist ever to have seemed dangerous enough that a
whole post office had to be shut down in the battle to stop her
evil schemes. In case anyone is keeping score on such matters.

Thanks to Bill Poser and
Barbara Zimmer for research on trains and
radio.

Expletive inserted

No word taboo at The New Yorker, it would seem.
Bill
Buford casually drops the occasionally attested colloquialism
lo and fucking behold (184 Google hits)
into a description of his thoughts as he hides behind a bush and watches
a male turkey appear in response to a slate-scratching device that makes
an imitation of a female turkey call:

... I heard a deep slow trilling. A gobble. Lo and fucking behold.
I peeked, ever so slowly, through the leaves of my bush and saw
him. Whoa! A gobbler, puffed and tail spread, looking like the
NBC logo. Wow! I'd called him in! I'd done it!

The New Yorker arrives in American homes just like any other periodical,
and has all sorts of cartoons and ads that might encourage kids to look at it.
It's puzzling to me why, when The New Yorker can risk dropping the prime
obscene expletive of the English language in mid fucking idiom in a feature
article about turkeys, so many newspapers are so astonishingly coy that they
can't mention shit without at least a couple of asterisks.
(I guess I mean that last clause in
both its literal and idiomatic senses.)

Fomite: panacea or backformation?

An article by Martin Veitch ("Are dirty keyboards truth or fiction?", Inquirer, 11/17/2006) taught me a new word, and so I'll offer to teach him a research technique in return. Veitch wrote:

Being ignorant about the truth or otherwise of the dangers of dirty keyboards, I asked for expert readers to mail in. Hospital physician Dtaylor took up the filthy gauntlet, replying to suggest that:

“Any ‘fomite’ (medical term for an inanimate object that can transfer contagious disease) can be a problem. Keyboards are certainly one such, and they are hard to clean. (Not as hard to clean as stuffed animals and toys in a hospital playroom, but I digress.) ‘Hospital-grade’ devices that can be cleaned more easily might conceivably help, but are certainly no panacea.”

This is a rare opportunity to correct the OED, which says that fomite is a "rare" variant of fomes (from Latin fōmes, fōmites "touchwood, tinder")

The first mistake here is in the gloss, which reflects an old-fashioned medical misconception. As I understand it, infectious agents are sometimes better transmitted by non-porous surfaces than by porous ones; in any case, restricting the concept to porous substances is wrong. The second mistake is in retaining the original Latin singular fomes, and suggesting that fomite is a rare variant. The OED gives one citation in its entry for fomite:

1859 R. F. BURTON Centr. Afr. in Jrnl. Geog. Soc. XXIX. 134 This must be an efficacious fomite of cutaneous and pectoral disease.

and three in its entry for fomes -- but three of the four are the plural form "fomites":

1773Gentl. Mag. XLIII. 554 If this putrid ferment could be more immediately corrected, a stop would probably be put to the flux, and the fomes of the disease likewise removed. 1803Med. Jrnl. X. 213, I cannot say that I have known it spread from fomites. 1851-9 A. BRYSON in Man. Sc. Enq. 248 Either simply through the medium of the atmosphere or by means of fomites. 1882Quain's Dict. Med. s.v., The most important fomites are bed-clothes, bedding, woollen garments, carpets, curtains, letters, &c.

Presumably fomite is a backformation from the plural fomites. But the current situation seems to be that the back-formation fomite is in wider use than the original fomes.

MEDLINE has 75 hits for {fomite} and 217 hits for {fomites}. There are 54 hits for {fomes}, but 53 of them are instances of fungus species, such as Fomes cajanderi, Fomes fomentarius, etc.). Ironically, in the one article where fomes is used to mean "inanimate object that can transfer contagious disease" (KL Autio, S Rosen, NJ Reynolds, JS Bright, "Studies on cross-contamination in the dental clinic", J Am Dent Assoc 100(3) 1980), it's construed as a plural:

Use of 5% iodophor in 70% isopropyl alcohol was effective in sterilizing certain fomes in the dental operatory.

The wikipedia entry for fomite also implies that this back-formation is now the standard medical term. Encarta has no entry for fomite, giving only fomites as a "plural noun" glossed as "inanimate objects capable of carrying germs from an infected person to another person". AHD gets it right, glossing fomite as "An inanimate object or substance that is capable of transmitting infectious organisms from one individual to another", and indicating that it's a back-formation from the plural fomites of fomes. Merriam-Webster gets it right as well -- the online version gives fomite as a "an object (as a dish or an article of clothing) that may be contaminated with infectious organisms and serve in their transmission", also noting its source as a back-formation.

I didn't know any of this -- or rather, I remembered the fungus species name from an old hobby of mushroom hunting, but I had never heard of the medical term until I looked it up after reading Martin Veitch's article. After quoting Dtaylor's conclusion about fomites -- "'Hospital-grade' devices that can be cleaned more easily might conceivably help, but are certainly no panacea." -- Veitch continues:

Btw, have you ever heard of anybody using the word ‘panacea’ without ‘no’ or ‘not a’ before? But now it’s me that’s digressing.

Having learned a new word from Veitch's article, I'm going to offer to return the favor by pointing out that questions about the distribution of adjacent words can now be explored by using web search. Before I looked, my own memory certainly agreed with Veitch in reckoning that panacea usually gets a preceding negative. But a quick web search turns up plenty of uses like these:

No matter what problems you face in your life, meditation is really a panacea.
Locust-bean bark seems to be a panacea for anything from toothache to impotence.
This book is a panacea for all the misinformation being disseminated about exercise, diet, weight control, and training for sports.
Determination is a true panacea, and cancer cannot win without concession.

Come, bumpers – aye, ever so many –
And then, if you will, many more!
This wine doesn't cost us a penny,
Tho' it's Pomméry seventy-four!
Old wine is a true panacea
For ev'ry conceivable ill,
When you cherish the soothing idea
That somebody else pays the bill!

More interesting, though, web search brings up a kind of phrasal template, often used in headlines, that hadn't occurred to me but is instantly familiar:

X: panacea or Y?

Many examples feature alliteration in the Y position, and sometimes in X as well:

There are lots more -- Google claims 215,000 pages containing the string "panacea or", and another 38,200 for "or panacea". And there's the usual penumbra of variant structures.

These snowclonish headlines may not be counter-examples, however -- it's characteristic of "negative polarity items" that they work in question contexts as well as negative ones, presumably because questions and negations share a property of "non-veridicality".

As the positive examples that I gave earlier showed, panacea is by no means a true negative polarity item. But it's got a sort of preference for non-veridical contexts, all the same.

Statisticians and their conjunctions

If you use adjectives in your prose, do not use nouns. If you use nouns, you must not use verbs. If you use verbs, try to avoid verbs that specify a particular city.

When specifying particular cities in fiction, do not use cities that have been specified in poems. Poems have so few things left of their own anymore that we should let them have their own cities. [...]

If you write about the weather, use as many adjectives as you can, or else your nouns will wilt and become adverbs.

Some coaches insist adverbs are stronger than nouns, but an independent panel of statisticians has proved otherwise. Despite appearances, though, statisticians don't like nouns so much as they adore conjunctions.

November 18, 2006

Find that mystery linguist woman

Linguists often arouse suspicion. They make field trips to unusual
regions;
they sit down to have lengthy conversations with members of
minority populations who speak strange languages that the security forces
do not know, and they make copious notes involving strange phonetic symbols.
Who knows what they are really up to. They have frequently aroused the
interest of police and intelligence services in various countries of the
world. African linguist Jack Mapanje spent time in jail in Malawi (he
still doesn't know why); Harvard linguistics student
Victor Manfredi was arrested as a spy and spent three days in jail
before a
judge
set him free (Victor spoke Igbo, and that turned out the be the
judge's native language; piece of luck!). MIT-trained syntax and semantics specialist Tanya
Reinhart has been jailed in Israel more than once for pro-Palestinian
activism. But,
to my knowledge, only one member of the theoretical linguistics profession
in the USA has been taken for a domestic terrorist to an extent so serious
that a major US post office had to be shut down in direct response to the
threat posed. I'll tell you the story, if you wish.

The scene (picture it) is a post office in Salt Lake City three or
four years ago. A dark-haired woman approaches the post office counter,
and says in a slight foreign accent (ah! an alien!) that she wishes to
mail a number of slightly bulky letter envelopes, all to American
universities (just like the Unabomber!). But something else was noticed
about the envelopes, and about the woman. Something that chilled their
blood.

The mystery woman paid for the postage, and remembers being slightly
surprised to see as she left the post office that the envelopes had been
left on the counter untouched, rather than tossed into the outgoing mail
bin. The staff seemed not even to be going anywhere near them, in fact.

Once mystery woman had exited, the whole place went into panic mode.
The postmaster was called; customers were hustled out; the entire post
office was closed and sealed. Specialist teams in protective suits were
called in to pick up those envelopes. Police were contacted, and set off
to track down the foreign-accented dark woman of mystery.

For what the post office counter clerk had noticed was that the
envelopes bore light traces of a white powder, and the garments of the
mystery woman had showed traces of it too. Obviously it was weapons-grade
anthrax.

Only it wasn't anthrax. The mystery woman was theoretical syntactician
Barbara Citko.
In her first year of teaching, she had neglected
a cardinal rule of our profession, which she never forgets today:
don't wear black
when teaching a class using white chalk.

The white chalk dust got not
only on her clothes, but also onto the envelopes she had rushed to the post
office straight after class. The contents was not lethal doses of
lung-destructive anthrax spores, but job applications.

The police soon got to the bottom of things, and she was not charged
with terrorism. Just one more day in the exciting life of a linguist.
And the letters were not destroyed in an effort to kill the putative spores;
they were ultimately delivered unharmed. Through one of those letters she
got a job teaching at Brandeis, and after a year of that she secured a
permanent tenure-track position at the great linguistics department of the
University of Washington in Seattle, where I had the pleasure of seeing her
again when I visited there to give a talk a few days ago. She told this
tale over dinner. With its happy ending.

Linguistics in the service of Plane English

No, it's not a typo. This is about the English used by pilots and
control towers -- "Plane English." Geoff Pullum's recent post Linguistics
in the service of astrophysics prompted me to describe one way
that linguistic service also extends to the field of product liability.
Illustrating this is a 1987 Chicago case about a private Lear Jet that
crashed seven years earlier in New Orleans, killing the pilot and his
four passengers. It's a story about Plane English.

The lengths to which insurance companies will go in order to avoid payment seem
to have no end. In this case, the company wasn't satisfied that pilot
error caused the crash. Instead, it tried to put the blame on the
Garrett Corporation, manufacturer of the airplane's engine. If it
wasn't the pilot's fault, what else could have caused him to veer off
course as he tried to land? The insurance company came up with the
idea that it was a toxic gas called trimethylol propane phosphate
(TMPP) that leaked from the engine into the cockpit, disorienting the
pilot and impairing his sense of judgment.

One problem with this theory was that TOXLINE and MEDLINE searches
showed that little was known about the effects of TMPP on any large,
living being, including humans. The only research available at that
time was on small animals, such as rabbits, mice,
and rats. There was nothing showing that it affects the cerebellum,
motor pathways to the brain stem, basal ganglia, or the descending
pyramid of the cerebral cortex.

Another problem was that no evidence of TMPP was found at the crash
scene so the insurance company had only a theory to work with (sound
familiar?). But it reasoned that TMPP is a class of bicyclophosphates
that are GABA inhibitors and since GABA inhibition affects speech in
diseases like Huntington's Disease, the insurance company theorized
that it had the same effect on the pilot's behavior. If they could
prove their case, it could shift the blame for the accident from human
error to a product liability case against a successful and presumably
prosperous company.

Garrett's defense attorneys were then faced with the unusual task of
combating a theory rather than physical evidence,which is the usual
basis for such lawsuits. Enter the service of linguistics. The only
accident
evidence available was the air-to-ground communications of the pilot
between Milwaukee and New Orleans. The pilot's voice sounded okay to
the defense lawyers but to verify their suspicions, they called me to
analyze the intermittent tape recorded communications from the time the
plane taxied down the runway in Milwaukee, as it passed over the
control
towers in Chicago, Kansas City, and Memphis, and as it approached its
final destination. The defense had to combat one theory -- that there
was TMPP in the cabin -- with another theory -- that if such happened,
the pilot's speech would give evidence of it.

I analyzed the pilot's syntax, word frequency, speech acts, pause
fillers, and other evidence of cooperative conversation, theorizing
that if he was being overcome with TMPP, these features would be likely
to show it. There is pretty good evidence that this happens when people
ingest large amounts of alcohol or drugs. But who knows whether the
same thing obtains when TMPP enters the system? Neither side had real
proof to back up the theories proposed.

In order to determine if there were aberrations in the pilot's syntax,
I first needed to study a number of other pilot-to-tower communications
to find out what the normal syntax patterns of Plane English, including
variability
within the optional and obligatory sytactic slots. I found the
following: first comes an optional acknowledgment ("okay," "Roger,"
etc.), followed by an optional self-identification ("Mitsubishi seven
two seven," "Mike Alfa," "Six Golf Hotel," etc.), followed by an
optional early closing ("out," "okay," etc.), followed by an obligatory
subject ("we," "five thousand feet," etc.), followed by an obligatory
predicate ("climbing to five thousand," "ready to go," etc.), and,
finally,a
second closing slot (used when a subject and predicate occur). I
found no aberrations from this formula in any of the pilot's
communications over the three tower-contact segments of the flight
beginning in Milwaukee, going over the Midwest, and ending in New
Orleans. His syntax didn't appear to be confused.

The pilot didn't show any loss of language ability in his use of
compound sentences either. They remained constant from the beginning of
the flight to the end. Nor did he start using shorter
sentences. His words per utterance remained fairly constant
throughout the three segments of the flight, averaging 9, 8.27, and
7.75, respectively.

I also examined the pilot's speech
acts. He reported facts, such as his location, altitude, and flight
course without confusion or hesitation. Failure to do so could have
been interpreted by ground control as erratic behavior but no such
complaints occurred. He replied to all of ground control's questions,
acknowledged all instructions, repeated all information accurately, and
even thanked the control tower once. His most telling speech act, which
came as the pilot was trying to land, was to correct the tower's error
when it misidentified him (I'll come back to this later).

And what about pause fillers, those "uh," "um," "er" features
that most of us use in daily conversation? Did the pilot use them
excessively, perhaps suggesting that he was getting sluggish or
beginning to be overcome by toxic fumes? There are two types of pause
fillers. One is used when speakers are trying to get the attention of a
listener or holding onto their turns of talk. The other type occurs
when speakers are uncertain or forgetful about how to say something.
The latter type might indicate a decrease in cognitive ability. The
pilot used five of these but, interestingly, they all came when he was
still on the ground in Milwaukee as he prepared to taxi down the
runway. His "hey, listen to me" attention grabbing pause fillers all
came at the end of the flight, as the pilot was struggling to set the
plane down in a torrential rainstorm.

Now for the most obvious feature. Did the pilot begin to slur his
speech, especially his fricative, affricate, and interdental sounds, at
some time in the flight? If alcohol or drugs can cause this, couldn't
TMPP do the same? If it could, it didn't. The pilot had to
pronounce words like "Mitsubishi," "seven," "that's," "Kansas,"
"thousand," "the," "taxi," "its," and "Memphis" throughout his flight.
No diminution of speech ability is noticeable here.

Finally, I looked at the pilot's conversational
cooperation (relevance, informativeness, sincerity, and clarity) to
see if it got worse as the flight progressed. Perhaps it would take an
air-traffic specialist to testify about whether any of the pilot's
communications were less informative than they should have been, but
the taped evidence shows no discomfort or complains by ground
control to any of the pilot's statements during any segment of the
flight. Throughout, ground personnel treated the pilot's reports of his
readiness, distination, movement, and flight level as though they were
relevant, informative, sincere, and clear.

Okay, so why did the plane crash? Oddly enough, this wasn't the focus
of the trial. It was only to show that TMPP did or did not affect the
pilot's judgment. Other aspects were not considered relevant but I'll
go there anyway, because this is Language Log, not a trial, and I can
do whatever I want here.

At the very end of the flight, the pilot radioed the local control
tower in New Orleans that he was on his approach to land. Just
before this, the radio picked up another aircraft, Six Golf Hotel,
whose pilot requested permission to abort his landing because of the
heavy rainstorm. After this the transmission went:

Perhaps realizing the mistake, the tower then gave Mike Alfa a weather
report and asked him to report Alger when he passed it. Alger is a
specified checkpoint in the landing approach. It's difficult to know
what happened next but Mike Alfa's response was, "Okay, Mike Alfa," and
we never heard from him again. He had already passed Alger by that time
and the tape gives no indication that he had ever reported it.

Fighting the elements, being misidentified by the tower, and having
already passed the Alger checkpoint, the pilot was pretty busy trying
to figure out what to do. As it turns out, he was far off course and
crashed on the shore of Lake Ponchartrain. It would appear that the
pilot was, indeed, confused and disoriented, but there seems to be no
language
evidence suggesting that this condition was caused by ingesting
TMPP. And if the pilot was confused, ground control seemed to be just
as confused. Note that it was the pilot who corrected the tower's
misidentification, hardly evidence of cognitive impairment.

Dueling stereotypes

People seem remarkably comfortable with inconsistent stereotypes. Not long ago, we commented on a Zits cartoon whose point was that when teen girls hang out, they all talk at once all the time, whereas teen boys hang out together in near-silent isolation. In contrast, the strip for 11/16/2006 is based on the idea that teen boys talk enthusiastically among themselves, but have nothing to say to their parents:

If you think of all this as a collection of hypotheses about how teen boys (or girls) behave, it's pretty incoherent. If you think of it as a collection of hypotheses about how (certain) adults react to teen behavior, though, you can find a pattern: whatever the kids do, it's inappropriate.

Of course, no one wants to read about people (of any age) acting and reacting appropriately. And an easy way to get an audience is to appeal to the greatest peeves of the greatest number.

[Update -- Language Hat writes:

I read that strip too, but didn't at all have your reaction -- to me
the point was not that "teen boys talk enthusiastically among
themselves, but have nothing to say to their parents," but rather that
teen boys refuse to share their exciting adventures with their
parents. In other words, it was not about language use but
secret-keeping: what happens in dudeland stays in dudeland.

In fact, I interpreted the intended message of the cartoon in exactly the same way that Hat did. But for the purposes of this post, I was focusing not the behavior's attributed motivation, but on the depicted behavior itself. What's shown here is a 15-year-old boy who's chatty with a male friend, in contrast to the group of silent boys in the earlier cartoon.]

Mrs. Olsen gets a D

John Vann sent in this Frazz strip, which ran 11/17/2006:

John's comment: "One might append '... or the Language Log'."

In this case, there's a Wikipedia article, which discusses the situation from many angles -- and provides ammunition for smart-mouthed elementary-schoolers everywhere by giving a long list of exceptions, including "oneiromancies", which breaks the rule twice, once in each direction.

And in fact, this rule (whatever its pedagogical value) performs badly as a predictor of English letter sequences, because of the high frequency of words like "their", "science" and Germanic names like "Einstein" and "Bruckheimer". In two random stories from today's NYT, I count:

[^c]__

c__

ie

29

5

ei

11

0

This is a total of 29 right vs. 16 wrong, for a grade of 64 on a scale of 100, or a D.

If we evaluate the performance in the terms usually used in modern AI, machine learning and similar disciplines, we'll get an F-measure of .78. I calculate this by defining the problem as predicting cases in which 'i' precedes 'e'. Then we can re-label the table in terms of predicted and observed positive ('i' before 'e') and negative ('e' before 'i') instances:

yes
(observed)

no
(observed)

yes
(predicted)

29

11

no
(predicted)

5

0

Then the "precision" of the test (otherwise known as "positive predictive value") can be calculated as the number of true positives divided by the sum of true positives and false positives, which here is 29/(29+11) = 0.725. This is the proportion of the time that the rule is correct when it predicts a positive outcome, i.e. that 'i' precedes 'e'.

And the "recall" of the test (otherwise known as "sensitivity") is the number of true positives divided by the sum of true positives and false negatives, here 29/(29+5) ≅ 0.85. This is the proportion of the observed positive outcomes (i.e. where 'i' precedes 'e') that is predicted by the rule.

We usually take the harmonic mean of these two figures in order to get a combined score known as the "F-measure", which here is 2*.85*.725/(.85+.725) ≅ 0.78.

This might look a bit better than the elementary-school grade of D -- after all, you'll find plenty of machine-learning papers, in the best journals and conferences, with F-measures in the upper 70s. However, the referees don't let these papers get by without comparing their performance to the obvious trivial baselines, such as predicting the commonest outcome all the time. In this case, that amounts to the rule 'i' before 'e' no matter what -- and this rule actually works quite a bit better:

yes
(observed)

no
(observed)

yes
(predicted)

34

11

no
(predicted)

0

0

Now we get precision of 34/(34+11) ≅ 0.76, and recall of 34/(34+0) = 1.0, for an F-measure of 2*.76*1.0/(.76+1.0) ≅ 0.86.

In terms of elementary-school grading, that would be 100*34/(34+11) ≅ 76 -- a solid C.

[Update: a reader wrote to complain that two random news stories is not a very big sample. So I wrote a little program to calculate the numbers for a random month of NYT newswire, from 2001 (a total of about 8.7 million words):

Now we get precision of about 0.67, and recall of 1.0, for an F-measure of 0.81. Not as good as before, but still better than the conventional rule. The "grade" of 117,835 right, 56,881 wrong, or about 67% correct, is also a bit better than the grade of the conventional rule.

Of course, any bright fourth-grader ought to be able to work out a simple rule that works a lot better than either of these. (Hint: supplement the default order with a list of the N commonest exceptions...)]

[Mark Baker raises a different point:

I've always heard the rule as "I before E except after C, when the sound's E"; I didn't think anyone had ever suggested that the rule might apply to things that don't have an E sound until I saw people discussing it online. There are still lots of exceptions, but many less: does this now outperform your alternative "I before E always" rule?

The wikipedia article (which I linked to above) offers two augmented versions, identifying Mark's as "British":

An augmented American version is:

i before e

except after c

or when sounding like a

as in neighbor and weigh

which excludes many of the exceptions but still fails to correctly handle many others.

A lesser known addendum in America is: Neither financier seized either species of weird leisure.

A British version is:

when the sound is ee

it's i before e

except after c

which excludes most exceptions, as well as excluding some words (e.g. friend) which are correctly handled by the American version. The most frequent everyday failures of the British form of the rule are seize, caffeine, protein and, for those who pronounce the initial vowel sound ee, either and neither.

Obviously the expanded versions are going to work better. However, since they're dependent on the alignment between spelling and prounuciation, it's going to be harder to score them. And since they make predictions about different sets of cases, using different numbers of clauses of differing complexity and generality, it's not easy to compare their scores.

In any case, the point about the Mrs. Olsen's of the world is that they promulgate such rules not because they accurately describe the facts, but because of some long-ago assertion felt to have been authoritative.]

It's a rule that is simple, concise and efficeint.
For all speceis of spelling it's more than sufficeint.
Against words wild and wierd, it's one law that shines bright
Blazing out like a beacon upon a great hieght,

It gives guidance impartial, sceintific and fair
In this language, this tongue to which we are all hier.
'Gainst the glaceirs of ignorance that icily frown,
This great precept gives warmth, like a thick iederdown.

Now, a few in soceity choose to deride,
To cast DOUBT on this anceint and venerable guide;
They unwittingly follow a foriegn agenda,
A plot hatched, I am sure, in some vile haceinda.

In our work and our liesure, our homes and our schools,
Let us follow our consceince, sieze proudly our rules!
Will I dilute my standards, make them vaguer and blither?
I say NO, I will not! I trust you will not iether.

]

[And this from Stephen Jones:

The wikipedia article is American, and thus biased against the British rule:
'i before 'e'
except after 'c'
when the sound is 'ee'.
which actually works for all but a handful of words ('seize' and 'protein' and 'Sheila' are the only ones I can find doing a search of the SOED).

Now, Wikipedia suggests that British pronunciation of 'sheikh' and 'either' is the result of applying the spelling rule to pronunciation. I am most dubious of this. It seems much more likely to me that the rule was imported to the USA from the UK, altered because of differences in American pronunciation, but, like most prescriptive rules never discarded.

Linguistics in the service of astrophysics

I happened to recall today, while socializing with some
astronomer friends, that some time around fifteen years ago
a couple of UC Santa Cruz astrophysicists came to
me in my capacity as a linguist and asked me if I thought I could
coin for them a one-word term to denote the ratio between rate of rotation of a cloud of objects rotating in 3-dimensional space
such as an accretion disk (this quantity is known as vorticity)
and the average number of objects visible in a unit area of
the 2-dimensional outer surface of the 3-dimensional cloud
(known as the surface density).
I thought about it overnight, and soon got back to them with my
suggestion: vortensity. The term was promptly
used in a scientific paper, with a footnote credit for linguistic
assistance, and it caught on. I was pleased to note
just now that my term gets over 230 Google hits. Not bad for
a technical word used in such a rarefied discipline.

I did that piece of
terminology coinage pro bono. There are companies
(Lexis Branding,
for example) that charge fees in the thousands or tens of
thousands of dollars when doing similar work for corporations.
I work cheap for my friends and colleagues at UCSC's Astronomy
and Astrophysics department. The permanent place in the
scientific literature is its own reward. But in addition it's a nice
feeling to know that I have a cast-iron response ready for
any arrogant physical scientist
(in the unlikely event that an
arrogant physical scientist should ever come along; I know
it's implausible)
who might dare suggest that linguistics has no role to play
in serious science. It'll be one of those finger- in- the- chest
"oh- yeah- lemme- tell- you- something- pal" moments, won't it?

Yet Another Fieldwork Sci Fi Book

Going through my books I came across another work of science fiction that involves linguistic fieldwork that might be added to ourprevious lists. It is The Color of Distance by Amy Thomson (ISBN 0-441-00632-9). It is also one of the best discussions of contact with a radically different intelligent species that I have read.

November 17, 2006

Quintuple quote embedding

More on
recursive quotation embedding:
David J. Swift of Jackson, Wyoming,
writes to point out a quintuply embedded quote in a story
by Garrison Keillor. The sentence ends in a wonderful (and
perfectly grammatical) string of seven successive punctuation marks
(! ” ’ ” ? ’ ”).
Here's the whole paragraph:

“TEEN LEADERS VOW ANTI-ROCK DRIVE, AIM SMUT BAN IN AREA,” the
Gazette reported the following morning. “Longtime youth worker Diane
Goodrich enjoys having as much fun as the next person [the story went on],
but Monday night, watching a local rock band rip into a live chicken with
their teeth at the 4-H Poultry Show dance, she decided it was time to
call ‘foul.’ Evidently, more than a few people agree with her.
Last night, at a meeting in the high-school auditorium attended on a
word-of-mouth basis by literally dozen of parents, not to mention
civic leaders and youth advisers, she spoke for the conscience of the
community when she said, ‘Have we become so tolerant of deviant
behavior, so sympathetic toward the sick in our society, that, in the
words of Bertram Follette, “we have lost the capacity to say,
‘this is not “far out.” You have simply gone too far.
Now we say “No!” ’ ”?’ ”

Keillor may have constructed this with malice aforethought, but it's
really quite natural-sounding, and basically understandable.

[Update:
In the first version of this post the paragraph above was presented
with a mistake: the left quotation mark before Have we become
was double. The first version of this update wrongly said the mistake
was there in the original book. Not so.
Rechecking the paragraph from a scan provided by David Swift
reveals that the book was correct, and has the quote marks
exactly as above. Thanks to Sridhar
Ramesh in Berkeley for the first email pointing out that the quotes
didn't match up before (because the left ones didn't alternate in type).
This forced me to do a recount and get it right. Gratitude and apologies
as necessary.]

A number of people have written to bring to my attention a story that
doesn't quite amount to natural use of the language. It's called "Menelaiad", and
it appears in John Barth's Lost in the Funhouse, which goes
into seven nestings of quotation marks. As Jeff Binder explains in
an email:

It starts out with a frame-tale, in which the main character is
telling a story to some of his friends, but the story he tells turns out to
be another frame tale, and so forth, until we get monstrosities such as
this:

This story is obviously very self-conscious about what it's doing, so I
don't know if you would consider this use to be "in the wild," but it
does show recursion taken to an extreme.

It does indeed look deliberate and artificial to me --- a literary
experiment, to be classed with experiments like Italo Calvino's If on
a Winter's Night a Traveler rather than an ordinary piece of
literature; and that lessens its interest a bit. But it must technically
be regarded as entirely grammatical, given the assumption that written
English allows recursive use of quotation marks to arbitrary depth, and
the rule that you alternate quotation-mark types.

I can't even spell "linguification"

Over at Watch Me Sleep, my good friend Ed Keer decided to make public our little disagreement this past weekend over whether the snowclone "can't even spell X" is an example of linguification. Ed's characterization of the argument makes it seem as if my position was just "Because Geoff Pullum said so!", and so I feel I must clarify.

(In fact, I had had enough to drink by that point that I couldn't remember anything that Geoff ever wrote except "Turkish exhibits a classic counterbleeding relationship between epenthesis and deletion" -- which, of course, is just plain wrong. But that's neither here nor there.)

Also, I want to add some more examples to the linguification mix.

Ed's position, in his own words, is that "the 'Can't even spell X' snowclone is a comment on the extent of a person's knowledge about a particular subject, so it does not qualify as a linguification". My position is that Ed's conclusion doesn't follow from his (inarguably correct) premise. In fact, most if not all linguifications that Geoff has discussed here on Language Log are (meant to be) comments on the extent of something; Geoff's point is that linguifications make rather curious comments of that sort, in particular because they fail as good examples of metaphor or hyperbole.

Now that I'm sober, I know better than to try to reconstruct Geoff's arguments from scratch; I'll start by quoting what Geoff writes about the distinction between linguification on the one hand and metaphor or hyperbole on the other hand.

To linguify a claim about things in the world is to take that claim and construct from it an entirely different claim that makes reference to the words or other linguistic items used to talk about those things, and then use the latter claim in a context where the former would be appropriate.

[E]ven if some linguifications can indeed be said to fall within the domain of metaphorical usage, that misses my point. In general, for most kinds of metaphors, it is easy to understand why people use them. They get the point across briefly and vividly. To say that the new office manager is a pussycat establishes instantly — just as a good caricature might — that the man's general demeanor and behavior suggests a cute, cuddly, playful, non-serious, easy-to-deal-with, tractable, non-fearsome nature that otherwise might take a considerable amount of time-wasting careful description to get across.

But I simply do not understand why people use linguification. If it gets the point across at all, it does it only indirectly and clumsily: we have to infer from [a] statement about [e.g.] word distributions, usually one that is false, some underlying statement that is only very imperfectly connected to it.

Hyperbole takes a claim and exaggerates it, so that if the hyperbolic version were true, the original claim would be true a fortiori. I do know about humorous uses of hyperbole. I believe I used it in the first line of this post ["About a million people have written to me ..."]; wouldn't you say so? Take that as an example. My underlying claim is that lots of people wrote to me. If my exaggeration in calling it millions were really true, the underlying claim would be all the truer. [...]

As I have tried to explain, patiently but fruitlessly, [...] Gilbert's figure of speech is very different. Take his underlying (broadly true) claim that parents don't get to go to the theater much after the kids are born. If he had said that parents forget what going to the theater is like, that would be hyperbole (they don't forget, it just becomes a tiny bit unfamiliar as far as recent experience is concerned). And if the hyperbolic claim were true — if they completely forgot what happens in theaters — then the underlying claim would be all the truer. However, he says instead that parents actually forget how to pronounce words like "theater". If they did, that would not make the underlying claim true. The loss of this snippet of pronunciation information would not mean that they had forgotten their experiences of theater. Nor would it mean they couldn't go: they get in a taxi, take it to Broadway, and point; or they could tell people they wanted to go to the big building downtown with the lights and the curtain and the actors.

Now take one of Ed's examples: "Accountability? They Can't Even Spell it!" The idea is clear: the "they" referred to here are not (held) accountable for something. But it's not hyperbole: if it were true that "they" could not spell "accountability", it wouldn't be all the truer that they aren't (held) accountable for something. It's also not a good example of a metaphor, unless you make the very improbable assumption that the first (or somehow primary) step in being (held) accountable is to be able to spell "accountability". (This could also be the first stop in not being (held) accountable, though, so the metaphor fails anyway.)

Ed closes his post with "You see what I'm sayin? Or do I have to spell it out for you?" Now in this case, I think "spell it out for X" is a great metaphor. Typically, you spell a word out for someone when you know or suspect that they don't know how to spell it; likewise, you painstakingly explain something to someone when you know or suspect that they don't understand it. It's a good metaphor, then, to say that painstaking explanation of something is like spelling out a word. Nothing linguificational about it, really.

Another example came up yesterday while I was talking to some friends. One of them was noting that he didn't have any cash, and I remembered that he had just spent some money at Guitar Center (a.k.a. "the Evil Empire" -- no link for you!), so I added: "... because you've just been to Guitar Center." Another friend said: "Those two statements often follow each other", by which I assume he meant that the Guitar-Center statement often follows the lack-of-cash statement. That's another example of linguification, similar to the kind Geoff describes here.

Finally, I'd like to suggest that my puzzlement over the "that's why they call it X" snowclone (follow-up here) is because this is also an example of linguification, though perhaps of a different kind than those that Geoff has identified. Certainly the number of comments and e-mail responses I've gotten to those posts (see also here), with completely unhelpful analyses of and excuses for this puzzling snowclone, are very much like the millions of unhelpful responses that Geoff has gotten to his various attempts to explain the difference between linguification and metaphor/hyperbole.

"Their their," I needed to hear them say

I'm grateful for your Language Log posts on the spelling nonsense in Scotland and New Zealand, not least because they've spurred me to read once again the George Starbuck poem appended below. It's harder to type than you might think, but I hope you'll agree it's worth the trouble.

My favorite student lately is the one who wrote about feeling clumbsy.
I mean if he wanted to say how it feels to be all thumbs he
Certainly picked the write language to right in in the first place.
I mean better to clutter a word up like the old Hearst place
Than to just walk off the job and not give a dam.

Another student gave me a diagragm.
"The Diagragm of the Plot in Henry the VIIIth."

Those, though, were instances of the sublime.
The wonder is in the wonders they can come up with every time.

Why do they all say heighth, but never weighth?
If chrystal can look like English to them, how come chryptic can't?
I guess cwm, chthonic, qanat, or quattrocento
Always gets looked up. But never momento.
Momento they know. Like wierd. Like differant.
It is a part of their deep deep-structure vocabulary:
Their stone axe, their dark bent-offering to the gods:
Their protoCro-Magnon pre-pre-sapient survival-against-cultural-odds.

You won't get me deputized in some Spelling Constabulary.
I'd sooner abandon the bag-toke-whiff system and go decimal.
I'm on their side. I better be, after my brush with "infinitessimal."

There it was, right where I put it, in my brand-new book.

And my friend Peter Davison read it, and he gave me this look,
And he held the look for a little while and said, "George..."

I needed my students at that moment. I, their Scourge.
I needed them. Needed their sympathy. Needed their care.
"Their their," I needed to hear them say, "their their."

You see, there are Spellers in this world, I mean mean ones too.
They shadow us around like a posse of Joe Btfsplks
Waiting for us to sit down at our study-desks and go shrdlu
So they can pop in at the windows saying "tsk tsk."

I know they're there. I know where the beggars are,
With their flash cards looking like prescriptions for the catarrh
And their mnemnmonics, blast 'em. They go too farrh.
I do not stoop to impugn, indict, or condemn;
But I know how to get back at the likes of thegm.

For a long time, I keep mumb.
I let 'em wait, while a preternatural calmn
Rises to me from the depths of my upwardly opened palmb.
Then I raise my eyes like some wizened-and-wisened gnolmbn,
Stranger to scissors, stranger to razor and coslmbn,
And I fix those birds with my gaze till my gaze strikes hoslgmbn,
And I say one word, and the word that I say is "Oslgmbnh."

"Om?" they inquire. "No, not exactly. Oslgmbnh.
Watch me carefully while I pronounce it because you've only got two more guesses
And you only get one more hint: there's an odd number of esses,
And you only get ten more seconds no nine more seconds no eight
And a wrong answer bumps you out of the losers' bracket
And disqualifies you for the National Spellathon Contestant jacket
And that's all the time extension you're going to gebt
So go pick up your consolation prizes from the usherebt
And don't be surprised if it's the bowdlerized regularized paperback
abridgment of Pepys
Because around here, gentlemen, we play for kepys."

Then I drive off in my chauffeured Cadillac Fleetwood Brougham
Like something out of the last days of Fellini's Rougham
And leave them smiting their brows and exclaiming to each other "Ougham!
O-U-G-H-A-M Ougham!" and tearing their hair.

Intricate are the compoundments of despair.

Well, brevity must be the soul of something-or-other.

Not, certainly, of spelling, in the good old mother
Tongue of Shakespeare, Raleigh, Marvell, and Vaughan.
But something. One finds out as one goes aughan.

November 16, 2006

Ear accidents

I'm in Manchester for a couple of days, serving as one of the external moderators for an "externally moderated reflective self-examination" at the National Center for Text Mining (NaCTeM). Although the formal process doesn't start until tomorrow morning, I spent this afternoon learning about some of the technical work in and around the center. One stop on the tour was Bill Black's office, where he and Jock McNaught told me about the CAFETIERE information-extraction system. Jock was describing his experience in supervising undergraduate students to adapt this system to new topic areas, and one of the examples involved creating a database of information drawn from news stories about ear accidents.

As I tried to assimilate his description into my gradually-developing image of how the system works and what it would like to adapt it to a new domain, it came to me that I had stumbled on an interesting cultural difference between the U.S. and the U.K. I don't believe that I've ever seen a news story in a U.S. publication about an ear accident. No doubt such accidents do occur -- piercings gone wrong, and the like -- but in my experience, they don't make the news. Could this be an aspect of pub culture that has so far escaped my notice; or an unanticipated side effect of playing cricket? I looked puzzled, I guess, because Jock repeated the phrase, "news reports about ear accidents". "Ear accidents", I echoed meditatively, tugging on my earlobe.

Well, of course, Jock was talking about "air accidents".

[Update -- Simon Cauchi writes:

I wonder if Jock McNaught is by any chance a New Zealander.

See Margaret Betterham's article, "The apparent merger of the front centring diphthongs — EAR and AIR — in New Zealand English", in New Zealand English, edited by Allan Bell & Konraad Kuiper (Victoria University Press, 2000), pp. 111-145.

No, Jock is a Scot.

And I don't think he merges ear and air (though I didn't check). Rather, I believe that he pronounces air as [er], with a vowel perhaps on the high side of [e] and a trilled [r], whereas I'm used to hearing something closer to [ɛɹ] or perhap [ɛeɹ] , with a much lower nucleus.

When stereotypes hang out

Here's a striking example of popular ideas about laconic guys and gabby girls, sent to me by Arnold Zwicky:

I don't know any evidence, one way or another, about the relative talkativeness of American male and female high-school students in hanging-out situations like those in the strip. But as discussed in a number of earlier posts, whenever people have measured sex differences in amount of talking in a variety of other settings, the result has generally been no difference, or a small difference in favor of more talk from males:

It's fair to observe that all the cited measurements have been made in contexts where talk is expected. There might be a sex difference, at least for some groups, in whether or not talk is expected when hanging out. And then again, there might not be.

Given all the quantitative social science out there, you'd like that someone would have measured this. It wouldn't be a very hard kind of experiment to do, except for the generic problem of the Observer's Paradox. If you know of any relevant research, please tell me.

If there isn't really a sex difference in talkativeness, why would so many people think that one exists? Well, once a stereotype is well established, confirmation bias kicks in. And maybe some people think that women are more talkative because they wish that certain women would say less; and maybe some people think that men are less talkative because sometimes they wish that certain men would say more. Or something like that. In any case, it's suspicious that there are apparently no actual word or talk-time counts that confirm the stereotype.

[Update -- Jim Roberts provides an anecdote, which is consistent with my own observations of groups of young men playing video games together:

I am a white American (well, Canadian, but I’ve lived in the States for eleven years) male and, consequently, have had occasion to gather together with a group of like-minded men for many an evening of video gaming. And, good God, are we loud. Several times the guys have been in one room playing games while the women are gathered in another room . . . doing whatever it is they do. Talking, I think, and perhaps baking or knitting. Too busy gaming to really notice, frankly. And those times when we’ve been separated by gender, at least once in the course of festivities a representative from the women is sent forth to the men to tell us to shut our collective traps before the neighbours call the cops. It’s been this way since my early teens. Men are possibly at their loudest and most vocal when playing video games, in my estimation only outstripping themselves when watching a sporting event.

And Theo Vosse writes:

In a political debate (there are elections next week) between what the British would call back-benchers, the speech rate for men was quoted as being 7,000 versus 23,000 for women. The reporter in question considered his statement well researched and hence this piece of knowledge could be used to discuss the role of women in politics. The consequence seemed to be that women were more factual and placed a larger weight on arguments than men. Thus the country would be better off with more women in politics.

A little knowledge is a dangerous thing...

"A little misinformation" would be closer to the mark, in this case. (And I presume that Theo's source means "speech rate" to be denominated in words per day, no doubt estimated by the standard Eskimo snow-words technique, otherwise known as "making up numbers".)

Anyhow, I've gotten no pointers to any systematic factual comparisons of talkativeness in "hanging out" situations, as yet. ]

[Eleanor Wroblewski writes:

Well, I'm certainly a gabby girl (and just a year older than Jeremy
Bucket), and with some of my female friends there's definitely a
replication of the whole "everyone-talk-at-once" thing. And there are
male friends who wish I would shut up sometimes, and male friends I
wish would talk more sometimes. However, in my social group, there are
definitely quieter people and louder people of both genders, and
although admittedly I have not witnessed male-only social interactions
often, I have reason to believe that the differences in discourse
between groups of different gender compositions, if you ignore
specific individuals, probably has a lot more to do with the "texture"
of conversation, rather than the wordcount.

Well, if "texture" means how much overlapping talk there is, then that should come out in the collective wordcount as well, though I agree that it would be better to measure it more directly, in terms of the distribution of number of simultaneous talkers at regularly-sampled (or randomly-sampled) time points. Eleanor has raised the hypothesis (if I can put words in her mouth) that a group of her female friends might show higher counts of two (or three or four) people talking at once, compared to a group of their male counterparts. The measurement is easy enough to make, though testing the hypothesis is made harder by the fact that different groups and different circumstances doubtless vary widely.

]

[But Eleanor replies:

Well, you did sort of put words in my mouth; I'm not sure exactly how
to describe "texture", but I think it's a combination of who says what
when, and how it's discussed, and the same things being talked about
in different ways. The classic is the fixation on emotions for girls,
but really I think that's just me being stereotypical . . . Also
conversational topics to a certain extent. But really a lot of it
varies on individuals and I just happen to fit the girl stereotype
pretty neatly except I talk about boys with physics metaphors (e.g.
"delta boy since August until now is perilously close to zero") and
say things like, "Well, I bet you don't know what the capital of
Azerbaijan is . . . Oh burn," and sometimes will make an absolutely
opaque grammar joke. But, you know, I talk all the time and about
things like food and romance and makeup, so I'm like a stereotype
personified.

OK, point taken. I guess you could interpret the Zits strip as an imaginative (i.e. false) reconstruction of such "texture" differences in terms of other stereotypes. Meanwhile, Rita Rouvalis Chapman wrote in to describe in verse her recent impressions of the reality, from a different perspective:

I'm a high school (English) teacher.

Do 16-year-old boys ever shut up?
Oh no, they do not.

They will talk if I pout.
They will talk if I shout.

Shall I give them a detention?
They do not think this even worth a mention.

They will quote the latest movie.
They will bust the latest Jay-Z.

Should I put them in the hall?
No, out there they have a ball!

They don't care if it's a test.
That's the time they like to whisper best.

November 15, 2006

No longer the subject of that mighty verb

Douglas Davidson submits to the linguification desk a sentence from
Simon Jenkins in The Guardian, an
utterly
baffling piece of linguifying that ends by saying of President Bush and
Prime Minister Blair that

They are no longer the subject of that mighty verb, only its painful object.

I not only have no idea what he means, I am inclined to doubt that even he
has any idea what he means. Here it is again with a paragraph of context:

Bush and Blair are men in a hurry, and such men lose wars. If there is a
game plan in Tehran it will be to play Iraq long. Why stop the Great Satan
when he is driving himself to hell in a handcart? If London and Washington
really want help in this part of the world they must start from diplomatic
ground zero. They will have to stop the holier-than-thou name-calling and
the pretence that they hold any cards. They will have to realise that this
war has lost them all leverage in the region. They can insult and sanction
and threaten. But there is nothing left for them to "do" but leave.
They are no longer the subject of that mighty verb, only its painful object.

To start with, which "mighty verb", and why is it mighty?
Does he mean the mighty verb
leave? Because making Bush and Blair the object of that verb
would give us sentences like this:

The journalists decided to leave Bush and Blair, and went off
to look for Britney Spears.

I can't make anything sensible out of the idea that Bush and Blair
have switched from being in a position to be described by true sentences
in which Bush and Blair would be the subject (as in If
Bush and Blair leave Iraq, lots of people will be happy) to being a
position to be described by true sentences in which Bush and Blair serves
as direct object.

Could Jenkins have meant the verb do, which he oddly puts
in greengrocer quotation marks (WE HAVE "FRESH" TOMATOES!), as if they
constituted a way of showing emphasis? Same problem. Doing Bush and Blair
(in any sense) just doesn't seem to be what he's talking about.

And why "its painful object"? In what sense can a grammatical object be
painful? What exactly is painful here? [Don't tell me you
think that "do" is intended in the sexual sense, and that's what's
painful. Jenkins
doesn't mean that; does he? At least one person who has already
emailed me thinks that he does: he means Iraq is going to "do" Bush
and Blair in the sense of bugger them. I guess what this reminds me
of most is the crude remark by John McClane, the cop played by
Bruce Willis in the 1988 movie Die Hard, when he points out to
the recently humiliated police chief, "You're the one who just got
butt-fucked on nationwide TV." Could Simon Jenkins really have
that in mind? It doesn't seem plausible to me.]

Simon Cauchi has pointed out to me what is probably the
right answer (I am modifying this from the earlier version of the
post). He points to an earlier sentence (with another
linguification I hadn't noticed!). I missed it because it is an
astonishing nine paragraphs back from what I quoted
above; but it is the key. Jenkins says that in Iraq:

It is total anarchy. All sentences beginning, "What we should now do
in Iraq ... " are devoid of meaning. We are in no position to do
anything. We have no potency; that is the definition of anarchy.

So here is what is going on. The "mighty verb" is
do. And Jenkins is equating "subject"
with "person
who does things and is in control of actions", and "object" with
"person or thing that gets things done to it and is not in control of
actions", and "verb" with "action".
He means Bush and Blair (surrogates for the whole of the West) are
not in control,
and will soon not be able to decide to take the action of doing
something to get out of Iraq;
rather, they will have something done to them —
they will be pushed out of Iraq by the actions of others.

In grammar, "subject" doesn't mean "actor", and "object" doesn't mean
"undergoer", and "verb" doesn't mean "action". Those confusions,
much beloved of traditional pedagogical grammar,
bedevil serious attempts to teach syntax. Jenkins' double effort at
linguifying here has stuck him with a sentence that is
hard to make any sense of at first. It's one of the most ill-judged
attempts at effective writing I've seen in quite a while.

It all makes me think that perhaps Simon Jenkins needs a little rest from
writing columns to deadline. (Jenkins was also responsible for some of the
"entertaining
foolishness" concerning the recent great spelling brouhaha.)

If it was good enough for King Alfred the Great...

Do you own a copy of Merriam-Webster's Concise Dictionary of English Usage? If not, go immediately to your favorite bookseller and buy one. Believe me, it'll be the best $13.22 (or even $16.95, if you pay list price) that you've spent in a while. Geoff Pullum recommended it last year ("Don't put up with usage abuse", 1/15/2005), in response to a reader's question about what references or authorities to trust with respect to style and usage. Geoff used blurb-worthy phrases like "the best usage book I know of" and "this book ... is utterly wonderful", and I agree with him.

Why am I plugging this book today? Because it provides a perfect answer to a note from a reader about the use of less and fewer.

Matt Cockerill send in a link to an article in the Guardian (John Mullan, "M&S: the pedant's store", 10/6/2006). Apparently a customer complained about apostrophe placement ("I do not care to dress my child in a top containing a glaring grammatical giraffe gaffe"), and after an appeal to their "childrenswear technologist", M&S withdrew the offending item from their stores, apologized, and sent a refund. Matt focused on the article's passing mention of an earlier M&S capitulation to customers' grammatical prejudices:

M&S, of course, likes to project a classy image and this confession of grievous fault rather neatly confirms it as the favoured shop of those with high standards, in grammar as in everything else. A few years ago it changed its "6 items or less" checkout signs for replacement signs declaring, more correctly, "6 items or fewer", reportedly after customers had grumbled.

Matt observed that this "crops up as a standard example of the shoddy grammar of our modern age in newspaper articles here all the time", and registered a counter-grumble:

... the weird thing is, I've never once seen anyone point out that there's nothing grammatically wrong with '5 items or less', and in fact it's much more natural and less stilted sounding than '5 items or fewer'.

The key, as far as I'm concerned, is to realize that it's quite valid to think of '5 items or less' to imply an ellipsis:
"5 items or less... [than that amount of shopping]"

and in that it's no different from any number of standard grammatical usages which make use of ellipsis.

I'm also always tempted to ask whether they would replace the sign outside a kids playground to indicate that it may be used only by children who are "5 years old or fewer"...

Matt's grammatical instincts are exactly right. He's also correct in observing that with ages -- and in certain other cases of countables as well, which MWCDEU summarizes as "distances, sums of money, units of time, and statistical enumerations" -- less is generally preferred to fewer. And Matt's observation about a possible construal of "5 items or less" also seem valid to me, although I think it's a secondary point. The primary point is that the now-standard pedantry about less/fewer is in fact one of the many false "rules" that have recently precipitated out of the over-saturated solution of linguistic ignorance where most usage advice is brewed.

But not the usage advice at MWCDEU. This is the start of its entry on less/fewer:

Here is the rule as it is usually encountered: fewer refers to number among things that are counted, and less refers to quantity or amount among things that are measured. This rule is simple enough and easy enough to follow. It has only one fault -- it is not accurate for all usage. If we were to write the rule from the observation of actual usage, it would be the same for fewer: fewer does refer to number among things that are counted. However, it would be different for less: less refers to quantity or amount among things that are measured and to number among things that are counted. Our amended rule describes the actual usage of the past thousand years or so.

As far as we have been able to discover, the received rule originated in 1770 as a comment on less:

This Word is most commonly used in speaking of a Number; where I shoudl think Fewer would do better. No Fewer than a Hundred appears to me not only more elegant than No less than a Hundred, but strictly proper. --Baker 1770

Baker's remarks about fewer express clearly and modestly -- "I should think," "appears to me" -- his own taste and preference. [...]

How Baker's opinion came to be an inviolable rule, we do not know. But we do know that many people believe it is such. Simon 1980, for instance, calls the "less than 50,000 words" he found in a book about Joseph Conrad a "whopping" error.

The OED shows that less has been used of countables since the time of King Alfred the Great -- he used it that way in one of his own translations from Latin -- more than a thousand years ago (in about 888). So essentially less has been used of countables in English for just about as long as there has been a written English language. After about 900 years Robert Baker opined that fewer might be more elegant and proper. Almost every usage writer since Baker has followed Baker's lead, and generations of English teachers have swelled the chorus. The result seems to be a fairly large number of people who now believe less used of countables to be wrong, though its standardness is easily demonstrated.

MWCDEU then gives a couple of pages of illustrative example in both directions, dealing especially with the "common constructions" with countables where less continues to be used more often than fewer "in present-day written usage". The concluding advice:

If you are a native speaker, your use of less and fewer can reliably be guided by your ear. If you are not a native speaker, you will find that the simple rule with which we started is a safe guide, except for the constructions for which we have shown less to be preferred.

I've scanned the whole less/fewer entry, and made it available here. Now validate my stretching the boundaries of fair use, and go buy the book!

I'm going to add a couple of observations based on web searches. First, Google News validates MWCDEU's observation about the difference between countables in general -- where journalists and their editors prefer fewer about 2-to-1 -- and things like units of time and amounts of money, where less is preferred by a whopping margin. Here are counts for a few different countables, in the "less/fewer than N items" construction:

votes

people

players

pages

hours

minutes

seconds

dollars

less than N __

547

263

10

21

5,060

14,000

1,830

127

fewer than N __

1,100

451

43

7

156

364

34

1

less/fewer

0.50

0.58

0.23

3.0

32.4

89.7

53.8

127

And the same tendencies can be seen on the web in general, except that the ratios are generally shifted in the direction of less. No doubt this is due to the effects of copy-editing on the Google News sample.

votes

people

players

pages

hours

minutes

seconds

dollars

less than N __

272,000

2,570,000

183,000

696,000

9,110,000

13,400,000

3,840,000

2,220,000

fewer than N __

182,000

837,000

61,500

112,000

552,000

92,200

22,300

921

less/fewer

1.50

3.07

2.98

6.2

16.5

145.3

172

2,410

But interestingly, in that "N items or less/fewer" construction, the less/fewer ratios generally shift away from fewer and towards less. At least, that's clearly true for the cases of countables where fewer is reasonably common to start with. Here are the counts from Google News:

votes

people

players

pages

hours

minutes

seconds

dollars

N __ or less

128

15

1

1

125

548

106

19

N __ or fewer

3

2

1

0

2

3

0

0

less/fewer

42.7

7.5

1.0

NA

62.5

182.7

NA

NA

And here are counts from the web at large:

votes

people

players

pages

hours

minutes

seconds

dollars

N __ or less

14,200

134,000

13,200

215,000

1,890,000

2,100,000

944,000

248,000

N __ or fewer

495

30,600

1,570

19,500

24,200

757

6,760

125

less/fewer

28.7

4.38

8.41

11.0

78.1

2,774

140

1,984

I'm not sure whether Matt's ellipsis theory is the reason for this shift, though it's a reasonable possibility: when someone writes or says "1,000 votes or less", they may well mean "1,000 votes or less of a margin than that", rather than "1,000 votes or less votes than that". But King Alfred says that they'd be OK either way, and so do most other English writers in the millennium between his time and ours.

[A small pedantic confession: King Alfred had the genitive case at his disposal, and so his use of less with a count noun is actually a partitive construction of a type that we can't copy idiomatically in modern English -- "less of words":

But still. And I'll make up for doubting Alfred's modern relevance by filling in one data point from the intervening centuries -- a footnote by Alexander Pope, to book XIV, verse 291 of his translation of the Iliad:

But whoever considers his Circumstances will judge after another manner. Priam, after having been the most wealthy, most powerful and formidable Monarch of Asia, becomes all at once the most miserable of Men; He loses in less than eight Days the best of his Army, and a great Number of virtuous Sons; he loses the bravest of 'em all, his Glory and his Defence, the gallant Hector.

The use of less with a count of time-units has always been preferred, as MWCDEU observes. But I was surprised by that "bravest of 'em all", in a footnote no less. ]

November 14, 2006

Wanna: neither slang nor language murder

The
Guardian (11/1/2006)
obediently repeats from a press release that the non-standard features
that will be permitted in examination answers
(as Mark
recently
noted)
range "from the slang ‘wot’ and ‘wanna’,
to the short cut ‘CU L8R’".
Can't Guardian journalists even look up the meaning of
the word "slang" in a dictionary that's free on the workstation
on the desk in front of them?

The clearest point here concerns wanna, a standardized spelling
(constantly used, for example, in representing dialog in novels) for a
kind derived amalgam of want and to about which linguists
have written reams over the last thirty years (since David Lightfoot
suggested it provided crucial evidence for a certain theoretical point
in syntax [Linguistic Inquiry 1976] and Paul Postal and I took out
after this false claim in a whole series of papers [Linguistic
Inquiry 1978, 1979, 1982, 1986] there have been dozens of papers on
the topic; two have been in Language, one by me in
1997 and one in the latest 2006 issue by Dick Hudson).
This isn't a slang form by any conceivable definition of
slang: it isn't peculiar to a particular group, it's familiar for every
American speaker (and I think most British-derived speakers too). It isn't
non-standard at all; it's part of informal style in Standard English (which
is why it's treated in The Cambridge Grammar
of the English Language). It isn't a recent coinage; it isn't
"arbitrarily changed"; it isn't "extravagant, forced, or facetious".
Nobody who knows what slang is could think wanna is slang.

But then I'm not sure that any testing authorities actually gave any
of the examples, or characterized them as slang.
The Guardian says:
"The Scottish Qualifications Authority (SQA) said the use of phrases like
"2b r nt 2b" or "i luv u" in exam papers would be allowed as long as
candidates showed that they understood the subject." Maybe you truly
believe that a staff member from the Scottish Qualifications Authority
solemnly told education journalists that "2b r nt 2b" is now an acceptable
spelling of the first line of Hamlet's famous soliloquy as far as the
testing of knowledge of Shakespeare is concerned, but I don't. I think
the journalist tossed that in as a gratuitous illustration. My guess is
the SQA did little more than to announced (or remind people of the prior
existence of) a policy that says an unconventional spelling will not
automatically lead to a student who sees what the answer is being graded
the same as one who didn't know. (This seems sensible. Preserving
the distinction between students who do know the answer and students who
don't is surely rather important educationally. But I suppose that makes
me a dangerous linguistic libertine.)

Katie
Grant of The Sunday Times (11/5/2006) uses the texting-is-OK
story as part of her evidence that "Our language is being murdered." It's
not all of her evidence; her rant touches on a variety of different
subjects relating to the supposed disastrous slippage of linguistic
standards in Britain's schools. She is aware that language changes, but
she dismisses this briskly by saying, "what the "anything goes" brigade
refuse to acknowledge is that there is a difference between developing
language and abandoning it."

Our language is being murdered and/or abandoned
because your is occasionally spelled ur (as opposed to the
common older abbreviation yr) by a teenager writing in a hurry?
You know, we do try to exaggerate the dumbness of newspaper stories
about language here at Language Log, for a little humorous leavening; but
it isn't very easy, because what the journalists say is often
far out beyond where satire can reach.

When life is funnier than the funnies

The world's media recently saw a flurry of stories about a non-story: exam-grading authorities in Scotland and New Zealand explained, when asked, that they give partial credit for correct answers that are wrongly spelled -- a long-standing policy that also covers occasional intrusions of abbreviated spellings from the culture of text messaging. This led to blasts and counter-blasts of end-times rhetoric from pundits and politicians. There was an especially funny exchange between Bill English, the spokesman for education of the National Party in New Zealand, and Steve Maharey, New Zealand's minister of education:

English: This kind of pigeon English is fine for young people organising their social lives, but it is not an acceptable way of expressing an academic argument or idea.Maharey: The statement is understandable, despite pidgin being spelt p-i-g-e-o-n, as in a bird from the dove family, rather than p-i-d-g-i-n, as in simplified language used between persons of a different nationality. But we will give him credit.

Although I'm a big fan of User Friendly, I'm afraid that in this case the politicians are funnier:

It's normal, if unfortunate, that politicians and the mass media get this kind of thing wrong. But I expect better from the cartoonists. [Tip of the hat to Robin Shannon]

Satirical cartoon uptalk is not HRT either

There's a widespread false belief that "uptalk" -- the phenomenon of final rising intonation used on phrases that aren't yes/no questions -- involves terminal pitch contours that start high and rise. As a result, some people use the unfortunate technical term "High rising terminal", abbreviated HRT, for this way of talking. But as I've argued earlier ("Uptalk is not HRT", 3/28/2006), the informal term uptalk is a better choice, since it avoids the often-false claim about the shape of this contour that's implicit in the term HRT.

Some additional evidence emerged yesterday on Fox's Family Guy (Episode FG-435 "Whistle While Your Wife Works", Air Date: Sunday, November 12, 2006), when Stewie tries to persuade Brian to break up with Jillian, who is described in the episode's press release as "very attractive but intellectually challenged". Stewie tries to make his point by satirically imitating Jillian's (alleged) uptalk, adding some annoying little nods and grimaces:

There are five rise-ending phrases in this short clip. In each case, I've given a transcribed pitch track of the end of the phrase, and also some numbers showing the pitch value (in Hz.) in the middle of some selected syllables -- or in the case of final rises on final accented syllables, in the middle of the initital lower-pitched region, and then at the location of the peak.

The first four examples are in Stewie's little "dump her" speech.

Alright, Brian, you can do this.
You can dump her.
Because once it's done,
never again will you have to listen to her talk like thi......s?
199 202 182 162..334

In this first case, the low part of "this" is fully 40 Hz. lower than the value of the previous accented syllable, "talk". (The even lower pitch in the low-amplitude region at the start of "this" is the consequence of the restricted air-flow during the voiced fricative [ð].)

You know, where everything has a question mark at the end of it?
213 210 169 321

Again, the low pitch value on the accented syllable of "question" is 44 Hz. lower than that value of the accented syllable of "everything".

With an upward inflection?
205 152 335

And in this case, the low value on the accented syllable of "inflection" is 53 Hz. lower than the previous accent on "upward".

at the end of every sentence?
198 159 347

This time there's a 39 Hz. difference -- the low point on the stressed syllable starting the final rise is still the lowest accented syllable by almost 20%.

Brian's uptalk is more ambiguous. The stressed syllable of "thinking" is indeed about 19 Hz. lower than the previous accented syllable "what" -- but the pre-stress dip on "was" seems to represent a genuine low-pitched target (rather than simply the pitch-depressing effect of the obstruent), and "thinking" starts up fairly rapidly. So you might believe that this a type of accent whose low point is aligned before the "beat" of the accent, rather than on or after it -- and that's one of the patterns that might plausibly be described as a "high rise", since the strong syllable of the accent is at a mid pitch value rather than at a local minimum.

And Brian's attitude towards Jillian is more ambiguous as well -- the episode's press release says that he "can't close the deal because she is so hot". But the funny thing is, though Jillian is certainly depicted as less than brilliant, she doesn't actually use uptalk very often. In the segment below, there are a couple of examples around 1:10, and a couple more around 6:00, but most of Jillian's talk is not uptalk at all. (And there's nothing like Stewie's accompanying head and face gestures.) Apparently even a cartoon stereotype of a female airhead is not as intonationally stereotyped as the other cartoon characters' stereotyped image of her is.

November 13, 2006

Plain english creeps into police radio transmissions

An Arlington Virginia policeman uses his high-tech radio to call for
help, shouting "ten thirteen," meaning "officer down." In nearby
Bethesda Maryland, other officers ignore his message because to them it
means "request wrecker." Hmm. this could be a problem, to say the least.

The Washington
Post reports that the Virginia State Police have had enough of such
confusion and are instituting a radically new policy that calls for
abandoning "10 codes" used in daily transmissions in favor of -- you
guessed it --plain English.

There is a history behind the language planning
currently used on police radios. It started back in the 1920s, when
police had only one channel to work with. But over the years, in a
Tower of Babel fashion, separate police departments began to develop
their own meanings for their "10 code" numbers. In the densely
populated area around the District of Columbia the separate law
enforcement agencies of the states and counties, along with the
Pentagon, ATF and FBI, gradually created their own "10 code" meanings.
Fine and dandy -- except when they communicate with each other across
jurisdictions, which turns out to be very frequently.

Sociolinguists describe three types of language planning: corpus
planning, status planning, and acquisition planning. Corpus planning is
what the Virginia State Police seem to be trying to carry out here.
This involves creating new forms of language, modifying old ones, or
selecting from among alternative existing forms. This is not the same
as status planning, which involves such things as deciding on an
official or national language, thereby assigning status to that choice.
Many Language Log posts (here),
(here),
(here),
and (here),
for example, have dealt with America's ongoing efforts to make English
the official language, an excursion into status planning. It doesn't
seem likely that the Virginia State Police are aiming at status here.
As for acquisition planning, it remains to be seen whether this
effort in language planning will succeed in teaching the new
forms effectively or create an incentive for officers to learn how to
use plain English instead of their more familiar "10 codes."

Language planning changes aren't easy to accomplish. On May 19, 2006,
the US Senate, influenced by a group called U.S. English,
voted to
make English the "national" (interestingly, not "official") language of
the country. But 27 states have elevated "national language" to their
state's "official language." This is a clear example of status planning
(proclaiming English to have more status than any other language) but
acquisition planning may prove to be a bit more difficult.
Already there is resistance to this venture, as the Post reports. Some
cops say that they're more comfortable with the old "10 code" system
and they think this in-group jargon is nifty because it marks their
status as police. They reason that if doctors and lawyers can have
their language codes, why can't police? Other officers express some
difficulty in even remembering what the plain English is for their
codes. Still others worry that their transmission will now become
understandable to the general public (as if they weren't already
available on the internet).

We'll have to see what happens on this one.

Update: Grant Barrett writes that one side-aspect of the dropping of "10 codes" is that the trunk radio systems now so prevalent in policing make the masking of police intent and action less necessary, since it's more difficult to monitor trunking than it is in the old analog systems.

So maybe this Virginia State Police language planning is just a practical matter after all.

7 - 38 - 55!

I'm not calling a football play; those are the famous Mehrabian
numbers, giving -- in the usual citations of this research -- the
percentages that verbal content, paralinguistic features (vocal
quality, prosody, etc.), and kinesic features ("body language", broadly
construed) contribute, respectively, to the total impact of a
message. When I last
mentioned this research, I noted that the great avalanche of
bizlore -- the lore of corporate trainers, motivational speakers,
advertising advisers, and the like -- using the Mehrabian numbers went
drastically far beyond Mehrabian's own claims, which were that these
figures applied only to the communication of attitudes and
emotions. As it turns out, the actual results of Mehrabian's 1967
studies are much more modest than even this, as Ed Keer noted in
his blog back in February (building on a
longer discussion by Richard Sproat on Linguist List in 2001).

Check out Keer and Sproat for details. The fact is that the 1967
studies weren't about the communication of attitudes and emotions in
general, but about the communication of one specific set of attitudes
and emotions, liking and disliking. Ok, you say, I had no earthly
idea how anyone could study the relative contributions of verbal
content, paralinguistics, and kinesics to the total impact of a message
(whatever that means), but I still don't see how this much more modest
question could be investigated experimentally: what do you measure, and
how?

Good question. What Mehrabian did was pit features (expressing
liking/disliking) in the three modes against one another to see which
mode prevailed, and how often, when they were in conflict. Even
if we accept his results at face value -- and there are many details of
the experimental design and the interpretation of the data that a
reasonable person could fret about -- all that Mehrabian did in 1967
was, in Keer's words, to discover sarcasm, in this case the conveying
by extralinguistic devices of a meaning opposite to the plain meaning
of
the words.

That's stage one. In stage two, these results morph into a global
generalization about language use, which then spreads into all sorts of
places outside the academic world. The details of this
transformation and diffusion would be worth looking at. (With
luck, Mehrabian himself has relevant materials from the 60's and
70's.) No, no, don't look to me to do this research; I'm the guy
with over a hundred postings in his queue for Language Log, and I'm not
a cultural historian.

In any case, I'd imagine that science writers for the general press,
and their editors, had a hand in the spread of the Mehrabian numbers to
a wider world. (I mention editors, because many a science writer
has had an original text altered, in small or large ways, to make it
conform to the beliefs of editors -- whatever the content of the
original. And then, famously, headlines are often attached that
seriously distort that content.)

That's stage two. In stage three, the Mehrabian numbers become
part of bizlore, indeed part of a larger set of folk beliefs.
Most people are no longer aware of the source of the numbers, and most
people who cite Mehrabian haven't looked at the original studies or any
careful summary of them; it's "common knowledge" now.

Bizlore is just one part of an enormous enterprise of popular advice
literature -- on education, child-rearing, exercise, diet,
relationships, gardening, and more, including grammar, usage, and
style. Bizlore focuses on persuasion, power, and the fostering of
positive emotions, with the aim of helping people achieve success in
business dealings of all sorts.

All sorts of popular advice literature, not just bizlore, appeal to
"common sense" and folk beliefs; rely heavily on personal opinions and
impressions (of the advisers and their audiences); and get points
across largely via particular examples, often by telling exemplary
stories of personal experience (we all love stories). Notice that
this is not at all the way scientific inquiry proceeds -- but it IS
the way ordinary people reason about their world and their lives.
"Science" appears in popular advice literature mostly for its value as
dressing: there are numbers, real numbers; and actual researchers or
institutions, of some prominence (or apparent prominence), can be
appealed to, however spuriously. "Science" is just one more
element in the rhetoric of popular advice literature, rarely an actual
contributor to it. (There are some honorable exceptions, of
course.)

In any case, you can see why bizlore loves the Mehrabian numbers.
They're wonderfully impressive. So exact, and from a real
scientist!

The Mehrabian numbers also plug into a powerful folk belief about how
human interaction works -- that we are "communicating" (passing back
and forth) "messages" to one another. Ordinary people (and some
social scientists) conceptualize interaction in terms of the "conduit
metaphor" discussed in several places by Michael J. Reddy (most
recently, I think, in the 2nd edition (1979) of Metaphor and Thought, edited by
Andrew Ortony) and made famous by George Lakoff in many of his
writings. Now, everyone, including social scientists as a group,
recognizes that paralinguistics and kinesics contribute a lot to the
texture of interaction, so it's natural for ordinary people to think
that linguistic expressions, paralinguistic features, and kinesic
features are just three different modes of communicating the same
messages, and it then makes sense to ask what their relative
contributions are.

Two problems, one of substance, one of method.

The first is that there's no reason to think that the three modes are
ways of conveying the SAME "meanings", or even that CONVEYING
meanings is what's going on. I would maintain, with many others,
that there are many different kinds of "meaning" at issue here, and
that it would be more accurate to say that the features of behavior in
question (depending on the occasion) express, reflect, perform, or
construct these meanings than to say that they simply convey them.

The second problem is that the question being posed -- what are the
relative contributions of (strictly) linguistic content,
paralinguistics, and kinesics? -- is, as I suggested above, one of
those impossibly over-global questions that almost surely can't be
answered. The methodological difficulty here is that what happens
in each of the three modes is exquisitely context-dependent. I
can't see any way to sample behavior, from the whole world of human
interactions, while controlling for these differences in context;
without such controls, what we see might well just follow from
differences in the frequencies of the various contexts, rather than
from some intrinsic difference between the modes. (There's also
the problem of individuating contexts. Where do we get an
inventory of the relevant types of context, even in one culture?)

What I'm saying here is that there are some questions about language
and behavior that are easy to formulate but so global that they are
probably unanswerable in principle. (At least some of the
questions about differences between the sexes, such as Louann
Brizendine's claim that women use many more words per day than men --
now discussed here in a long series of postings by Mark Liberman -- are
almost surely unanswerably over-global. I hope to post on that
eventually.)

A semi-final remark: linguists will probably be struck by what counts
as (strictly) linguistic (vs. paralinguistic or kinesic) in Mehrabian's
research and everything that cites it or alludes to it: apparently,
only aspects of utterances that contribute to literal meaning. To
a linguist, this is desperately impoverished view of language use,
excluding most of the subject matter of entire subfields of
linguistics. Almost all variation in the linguistic system is
ignored, thus neglecting the ways in which, through their use of
particular linguistc variants, people express, reflect, perform, and
construct social group affiliations and personas; the ways they express
or reflect attitudes and opinions towards their audiences (including
liking/disliking!), about the nature of the interaction, etc.; and the
ways in which they use linguistic choices to structure their discourses
(via discourse particles, for example). These are "social
meanings" and "discourse meanings", if you want to put everything under
the umbrella of "meaning".

Also missing is everything to do with non-literal meaning: for
instance, implicatures of all sorts, fresh figures (especially
metaphors
and metonyms), and other rhetorical devices. Emotions and
attitudes can be expressed or revealed through all these means, too.

On a more constructive note, I can remind you that linguists,
psychologists, sociologists, and anthropologists have long concerned
themselves with the ways in which linguistic content, choices of
variants, discourse organization, paralinguistic features, and kinesic
features are coordinated with one another and can combine into suites
of behaviors associated with "meanings" of all sorts. For a
beautiful recent example of research along (some of) these lines I
recommend Rob Podesva's Stanford Ph.D. dissertation, Phonetic Detail in Sociolinguistic
Variation: Its Linguistic Significance and Role in the Construction of
Social Meaning, completed this summer (it will be available
eventually, in chapter-sized chunks, on his website). Podesva
looks at the way three speakers' uses of one segmental variable
(realization of word-final coronal stops) and two paralinguistic
variables (prosody and voice quality, in particular falsetto) are
associated with different personas in different contexts. These
associations are very much local, in that they are tied to particular
social groups and to particular contexts, as well as to individual
speakers (the three speakers -- all friends -- use the variables in
different ways).

Punctuational linguificatory hyperbolicity

Charles Belov points out to me a punctuational linguification that does fall under the heading of
hyperbole.It's in a piece of writing about dance by
Eva
Yaa Asantewaa:

Some enormously gifted people contributed to Francesca Harper's Modo
Fusion Lounge showcase up at Makor/Steinhardt Centers intimate café space on West 67th Street. For starters, there was the stunning Harper herself — the kind of artist and performer whose pile-up of talents quickly exhausts a keyboard's hyphen or comma keys.

Did you instantly parse that connection between talent pile-up and
key exhaustion?

The idea seems to be that the comma and hyphen keys on your keyboard
will get worn out when you try to write about Harper's talents. Asantewaa
says Harper is "‘a conceptual pop artist,’ film director,
lyricist, dancer, singer, and actor currently understudying two roles in
The Color Purple", and the Modo Fusion Lounge features "music, ...
dancing ..., film, poetry, humor, and a whole lot of fun." Ten commas
there. My comma key survived the pounding. I suspect
Asantewaa's did
too, refuting her literal claim. She exaggerates. But at least this
linguification is intelligible once you see that: if Ms Harper did billions of different things, (the actual array of projects can be seen
at www.francescaharper.com;
it doesn't really run into the
billions), and you tried to list them all in a
multiplecoordination,
then at an average word length of about 6 characters (roughly the
right figure for English), for every n
keystrokes on letter keys you would need 6n comma keystrokes, which
might cause the comma key to wear out before the letter keys. It's a
fairly silly piece of over-writing (perhaps not the first in enthusiastic
arts reviewing), but at least it does make perfect sense as an
exaggeration. As I have
previously pointed out, many linguifications don't.

November 12, 2006

prancing about with jack mcconnells pants on your head does not a news story make

... but the Tensor's Snowclonatron will find it for you anyway. Over at Tenser, said the Tensor, the Tensor has just released a perl script that tabulates instances of snowclones. So I downloaded his program snowclone.pl, and executed

snowclone.pl 'X does not a Y make' >DoesNotA

and hey presto (well, after a couple of minutes), I had a list of 640 variant instantiations, sorted according to Google count. In this case, the top ten (with their counts) are

one game does not a season make 916
one election does not a democracy make 482
prancing about with jack mcconnells pants on your head does not a news story make 404
a bottle of water does not a rider make 402
benchmarks does not a majority make 195
one month does not a trend make 158
an os does not a platform make 143
one year does not a trend make 141
one win does not a season make 131
one day does not a trend make 85

The business about Jack McConnell's pants is a self-refuting line from a much-copied recent news story: Murdo MacLeod and Eddie Barnes, "McConnell under pressure over Bute House video fiasco", The Scotsman, 11/5/2006. Well, actually, it's from a reader's comment on the news story, so it's not so much self-refuting as reader-refuted.

The Tensor's algorithm for finding the boundaries of the phrasal template in particular cases is simple but generally effective. There's just one mistake in the top ten list: the phrase "benchmarks does not a majority make" omits an initial "several" in quotation marks in the source.

In this list, the Xs such that X "does not a trend make", in frequency-sorted order, are one month,
one year, one day, one data point, one quarter, one week, one number, two months, one point, three weeks, one person, one season, one example, five people, one quarter, four days, an exception, one movie and one failure.

The Xs such that X "does not a democracy make" are one election, back room politics, voting, one poll, voting alone, a single election no matter how successful, majority rule, holding elections alone, the right to vote alone, and white boys in power.

The Xs such that X "does not a season make" are one game, one win, one bad loss, and April.

The line "... and let me be the first to welcome our new Centaurian
overlords" appeared in a three-frame comic strip in Utne Reader some time
between 1990 and 1994 (when my parents got a subscription to Utne and when
I left Australia and stopped reading it). The scene is the same as the
Simpsons episode: a TV announcer covering an alien invasion switching
sides mid-sentence. It's highly unlikely that one was not inspired by the
other. Unfortunately I don't have the magazines to hand, so can't be as
precise as I was over "butt-crack of dawn".

If you can track down this reference, please send me the citation -- and if possible a scan of the cartoon.

Alarming decline in literacy among publicists and journalists

In a couple of earlierposts, I commented on the fuss in Britain and elsewhere created by the revelation that some exam-grading authorities give partial credit for correct answers that are wrongly spelled. This post gives some background, including a psycholinguistic backstory that also brings out some interesting things about the ecology of science journalism.

Now, the fact seems to be that no exam-grading policies have actually changed. To the extent that there was any news here at all, it was just that some new kinds of misspellings, following the new conventions of text messaging, are now found in what a spokesman for the Scottish Qualifications Board called a "very small" percentage of exam papers. These new types of misspelling are treated just as the old ones were: "pupils would still be given [partial] credit if expressing a valid idea".

Those marking exams are no longer presented with neat, comprehensible scripts, but with pages and pages of C U l8r, heavily illustrated with emoticons, those smiley or gloomy faces so beloved of teenagers, who probably have no idea that emoticons were originally made up of punctuation marks. In Scotland today, children presenting such scripts go unpenalised.

Thank you, Scotland. First John Knox, then the Enlightenment and now the Scottish Qualifications Authority. In a direct challenge to the English at their most reactionary, the authority has declared that it will accept text-messaging short forms in school examinations. The dark riders of archaism will protest and the backwoods will howl. No spell is cast as dire as spellcheck. But the champions of reason are massing north of the border and need our support.

(By the way, is it only in the Anglosphere that discussions of orthography so readily tap a vein of apocalyptic imagery? Is this connected to the phenomenon of word rage, discussed here, here and here?)

Exam chiefs in Scotland were branded "ridiculous" today after admitting that answers written in text message language will be acceptable in English tests as long as they are correct.

The Scottish Qualifications Authority (SQA) said the use of phrases like "2b r nt 2b" or "i luv u" in exam papers would be allowed as long as candidates showed that they understood the subject.

The admission follows research from Coventry University, released in September, which suggested that sending text messages - from the slang "wot" and "wanna", to the short cut "CU L8R"- may actually be improving, not damaging, young children's spelling skills.

But it turns out that the research was not exactly "released in September", in the sense that in September there was a paper to read, or even a preprint. The news reports came about because of a press release.

On Friday, September 8th, at the annual conference of the British Psychological Society's Developmental Section, held at the University of London, there was a poster-format presentation with the title "Cognitive Factors in Text Messaging and Literacy Links". The authors were Beverly Plester and Clare Wood, of Coventry University.

Unfortunately, the BPS does not put papers or even abstracts on line for its many conferences ("around 100 conferences and events each year"), but it has an active publicity department, who distribute press releases for particularly juicy items from these events, and the flacks chose the Plester and Wood poster as one of 16 things to tell the world's press about in the month of September. I believe that this was the only item that they chose from the program of 102 posters, 108 individual presentations, 16 symposium presentations and 4 keynote presentations at the Developmental Section's September meeting.

For obvious reasons, PR departments and scientific program committees have different ranking criteria for ranking research. In this case, the program committee put the Plester/Wood paper among the poster presentations, which is the lowest rank of acceptance at such a conference. The BPS's PR department chose it as the only one of the roughly 230 presentations at the conference to tell the world about. I'm not suggesting that either the program committee or the PR department made a mistake, nor that one set of criteria is intrinsically better than the other. I'm just observing that the program committee and the PR department clearly value different things.

For a sense of what the BPS PR department values, we can list some of its other titles from September: "Identity key to race relations"; "Do terrorist threats increase Islamophobia in Britain?"; "Sacked Rover workers can only find harmful 'bad' jobs"; "Larger mobs carry out more violence"; "People don't deal directly with threats to their way of life"; "Is sex at work the kiss of death for your career?"; "Young children think TV is real"; "Mobile phones: addictive, causes of stress"; "Exercise beats nicotine cravings"; "Young people reveal role of alcohol in their lives"; "Keep fitness fun to lose weight"; "Health risks of smoking ignored by women"; "Texts strengthen exercise plans"; "Hand tied and tongue tied"; etc. In other words, the usual things: sex, violence, race, fitness, smoking, drinking, and so on. Drugs, global warming and celebrities were left out due to sampling error, I guess.

I'm sure that the BPS PR department is doing its job well, in the sense that its operatives understand what the press is looking for, and act as an effective filter in picking out the items that will sell. Let's note in passing, though, that as a result, some fascinating-looking stuff from that same BPS Developmental conference went completely unreported, even in the world's most intellectual media. For example, there was an invited symposium with the title "State of the Art in Theory of Mind: old problems, new data", with Josef Pemer, Paul Harris and Michael Tomasello; and another one with the title "Workshop on methods for analysing children's interaction", convened by Margaret Harris.

A second problem with this system is that the PR operatives who write the press releases, focused as they are on getting the attention of reporters and editors, aren't always very careful to present the facts clearly. The BPS press release for the Plester and Wood poster came out under the title "Do U no wot Im Sayin?". Here it is -- do you know what it's saying?

Contrary to popular assumptions, the use of text messaging abbreviations is linked positively with literacy attainment, a study conducted with eleven-year old children has found.

Mrs Beverly Plester and Dr Clare Wood of Coventry University presented their research on Friday 8 September 2006, at the British Psychological Society’s Developmental Section Annual Conference being held at the Royal Holloway, University of London.

The study was designed to explore how the use of text abbreviations might be related to the skills children need in reading and writing, in response to concern from parents and teachers about whether texting might damage children’s ability to use standard English. The children were quizzed about their use of mobile phones and asked to translate messages between standard English and text language, as well as complete tasks to reveal their English writing, reading and spelling abilities.

It was found that children use their mobile phones more for sending text messages than for talking, the majority of which are sent to friends. Most text abbreviations were phonetically based, such as ‘wot’ for ‘what’ and rebus types, such as ‘C U L8r’. Many also used what the researchers describe as ‘youth code’, casual language such as ‘dat fing’, ‘gonna’ or ‘wanna’. Surprisingly, the children who were better at spelling and writing used the most ‘textisms’.

Mrs Plester said; "So far, our research has suggested that there is no evidence to link text messaging among children to a poorer ability in standard English and those children who were the best at using ‘textisms’ were also found to be the better spellers and writers."

"Texting could be used positively to increase phonetic awareness in less able children, and perhaps increase their language skills, in a fun yet educational way."

A couple of initial problems:

1. When I google the authors, I find that Coventry University lists Dr. Beverly Plester as "Senior Lecturer in Psychology", with a Ph.D. in psychology from Sheffield University -- exactly the same job title and level of academic qualifications as Dr. Clare Wood. So why does the press release describe the authors as "Mrs Beverly Plester and Dr Clare Wood"? Simple carelessness?

2. Were the children asked to compose and send text messages? The description of the study doesn't say so -- we're told that they "were quizzed about their use of mobile phones and asked to translate messages between standard English and text language, as well as complete tasks to reveal their English writing, reading and spelling abilities". But then how could they tell that "the children who were better at spelling and writing used the most ‘textisms’"? Is this just a misleading way of restating Dr. Plester's assertion that "those children who were the best at using ‘textisms’ were also found to be the better spellers and writers"? I suspect so, though I can't tell from this description.

A web search didn't turn up any published version of this study, nor any preprints, but it did turn up the abstract for a talk given two months earlier, at the thirteenth annual meeting of the Society for the Scientific Study of Reading, in Vancouver, July 6-8:

Beverly Plester (Coventry University ); Bell, Victoria; Wood, Clare - Exploring the Relationship between Text Messaging and Literacy Attainment
A pilot study revealed that although high levels of texting on mobile phones was linked to lower levels of literacy attainment in a sample of 12 year old children, their use of text abbreviations when messaging was positively associated with their literacy attainment at school. Ongoing research is attempting to understand the nature of the positive association between textism use and literacy attainment. In particular, the question of whether phonological awareness may be implicated in the apparent ability to use text abbreviations will be considered.

This is probably a report of the same research -- it seems unlikely that an additional study on the same topic could have been completed between July 8 and September 8. However, there are some worrying differences. The biggest one is that this abstract says that "high levels of texting on mobile phones was linked to lower levels of literacy attainment". I interpret this to mean that kids who reported that they did more texting when "quizzed about their use of mobile phones" scored lower on the tests given "to reveal their English writing, reading and spelling abilities" (using the phrases from the 9/8/2006 press release). But the this correlation was not mentioned in the BPS press release, nor in the popular-press articles based on it.

Instead, what was featured was Dr. Plester's observation that "those children who were the best at using ‘textisms’ were also found to be the better spellers and writers". I interpret this to mean that kids who performed better when "asked to translate messages between standard English and text language" also scored higher on the tests given "to reveal their English writing, reading and spelling abilities".

Now, there's no contradiction between these two results. It could be that kids who do more texting (or at least report doing more of it) score lower on tests of spelling and writing; and at the same time, kids who are more skillful at translating between texting and standard orthography also score higher on tests of spelling and writing. (In fact, it would be hard to measure skill at translating between texting and standard orthography in a way that did not automatically guarantee that kids who score higher are also better at using standard orthography...) Then again, it could also be true that the researchers did collect samples of the kids' texting, and found that kids who used more abbreviations when texting were also better spellers in tests of standard orthography.

The July 8 abstract says that the children in the study were were 12 years old. The 9/8/2006 press release doesn't mention the subjects' ages, but the 9/8/2006 BBC story, whose author interviewed Dr. Plester, talks about "[a] Coventry University study of 35 11-year-olds". So maybe there were two studies? Unfortunately, zero studies have been published, as far as I can tell, or even described clearly in an informal document. So all we can say, as usual, is that without knowing what the researchers actually did, it's hard to tell what the results actually mean.

My own image of a more perfect society is agnostic about the level of spelling skills. Instead of dreaming about a world of perfect spellers, I like to imagine a world where stories about scientific research provide (or link to) a clear and simple account of the researchers' methods and results, and where reporters and editors have the skills to make this happen. Since this fantasy of mine is pretty implausible, I admit, here's a goal that we might actually reach: how about a world where all organizers of conferences in science and engineering routinely require, and publish on the web, the sorts of four-page "extended abstracts" that many conferences already require? This would improve refereeing as well as communication within our disciplines. It would also make it possible for the interested public to go to the source, by-passing the (usually misleading) presentation in the mass media.

Despite this post's jokey title, I don't think that scientific literacy -- by which I really just mean common sense and clear thinking -- is any lower among publicists and journalists than in earlier times. But it's pretty low, and I think we'd all be better off if it were higher.

November 11, 2006

Unblogged snowclones

On returning to the world of snowclones with my discussion of The
New Y, I was dismayed to see how many figures or formulas had piled
up in my files of unblogged snowclones; the first came in in
2000! Here's the inventory, with my sources, and very minimal
commentary.

Note: some of these are without question snowclones, but others might
be patterns of playful allusions, idioms, playful morphology, or
clichés; several of them deserve a discussion of some length,
which I'm not now able to provide. Nor do I have the time now to
trace the histories beyond what I say below. This is the best I
can do at the moment.

1. "Now if you will excuse me I have a X to Y", e.g. "... I have
a plane to catch" (Aaron Dinkin, mail of 9/24/05)

2. "I'm from X and I'm here to help (you)", e.g. Ronald Reagan's
mockery of "I'm from the government and I'm here to help (you)" (Ben
Zimmer on ADS-L, 7/13/05, citing a query of the March before from Geoff
Nunberg)

3. "not the Xest Y in the Z", e.g. "not the sharpest tack in the
box" (me to Language Loggers, 8/30/06, with a response from Ben Zimmer
pointing to on-line lists of "what to call dumb people", for instance this one)

4. "Don't X me because I'm Y", e.g. "Don't hate me because I'm
beautiful", from an 80's shampoo commercial that is possibly the source
for the snowcloning (mail from Tim Shock, 10/13/05)

6. "Hardly/Not a X goes by without Y", e.g. "Hardly a week goes
by without a Nunberg citing in the New York Times" (mail from Benita
Bendon Campbell, 10/13/05; possibly an idiom?)

7. "We don't need no stinking/stinkin'/steenkin' Xs", e.g.
"We don't need no steenkin' snowclones" (mail from Chad Sanders,
10/17/06; discussion on the Subjunctivitis
blog, and a whole web site
devoted to the figure and its history)

8. "If that's X, every Y should be so lucky", e.g. "If
that's being discriminated against, we all should be so lucky" (mail
from Marilyn Martin, 10/23/05)

9. "Yes, Virginia, [mildly improbable statement is true]",
e.g., "Yes, Virginia, the moon isn't made of green cheese" (mail from
Vishy Venugopalan, 10/26/05; the source of this one -- an 1897
editorial in the New York Sun
-- is well known)

10. "X does not a Y make" and assorted variants, e.g. "One
chapter does not a dissertation make" (mail from Brendan McGuigan,
11/2/05; this one turns out to go all the way back to Aristotle, on
swallows and summers)

14. "There's a lot we don't know about X", e.g. "There's a lot we
don't know about the unconscious" (Lee Rudolph on soc.motss, 1/22/04,
citing my use of the figure and suggesting that the original had
"mirrors")

15. "As a X, N is a great Y", e.g. "As a baseball player, he's a
great linebacker" (Mark Mandel on ADS-L, 8/22/00)

16. "busier than a X [someplace]", e.g. "busier than a one-armed
man in an ass-kicking contest" (Barry Popik on ADS-L, 5/1/04, citing
some "busier than a cranberry merchant" examples going back to the 19th
century)

17. "That's not an X; this
is an X", e.g., "That's not a screw-up; this is a screw-up" (Jason
Parker-Burlingham in conversation, 12/15/05)

18. "N is the M of X", e.g. "Eric Raymond is the Margaret Mead of
the Open Source movement" (e-mail from John Cowan, 9/13/05, and previously blogged here)

19. "There's no rest for the X" and variants, e.g. "There's no
rest for the Clinton-obsessed" (me on ADS-L, 5/21/06; this goes back to
Isaiah and "no rest for the wicked", with later variants with "peace"
for "rest" and/or "weary" for "wicked")

Fully awesome!

Today's Zits cartoon takes on
the march of intensifiers (beyond GenX so, beyond intensifier all, beyond totally), and also works in an
instance of the "X is the new Y" snowclone (last discussed in the halls
of Language Log Plaza here):

Since this last posting about The New Y (as I'm now labeling this
snowclone), citing Chocolate is the
new black, I've been collecting instances in the wild, with the
following finds:

Blue is the new red. (a variety of
meanings, some of them opaque to me)

Gray is the New Blonde. (hair color for women)

Nudity was clearly the new black. (model Kate Moss naked at a photo
shoot)

How long will... an article about how taupe is the new black. (fashion
colors)

Folk is the New Black (Janis
Ian album released earlier this year)

After all, 60 is the new 50. (porn actor Peter Berlin at 60)

And as the proper accessory for the well-dressed man of a certain age,
a bulging crotch is the new bifocals. (ditto)

... rugby is the new polo. (shirts)

Pink: the New Black. (anal bleaching -- would I make something like
that up?)

Small is the new big. (economic developments in the energy world)

I hope you're eating organic!
Because organic is th' new
"Fifty" and th' new black.
(Zippy on food)

Fat is the new black. (designer Isaac Mizrahi on men's fashions)

Forty is the new 30 (price per dish at some upscale restaurants)

Sicily is the new Tuscany.
(vacation destinations)

Here's to 50! ... The new 40! (women's ages)

College is the new high school. (preparation for careers)

Chefs are the new rock 'n' roll stars, cookbooks are the new
pornography. (food and sex)

How long can this go on, before the attractions of The New Y wane and
it crashes, the way Color Me ("Color me surprised" 'I am surprised')
eventually did? Or will it live on as a durable but no longer
ubiquitously fashionable formula, the way One Man's X ("One man's
terrorist is another man's freedom fighter") seems to have done?
Is The New Y going to be the new Color Me or the new One Man's X?

[Addenda: Ben Zimmer supplies a link to a site with a pile of The New Y examples and a pointer to the Wikipedia page, where the figure is taken back to Gloria Vanderbilt asserting, in the 60's, that "Pink is the new black." And Jim Lewis pulls up around 46,000 hits for, omigod, "Black is the new black". Shannon Casey notes the popular gossip blog Pink is the New Blog. And Martyn Cornell tells me that the British satirical magazine Private Eye has been running a column called "The Neophiliacs" for several years that reprints "ever-more ridiculous examples" of The New Y, without, apparently, having any effect on its popularity.]

The Historian

Having mentioned Elizabeth Kostova's novel The Historian (as I
just
did), let me add that as a true aficionado of Bram Stoker's wonderful
1897 epistolary novel Dracula, with which Kostova's is intimately
interwoven, I would not be one to think well of any cheap ripoff of it;
but Kostova has crafted a serious, well-written, and ingenious homage to
Stoker's imaginary world and the historical truth about Vlad Drakul that
partly inspired it.

In addition to being talented, Kostova is, as her
picture
shows, very beautiful. If there is any Language Log reader who happens
to know how to get in touch with her, I would be grateful if they would
kindly convey to her the following private message.

Elizabeth (if I may): I wish to take you to dinner at some time and
place in the future that is mutually convenient. On the eve of St
George's Day, perhaps. I thought we might have paprika hendl,
as did Jonathan Harker when he broke his journey at the Hotel Royale
in Klausenburgh (Cluj as it is now called). We might also enjoy some
impletata (he must mean patlagele impulute, the stuffed
eggplant dish he spoke of as having been offered to him at breakfast;
it should do well as part of the dinner). Some cheese, and a salad,
and a bottle of old Tokay, the wine Dracula himself served for Harker
with his first dinner at the castle. Then, after dinner, a taste
of slivovitz, the plum brandy offered to Harker by Dracula's
mysterious carriage driver on that cold May 4th night when he was
driven to Castle Dracula.

At the end of the evening I would like, if it is acceptable, to bite
you very gently on the neck. (You will find that when a grammarian kisses
you, you stay kissed.)

Perhaps you might even do me the honor of visiting my own humble home.

Welcome to my house! Enter freely and of your own will! Come freely.
Go safely; and leave something of the happiness you bring!

Grateful acknowledgments to Leonard Wolf, whose beautiful
book The Annotated Dracula (New York: Clarkson N. Potter, 1975)
is indispensable for an appreciation of Stoker's world. Spellings of place names in Transylvania used above are those found in Stoker's novel,
not the modern Hungarian or Romanian spellings.

November 10, 2006

More Field Linguistics Books

Here are some additions to Arnold's and Mark's suggested readings on linguistic fieldwork.

Bob Dixon's 1983 Searching for Aboriginal Languages: Memoirs of a Field Worker
( New York: University of Queensland Press. Reprinted 1989 by the University of Chicago Press.) is a fascinating memoir of his work in Australia.

Also about Australia is Forty Years On: Ken Hale and Australian languages, a collection of essays about the work of Ken Hale and related topics, including a piece by Ken's widow Sally and one by Geoff O'Grady, another of the greats of that generation of Australianists.
Further information is available here.

Science Fiction/Fantasy writer Sheri Tepper has written two books that deal with linguistic fieldwork
The more recent of the two, The Companions (New York: Harper Collins, 2003) is explicitly about linguistic fieldwork on an alien planet. Her earlier book After Long Silence (New York: Bantam Books, 1987), which is one of my favorites, is not explicitly about fieldwork but turns out to be. I can't reveal more without spoiling it. If you think your tastes are like mine, read it.

Some textbooks of field methods contain interesting anecdotes or philosophical discussions. You might enjoy Bert Vaux and Justin Cooper's Introduction to Linguistic Field Methods
The first edition was published by LINCOM Europa. There's a new edition of whose status I am uncertain. Anvita Abbi's A Manual of Linguistic Field Work and Structures of Indian Languages (Munich: LINCOM EUROPA) is interesting for its focus on fieldwork in India.

Recursively nested quotation marks

The theory of quotation marks in printed Standard English says that
whether you use single quotation marks (‘ ’) or double
quotation marks (“ ”) as the default, if you have to enclose one
quotation within another you switch to the other kind, and so on
recursively, alternating quotation mark types: either
‘...“...‘...’...”...’, and so on, or
“...‘...“...”...’...”, and so on, to any depth of embedding of
quotes. But of course, even just a quoted string that is within a quoted
string that is itself within a quoted string is very rare. If you would
like to see one in the wild, I can tell you one place to look.

In Elizabeth Kostova's best-selling vampire novel The Historian
(Time Warner Books, 2005), starting four lines from the bottom on
page 365 of the paperback edition (8th reprint) that I purchased in
England this summer, you can see a sequence like this (I omit a long
portion just before “The epitaph”):

“‘“A Traveller”had visited the monastery in Snagov in 1605. He had
talked a good deal with the monks there [. . .]
The epitaph, which I copied
down with care — out of what instinct I didn't know — was in
Latin.’Hugh dropped his voice, glanced
behind him, and stubbed out his cigarette in the ashtray on our table.

“‘After I'd written it down and struggled with it a while,
I read my translation aloud:“Reader, unbury him with
a —”You know how it
goes [. . .]

What is going on here is that the character named Hugh James is telling a long story (shown here in green)
which is embedded within a longer story being told by the father of the narrator
of the whole novel (shown in red). The father's words are signalled by opening double
quotation marks (in red), which, in a style familiar to those who
know 18th and 19th-century epistolary novels, are repeated at the beginning of
each paragraph but only closed once, at the end of the whole section in that
person's voice (so the closing red quotation marks are not shown above; they
occur on page 376, at the end of the chapter). Hugh James's words are shown
in single quotation marks (green), also not closed at the end
of each paragraph but only at the end of a complete section of direct quotation
in his voice (as, for example, just before “Hugh dropped his voice”; the second green left single quotation mark above is
actually not closed until a couple of paragraphs later, on the lower half of
page 366, after the words “My father looked very
upset”).
The first double quotation marks in blue are scare quotes; there is a character in
a manuscript identified only as “A Traveller”, and the novelist uses scare quotes in the written form of Hugh James's spoken
utterance to make it clear that, in the part shown here in blue, Hugh
James is not using an indefinite noun phrase in his own voice but rather
using a repetition of the manuscript's
way of identifying a certain definite individual. Later the green type
is interrupted by another blue section, in double quotation marks,
where Hugh quotes the epitaph.

Also embedded in the narrator's father's double-quoted sections of the
novel are letters from another character, the father's mentor Bartolomeo
Rossi, and these are in italics. Had they been shown in quotes instead,
those would have been single quotation marks.

This is really a novel of very complex narrative structure. One has to
keep one's eye on the ball, and one's recursion-depth counter on the level
of quotations in which one is currently embedded. This kind of complexity
will not be found very often in any kind of literature. But at least I
have been able to show you a paragraph that opens with the sequence
<Left Double Quotation Mark> <Left Single Quotation Mark>
<Left Double Quotation Mark>. And I could have put things
differently by saying that the paragraph begins with ‘“‘“’, giving you a four-quotation-mark sequence to
read (the outermost single quotation marks would be mine, to indicate that
I am quoting a string composed of the other multicolored ones). And if
someone else quoted me, they would need to add yet another set of
quotation marks (those should be double quotation marks, since I used
single). In principle, there is no limit.

Nerd note: I leave it as an exercise for those readers who are
acquainted with the methods
of formal language theory to turn what I have just explained
into a rigorous argument that the set of all possible properly punctuated
English texts cannot be accepted by any finite-state automaton.

Partial credit for "pigeon English": not new in New Zealand

A few days ago, there was a small bubble of news reports to the effect that the New Zealand Qualifications Authority was planning to follow the example of the Scottish Qualifications Authority and allow free-form spelling and "texting" abbreviations on the NCEA examinations. But according to a story by Claire Trevett in today's New Zealand Herald, this was all a mistake -- yet another example of the danger of trusting what you read in the mass media, a creative but undisciplined arena that has yet to work out how to impose the checks and balances that we in the blogosphere take for granted.

Bali Haque, deputy chief executive of the authority, said there had been no change to guidelines and there was no specific policy about text language.

However, he warned: "If people are expecting they can come up with an exam script full of text and pass, then they're dreaming.

"Examiners will be expecting the use of the English language in full. I think students are intelligent enough to understand that. Most would know the difference between using formal language in an exam and informal with friends on the weekend."

(If this is really what Mr. Haque said, by the way, it's an interesting example of the word text apparently being used to mean "writing of the kind found in cell-phone text messages".)

The best part of this story, in my opinion, was the statement issued by Bill English, who is described in the story as "National
education spokesman Bill English", which apparently means that he is the spokesman for educational matters of the National Party. Mr. English's statement read in part:

This kind of pigeon English is fine for young people organising their social lives, but it is not an acceptable way of expressing an academic argument or idea.

The Education minister, Steve Maharey, used this to explain what story calls "NZQA's policy of forgiving minor mistakes that were understandable in an otherwise strong answer":

The statement is understandable, despite pidgin being spelt p-i-g-e-o-n, as in a bird from the dove family, rather than p-i-d-g-i-n, as in simplified language used between persons of a different nationality. But we will give him credit.

This is a sensible implementation of a sensible policy, familiar to anyone who has graded a set of college essays. I wonder if the Scottish Qualifications Authority case was a similar non-story, blown up by the Guardian and other British papers to fill a hole on a slow day. As we've often observed, the traditional media will never be able to fulfill their undoubted promise as an information source until they can find a way to impose some elementary standards of accuracy and accountability.

[Update -- Ben Zimmer points out that:

Through the late 19th century, "pigeon" was a common variant for
"pidgin" -- in fact, "pigeon (English)" predates "pidgin (English)".
An early citation for "pigeon English" from 1857 (which I contributed
to the OED's latest draft entry) can be found here:

On every side of you, Pigeon English - that horrible
jargon of multilated baby talk which custom has made
law - meets you. From boatwomen to shopmen - house boy
to compradore - you hear nothing else. I endeavored to
get a copy of Hamlet's soliloquy, which was translated
into Pigeon English, but I have failed to do it. I can
only remember its commencement.
"To be or not to be" reads: "Can - no can."

(p. 539)
The "pidgin English" which followed, was too much
for our untutored intellects to comprehend.

(p. 543)
We asked Ah Lum to translate one of the songs for us;
but the effort to put the words of one of his native
poets into "pidgin English" was too much.

Somehow I doubt that Mr. English was attempting to spell following pre-1869 norms. But this is one more reason
to be tolerant of spelling mistakes -- as someone who often makes such mistakes, I certainly have a personal interest in the availability of forgiveness. Though I think that we are still allowed to enjoy the display of self-refuting hypocrisy
on the part of the intolerant.]

Field linguists at work

Now that I've gotten around to posting
suggestions of books that show linguists at work, correspondents
are writing to fill in gaps in my list. My list had a couple of
items depicting field linguistics, but I missed several good books
about field work.

Curtis Booth writes to nominate Leanne Hinton's wonderful collection of
essays on California Indian languages, Flutes of Fire. To which
Peter Austin adds Paul Newman and Martha Ratliff's Linguistic Fieldwork, a collection
of essays by a number of accomplished field linguists about all aspects
of the field experience, and Mark Abley's Spoken Here: Travels Among Threatened
Languages, specifically on working with endangered languages,
including revitalizing them.

For sociolinguistic fieldwork, I know of nothing quite like these
volumes about traditional field linguistics. My current best
recommendation is Penny Eckert's 1989 volume Jocks and Burnouts, because it
treats both quantitative research and ethnographic description, and
because it's engagingly written.

The film that dare not speak its name

I went this evening to a Berkeley screening of a new documentary called Fuck. Or make that F*ck, or F**k, or ****, depending on which newspaper the film is being advertised in (when it opens at the Nuart in LA tomorrow, the marquee will read simply "FOUR-LETTER MOVIE"). It's a funny and fascinating farrago of four-letter fact and fable -- and you can quote me on that. Though in the interest of full disclosure I should add that I appear in the movie, along with Jesse Sheidlower of the OED and a supporting cast of talking heads that includes Bill Maher, Drew Carey, Sam Donaldson, Billy Connolly, Ice T, Alan Keyes, Alanis Morissette, Chuck D, Hunter S. Thompson, and Pat Boone (whose presence entitles me to claim a Kevin Bacon number of 3). The film will open in LA and New York tomorrow, and over the following weeks in selected cities, including San Francisco, Berkeley, Minneapolis, Chicago, Boston, San Diego, Portland, Seattle, Atlanta, Santa Cruz, Washington DC, and St Louis. See you at the movies!

November 09, 2006

Ill-judged word choice lost Congress for GOP?

When Senator George Allen (R, Virginia) announced today that he had
given up his attempt at re-election to the US Senate and conceded to his
Democratic opponent, it became clear that the Democratic party will
control the Senate as well as the House in the next Congress. The margin
by which Allen lost was only about 7,000 votes (roughly half a percent
of the voters). For many voters, the decision was influenced by Allen's
foot-in-mouth problem, and particularly the incident in which he twice
referred to S. R. Sidarth, a brown-skinned Democrat campaign tracker
whose parents came from India, by the derisive nickname Macaca
(previous Language Log epithet-watch coverage
here
and
here
and
here).
The word was almost certainly intended as a racial epithet. It is
familiar among French colonists in Africa in that capacity (Wikipedia has
the basic facts about the word
here), and Allen's
mother is a French-speaking Tunisian Jew who would have been quite likely
to use the word that way in referring to North African Arabs.
(Allen seemed to think Sidarth was an immigrant. He is not;
he is Virginia-born and raised.) The subsequent
brouhaha ultimately necessitated a public apology from Allen. All in all,
it seems unlikely that the number of voters swayed by the Macacagate affair
was less than the 7,000 margin. And if that is right, then the control of
the US Senate and thus the entire legislator may have been turned over to
a different party because of one thoughtless nickname choice by a tired
and irritated candidate. (That's not an exculpation, by the way. Tired
and irritable he was, but reprehensible nonetheless.) It was surely
one of the biggest consequences of an
on-the-fly nickname choice in all of history. Watch your mouth,
politicians. It's a linguistic jungle out there.

You know, putting this incident alongside various others
(like the birthday babbling that cost Trent Lott his job), it
sometimes seems to me that politicians get insufficient training
in choice of words and idioms. It is as if they have not yet
grasped the nature of the huge change has
taken place with respect to racism in the United States
over the past forty years. Those
who want to get their language use in line with
current standards should understand it very clearly.
It is not that racism has gone away (good heavens, surely nobody thinks that will ever happen). And it's not that racist talk has
been made illegal, or ever could be: the
First Amendment is simply not going to allow that.
You can speak your
opinions in this country, and express anything you want about the racial
inferiority or utter subhuman vileness of any racial group you may want
to take out after. No, it's not illegal to say racist things, it's not
even a misdemeanour; it is something much worse, for racists,
that has happened.
Racism has become not just unfashionable (itself almost a kiss
of death for those in public life) but
unacceptably disgusting to
most thinking people. And that's much more serious.

If you're a political candidate, then for you
to say something on camera that suggests
racist attitudes or beliefs is comparable to, oh, something like putting
your hand down the back of your pants to scratch your asshole and then
sniffing your finger. Nothing illegal there. But your campaign will
take a downswing from the moment that video clip hits YouTube.

This is not about the mythical political-correctness
"word police" of which the right-wingers
disingenuously complain. This is about thinking people simply seeing
what you do and turning away in disgust. It if were just illegal to say
"nigger" or "spic", a politician could perhaps survive it (politicians
do survive drunk driving arrests, and surely drunk driving is enormously
more serious and dangerous than having negative opinions about some racial
group). But it's worse than illegal. It picks you out as someone
to stay away from. It identifies you as disgusting and fit only
to be shunned. A person who would never be invited to dinner.
And you won't survive that in modern American politics.

Two upticks in a classical allusion

Only 17 words for snow

I'm not sure what the current record is for Eskimo N, the number of
words the Eskimos are claimed to have for snow, but this Sunday's New York Times Book Review yielded
an unusually modest Eskimo N, 17.

From Christopher Buckley's review (p. 18) of Chris Miller's The Real Animal House: The Awesomely
Depraved Saga of the Fraternity That Inspired the Movie, on the
vocabulary of the members of Alpha Delta Phi at Dartmouth in the
early 1960's:

There are... a few relatively innocent
terms, like the synonyms for breasts: "jehoshaphats," "baboos,"
"wazookies," "ka-hogas" and of course "gabongas." The Inuit
language contains -- what? -- 17 different words for "snow"? The
AD's must have twice that many for "vomit."

Buckley has obviously pulled the number 17 out of his, um, hat.
This number is what you're likely to come up with when you're asked to
pick a random number: it's the smallest prime number without any
special cultural significance. The numbers 2, 3, 5, 7, and 13 are
clearly special; 11 is not quite so special, though it is the number of
players on a football team (American or Association), and then you're up to 17.

Progress in malignancy tagging

There's some new news from BIOIE, an NSF-sponsored research project on information extraction from biomedical text, which is one of my day jobs. I share faculty responsibility with Fernando Pereira and Aravind Joshi at Penn, and Pete White at Children's Hospital; but as usual in such projects, most of the real research is done by graduate students. One of those students, Yang Jin, has just had a paper accepted by BMC Bioinformatics: "Automated recognition of malignancy mentions in biomedical literature".

Last month, I posted about "Fable 2.0", an on-line system that automatically tags articles with mentions of genes, normalizes the mentions so that various different ways of referring to the same gene are connected, and lets you search millions of articles to find genes associated with arbitrary boolean combinations of keywords. Yang's new paper applies the same named-entity tagger to finding clinical descriptions of malignancies, as part of a larger strategy to link molecular and phenotypic observations, both in reports of laboratory research and in clinical records.

Yang used the "same tagger" as Fable does, in the sense that he used a general-purpose program that will attempt to learn how to "tag" any sort of text regions at all, generalizing from a body of hand-tagged training material. To make a gene tagger, the program was trained on text hand-annotated for genes. To make a malignancy tagger, it was trained on text hand-annotated for malignancies. (This tagger was developed by Ryan McDonald while he was a grad student at Penn -- Ryan is now at Google -- based on the Mallet machine learning toolkit.)

Yang's malignancy tagger works pretty well: 0.84 precision, 0.83 recall, 0.84 F-measure. ("Precision" is the proportion of hits that are valid; "recall" is the proportion of valid mentions that are found; the "F-measure" is the harmonic mean of precision and recall. These days, across various entity types and document collections, such taggers generally have F-measures in the 0.7-0.9 range.)

Yang's tagger also worked notably better than the obvious baseline of string-matching against a term list. Yang took the National Cancer Institute's neoplasm ontology, a term list of 5,555 malignancies, and tested it (on a random subset of abstracts from the larger test set) using case-insensitive string matching. Of the 202 malignancy mentions in this subset, the term-list method found only 85, for a recall of 0.42, while his tagger found 190, for a recall of 0.94. The mentions missed by term-list matching but found by the tagger included some variations in form for items already on the NCI list (e.g. "leukaemia" vs. "leukemia" or AML vs. "acute myeloid leukemia"), but also quite a few that simply weren't on the list in any form, such as "temporal lobe benign capillary haemangioblastoma" and "parietal lobe ganglioglioma".

One of the most interesting and promising results was an essentially negative one. Yang trained the tagger in one trial with a completely generic set of features (words, character n-grams, and so on), that could be used for any entity tagging task at all, and in another trial with additional cancer-specific feature sets, in particular the NCI term list and a list of indicative suffixes. The generic tagger scored an F-meaure of 0.834, while the addition of the cancer-specific feature sets only improved its performance to 0.838. This suggests that for some biomedical tagging tasks, domain-specific lexicons and other task-specific feature sets may not be needed.

But the single most important part of this story, in my opinion, is who Yang is. He's a graduate student in neuroscience, not in computer science or even in bioinformatics. We're beginning to enter an era when text-mining techniques are just another scientific tool, like a centrifuge or perhaps more analogously a package of software for fMRI analysis, available for use by researchers whose goals have no intrinsic connection to the analysis of language.

November 08, 2006

Double disingenuousness

Stephen Rowland has made me very happy. He has found a
linguification phrased as an embedded rhetorical question.
An astonishing blend of two of my recent topics of interest.
And in fact it actually co-occurs with a third trope, irony.
It's from an
article
by Michael Gove
in the Times (London) on November 8, 2006:

[A]ll true fans of The Sound of Music know that the most
important role in the production, the moral centre of the show,
is the Captain.

Before I go any farther, I know that I have to pause while some,
mildly perturbed, readers wonder how one can shoehorn the words
"moral centre" into a sentence which also contains the phrase
"Sound of Music".

Isn't that wonderful? Two disingenuousnesses compounded. He does not
really think his readers will be wondering about how a musical can be described as having
a moral center, he is being ironical, and pretending he thinks it is
uncontroversial shared knowledge
that musicals don't have moral centers (the mutual knowledge of
what the answer is supposed to be is what makes it a rhetorical
interrogative); and he doesn't really mean to raise the question
of whether the word sequence "moral center" might occur in a
properly formed sentence where "Sound of Music" also
occurs (that's the linguification).
I know that I have to pause while some, mildly amused, readers wonder
how one can shoehorn three tropes into one short but
rhetoric-heavy sentence.

Try and stop me, FCC

The Language Log news department has learned
that the FCC has just decided to reverse itself on certain cases
where they had ruled against uses of obscene language on
broadcast media. The CBS Early Show will not be fined for
broadcasting
an occurrence of the word "bullshitter".
(The New
York Daily News is remarkably coy, though, printing it as
"bulls-er"; comment later from Arnold Zwicky of the Language
Taboo Desk). It seems that the FCC is going to allow
more freedom for use of filthy language on
news programs than is allowed for similar cursing on other programs.

Well, Language Log is, of course, part of the news media,
so we have even more freedom now to say whatever we want, and broadcast it if we want to.
Which gives me the right to tell you about something that just arrived
in the mail that I thought I would share with you. Something to
convince you that we professors, even Kantian moral philosophy
professors, are not all prim and fusty and severe; we are
open, red-blooded, and always ready for a laugh.

Professor Jeffrie Murphy
of Arizona State University has just published the presidential
address that he gave in consequence of his receiving the honor of
selection as President of the Pacific Division of the American
Philosophical Association ("Legal moralism and retribution
revisited," Proceedings and Addresses of the American Philosophical
Association, 80.2, November 2006, 45-62). And about being
awarded the presidency by his peers, he says (p. 45):

The very day after I received notification of my selection as
president, my wife and I went to see the Francois Ozan film
"The Swimming Pool." Early in that film, the Charlotte Rampling
character — commenting on literary and academic awards —
makes this remark: "Awards are like hemorrhoids — eventually
every asshole gets one."

And we can report filthy talk like that, you see, without fear of
prosecution, because we are Language Log, and we are part of the
media in a free nation.

Linguists at work

Reading Anatoly Liberman's Word
Origins ...and How We Know Them: Etymology for Everyone (Oxford
University Press, 2005) has brought me back to some unfinished Language
Log business from long ago, a 2004
query about books that "could give a potential linguist some sense
of what it's like to be a linguist, to do linguistics".

Back then I said:

I found this a surprisingly difficult
question. Not-bad introductions to linguistics aren't hard to come by,
and there are some pretty good surveys of what has (or, actually, had)
been done in the field: some of the chapters in Shopen's set Language Typology and Syntactic Description
and in Newmeyer's Cambridge Survey
of Linguistics, for example. But such works present the product
of doing linguistics, not the activity.

For a feel for what it's like to do syntax, maybe Green & Morgan's Practical Guide to Syntactic Analysis.

For a sense of what it's like to do fieldwork and to discover something
about the structure of a language, the two Shopen volumes Languages and Their Speakers and Languages and Their Status.

And for thought-provoking reasonably brief essays, the two books that I
most often give to non-linguist friends who are interested in language:
Bauer & Trudgill's Language Myths
and, especially, Pullum's Great
Eskimo Vocabulary Hoax.

Then some weeks ago a correspondent who was working his way through the
Language Log archives from the very beginning wrote to ask if I had
ever answered the question (alas, no), and right after that my copy of
Liberman's etymology book arrived.

So let's start with the Liberman book, which I think is wonderful at
showing, in detail, how word histories are uncovered (or, as is often
the case, not). Along the way you get a lot of fascinating
etymologies, plus accounts of sound symbolism, borrowing, sound change,
semantic change, comparative reconstruction, and much more. You
should carry away an appreciation of just how HARD
etymology is, what an immense store of background knowlege is required
to do it well, how provisional many of the histories are, and how much
of history is probably not recoverable at all.

The whole book might be a bit much for some readers, but chapter 13 ("A
Retrospect: The Methods of Etymology") gives a nice summary, and the
two chapters that follow, on sound change and semantic change,
illustrate well the etymologist in action. The enormously
entertaining chapter 16 ("The Origin of the Earliest Words and Ancient
Roots") could, I think, be read on its own.

Now back to the 2004 question and replies to it. Several people
seconded my nomination of The Great
Eskimo Vocabulary Hoax. But two suggestions dominated the
responses I got: Steven Pinker's The
Language Instinct (which won Pinker the first Linguistics,
Language, and the Public Award from the LSA in 1997) -- I can't imagine
how I could have left this book off my list -- and Language Log
itself. (Remember: unlike Steve, we offer a full money-back
policy to anyone who's dissatisfied with the services we provide.)
And now some of our stuff has been published in Far from the Madding Gerund (see ad
on front page, and buy the book!).

Adam Parrish nominated Thomas Payne's Describing
Morphosyntax, saying: "It provides an outline for a
morphosyntactic description of a language and instructions on how to
fill in the details. It's billed as a guide for fieldworkers, but
I just like to read it and marvel at how languages are simultaneously
diverse and similar."

And Matt Post suggested, for computational approaches to language, the
first few chapters of Daniel Jurafsky & James Martin, Speech and Language Processing: An
Introduction to Natural Language Processing, Computational Linguistics,
and Speech Recognition.

Some further suggestions that have occurred to me: Pinker's 1999 book Words and Rules: The Ingredients of
Language, which shows an experimental psycholinguist grappling
with a lot of messy details about language, in particular about
inflectional morphology; George Miller's 1977 Spontaneous Apprentices: Children and
Language, in which you get to watch Miller and Phil
Johnson-Laird struggle to do research on child language acquisition;
and Miller's 1991 The Science of
Words, about all things having to do with words, with much
discussion of experimental work. Miller is an especially engaging
writer, by the way.

Finally, Mark Liberman recommended a very different sort of writing,
fiction with linguist characters:

None of these are by linguists. All of
them involve sympathetic central characters who turn out to be better
at analyzing the structure and content of exotic languages than the
structure and content of their own lives.

Vote for the woman whose mother uses this verb

Claire McCaskill, who ran successfully in Missouri for a US Senate
seat and beat out incumbent Jim Talent yesterday,
said to
Renée Montagne on NPR this morning
that part of the explanation for her success among the typically conservative people in the rural areas of the state
was that her mother was a native of the Ozarks,
and is so rural that
"she's the kind of woman that says ‘hornswoggle’
as part of her ordinary vocabulary."
The inference from having the verb lexeme
hornswoggle
to being appealing to Missouri farmers was apparently supposed to be
completely obvious, and for a moment I thought that was completely nuts.
Why, I thought, would anyone imagine that a person was worth voting
for because her mother knew a certain lexical item? Let's say
I have the word
psephologist in my active vocabulary; does that help you in
deciding whether you would cast a vote for my son Calvin?

But I guess the reasoning goes like this: "McCaskill's mother uses
hornswoggle; I use hornswoggle; hornswoggle is rare
or unknown in standard dialects, but familiar in dialects of rural people
like me; so the mother is probably a rural person like me; so the mother
probably has similar values to mine; and mothers teach their values to
their daughters; so the daughter probably has them too; so a vote for her
will probably be a vote for someone with values like my own." Far from
being a foolproof reasoning chain, but not entirely as irrational as at
first one might think. Voting is so often a matter of looking at a brief
resumé of a person you don't know, plus some repellent negative
allegations about them in an opponent's TV ads, crossing your fingers,
and hoping the electee won't turn out to be just another rascal. Using
a statistically unusual lexical item as a possible indicator of membership
in a social group with values you like might be one small way to make the
process less irrational.

Of course, the sociolinguistic judgment about the item in
question may not be right; as Ben
Zimmer remarked to me at the water cooler in Language Log
Plaza this morning:

I don't know if "hornswoggle" is such a reliable sociolinguistic
index. Ann Coulter, who hails from New Canaan, CT,
once said
that President Bush has shown "how easy it is to hornswoggle liberals."
Somehow I don't think those rural McCaskill voters would feel much
social or political kinship with Coulter.

How would a farm housewife in Wright County, MO,
react to a quintessentially urban
blonde bombshell who makes her living as rabid liberal-baiter,
hostile TV personality, fire-breathing columnist, self-parodist, and
ultraconservative performance artist?
I don't know. Some psephologist is probably working on it.

Update: We put an intern onto
checking Coulter's biography, and it turns out her mother was
born in Paducah, Kentucky! Maybe you can take the girl out of
the country but you can't take the country out of the girl.
We are now trying to lure Coulter to Philadelphia so we can get
her into our basement sociolinguistics lab, where we have...
umm... equipment suitable for robust and forceful interrogation.
We use it for eliciting information about speakers' dialect backgrounds. More news as we manage to extract it.

Swear it

To celebrate the publication of Keith Allan and Kate Burridge's Forbidden Words: Taboo and the Censoring
of Language (Cambridge University Press, 2006), I reproduce today's
poem on the Writer's Almanac, "Swear It" by Marge Piercy (from The Crooked Inheritance, 2006):

Swear It

My mother swore ripely, inventively
a flashing storm of American and Yiddish
thundering onto my head and shoulders.
My father swore briefly, like an ax
descending on the nape of a sinner.

But all the relatives on my father's
side, gosh, they said, goldarnit.
What happened to those purveyors
of soft putty cussing, go to heck,
they would mutter, you son of a gun.

They had limbs instead of legs.
Privates encompassed everything
from bow to stern. They did
number one and number two
and eventually, perhaps, it.

It has always amazed me there are
words too potent to say to those
whose ears are tender as baby
lettuces--often those who label
us into narrow jars with salt and

vinegar, saying, People like them,
meaning me and mine. Never say
the K or N word, just quietly shut
and bolt the door. Just politely
insert your foot in the Other's face.

The new Allan & Burridge follows their 1991 Euphemism and Dysphemism: Language Used as
Shield and Weapon, the two volumes together making a thorough
survey of the field of taboo language, its uses, and its regulation,
discussed "with deep erudition and a light touch" (as Steve Pinker puts
it in his blurb on the back cover). Full disclosure: I am one of
the helpful friends and colleagues thanked in the
Acknowledgements. (I was surprised to see my name so early in
this list -- third, right before Bill Bright -- but then I realized it
was alphabetized by first name. Bill Leap comes last, my usual
place in such lists, because he appears under the name William Leap.)

Giant space ants win control of Congress?

Let Me Just Say...
I for one welcome our new Democratic overlords. I'd like to remind them that as a trusted rightwing personality, I can be helpful in rounding up others to toil in their underground sugar caves.

The "explainer" that he links to is here, but it offers only the picture. Except for the announcer's simpsonian (simpsonical? simpsoniacal? simpsonistic? simpsonish?) yellow skin, there's no indication of the actual source of the joke.

That's because Goldberg has used this allusion before, to frame another electoral outcome that he didn't like, and he expects his readers to remember the explanation that he gave on that occasion. The date was 2/03/00, and John McCain had just defeated George W. Bush in the New Hampshire primary.

Recall, if you will, the episode of the Simpsons when Homer is selected to be a space shuttle astronaut. News anchor Kent Brockman is scheduled to interview the shuttle crew while they are in orbit.

But just before they "switch live" to the crew of the corvair craft, there's a mishap on board. Homer, unaccustomed to weightlessness, is veering, out of control, straight toward the ant farm the crew brought along for study. [...]

When news anchor Kent Brockman cuts to the live feed from the shuttle, the ants float by the camera lens — momentarily appearing gigantic. Then they lose the picture. Brockman instantaneously reports:

"Ladies and gentlemen, er, we've just lost the picture, but, uh, what we've seen speaks for itself. The Corvair spacecraft has been taken over — 'conquered', if you will — by a master race of giant space ants. It's difficult to tell from this vantage point whether they will consume the captive earth men or merely enslave them. One thing is for certain, there is no stopping them; the ants will soon be here. And I, for one, welcome our new insect overlords. I'd like to remind them that as a trusted TV personality, I can be helpful in rounding up others to...toil in their underground sugar caves."

When it becomes clear that the bugs are in fact not a "master race of giant space ants", Brockman quickly removes his "Hail Ants" sign hanging just behind him, covering the station logo. [...]

The moral of the story is that journalists (and party hacks) love power. Whether it's a new insect overlord or a candidate suddenly surging at the polls, the chattering class works under the assumption that whoever has power now will have it for a long time.

And some people doubt the benefits of a classical education!

Goldberg continued:

If that's not highbrow enough for you, consider George Orwell's 1946 observation that "Power-worship blurs political judgement because it leads, almost unavoidably, to the belief that present trends will continue."

As far as a few minutes' web research allow me to determine, Jonah has reserved the "overlords" allusion specifically for apparent defeats of W, whose opponents are all thus classified as temporarily magnified insects. Rumor has it that researchers at the Rockridge Institute have been toiling through the night to develop an effective counter-allusion. So far the leading candidates are "
Nyah, Nyah: We're Back" and "Attach the Stone of Triumph!" More on this as it develops.

November 07, 2006

Attested subordinate rhetorical interrogatives

Almost as soon as I
mentioned
that it would be interesting to find actual examples confirming
Ivano Caponigro's suggestion that
interrogative subordinate clauses could
have rhetorical-question interpretations, Mark Liberman
noticed one
in something he was reading.
Well, further overwhelmingly convincing evidence has been coming in of actual
examples showing that interrogative content clauses can indeed express
rhetorical questions.

Bruce Rusk, of Cornell University's Department of Asian Studies,
contributed one from the early 18th century:

The new prophesying Sect, I made mention of above, pretend, it seems,
among many other Miracles, to have had a most signal one, acted
premeditately, and with warning, before many hundreds of People, who
actually give Testimony to the Truth of it. But I wou'd only ask,
Whether there were present, among those hundreds, any one Person, who
having never been of their Sect, or addicted to their Way, will give
the same Testimony with them?

He also points out that rhetorical questions can be embedded in
rhetorical questions. One structure that often shows this is the
rhetorical "Dare I ask...":

And, dare I ask who could not benefit from reading certain experienced
Illinois trial lawyer's tips and tactics. ;^>

http://www.legalunderground.com/2004/11/wwkly_report_i_.html

Says Bruce: "The writer knows that everyone would benefit from reading these tips (and
of course dares to ask). And doesn't even bother with the question mark."

Bruce offers another similar expression favoring rhetorical
interpretation: "Should I even ask...":

Should I even ASK how the field trip was? :)

http://kiwords.blogs.com/kiwords/2006/04/im_just_sharing.html

Bruce says: "The commenter assumes it was bad and that he/she
should not ask. I think (tone can be hard to judge)."

A similar structure that Bruce points out is "Do I have to ask...":

LMAO at the fools that take up politics. Do I have to ask what party
is doing this??? and like the previous post said, the union mafia
(dimwit votes) is probably going to get P.O.'d about this.
[Comment at http://wuzzadem.typepad.com/wuz/2006/09/how_to_lose_ele.html]

He notes that the continuation with "and" heightens the rhetorical
force of the question, and adds:

Actually, in the last case I think the rhetorical nature of the question
is only apparent if it's so framed. In the first case ("who could not
benefit..."), the rhetorical force is apparent even without the framing.
In the second, it's less clear, though it could be read into "And how was
the field trip? :)" In this last case, however, I don't think it would
be apparent to a reader (though it might be, to a speaker, from tone)
that the question was rhetorical: "What party is doing this???" just
sounds (again, in writing) like a "real question."

Can a question be rhetorical precisely because it's embedded in another?

Ora Matushansky wrote from Paris with another pair of attested examples,
found by looking for "makes me wonder" + "could possibly" on Google:

which kind of makes me wonder how you could possibly find less
funny videos

which is definitely an improvement, but makes me wonder what
exactly was the point in adding the RFID chip in the first place?

Let us agree, then, that Ivano Caponigro's intuition is correct:
the device of the rhetorical question should perhaps be referred to
more broadly as the rhetorical interrogative, since interrogative
clauses that do not directly ask questions can indeed have the
flavor that independent-clause rhetorical questions have.

The Duff curriculum

Arnold Zwicky recently wrote that "I recalled with pleasure [C.C.] Fries's careful development of a system of parts of speech via distributional analysis, using as raw data some fifty hours of (covertly) recorded conversations". Tom Duff emailed an interesting suggestion in response:

I wonder if there's a primary education hook here (and a way to promote general Linguistics awareness.) Unless the math is too heavyweight, it sounds like a research program that schoolkids could replicate: taking down each other's speech, analyzing the data, discovering the grammar of the language as used by their peers. I would have been so stoked by this when I was 9 or 10.

It sounds like a complete primary education program -- English, science & math all rolled together. And talking in class!

I think this is an absolutely terrific idea. There are many difficulties, some of which I'll sketch below, but the opportunities are even greater. And much of the needed computational infrastructure could be shared with other projects, pedagogical and otherwise.

First, let's generalize the idea. Although distributional analysis of word classes would be a fine thing to do, you wouldn't want either to start there or to stop there. Students could learn some acoustic physics with their math, while looking at pitch contours or measuring formant frequencies and segment durations. They could learn some simple statistics, especially if data is available to them from multiple classes and schools, by looking at the effects of age, sex, region and so on. They could analyze the rhetoric and the performance of speeches, or the dynamics of conversation, looking at how gestures and facial expressions are aligned with words and phrases. They could compare vernacular and formal speech. They could look at different languages, for example to see how differently words with similar meanings are used.

I can say from personal (though informal) experience that bright nine- or ten-year-olds are interested in this kind of thing, at least at the level of looking at waveforms, spectrograms and pitch tracks of their own speaking, singing and assorted weird noises, or using web search to try to figure out what the right way to say something in Spanish is.

And as a technical matter, it would be fairly easy to make such analyses available to kids. Most of the needed infrastructure is already available, as free software on generic personal computers -- though you'd need to create more kid-friendly (or teacher-friendly) versions in some cases. There's one thing that's still missing, however: support for sharing data and for conveniently accessing shared data. The main motivation for this is that many interesting things, including distribution analysis of word classes, require more data than one class could collect; but even sharing data within a group of 30 or so students could be challenging without an appropriate system.

Here's one idea about what you'd want: a server where anyone can upload audio (and video too) with appropriate metadata; an Ajax-based tool for creating, editing and viewing transcriptions (and other time-aligned annotations), also saved and accessed on the net; a mechanism for defining virtual corpora out of sets of these annotated audio/video files; and a user interface (and an API) for searching such virtual corpora.

This would be useful for education through the graduate-school level, and for many scientific and engineering projects as well. I think that anyone who's ever taught or done research in this general area can see how it might be used.

OK, enough enthusiasm. Now for some of the (very serious) problems with the idea.

1. Most elementary-school and high-school teachers don't have the background needed to understand and teach such stuff, much less to create course materials based on it.

2. There are ethical and legal problems, in the general area covered by "human subjects" regulations, that are more acute in dealing with kids. You'd have to worry about how to prevent students from releasing information about personal identity, or inappropriate information about themselves and their families, or slander about their classmates, or whatever. This is related to the problems that myspace and facebook have, except that in this case, (some of) the material would be created or used under the authority of schools, who need to be much more cautious.

3. Even if problems (1) and (2) were dealt with, my guess is that the hardest problem here is how to create "lab exercises" that would work for students of different ages, backgrounds and interests, as presented by a similarly diverse set of teachers.

All the same, the general idea is a wonderful one. The (additional) infrastructure is worth implementing for other reasons -- more on this later. And I guess the way to make progress on the pedagogical problems would be to try it out with some kids in pilot projects, which could be in schools or in other contexts, like a summer camp or a museum program.

[Update -- Mike Maxwell and Bill Poser remind me that Ken Hale had the idea, more than 30 years ago, of using study of the Navajo language to teach the scientific method to Navajo students. Bill mentioned this work in an earlier LL post ("Reintroducing diagramming", 11/7/2004). Ken wrote a (still unpublished) textbook in support of this idea. Mike also cites Josie White Eagle, "Teaching Scientific Inquiry and the Winnebago Language", International Journal of American Linguistics, 48 306-319, and a paper by Michael Barkey, "Linguistics and Scientific Inquiry" (ms. dated 9/4/2006), which includes a brief review of "what others have done" (pp. 26-28), including Nigel Fabb's "Linguistics for ten year olds" (MIT Working Papers in Linguistics, 6, 45-61, 1985). Josie White Eagle's paper in turn refers us to a 1970 paper by Samuel Kay Keyser, "The role of linguistics in the elementary school curriculum", Elementary English, January 1970, 39-45. A bit of internet searching also turned up a brief review by Wayne O'Neill, "Linguistics in the Science Classroom: Progress and Prospects".

Since the general concept has been around for more than a generation, we need to ask why it has never been adopted to any significant extent. My speculation would be that it's because curricular innovation is hard; because most teachers lack the knowledge and skills needed to teach such material; and because the cultural trend has been strongly against teaching any analytic skills at all, at least in the area of language and communication.

Are things any different now? Well, the anti-analytic tide may have turned; the internet's "long tail" effect makes it easier for enterprising teachers to find and use curricular materials; and it may be possible to design interactive web-based materials and tools that can help teachers (and students) develop the concepts and skills that they need to make such ideas work. Also, we might get some added traction from the use of corpus-based rather than intuition-based methods, especially for kids who are already used to internet search.]

Leanne Hinton Wins Lannan Award

Yesterday's New York Times carried a full-page ad announcing the 2006 winners of Lannan Awards for Cultural Freedom. One
recipient is Leanne Hinton of the University of California at Berkeley, arguably the world's most effective and influential advocate for language preservation and revitalization. Leanne has long worked with California Indian tribes who are on the point of losing, or have lost, their heritage languages. Her famous Master-Apprentice program has been adopted by communities in which a few elders still speak the tribal language fluently; her regular Breath of Life workshops at Berkeley are an important resource for communities whose languages are no longer spoken but are sufficiently well documented that they can (with hard work and some luck) be revived. Shortly before Ken Hale died, he and Leanne co-edited the influential sourcebook The Green Book of Language Revitalization in Practice. Everyone who works with Native American tribes, and with other communities around the world whose heritage languages are endangered or moribund, is greatly indebted to Leanne for her work and her inspiration. And with the most optimistic estimates predicting the death of 50% of the world's 6,000 or so languages by the end of this century (the most pessimistic estimates range up to a 90% extinction tally by 2100), all linguists ought to respect Leanne's work and to congratulate her on her Lannan Award.

November 06, 2006

Charles Carpenter Fries

I had occasion recently to refer a graduate student to Charles
Carpenter Fries's 1952 book The
Structure of English. She's working on a cluster of issues
having to do with syntactic categories and subcategories, and I
recalled with pleasure Fries's careful development of a system of parts
of speech via distributional analysis, using as raw data some fifty
hours of (covertly) recorded conversations. Though many linguists
are now looking at syntactic categories and subcategories through the
lens of the constructions words can and cannot occur in, and though a
great many linguists now draw their data from corpora, Fries's work is
scarcely known. He has no Wikipedia page, except for a place-filler
("Diese Seite existiert noch nicht") on the German Wikipedia site.

Well, I think it's time for people to pay some attention to C. C. Fries.

I never met Fries, or Paul Roberts, whose 1956 textbook Patterns of English is a
presentation of Fries's system for classroom use. But the two
books are an important part of my intellectual history: one of my high
school English teachers used Patterns
as a text in English grammar -- quite a remarkable step, then as now --
and so gave me my first taste of linguistics. It was
delicious. A couple of years later, at Princeton, I took intro
linguistics (with the Gleason text) first chance I got, even though I
was a math major. I was hooked. On to the intro to
historical linguistics (with the Hockett text) and reading Sapir,
Bloomfield, Fries's Structure
book, Harris's Methods in Structural
Linguistics (1951), and, yes Syntactic
Structures.

The Fries system has four major syntactic categories, called "parts of
speech", in "classes" numbered 1 through 4 (Roberts maintains Fries's
notation, but is willing to label the four classes Noun, Verb,
Adjective, and Adverb), plus fifteen minor categories of "function
words", in "groups" lettered A through O. Some of the groups have
only one member (Group C, not;
Group H, expletive there),
and several gather together words that are largely ignored in
traditional English grammar (Group K, comprising utterance-initial well, oh, now, and exclamatory why; Group M, comprising the
discourse markers look, say, and listen). There are extended
treatments of sentence patterns, immediate constituents, the syntactic
functions "Subject" and "Object", and much else.

Well worth looking at now.

But what happened? Why did Fries pretty much disappear from sight?

Look at the dates. While Fries was getting his book to press,
Chomsky was writing The Logical
Structure of Linguistic Theory; he finished the manuscript in
1955, the year before the Roberts book was published, and the next year
after that Syntactic Structures
appeared. By the time Fries died, in 1967, generative grammar was
flourishing and American structuralism was increasingly
marginalized. Fries's careful procedures and concepts defined
from (real-life) data had no place in the world of Universal
Grammar. Well, they're back, and it's time to say some good words
for Charles Carpenter Fries.

zwicky at-sign csli period stanford period edu

[Update from Mark Liberman -- Dan Everett writes:

Ken Pike told me many stories about Fries that support Arnold's statements. Ken's first presentation in linguistics was on tone languages, to the plenary session of the LSA in 1936. There were only 12 people in attendance, but on the front row were Bloomfield, Sapir, Trager, Bloch and Fries. In the second row was the new PhD, Charles Hockett. After his presentation, Pike said that Sapir wanted him to do his PhD with him at Yale. Bloomfield offered him a spot at Chicago. And Fries talked to him about coming to Michigan. Pike said that he chose to work with Fries over Bloomfield and Sapir because Fries' work was more concerned with helping people learn to do linguistics and apply it.

Taboo avoidance in Dilbertland

Scott Adams has turned his attention to taboo avoidance, and he doesn't
like what he sees. In his blog
of 11/4/06, he declares that "the most obscene letter in the
alphabet is the asterisk." But to balance that judgment, he
explains how the asterisk protects us:

Naked naughty words can destroy your
brain and also society as a whole. However -- and one would think this
is obvious -- It's completely safe to THINK naughty words. And it's
safe to cause other people to think naughty words. But if you spell
those naughty words without the asterisk loin cloth to protect your
victims, you're a danger to society. I know this to be true because I
heard it from lots of people who have sh*t-for-brains.

From one non-proscriptivist to another

I am a linguist by training. Long before I delved into free software and was snagged by the quagmire of marketing, I pondered the marvels of morphology, the grimness of grammar and the splendor of semantics. It is only natural then that my wrangling criticism of industry-speak, in both technical and literary modes, is informed by ingrained linguistic sensibilities, descriptive and proscriptive. Given my background, I find it vexing when open source is used as a verb.

In my travels with OSDL, I frequently hear our eponymous Open Source employed as a transitive verb. As in "My company open sourced our product." Now, I am no petty proscriptivist.

Me neither. But I'm happy to offer some free editorial advice to a fellow linguist.

First, I think you mean "prescriptivist", not "proscriptivist". The OED tells us that a prescriptivist is "An adherent or advocate of prescriptivism", and that prescriptivism is "The practice or advocacy of prescriptive grammar; the belief that the grammar of a language should lay down rules to which usage must conform". There's no OED entry for "proscriptivist", but we could regard is as a regular derivation from proscriptive, which is glossed as "Characterized by proscribing; tending to proscribe; of the nature or character of proscription". The verb proscribe has the glosses

I. 1.trans. To write in front; to prefix in writing. Obs. rare. Perhaps a scribal error for prescribe.II. 2. To write up or publish the name of (a person) as condemned to death and confiscation of property; to put out of the protection of the law, to outlaw; to banish, exile. Also fig.b. To ostracize, to ‘send to Coventry’.3. To reject, condemn, denounce (a thing) as useless or dangerous; to prohibit, interdict; to proclaim (a district or practice).

And the noun proscription is glossed as

1. The action of proscribing; the condition or fact of being proscribed; decree of condemnation to death or banishment; outlawry. Also fig.2. Denunciation, interdiction, prohibition by authority; exclusion or rejection by public order.

"Word rage" may be common among English-language prescriptivists, but ostracizing, death, banishment and confiscation of property are not really on the agenda here. Rejecting, condemning, denouncing, prohibiting and interdicting might be, so it's not nonsensical to use "proscriptivist" to mean "someone who condemns or denounces others for misuse of words". But it's confusing to coin a new word, when there's an old one that's almost identical in sound and essentially equivalent in meaning. And you're likely to make a bad impression on your readers, who may suspect you having committed a malapropism or created an eggcorn. So my advice is to proscribe the use of "proscriptivist", and to prescribe "prescriptivist" instead.

Weinberg's commentary continues:

English is a dynamic, productive language in which nouns can become verbs, and verbs can return the favor. Consider the word source (n. from Middle English sours, from Anglo-French surse spring, source, from past participle of surdre to rise, spring forth, from Latin surgere). Today, source is as often uttered as a verb as it is a noun, as in the dreaded labor term, outsource.

Why do people ... who propose to offer authoritative advice to educated people not use standard sources of information? ("You could look it up", as Casey Stengel is reported to have said, with reference to his claim that most people his age were dead.)

One way to "look it up" in this case would be check examples of the word source on the web. So I read through the first ten pages of Google's returns for a search on source, without finding any examples of source as a verb. This suggests to me that it's unlikely to be true that source the noun and source the verb are equally common these days.

But Weinberg wrote "uttered", so maybe we need to check conversational use. Well, in the 26 million words of English-language conversations indexed at LDC Online, there are 392 instances of the word source, of which 391 are nouns, and one is a verb:

now everybody that can out source to a cheaper you know find a cheaper worker somewhere

(By the way, Weinberg's sentence offers another classic example of the McKean/Skitt/Hartman Law of Prescriptive Retaliation. When he wrote that "source is as often uttered as a verb as it is a noun", I think he left out an "as". I could be wrong about this, because both wordings give me a sort of unpleasant headachy feeling, but I believe that the clause ought to read "source is as often uttered as a verb as it is as a noun." In any case, it's probably not a good idea to create a 14-word clause involving either three or four copies of the word as. A better option might be something like "source the verb is now as common as source the noun". It's still false, but it reads better.)

We haven't quite gotten to Weinberg's real point yet, but we're getting closer. He continues:

What I find nettling is the presumption of what syntacticians call agency. In pragmatic grammar (as opposed to case grammar), the subject of a transitive verb is the agent that performs some act upon the patient or direct object of the verb. Dog [agent] bites [verb] man [patient]. The dog bites the man because it wants to, because it can. (Maybe a better example for software is "Cat throws up hairball"). But is it meaningful to say that the owner of a piece of code can open source that code, by fiat?

This is confusing. Is he saying that all transitive verbs have agents as subjects? In all flavors of grammar that I'm familiar with, some transitive verbs have agentive subjects and some don't. Here are a few examples where transitive verbs have subjects that are causes or themes or experiencers, not agents:

A fallen tree blocked the road.
The noise bothered her.
The bullet entered his chest and lodged near the spine.
Everyone in the room heard the explosion.

In those sentences, "blocked", "bothered", "entered" and "heard" are perfectly good transitive verbs, although none of their subjects are agents by the usual linguistic definition. Certainly none of them "performs some act upon the patient ... because it wants to". But in any case, someone who makes software available under an open-source license is both legally and linguistically a sentient agent who intends the result that is achieved.

Well, let's put all the grammar aside, because we're about to get to the real point:

There are actually four distinct stages for source code, only one of which I consider open source. ... The first is source code as documentation ... The second is source code as bait ... The third is source code under an OSI license ... The fourth and canonical scenario that embodies the true meaning of open source is a community of developers and users cooperatively building, deploying and maintaining project code.

OK, fair enough. Weinberg's idea seems to be that we shouldn't say "(person or company) A open-sourced (software system) X" just to mean "A made X available under an open source license", because X won't really be true "open source software" until and unless it comes to have an active community of developers and users.

I agree with Weinberg that "Without community, the source code behind open source is just a dusty tome, lifeless, static and unread". But lifeless, static and unread open source software is still open source software.

And therefore it's not only meaningful, but also true, to say that the owner who released some code under an OSI license "open sourced" it. You can object to this usage on aesthetic grounds, if you want to, but the business about subjects and agency is beside the point. English syntax and and semantics are neutral on this one.

[Hat tip to Tiego Tresoldi.]

[Update -- Bill Poser writes:

Bill Weinberg seems to be confusing a canonical association
between transitive subjecthood and agency with a rigid implication.
There are languages that have something stronger than what
English does. In Japanese this association is sufficiently strong
that, as Susumu Kuno pointed out years ago, it is generally not
possible for a transitive verb to have an inanimate subject.
To say "History repeats itself", for example, is bizarre in
Japanese.

If he were writing about Japanese, I would just have pointed out that sentences like "IBM open-sourced UIMA" in fact have perfectly good agentive (and even quasi-animate) subjects.

What he's really saying is that the real agent of the open-sourcing process is the developer and user community, not the software owner. The trouble is, that's a moral or political judgment, not a linguistic one.

Yahoo! made this library available under an open source license.
The Redmond company made WiX available under an open-source license on SourceForge.net...
...we made it available under an Open Source license available at [10].
The CMU Sphinx project ... has made Sphinx2 available on SourceForge under an Open Source license.

There's nothing wrong with these sentences, in terms of syntax, semantics or word usage. And it's also sanctioned by the norms of English to refer to software that is available under an open-source license as "open source". As a result, it would be perfectly natural, from the point of view of English morphology and syntax, to rephrase each of those sentences using the causative neologism to open-source <something>, meaning to make <something> open source. Thus "The CMU Sphinx project has open-sourced Sphinx2 on SourceForge".

You might decide against this rephrasing because you don't like to make a new causative verb out of a complex nominal of the form adjective+noun -- that would be a reasonable stylistic preference. But to reject such usage on the grounds that "it takes a village to open-source a program" (as we might paraphrase Weinberg's argument) confuses morphosyntax with politics.]

November 05, 2006

White Horses

Language Hat's discussion
of the Chinese spelling of "Africa", in which the character 非 is used
phonetically, triggered a few thoughts about Chinese philosophy. The word
非 is usually translated as "not", or in compounds, as a negative prefix
such as "un-" or "i(n)", as in 非 法 feī fǎ "illegal".
This has led some people, on encountering the statement 白馬非馬
in the writings of philosophers of the 名家 míng jīa "Logicist"
school, as, for example, in the
title of a famous work by 公孫龍子
Gōngsūn Lóngzı̌
known in English as
the White Horse Dialogue,
to interpret it as "A white horse is not a horse.", which appears to be a contradiction.
A key point is that in Classical Chinese 非 can mean "different", so the statement
can be read as "A white horse is different from a horse".
The White HorseDialogue
plays on the ambiguity created by the two meanings of 非.

The Logicist tradition in Chinese philosophy, which bears a much closer relationship
to the Western scientific tradition than does the Confucian tradition, was largely
submerged by Confucianism and was for a long time poorly known. It was brought to
prominence by Hu Shih 胡適,
who, after a traditional education in Chinese philosophy became a student of John
Dewey, in his doctoral dissertation The Development of the Logical Method in Ancient China.
To my astonishment, it is available from Amazon.com.
He is probably better known as one of the leaders of the shift
to the use of the vernacular in written Chinese and as the ambassador
of the Republic of China to the United States from 1938-1941.

Newroz Píroz be!

Although Turkey has taken some steps toward reducing its oppression of the Kurds
in hope of being admitted to the European Union, it keeps on backsliding.
It is reported
that Osman Baydemir, a prominent human rights activist now the mayor of Diyarbakır
is being prosecuted for sending out cards containing New Year's greetings in
Turkish, Kurdish, and English. "Happy New Year" in Kurdish is Newroz Píroz be!,
the publication of which violates Act 1353 of November 1, 1928 on
Adoption and Application of Turkish Letters, which forbids the use of any
letters not found in the Turkish alphabet. Turkish does not use the letters q, w,
or x.

The Constitution of 1982, incidentally, is unusual in that, while it contains provisions guaranteeing freedom of expression, it explicitly empowers the government to prohibit the use of languages not once but twice (Article 26, par. 3, Article 28, par. 2) and declares (Article 174, par. 6) the Act of November 1, 1928 to be constitutional. As amended in 2001 [the amendments are marked in this Turkish text], the Constitution no longer explicitly empowers the government to ban languages but continues to enshrine the Act of November 1, 1928. As I pointed out in my
comment
on a previous incident of this type, Turkey enforces this law selectively, using it only
for the purpose of suppressing Kurdish. Meanwhile, one of the few good things that
can be said to have come out of the invasion of Iraq is the
improvement
in the
status of the
Kurds and Kurdish.

George Lakoff, Card-Carrying Chomskian

George Lakoff has taken a lot of knocks for his theories about political language and the cognitive roots of political orientations, which is not unexpected for someone who has courted controversy throughout his career. Some of the criticism is justified, I think -- in fact I've gotten in my own two cents in my book Talking Right and in a recent post on Open University, the New Republic's academic blog. But some of the charges are really loopy, and none is so weirdly off-the-wall as the description of Lakoff offered in a piece by the Bloomberg columnist Andrew Ferguson claiming that Lakoff's influence is fading, which also ran in the New York Sun and was picked up by various conservative bloggers:

A disciple of the notoriously anti-American Massachusetts Institute of Technology professor Noam Chomsky, Lakoff first earned a wide public audience -- inadvertently -- with his essay "Metaphors of Terror," published a few days after 9/11.

Now call George Lakoff what you will -- and a lot of people have done just that -- "a disciple of Noam Chomsky" he ain't, not in his politics and certainly not in his linguistics.

True, you could argue that Lakoff's work bears the methodological traces of his Chomskian upbringing; virtually no modern linguist has escaped Chomsky's influence, after all. But Lakoff and Chomsky have been at theoretical and personal loggerheads since the 1960's, and to call Lakoff a Chomsky disciple is like calling Dave Winfield a disciple of George Steinbrenner.

As nonsensical as it is, though, this little factoid has been making its way around the influential conservative blogs. From Little Green Footballs, for example:

I'm sure it will come as no surprise to learn that Lakoff is an admirer of Noam Chomsky.

George Lakoff, the Chomsky protege who's a big fave with Democrats these days, has a new book out urging leftish politicians to spin more.

It isn't hard to see what's going on here. Many people assume that there's some connection between Chomsky's politics and his linguistics, and a lot of them go on to conclude that linguistics itself is constitutively a leftish discipline. So when Lakoff emerged as an influential political figure, it seemed natural to blur both his politics and his linguistics with Chomsky's, particularly if for those who didn't know jack about linguistics. Whatever your political views, it's a depressing reminder of how widespread the ignorance about the field of linguistics is (not that we exactly needed another one). But then it's probably asking too much to expect people who find it expedient to conflate Lakoff's garden-variety liberalism with Chomsky's anarcho-syndicalism to take the trouble to learn the difference between Chomsky's minimalism and Lakoff's cognitive linguistics. Oh well, they have the sense they were born with.

Clifford Geertz, 1926-2006

Jeff Weintraub has a nice appreciation of Clifford Geertz, who died last week, along with links to other appreciations of this great scholar. Geertz's work displayed a rare combination of depth and subtlety, and a resolute refusal to accept easy answers or facile generalizations about culture. It's a testimony to his influence that "thick description" became a term of art widely used not just in anthropology but in the social sciences and humanities, including linguistics (it gets 188,000 Google hits), even if he sometimes raised an eyebrow at the way people deployed the phrase. I knew Cliff a bit, and fondly recall both his personal warmth and intellectual generosity. He will be missed.

Aut blog aut mori

The motto of NaBloPoMo ("National Blog posting month") is a nice example of dog latin. (Though in former times "blog or die" might have been dog-latined as "aut blogare aut mori", with some approximation to an infinitive form of the verb blog, more closely echoing the original "aut vincere aut mori" = "conquer or die".) No matter what the pseudo-Latin morphology might be, though, I don't agree with the sentiment, since I make it a principle not to blog unless it feels like fun. Life has enough necessities without adding another.

The alternative motto "Blog today -- tomorrow you may be eaten" is funnier but even less attractive. I'd go along with "Blog longa, vita brevis", especially if you add the rest of the quote.

November 04, 2006

Grammar on the gay beat

Genre, a lifestyle magazine
for gay men, has an advertising section every month with interviews of "Genre men", one from each of
several cities. Questions about favorites: gym, bar, restaurant,
retailer, place to hang. (Picture of gay male life: we work out,
go to gay bars, like to eat and shop and hang with other gay men,
cruising them.) Other more specific questions, some silly ("If
you were a cocktail, what would you be?" and "Who is the chick you'd
switch for?"), some serious ("When/How did you come out?" and "What are
you afraid of?").

New York City is represented in the November issue (p. 71) by Arnie
Plotnick, a 46-year-old cat veterinarian who's inclined to smart-aleck
answers:

First thing you do in the morning?

My boyfriend.

And then there's the question:

If you could go back in time to any
year, what year would you go to?

Plotnick snaps back:

I'd go back to a time when people
didn't end sentences with a preposition.

Ah, that prescriptive fiction Dryden's Rule, a.k.a. No Stranded
Prepositions. Particularly ridiculous here, since the fronted
version of the interviewer's question is stunningly awkward:

If you could go back in time to any
year, to what year would you go?

Whether by intention or accident, the Genre
editors get their revenge by following the stranded-preposition
exchange with this one:

How important are politics to
you? Be honest.

Very important. We have to do everything we can to stop the
radical right-wing bigots from destroying our country and everything it
stands for.

There you have it: stranded prepositions AND a
prejudice against them. They are everywhere.

(In case you're curious, the answers to the other questions above are,
in order: New York Sports Club, Gym Bar, RUB (Righteous Urban
Barbecue), Whole Foods, and "on the grass by the pier on Christopher
Street"; Absolut Peach and Tonic; Bjork "or maybe J.K. Rowling"; "I
didn't really come out of the closet" because "the entire house kind of
fell down around me instead"; and gay Republicans.)

November 03, 2006

Madonna in Malawi: distinguished white lady?

It's a fair bet that most Americans were unaware of the existence of
the poverty-stricken African nation of Malawi before Madonna
decided to fund an orphanage there and adopt a Malawian child. Now that
Madonna is making the media rounds to smooth over criticism of her
adoption efforts, she's also bringing some unexpected attention to Chichewa,
Malawi's national language (along with English). The Associated
Press reports:

"People started to say my name and they had
never heard of Madonna,"
the 48-year-old singer, talking about her recent visit to Malawi, told
AP Television in an interview Tuesday.
"And, in Chichewa, the word 'madonna' means 'distinguished white lady,'
so I think they got very confused."

At the very least Madonna's mention of Chichewa is an improvement
over earlierreports
that she was planning to learn "Bantu" so that her adopted
son could remain in touch with his Malawian roots. "Bantu" of course
refers to a language
family rather than a particular language, encompassing Chichewa and
hundreds of other distinct languages throughout the southern half of
Africa. So if Madonna has now figured out the name of the language of her new son's homeland, how'd she do with the gloss of madonna as 'distinguished white
lady'?

Turns out she wasn't too far off, though she should probably keep
working on those Chichewa lessons. The word
to which she refers is madona,
which consists of dona 'lady'
plus the plural prefix ma-.
(Compare makaku, the plural
of kaku 'mangabey' in Bantu
languages of Gabon and Congo, which is the etymon for macaque and, possibly, George
Allen's notorious epithet Macaca.)
The word appears in the Chichewa equivalent of "ladies and gentlemen," mabwana ndi madona — as in this
song in honor of Malawi's first president Dr. Hastings Banda (or
more recently, this
post on a Malawian message board). So for starters, it's 'ladies'
rather than 'lady'. But what about the 'distinguished white' business?

The honorific dona has
indeed been used to refer to white women in Malawi since colonial
times. Presumably the word itself is a vestige of Portugal's early
colonial ties to Africa, derived from Portuguese dona — cognate with Spanish don/doña and ultimately from Latin dominus/domina. (Note also the Old Italian cognate donna, which in the form ma donna 'my lady' provides the etymological source for Madonna Ciccone's first name.)
According to an article on the Dutch Reformed Church Mission in Malawi (History of Education Quarterly,
Autumn 1984), the DRCM's female missionaries were known as madona, and the girls' homes that
they supervised in the late 19th century were called Ku Madona.
The presence of European women in Malawi also gave rise to a style of
ceremonial mask known as Dona
(described here
and here),
emulating the women's foreign features. So (ma)dona
has survived as a more generalized honorific for distinguished
ladies as well as a term specifically used for women of European
descent.

The racialized sense of (ma)dona
did apparently have some resonance for Yohane Banda, the father
of Madonna's adopted son David. At least that's what a Malawian blogger named Steve Sharra
wrote, most likely based on local news reports:

For Mr. Yohane Banda, who had never heard of
the pop diva Madonna
until she visited Malawi last week to adopt his 13 month-old son David,
the closest he could relate with the material girl was the word Dona,
meaning rich white woman, in Malawian parlance. In a matter of days, he
now knows her, and the rich guy Guy Ritchie, as the new parents of his
son.

Perhaps that rich white lady Madonna will do for Chichewa what Mel
Gibson (before his fall from grace) hoped to do for YucatecMaya.
Shouldn't every indigenous language have its own celebrity spokesperson?

Elliott Bay in view! Oh! The Joy!

Elliott Bay is an inlet of Puget Sound that forms Seattle's harbor, and so the Elliott Bay Book Company is a terrific Seattle bookstore (150,000 titles) and cafe. If you live in the Seattle area, you probably already know that. But you might not know that Geoff Pullum will be there for a book reading and signing, at 2:00 p.m. on Sunday, November 5.

Geoff won't be handing out money, as he did back in June in order to encourage people to come to his reading at the MIT Coop. However, I hear that Tom Sumner will be there, giving out ("a limited supply of") free stuff. Anyhow, Geoff's readings are legendary, so you shouldn't need to be bribed. And as Tom points out on his blog, the Seahawks' game is Monday night, so what else are you going to do on a rainy Seattle Sunday afternoon?

The title of this post refers to the entry in William Clark's journal that recorded his first view of the Pacific Ocean, in November of 1805, just about 101 years ago. ("Ocian in View! Oh! The Joy!") To avoid geographical confusion, I should hasten to tell you that Lewis and Clark were following the Columbia River, and therefore reached the Pacific 100 miles or so to the south of Elliott Bay, which in any case wasn't named until 1841. Of course, William Clark was also a pioneer of plain spelling, so it's unlikely that he would have gotten the two l's, the two t's, the use of o for the reduced final vowel, etc., even if he had reached Elliott Bay, and known it by that name. He might have referred to "Eliot", "Elliot", "Eliott", "Elliatt" or some other kind of bay. I might have, too, except that I used the internet to check the spelling, and to find you a map link.

>>>Elliott Bay, 2:00pm. It's inevitable that someone would come along to rip Strunk & White a new one. It's the good fortune of every user of a non-fossilized version of the English language that that someone is as eloquent as Geoffrey Pullum. Pullum's one of the prime movers behind the essential linguistics blog (seriously!) Language Log, and the co-author of Far from the Madding Gerund.

Plain spelling

Thank you, Scotland. First John Knox, then the Enlightenment and now the Scottish Qualifications Authority. In a direct challenge to the English at their most reactionary, the authority has declared that it will accept text-messaging short forms in school examinations. The dark riders of archaism will protest and the backwoods will howl. No spell is cast as dire as spellcheck. But the champions of reason are massing north of the border and need our support.

Sample quotes:

I have no quarrel with grammatical authoritarianism. Grammar is a vehicle that needs a highway code of human communication. To parse is to prosper. [...]

In contrast, spelling has become a no-go area, an intellectual tundra. While plain writing is considered a stylistic virtue, plain spelling is a vice. English orthography is an edifice of unreason. Word endings are the last gasp of the Anglo-Saxon and Norman invasions, embedded in the cultural DNA of literary Brahmins. Not to spell properly is a sign of being common, as once was ignorance of Latin. Knowing your "ie" from "ei" or -ible from -able does not affect a word's meaning one jot. It is a caste mark, its distinction deriving from its very obscurity.

Most linguists think that this is backwards: syntax and word usage can take care of themselves, pretty well, but spelling does need standardization. The basic argument is that writing is artificial in a way that speaking is not, and orthography is the most artificial part of writing, so that the normal human process for creating and maintaining cultural norms is good enough for grammar, but not for spelling, which therefore needs to be established as "made order" rather than a "grown order".

The form and content of this argument are certainly valid, but it does have a bit of the smell of a rationalization. At least, it's certainly true that the Elizabethans got on fine with what Jenkins calls "plain spelling" (i.e. chaotic spelling) -- though this would have made search engines harder to implement, if they'd had them.

TLSX: reverse engineering the language module

I'm in Austin for TLSX ("Texas Linguistic Society, Ten"), an annual conference run by grad students at UT. This year's theme is "the application of techniques from computational linguistics to descriptive linguistics and the analysis of less-studied languages".

I'll blog from the conference site as time permits. I'm going to start with an entry that does have to do with linguistics and with blogging, but not with TLSX (though maybe there'll be some sort of connection, who knows). The subject is Ken Macleod's new novel, "Learning the world: A scientific romance", and I've had it on my to-blog list for a month or so.

From the cover blurb:

Humanity has spread to every star within five hundred light-years of its half-forgotten origin, coloring the sky with a haze of habitats. Societies rise and fall. Incautious experiments burn fast and fade. On the fringes, less modified humans get on with the job of settling a universe that has, so far, been empty of intelligent life.

Being at a conference run by grad students reminds me: does the appeal of space-colonization stories comes partly from their success as a metaphor for the experience of young people starting out to find a place in the world?

More from the cover blurb:

The ancient starship But the Sky, My Lady! The Sky! is entering orbit around a promising new system after a four-hundred-year journey. For its long-lived inhabitants, the centuries have been busy. Now a younger generation is eager to settle the system. The ship is a seed-pod ready to burst.

Graduate school does seem to last for centuries, for some people, and UT does have an unusually large linguistics department. And perhaps the UT department is entering intellectual orbit around some new topics. But I think this joke has been pushed far enough.

One of the Learning the World's narrative threads comes from a young girl's blog. The book opens with her first entries:

13 364:05:12 16:24

The world is four thousand years old. I was eight years old when I found that out for myself. My name is Atomic Discourse Gale and this is the first time I have written something that anyone in the world can read. It is strange and makes me feel a little self-conscious, but I reassure myself that not many people will read it anyway.

14 364:05:13 18:30

That was a joke. I see I have a few readers. J---- wants to know how I found out the age of the world. It was six years ago now but I remember it quite well. I was very young then and didn't understand everything that happened, but looking back I can see that it was a significant event in my life. That is why I mentioned it. So this is what happened.

OK, let's get to the point. The second planet of the new star system is inhabited by sentient bat-creatures, who have reached a roughly Victorian-era level of technology. Their economy is based on slave labor provided by trudges, members of a semi-sentient related species -- roughly as if humans had succeeded in domesticating and enslaving chimpanzees.

In their spread across the galaxy, humans have never encountered any life much above the level of slime moulds. But they've got ethical principles in place, all the same, which say that they should leave the bat-people alone, except for some discreet observation mediated by bioengineered beetles and the like.. But what does non-interference mean, in this case? Would colonizing the system's other planets, ithe asteroid belt, etc., be OK? Maybe not, since the bat people are on a track to invent space flight and do it themselves. Still, a dissident faction plans to force the issue by breaking off part of the ship (it's designed to work that way) and starting the colonization process anyhow.

And is non-interference really the right policy, given the bat people's warring societies and their cruel treatment of the trudges? Some crew members don't think so.

In this context, a bit of genetic hackery goes badly wrong -- or maybe exactly right, depending on which level of whose plans you attend to. From p. 281 of the TOR paperback

She reviewed what he had told her, replaying the words and sentences her anger had whited out and shouted down the first time.

The problem, the intellectual problem, was this. No Rosetta stone existed for the bat people's language. No amount of observation, no iteration of linguistic heuristics, could decode an unknown language from recordings alone. For mutual understanding, there had to be mutual interaction. One had to know directly what one side of the conversation was trying to say, and that meant one side of it had to be you. Faced with this impasse, the crew's scientists had, in all too characteristic a fashion, worked around it. Their solution had all the grubby fingerprints of a brute-force kludge.

The neural structure of the human brain's language-processing module, named in deep antiquity Chomsky's Conceit, had been known since the Caves. The genetic code of the Destiny II biosphere was known from aerial microorganisms returned to the stealth orbiter. The amount of information and genetic instruction that could be packed in a naonassembler was vaster by far than even the vast amount stored in natural genomes and machinery, cluttered as tehy were with redundancy and junk. The information-processing hardware capacity of the ship was beyond all human conception, and the amount of information its sceince software could extract from the slenderest and most fragile of evidence was limited only by the ingenuity of the human inquiry that initiated it.

So . . . they'd had the means to install Chomsky's Conceit on any big enough brain down below. They had the means to generate radio transmitters within host bodies, as they'd done with the dung-beetles. And faced with the crash-and-burn and banning of that project, they'd skipped blithely ahead to a bolder one. They couldn't install Chomsky's Conceit on the brain of bat people -- the aliens' brains already had a language module of their own. That would have given rise to wetwoare conflicts and deep grammar errors, and anyway, ethically, that would never have done. Oh no. That would have been wrong. That would have interfered. What they had done, bless their reckless little souls, was to set up the machinery to install the module on the brains of the slaves, who had (they'd figured) no language module (and who were, therefore, not slaves but beasts). And once thye'd received and filtered and processed and quantum-handwaved the information coming back from brains learning the bat people's languages, the translation protocols had been ---

Reverse-engineered from the language module!

Holy rocking shit.

From a narrow technical point of view, this is just clever intelligence work: getting the information needed to be able to understand what the bat people are saying to one another. But from another point of view:

"You realise what you've done?" she demanded. "Do you have the faintest conception of the harm this will cause?"

Constantine nodded. "The disruption will be immense. It'll destroy the entire slave economy."

"But they're not slaves!" Synchronic said. "If they had been, I could see why we might want to interfere. But you've taken waht are by your admission mute brutes, and given them language. Deep grammar. Self-awareness. Human consciousness. You've made them slaves."

Caliban writ large. (No, not Taliban -- that's a completely different problem.) For a later post: why Macleod's future humans have misread Chomsky. If you're really curious about this, you could check out these earlier posts:

Bad Boy Science

In the current American Prospect, a nice piece by Jaana Goodrich called "Where the Boys Are" (subscription required, unfortunately) takes on the conservatives' enthusiasm for same-sex schooling and the psychological "evidence" for the vast cognitive differences between the sexes that are held to justify separate education, as presented by the likes of Michael Gurian and Leonard Sax:

Too bad that the scientific evidence underlying these recommendations is unclear at best and nonexistent at worst. Mark Liberman, on the Web site Language Log, takes apart some of the bad science Sax uses in his popular book Why Gender Matters. He also points out that any average sex differences in learning styles are small and swamped by the individual variations within each sex.

Goodrich goes on to dismantle the idea of a "boy crisis," which she lays to sloppy research and discomfort with the idea of girls doing better than boys. She concludes:

None of this probably bothers the Republican Party's socially conservative base. Social conservatives already view gender roles as innately determined and single-sex schools fit admirably into their sexual abstinence agenda. Neither are conservative anti-feminists likely to be upset over these developments: Anything that pokes a finger in the eye of second-wave feminists with their claims of equal treatment for girls and boys is fun for this group.

November 02, 2006

Social science on the playground

On the recent Linguistics 001 midterm, one of the questions that most people missed was "What is the Machiavellian Intelligence Hypothesis?" Obviously my lecture on hypotheses about language evolution failed to get this point (or at least this term) across. For the record, "Machiavellian intelligence" (popularized as a term by a 1988 book of that title) refers to the hypothesis that "the driving force in the evolution of human intellect was social expertise--a force which enabled the manipulation of others within the social group, who themselves are seen as posing the most challenging problems faced by primitive humans".

In the interests of making this idea more vivid, here's a little story that someone recently sent me, about an interaction among elementary-school children (in a galaxy far away):

There's this new kid, A, who, according to B, never smiles and is extremely hateful to other people. He's in C's group, with T for teacher.

At recess, C was complaining to B that T got mad at him and it wasn't fair. He said T turned to him just before recess -- because C was sitting closest to her -- and asked where A was. C said, "I don't know, and I bet nobody else does either, because 9 out of 10 people don't like him." That made T mad.

B told C the problem was that if you quote a statistic, most people will think you agree with it. So they took a survey among three kids on the playground -- X, Y, and Z -- asking, "If someone quotes a statistic to you, will you assume they agree with it?" X said yes, Y said yes, and Z said, "What's the right answer to that question?" indicating that she just wanted to be on the same side as B and C.

But B points out that while it really wasn't fair of T to get mad at C for merely quoting a statistic, in fact C does agree with it.

And of course, he did make up the statistic. Although B says he thinks C's estimate may be conservative, because they only know of one kid who likes A.

The cabinet of Dr. Birdwhistell

In the fictional 1880s, Sherlock Holmes closed the case of Silver Blaze by paying attention to the curious incident of the dog that did nothing in the night time. In the middle 1960s, Paul Ekman opened his study of the universality of facial expressions by paying attention to the curious incident of the file cabinet that didn't exist in an anthropologist's office.

Ekman came at the problem of facial expressions with a strange combination of Freudian and Skinnerian motivations:

As a freshly trained clinical psychologist, my therapeutic orientation was psychoanalytic, but as a researcher I had been trained as a behaviorist, a radical Skinnerian. Skinner said that psychology should examine only observable behavior; there were to be no inferences about what might be going on inside the heard. ...

I was dissatisfied with the evidence for the effectiveness of psychoanalytic therapy, which rested on what the patient and a therapist said. I wanted to example not words but real behavior (from a Skinnerian viewpoint) -- body movements and facial expressions. Examining the non-verbal behaviors of patient and therapist might reveal evidence of clinical improvement not shown in their words, and perhaps would suggest ways to improve therapeutic techniques.

Body movements and facial expressions are certainly real enough, but it seems odd to view them as "real behavior" in a sense that spoken words are not. In any case, Ekman spent "a few years studying hand and leg movements", and then moved on to the face.

I had not read Expression, but had heard about it and thought Darwin was probably wrong. As a Skinnerian, I thougth it unlikely that expressions would be universal, and I was sure that inheritance could not play a role in emotional behavior. But it didn't really matter what I thought, as a Skinnerian, it was better not to have any forethought about what you were going to study. I would just get the facts.

As it turned out, Ekman's empirical methodology would come into conflict with his empiricist ideology. The first clue that there might be a problem was the curious incident of the missing data in Ray Birdwhistell's office:

Before starting my research on the face, I vistied Birdwhistell. I expected to find file cabinets full of data, notebooks crammed with detailed observations, or racks of film documenting his position. Birdwhistell was surprised at my request to see his documentation, for what he had seen and observed was all in his head. We did not get along. He could not undersand what I thought I might be able to prove by re-opening the question of whether facial expressions are universal, when he had found the answer was 'no'. He could not comprehend why I was dissatisfied with his conclusions with no documentation or data others could inspect or attempt to repeat.

Ekman also talked with Gregory Bateson, who was charming, and Margaret Mead, who was not.

I clearly remember that meeting, the appearance of her office, and her unfriendly, gruff manner. She had little patience for the quest I was about to begin. She knew that I had been to see Birdwhistell and that I disagreed with Birdwhistell's view that the question of universality was settled. I did not anticipate how angrily she would react later when my findings challenged Birdwhistell's claims.

Some of the animosity may have come from the fact that Birdwhistell was Mead's student. But much of it was a clash of cultures:

They believed in the value of the lone anthropologist and his or her fieldwork, trusting in his or her own intuitions and judgments. The idea of using multiple observers, of gathering quantitative data, of building in safeguards against the influence of the scientist's commitments, which are standard in experimental psychology, were foreign to them.

Next: people in 21 countries agree about pictures of facial expressions; Birdwhistell argues they've learned about facial expressions from John Wayne and Charlie Chaplin; Ekman studies the Fore people in in the highlands of Papua New Guinea to escape the influence of Hollywood; Margaret Mead writes in the Journal of Communication that Ekman's work "is a continuing example of the appalling state of the human sciences".

November 01, 2006

Is there coded language in the House?

Ruth Marcus of the Washington Post (see here) believes that if or when the Democrats gain leadership of the House of Representatives, it would be a bad idea for the incoming Speaker, Nancy Pelosi, to appoint Rep. Alcee Hastings of Florida to head the House Intelligence Committee. Hastings has the seniority to take over that role but Marcus points out that he has a serious blot on his record, because in 1988 he was impeached and removed from his office as a Federal judge.

So what does this have to do with linguistics? And why comment on it on Language Log? Because linguistic analysis played a role in Hastings' impeachment process and Marcus, who covered this event for the Post, still recalls the salient factors, as follows:

"The evidence against Hastings is circumstantial, but it's too much to explain away: A suspicious pattern of telephone calls between Hastings and Borders at key moments in the case; Borders' apparent insider knowledge of developments in the criminal case; Hastings' appearance at a Miami hotel, as promised by Borders as a signal that the judge agreed to a payoff; a cryptic telephone conversation between the two men that appears to be a coded discussion of the bribe arrangement."

So what is this cryptic conversation that leads Marcus to say:

"I don't worry that as chairman he'd suddenly be for sale: If he could be entrusted with national security secrets as a committee member, why not as chairman? But this is no ordinary crime, and Intelligence is no ordinary committee."

It all began in 1981, when a DC lawyer, William Borders, and Judge Hastings were involved in a criminal extortion proceeding. Borders was convicted but Hastings was acquitted. Months later, the House Subcommittee on Criminal Justice was not convinced of Hastings' innocence and it started its own lengthy investigation of the matter. At issue were several intercepted telephone calls between Borders and Hastings. To the Subcommittee, the conversations looked like coded messages but they couldn't figure out how to prove this. In 1988 they called on me to analyze the conversations with the sole purpose of answering the question about whether or not they indeed contained some kind of code.

The main focus was on one very short conversation between the two men. The ostensible topic was the judge's plan to write support letters for Hemphill Pride, a South Carolina attorney who had run afoul of the law and was now trying to reverse his disbarment. The government believed that Hastings and Borders were involved in a plot to extort money from a man they believed to be Frank Romano but who was actually an undercover agent using the name, Rico. Borders assured Rico that he could get a judge to provide a favorable sentencing report if Rico would ante up $50,000. The government further believed that part of this money was to go to Judge Hastings. No linguistic analysis of the conversations were made at the trial.

My task was to discover whether the conversation was actually in code. But first, let's look at what they actually said to each other in this short, 19 line conversation:

Phone ringing:

(1) B: Yes, my brother.

(2) H: Hey, my man.

(3) B: Uh-huh.

(4) H: I've drafted all those, uh, uh, letters, uh, for Hemp.

(5) B: Uh-huh.

(6) H: And everything's okay. The only thing I was concerned with was, did you hear if, uh, did you hear from him after we talked?

(7) B: Yeah.

(8) H: Oh, okay.

(9) B: Uh-huh.

(10) H: Alright then.

(11) B: See, I had, I talked to him and he, he wrote some things down for me.

(12) H: I understand.

(13) B: And then I was supposed to go back and get some more things.

(14) H: Alright. I understand. Well, then, there's no great big problem at all. I'll, I'll see to it that, uh, I communicate with him. I'll send the stuff off to Columbia in the morning.

(15) B: Okay.

(16) H: Okay.

(17) B: Right.

(18) H: Bye bye.

(19) B: Bye.

If this was a code, it certainly wasn't a total, obvious code. That is, it wasn't the type that cryptologists deal with where the intent of the coding is to be so unclear that the message can't be deciphered by outsiders to the code. Such codes look like codes and strive to be impossible for outsiders to understand.

Nor was it the usual partial and obvious code, where ludicrous nouns and verbs substitute for the intended meaning, similar to the ones Oliver North described in his 1989 book, Taking the Stand: "If these conditions are acceptable to the banana, then oranges are ready to proceed."(p. 143).

If Hastings and Borders were using a code here, it was a partial and disguised code, one in which words were carefully selected to make it appear to anyone who should happen to intercept it that the participants are talking about one thing while, in reality, they are talking about something very different. In his Encyclopedia of Language David Crystal cites such a code used in a murder case in India: "Go clean the bowl" was used to man "Prepare the grave."

In partial and disguised codes both participants must understand the code, which must be relevant to real life situations, be plausible, be specific, be consistent, and they generallly require more confirmation of mutual understanding than we find in everyday conversation. So the question was whether this type of code was being used here.

Lines 1 and 2 appear to be a standard greeting routine between two friends but in line 3 Borders gives a feedback marker, "uh-huh," that suggests a willingness to give up his turn of talk immediately. Oddly, there is no request to Hastings about why he is calling such as "What's up?" or "What's on your mind?" Nor did Borders seize the opportunity to assert his own agenda, such as "I'm glad you called because..."

In line 5, Hastings explains that he's drafted those letters for Hemp, to which Borders says only "uh-huh," accomplishing no more that giving up his turn again. Note that Hastings used the pause filler, "uh," three times in line 5. Pause fillers can accomplish at least three things: to prevent interruption, to provide assurance that more is coming, or to struggle to find the right word to use. In hastily constructed codes, one expects speakers to struggle to find the word that accomplishes the code. These pause fillers tend to occur in exactly those places where the potential code word is to follow:

(4) uh, uh letters

(4) uh, for Hemp

(6) uh, you hear from him

(14) uh, I communicate with him

In lines 6 through 10, Hastings asks if Borders has heard from "him" after they last talked. Since "him" is not specified, we can assume that they both understand who this is. Borders says "Yeah," but oddly does not report what he heard from "him." Nor does Hastings pursue this, offering only "Oh, okay," to which Borders says "Uh-huh," and Hastings says, "Alright then." This odd exchange appears to signal that complete information has been given when it has not--unless "did you hear from" is code for "did you get X," for which the rest of the exchange would have been appropriate.

Borders makes his first substantive contribution to the conversation in line 11: "See I had, I talked to him and he, he wrote some things down for me." Note the care with which he constructs this sentence, including a false start and pronoun repetition at the points where a code word could be expected. This gives the appearance of Borders' struggle to not slip into uncoded talk along with his effort to remember the code consistently. Hastings' response, "I understand," serves as a confirmation of truth or the existence of facts presented by Borders. It's curious why Borders' sentence would require confirmation of undersanding unless it is a code for something else. Hastings' "I understand" can relate only to the "he wrote things down" part of the sentence, since it had already been established that Borders had heard from "him." Even so, writing some things down is hardly monumental enough to require confirmation of understanding. On the other hand, it is appropriate as a confirmation of the potential coded meaning of something else.

Lines 13 to 15 continue the odd sounding dialogue. Borders had already referred to "things" that someone had written down and now he elaborates a bit, saying that he was supposed to go back and get some more "things." if "more things" are things to say in the support letter for Hemp, one might expect it to be said differently, such as, "He couldn't think of everything you should say so he'll think about it more and get back to me." In any case, although it may be appropriate to write "things" down, it is odd to say that one will go back and get more things. Such wording can work nicely, however, for a different (coded) meaning of "things." Note also Hastings' false starts and pause filler in line 14, again in front of potential code words.

Other issues also point to the use of a partial, disguised and hastily constructed code here. For example, one would expect Borders to have said that Hemp wanted him to "come back" for more things rather than "go back." One also wonders why Hastings said that there was "no great problem" in a context where going back to get more things was allegedly benign. This suggests that there may have truly been a big problem about something else. And why, after Borders has said that he was supposed to go back and get more things (line 13), does Hastings change the plan so abruptly on line 14, saying that he will communicate with Hemp himself? Borders' response to this change in plan was only a mild, "Okay." Finallly, Hastings' change in the procedure includes, "I'll send the stuff to Columbia in the morning," more consistent with sending something other than a support letter.

The crucial expressions used here appear to be "letters," "wrote some things down," "get some more things," "I'll communicate," and "send the stuff," all easily translatable to other meanings. Apparently "things" had morphed into "stuff" by the end of the conversation. This gives evidence of a hastily constructed, partially disguised code in which the participants intended their meaning to look like support letters for Hemphill Pride. Interestingly, Pride informed me personally that he never requested such letters and that as far as he knew, none existed. At his impeachment hearing, Hastings claimed that the style of speech he used in this conversation was his typical mode of talk. But comparison of this conversation with the others in evidence showed none of these features that look very much like code.

Rep. Hastings is a congenial man who is generally well liked and thought to be competent. He has served in the House of Representatives since 1992 and has enough seniority to head the House Intelligence Committee. Marcus doesn't think it's a good idea to appoint him to that post, largely because of this 19 line conversation. She may have a point.

However,...

My last
posting on Garner's Rule -- which proscribes sentence-initial
linking however -- ended with
an unresolved issue: I observed that college student writers seem to be
fond of the discourse connective however,
and to prefer to put it in initial position (rather than
sentence-internally), and wondered why. That was Act I of
"The Story of However".
Now, Act II, with some reasons. The zero-tolerance policy ZT-1,
"If they do it too much, they should be told not to do it at all", will
return to the stage and play a prominent role in this act.

(Acknowledgment: I'm reporting on joint work with Douglas Kenter.)

It's fairly easy to see why writers like discourse connectives (in
general, not just markers of contrast like however and but) in sentence-initial position:
if you use a marker C to connect a sentence S with preceding discourse
D according to the scheme

D C+S

the marker comes between the things whose contents it relates;
structure reflects function. In addition, sentence-initial
connectives are easy to produce and easy to process, while other
schemes of connection are more demanding. Sentence-internal
connectives are interruptions within their sentence:

The test is demanding.

Most students, however, will get all
the answers right.
Most students will, however, get all the answers right.

and sentence-final connectives hold off information about discourse
connection until the last possible moment, where it may come as
something of a surprise:

Most students will get all the answers
right, however.

The other main option, expressing discourse connection via a
subordinating conjunction on the sentence S' preceding S --

C+S' S

Although/Though the test is demanding, most students will get all the
answers right.

involves the complexity that is associated with subordination in
general.

The point here is not that these other options are inferior -- there
are occasions when they would be excellent choices -- but that a
sentence-initial linker is the simplest way to connect a sentence to
preceding discourse, so it's no surprise that students are inclined to
go for that scheme a lot of the time.

Ok, a sentence-initial connective, but which one? For expressing
contrast, the main contenders are however and but. These items differ in
(at least) three relevant ways: in their prosodic properties, in their
stylistic levels, and in their syntactic category. (Actually,
Kenter and I maintain that they also differ subtly in meaning and/or
discourse function, and we aren't the first to make this claim.
But that's a matter for another day.)

First, prosody. However
has three syllables, has an accent of its own, and comes with a prosody
that separates it from the sentence it modifies. But has only one syllable, is
usually unaccented, and is prosodically integrated with what
follows. Overall, however
is a lot "weightier" prosodically than but. Things follow from that.

Some people report that they like however
just because it's more substantial, more prominent, than but. They see however as a more emphatic marker
of contrast, or at least a more noticeable one.

As I noted in my last posting, Bryan Garner sees things the other way;
he finds initial however
"unemphatic". De gustibus and all that. But there are
reasons for a sensible person to like but:
it's shorter and less ostentatious; however
holds off the sentence that follows for an appreciable amount of time,
and it shouts "Contrast!" (As usual when we're comparing
alternatives, the things that distinguish them cut both ways,
functioning as either advantages or disadvantages, depending on the
context and the writer's purposes. That's why we should want both
alternatives to be available to writers and speakers.)

On to stylistic level. Here, people generally agree that however is more formal than but -- however groups with
adverbials like moreover,
furthermore, consequently,
therefore, nevertheless,
and nonetheless -- with the
result that many people like it when they're doing formal
writing. College students seem to like it especially, probably
because one of the things they're working at is to get the proper level
of formality in their writing. (They often overshoot, of course.)

(A little digression: complaints that initial however is weak, monotonous, etc.
seem not to be extended to the other formal discourse adverbials in
initial position. The concentration on however puzzles me; furthermore is in competition with and, and consequently and therefore with so, in much the same way as however with but, yet however gets all the
attention. Maybe it's just intellectual fashion. Maybe it's
all Strunk's fault.)

Notice that I said that however
is more formal than but, not
that but is informal or
colloquial. My judgment here is that but is in fact stylistically
neutral, usable at all levels, and this seems to be Garner's judgment
as well. In choosing between a neutral and a more formal
alternative, Garner seems to aim for a "plain style" and recommends the
neutral item, and in fact that's my practice too. That's why I
use so little sentence-initial however.
(Garner's preference for neutral items over more formal alternatives
undoubtedly contributes to his enthusiasm for Fowler's Rule, insisting
on restrictive relative that
over the more formal which
when both are available.)

Finally, syntactic category. Here we approach the dramatic climax
of "The Story of However".
However is an adverbial, but a coordinating conjunction, and
this second fact introduces a conflict into our play's action.

A little story: whenever Kenter and I talk about our investigations
into but and however, a significant number of
people in our audiences are astounded to hear that there are
authorities actually RECOMMENDING sentence-initial but. Almost all of the
students in the audiences respond this way. (And now, after
yesterday's posting, my mailbox is filling up with similarly surprised
messages from all over the place.) But, but, they clamor, we were
taught NEVER to begin a sentence with but, or any other coordinating
conjunction (and and so are the other usual offenders).

Taught where? In grade school and high school. No Initial
Coordinators (NIC) is all over the place in those precincts. Some
Stanford undergraduates told us that their section instructors in PWR
(Program in Writing and Rhetoric, the successor to Freshman
Composition) insisted on NIC. I happen to know that the main
texts used in PWR do not advocate NIC, so these section instructors
were rolling their own advice (well, probably just handing on things
they themselves had been taught). Still, NIC had some college
presence. And at Stanford. I was appalled.

In any case, what were the kids taught in elementary and secondary
school? Don't use but
to start a sentence; USE HOWEVER
INSTEAD! So of course college students very frequently
opt for however; it's just
what they were taught to do. Now we see the dramatic conflict:
NIC vs. Garner's Rule. You can't obey them both.

I will soon speculate on the origins of NIC. But first, some
disavowals of NIC, beginning with Mark Liberman right here on Language
Log:

There is nothing in the grammar of the
English language to support a prescription against starting a sentence
with and or but --- nothing in the norms of
speaking and nothing in the usage of the best writers over the entire
history of the literary language. Like all languages, English is full
of mechanisms to promote coherence by linking a sentence with its
discourse context, and on any sensible evaluation, this is a Good
Thing. Whoever invented the rule against sentence-intitial and and but,
with its a preposterous justification in terms of an alleged defect in
sentential "completeness", must have had a tin ear and a dull mind.
Nevertheless, this stupid made-up rule has infected the culture so
thoroughly that 60% of the AHD's (sensible and well-educated) usage
panel accepts it to some degree.

(And, sadly, Microsoft's Grammar Checker tries to enforce NIC.)

Mark notes that the AHD note for and
rejects NIC out of hand, and he provides a smorgasbord of cites (and
statistics) from reputable authors. Similarly MWDEU. Paul
Brians, collector of common errors in English, labels sentence-initial
coordinators a "non-error". Bryan Garner denies, all over the
place, that NIC has any validity. Even the curmudgeonly Robert
Hartwell Fiske tells his readers that there's absolutely nothing wrong
with sentence-initial coordinators. A point of usage and style on
which Liberman and I and the AHD and the MWDEU stand together with
Brians and Garner and Fiske (and dozens of other advice writers) is,
truly, not a disputed point. NIC is crap.

But still it lives on, as what I've called a zombie
rule. It's been lurking in the grammatical shadows for some
time -- at
least a hundred years, to judge from MWDEU. Hardly any usage
manual subscribes to it, but it
is, apparently, widely taught in schools, at least in the U.S., with
the result that educated people tend to be nagged by a feeling that
there is something bad about sentence-initial and (and but and so). (It might well be that
this sense of unease rises with level of education. Someone
should look at this possibility.)

I speculate now about two questions: how did the proscription arise,
and why does it persist?

Grammatical
proscriptions that are at odds with elite usage can arise in three
ways, two of which were probably at work in the case of
sentence-initial and/but/so:
as an expression of individual taste; as a consequence of "theoretical"
claims about grammar; and as a by-product of well-intentioned efforts
to improve student writing and speech.

Most of the advice literature on English is the product of individual
people -- essayists, poets, editors, journalists, literary scholars,
lawyers, translators, and other people who deal in a practical way with
language -- who see themselves as serving as arbiters of style as well
as guardians of the formal standard written language. There's
plenty of room on matters of style for the arbiters to retail their
personal likes and dislikes as instructions to others. But as far
as I can tell, the impulse to impose personal taste has played no
significant role in the rise of the NIC
zombie.

But "theoretical" considerations surely have. There is a
widespread belief that sentences -- in both writing and speaking --
should be "complete", not fragmentary, in fact that complete SENTENCES
are signs of "complete", well-ordered THOUGHTS (and
that incomplete, fragmentary sentences are signs of incomplete,
disordered thoughts). The underpinning belief is that the
superficial syntactic form of sentences is a direct reflection of the
structure of the thoughts these sentences convey. This is a very
silly idea, and when it's combined with an almost exclusive attention
to single sentences, rather than organized discourses, it yields the
claim that fragmentary sentences are very bad things.

(Animosity towards fragmentary sentences has had occasionally
pernicious results -- perhaps, most famously, in claims by Bereiter and
Englemann back in the '60s that kids, or at least impoverished black
kids, who answered wh-questions (Where
is the monkey?) with fragments rather than full sentences (In the tree rather than The monkey is in the tree) were
betraying an inability to think clearly. The recommended
treatment for their deficit in thinking was drilling on always
producing complete sentences in answers to questions.)

NIC can
be seen as just a special case of No Fragmentary Sentences. The
function of conjunctions like and,
but, and so -- the only function of such
conjunctions, it is claimed -- is the joining of phrases of like type,
so that a sentence that begins with one of these words is missing the
clause that is to be joined with the clause that follows the
conjunction, and that sentence is therefore only fragmentary.
(Yes, I know, the clause is right there in the previous sentence, but
we're supposed to be looking only at single sentences here.) If
you take all the beliefs and claims above literally, you are led to the
conclusion that NIC
is not only true, but necessarily true.

But few advice manuals are willing to go all the way with this
"theoretical" argumentation. For example, Diana Hacker's A Pocket Style Manual, 4th ed.
(Boston: Bedford/St. Martin's, 2004), p. 48, tells the student, "As a
rule, do not treat a piece of a sentence as if it were a sentence" and
goes on to classify fragments into two types: "Some fragments are
clauses that contain a subject and a verb but begin with a
subordinating word. Others are phrases that lack a subject, a
verb, or both."

Notice: "a SUBordinating word". Hacker's rule
applies only to subordinate clauses. And, indeed, in the text
that follows there's a list of sample words that begin subordinate
clauses. And, but, and so are not on this list; by
inference, they are allowed in sentence-initial position.

It's likely that the main justification for NIC comes instead from
well-intentioned
attempts to improve student writing and speaking. Initial and is the first sentence
connective acquired by most English-speaking children, and they use it
heavily in their speech; of course they do, since for a while it's all
they've got for indicating connection between sentences. Heavy
use of sentence-initial and
and (logical/temporal) so
continues through childhood and into adulthood, in both speaking and
writing, with then and and then as additional variants in
narratives. Observe the discourse organization of the Coasters'
rousing "Along Came Jones", from about 45 years ago:

And then he grabbed her (And then)
He tied her up (And then)
He turned on the bandsaw (And then, and then...!)

And then along came Jones
Tall thin Jones
Slow walkin' Jones
Slow talkin' Jones
Along came long, lean, lanky Jones

Teachers quite rightly view this system of sentence connection as
insufficiently elaborated, and they seek ways of getting students to
produce connectives that have more content than vague association or
sequence in time. At some point, I speculate, they applied ZT-1,
"If they do it too much, they should be told not to do it at all", and
NIC, a blanket proscription, was born. Probably in elementary
schools, from which it would have diffused to secondary schools and
beyond. And now the zombie lurches on, possibly inside your own
computer; it's inside mine, thanks to Microsoft Word for Mac OS X.

Once NIC is out there, it will persist. Any fool with a claim to
authority and either students or a publisher can get a rule ON
the books, but there is absolutely no mechanism for getting rules OFF.
People think that rules are important, and they are reluctant to
abandon things they were taught as children, especially when those
teachings were framed as matters of right and wrong. They
will pass those teachings on. They will interpret denials
of the validity of such rules -- even denials coming from people like
Garner and Fiske, who are not at all shy about slinging rules around --
as threats to the moral order and will tend to reject them. I've
had some success convincing some students and friends that some of the
rules they were taught are not good rules to live by -- but my success
depends on their willingness to listen to me and their willingness to
question their beliefs, two qualities that are not widespread in the
general population.

So our little play goes: ZT-1 contributes significantly to the rise of
NIC and then Garner's Rule, though these originally have different
audiences. Eventually, the two proscriptions clash, and, in my
telling of the story, NIC is mortally wounded, but continues to wander
the landscape as a zombie. Garner's Rule survives, in a community
of like-minded souls pugnatiously defending themselves against the
opinions of linguists and the practices of many of the neighbors.
Nothing is ever resolved.

How to defend yourself from bad advice about writing

Okay, I admit it. I am a shameless user of passive tense. I've been involved with a power-struggle with one of my writing profs on campus (creative nonfiction class) about the tense, and I think it's finally time for me to concede. However, she seems to think that I should just inherently know other ways to word things. And, of course, there's the issue that I don't think I quite understand passive tense, because the things she's been marking as "wrong," are not passive tense as I was taught. I guess I tend to say things like "I was doing this," "there were these things," etc.

A specific sentence I've been playing with recently:

"Thomas was relieved when the car finally pulled onto the highway."

So, any thoughts would be awesome.

Well, Elrina, here are a few thoughts.

First, there's nothing wrong with "Thomas was relieved when the car finally pulled onto the highway". At least, there's no grammatical or stylistic reason for you to reword it.

Second, that sentence is probably not really an example of the passive voice, unless you mean that Thomas was relieved in the sense that his replacement arrived for duty. If you mean that Thomas was relieved in the sense that he felt a lessening of anxiety, then the construction is an example of what the Cambridge Grammar of the English Languagecalls an "adjectival passive". CGEL observes that "adjectival passives are passive only in a derivative sense". (More on this in a later post, or go read pp. 1436-1440 of CGEL if you're curious and impatient. A clue: you can say "Thomas was very relieved when the car finally pulled onto the highway", but not "Thomas was very run over when the car finally pulled onto the highway".)

Third, your other two examples -- "I was doing this" and "there were these things" -- are definitely not passives in any sense at all. If your writing prof is really telling you that things like this are wrong because they're in the passive voice, then she's certainly ignorant and probably incompetent.

Fourth, there's nothing intrinsically wrong with using the passive voice. All the best writers do it, some of the time. See the list of posts at the bottom of the page for some examples -- if you're eager, go here to find some examples of passives in the writing of E.B. White, who was the White in Strunk & White, or here, to examine the passive practices of professor Strunk himself..

Fifth, and least important, the traditional terminology is "passive voice", not "passive tense". The term tense deals with ways of expressing concepts of time, like present and past; the term voice deals with ways of arranging the arguments of a verb, as in "The ninja explained the concept of passive to the writing teacher" (which is an example of active voice), vs. "The concept of passive was explained to the writing teacher by the ninja" (which is an example of passive voice).

There's also aspect, which deals with ways of expressing action (or being) in respect of its inception, duration, or completion. Putting voice, tense and aspect together, we can create a little paradigm of some variations on one of your examples.

Speaking of ninjas and writing teachers, the ninja is this cartoon by Nic Bommarito seems to know her grammar:

Of course, we here at Language Log don't recommend or even condone the murder of English professors, though we do feel that some of them ought to sit in on a linguistics course or two, or maybe read a good student grammar. And if you want to be able to stand up to them, Elrina, you might invest in such a grammar yourself, and perhaps in a good usage guide while you're at it. What's in those books might even help your writing, but in any case it'll help you keep your writing teachers from wasting your time.

Following Elrina's question, the NaNoWriMo forum has three pages of interesting answers. These make it clear that most people believe that "passive" has something to do with whether or not the subject is an agent, and perhaps also something to do with overall dynamism, vividness or concreteness. For example, "Corvus" defends the use of passive in this way:

While I do not advocate a sudden embracing of the passive voice, I do advocate a less strident opposition to it. It's not always the wrong voice. For example:

"The horrific thought occured to me that I was on the wrong train, headed for Paris instead of Berlin."

I challenge the reader to reconstruct this idea in active voice and maintain the flavor.

That challenge is hard to meet, since the sentence is already in the active voice. The issue for Corvus seems to be that the subject of the main clause is "thought" rather than "I".

Thomas felt relieved when the car finally pulled onto the highway.
Thomas gave a sigh of relief when the car finally pulled onto the highway.

and "paintbyletters" responds that

The third is definitely the most active, because Thomas is acting. I, as the reader, sigh in relief right along with him. OTOH, to tell me that Thomas felt relief or was relieved distances me from Thomas. I don't care quite as much, and overuse of passive verbs will have a cumulative effect on your reader's interest in your characters.

I don't care at all, myself. I've given up, for the moment, on wishing that people would use grammatical terminology in a coherent way, and instead, I'm asking myself whether any of this writing advice makes any sense. Specifically, I wonder whether there's any evidence that a narrative is better if it has a higher proportion of verbs that denote actions, whose subjects are human agents.

Let's do a quick sanity check, by looking at the openings of a few successful novels, pulled (literally) off the shelf at random. Exercise for the reader: what proportion of the clauses have a verb denoting an action, with an agentive subject? Would these novels have been better if that proportion were higher?

It was August, and it shouldn't have been raining. Perhaps rain was too strong a word for the drizzle that blurred the landscape and kept my windshield wipers going. I was driving south, about halfway between Los Angeles and San Diego. [Ross Macdonald, The Far Side of the Dollar]

The Channel Club lay on a shelf of rock overlooking the sea, toward the southern end of the beach called Malibu. Above its long brown buildings, terraced gardens climbed like a richly carpeted stairway to the highway. The grounds were surrounded by a high wire fence topped with three barbed strands and masked with oleanders. [Ross Macdonald, The Barbarous Coast]

The law offices of Wellesley and Sable were over a savings bank on the main street of Santa Teresa. Their private elevator lifted you from a bare little lobby into an atmosphere of elegant simplicity. It created the impression that after years of struggle you were rising effortlessly to your natural level, one of the chosen. [Ross Macdonald, The Galton Case]

Moran's first impression of Nolen Tyner: He looked like a high risk, the kind of guy who falls asleep smoking in bed. No luggage except for a six-pack of beer on the counter and the Miami Herald folded under his arm. [Elmore Leonard, Cat Chaser]

A friend of Ryan's said to him one time, "Yeah, but at least you don't have to take any shit from anybody."
Ryan said to his friend, "I don't know, the way things've been going, maybe it's about time I started taking some." [Elmore Leonard, Unknown Man #89]

The marriage wasn't going well and I decided to leave my husband. I went to the bank to get cash for the trip. This was on a Wednesday, a rainy afternoon in March. The streets were nearly empty and the bank had just a few customers, none of them familiar to me. [Anne Tyler, Earthly Possessions]

He -- for there could be no doubt about his sex, though the fashion of the time did something to disguise it -- was in the act of slicing at the head of a Moor which swung from the rafters. It was the colour of an old football, and more or less the shape of one, save for the sunken cheeks and a strand or two of coarse, dry hair, like the hair of a cocoanut. [Virginia Woolf, Orlando]

I'll make my report as if I told a story, for I was taught as a child on my homeworld that Truth is a matter of the imagination. The soundest fact may fail or prevail in the style of its telling: like that singular organic jewel of our seas, which grows brighter as one woman wears it and, worn by another, dulls and goes to dust. Facts are no more solid, coherent, round, and real than pearls are. But both are sensitive. [Ursula K. LeGuin, The Left Hand of Darkness]

Early this morning, 1 January 2021, three minutes after midnight, the last human being to be born on earth was killed in a pub brawl in a suburb of Buenos Aires, aged twenty-five years, two months and twelve days. If the first reports are to be believed, Joseph Ricardo died as he had lived. [P.D James, The Children of Men]

My quick count says that out of 39 tensed verbs, 7 (or about 18%) denote actions and have an agentive subject. Two of the seven are "said" -- hardly the most dynamic action around -- and if we discount those, we're down to 13%.

Do you see any places where the text would be improved by substituting some active-voice transitive verbs denoting actions, with human agentive subjects? I don't. The next time someone tells you to "avoid passive", -- apparently meaning that you should use verbs denoting actions with human agents as subjects -- why not ask them to define their terms, and to back up their advice with some evidence?