January 31, 2006

The vocabulary of toadying

A little while back, on his blog Daily
Dish, Andrew Sullivan took a quick swipe at Fred Barnes's
adulatory biography Rebel-in-Chief:
How George W. Bush Is Redefining the Conservative Movement and
Transforming America (Crown Forum, 2006), using a reference to
oral sex. Here's Sullivan
on 1/15/06:

BREMER'S BOMB-SHELL: And Fred Barnes'
fellatial biography of Bush. (He makes Powerline read like the Daily
Kos.) I try and make sense of each here. More on Fred's book soon.

The very next day, on the National
Review's blog, John Podhoretz sputtered
about "Andrew Sullivan's anti-gay invective", in a posting that seems
confused on several fronts: Podhoretz apparently thinks that words have
one single meaning in all contexts, and that gay men (like Sullivan),
who presumably have a positive attitude (once again, in all contexts)
towards performing fellatio, are being hypocritical when they
characterize toadying, negatively, by analogy to one man fellating
another. (It's not easy to unpack Podhoretz's unhinged,
shouting-in-capitals rhetoric, so my analysis here might be off the
mark.)

Later that day, James Wolcott (of Vanity
Fair), on his blog,
mocked Podhoretz's fuming but embraced the imputation of homoeroticism
in neocon gushing over GWB.

This was how things stood when I first learned about these exchanges,
from John Calendo on the entertaining gay blog Nightcharm (yes, I'll give you a
link, but that will come below the fold) on 1/19/06, in a piece titled
"A Blowjob By Any Other Name":

It's a wonder so useful a word [as fellatio] was never put in
adjectival form until last Sunday, when it was invented by conservative
gay pundit Andrew Sullivan, the man we love to hate.

He used it in a book review, to indicate the fawning, on-their-knees
way conservative blowhards write about this, the worst of all possible
presidents (save one: Nixon still wins that derby.)

Recency
Illusion alert! This was by no means the first time Sullivan
had used fellatial, and
plenty of other people have used it. In fact, fellatial is one of NINE
attested adjectives in the fellat- family;
it's not even the most frequent in Google webhits, coming in way behind
fellatory, which is the only
one to make it into the OED.

Now to go through these things systematically. First, the fellat- family of words in
English. Then, some reflections on the vocabulary of toadying,
the many uses of the sexual lexicon (or, why fellatial isn't necessarily a
homophobic slur), and the attitudes of gay men toward fellatio (or, why
it's not necessarily hypocritical for a gay man to use this word
disparagingly).

Two words of warning. First, the Nightcharm site offers, in its own
words, "gay porn, blog and naked men pictures". The blog part
brashly covers all sorts of things of interest to gay men, and
potentially to many people. But there's really no avoiding the
gay porn and the naked men, so if it makes you uncomfortable to be
close to this stuff, don't go there. If you're cool with that, or
positively disposed, the "Blowjob By Any Other Name" posting is here. The
accompanying photos are mostly naked French rugby hunks, plus a
Japanese sex doll (female) "looking very fellatial".

Second, at this point I'm going to abandon the elevated register of the
English sexual lexicon ("oral sex", "perform fellatio", "fellate"
above) in favor of the vernacular -- what I think of as "plain talk" --
because I dislike the distancing and shrinking-away effect of the more
technical vocabulary. You might well feel otherwise, but at least
I've warned you about what's to come.

Ok, on to a brief introduction to the fellat-
family. In three dictionaries I respect (and have extremely easy
access to while I'm sitting at my desk at the Stanford Humanities
Center outpost of Language Log Plaza) -- OED2, AHD4, and NOAD2 -- there
are listings for the nouns fellatio
(cocksucking, the act) and fellator
(cocksucker, the actor) and the verb fellate
(suck cock, perform the act). NOAD2 has only these three basic
items. AHD4 has the anglicized variant fellation for the act noun.
And OED2 also gives the feminine actor noun fellatrix (the variant fellatrice is attested, but not in
the OED) and the adjective fellatory,
noting that fellate is a
back-formation and that fellatory
is built on it. (There are also occurrences of fellatiate, built directly on fellatio, or possibly a blend of fellatio and fellate, as an alternative to fellate.)

(A digression. People have occasionally objected to the noun fellation to me on the ground that
the "correct" noun is the Latin fellatio
-- even though fellatio is
already anglicized in pronunciation, to rhyme with ratio, not patio or potty-o. It is true that fellatio came first, and was used
as an unassimilated Latin word in medical or other "scientific"
discussions of sex, and as a coded word in elegant pornography, where
all the racy bits were for some time in Latin or Greek. But some
time ago the word passed into ordinary language, though as part of an
elevated register; fellatio
is the word you use to talk about cocksucking in "polite" contexts that
allow this as a possible topic of discourse at all. At that
point, it's natural to anglicize it fully, like the zillions of other
English nouns in -ation that
trace back to Latin nouns with a nominative singular in -a:tio: and genitive singular in -a:tio:nis, some taken directly
from Latin, some indirectly via French. The noun fellation is, in fact, reasonably
common, though nowhere near as common as fellatio.)

(Another digression. A few people object to the verb fellate on the ground that it's a
back-formation; presumably, fellatory
would be objectionable as a result. Well, there are people who
object to any back-formation they perceive as recent, which is to say,
any back-formation they recognize as one -- but, eventually,
back-formed verbs cease to be seen as innovations, especially if
they're really useful. OED2's first citations for fellate are from 1968 and 1969 --
Updike's Couples and Legman's
Rationale of the Dirty Joke,
respectively -- but I have no doubt that some sleuthing will take the
dating back at least another decade or two, so the verb is not exactly
a recent thing. In raw Google webhits as of 1/30/06, fellate gets 74,900, which is
pretty respectable for an item from an elevated register. In any
case, if you want to talk about cocksucking in an elevated register,
it's hard to do without fellate
as the verb, since the alternatives -- perform fellatio on, perform oral sex on,
copulate orally with, etc. -- are wordy and clunky.)

Back to the rest of the fellat-
family. Attested as alternatives to the actor nouns fellator/fellatrix (fellator is sometimes used of
women, by the way, another sign that we're moving away from Latin) are fellatist and fellationist, though they're much
less frequent than fellator/fellatrix.

Here I point out that every single member of fellat- family that I've mentioned
so far has both literal uses (referring to events in which actual dicks
are in actual mouths) and metaphorical, or figurative, uses, referring
to praising, admiring, pandering, fawning, sycophancy, obsequiousness,
and the like -- acts, relationships, and attitudes in what I'll call
"the toadying domain". Situations in the toadying domain involve
two participants, an ADULATOR and a RECIPIENT
of the adulation, and there are at least three relevant aspects of the
relationship between them: (1) REGARD: the adulator
appreciates, admires, possibly worships the recipient, regards the
recipient highly; (2) DEFERENCE: the adulator shows
deference, submission, or subservience to the recipient; and (3) EAGERNESS
TO PLEASE: the adulator is eager to please the recipient.
All three aspects can vary in degree. Some situations in the
toadying domain show a fourth component: (4) THE ICK FACTOR:
the adulator is willing to do things they find unpleasant or
humiliating in the service of the recipient. Figurative
cocksucking often has a pretty big ick factor.

Some examples:

fellatio:
Anyone who claims that artistic fellatio is not rampant in the arts in
general ... Unsurprisingly the writing is a veritable Johnny Wadd of
fellatio. (link)

fellation: Even I was getting
fed up with the non-stop fellation of Brady and Belichick by
Michaels... [in discussion of 2005 NFL playoff] (link)

fellatist: Wow. After reading
that, I once again have to wonder just what the hell Bush is thinking.
Then I remember Fox is renowned to be a consummate fellatist. (link)

fellate: There was never an
enemy of the US that Klintoon DIDN'T fellate. (link)

And as long as you continue to fellate at least some
of my favorites I'll keep coming back ... Sorry, I can't fellate
everyone's favorite band. Farewell. (link)

it seems that the art world is very insular and
artists merely metaphorically fellate one another while simultanaously
ripping off rich idiots who think ... (link)

fellatiate: Now we have a two
weeks of Packer Luv Orgy on all the networks. I can't wait to hear how
Madden will verbally fellatiate Favre this week. And Berman will go
into some sort of ecstatic wet dream on ESPN about Favre and what a
super human being he is. (link)

On to the adjectives in the fellat-
family. There are nine attested adjectives, four of them with 200
or more raw webhits on 1/25/06:
1. fellatory:
3,390 hits
2. fellatial:
853
3. fellative:
346
4. fellatic:
212
5. fellational:
23
6. fellationary:
20
6. fellationic:
20
8. fellatorial:
5
9. fellatiary:
4
(Not attested, on the web or in newsgroups: fellatoric, fellatorian, fellatoriary,
fellatistic, fellatonic, fellationistic, fellationical.)
People have certainly been creative with English morphology in order to
get an adjective related to fellatio.

The first seven adjectives are all attested in both literal and
figurative uses. The adjective fellatorial
(#8) is attested only in literal uses, fellatiary (#9) only in figurative
uses, but this is probably just a consequence of the small numbers
involved. Some metaphorical examples:

fellatory:
The interviews range all the way from obsequious to fawning to
fellatory. Two of the worst are those with Sylvia Benso and ... (link)

As for "barbaric and backward", well, that pretty
much sums up my attitude toward Europe's fellatory attitude toward
Arab-Muslim tyrants and terrorists. (link)

fellatial: WENNER TAKES ALL
... And we're no longer shocked to find that Wenner's indebtedness to
Clinton translates into fellatial coverage of the president in the
pages of Rolling Stone. And this toadying to a man who expanded the
drug war to new and invidious heights! (Andrew Sullivan's Daily Dish, 2/22/01)

THAT TONY BENN INTERVIEW: Like many former
apologists for Soviet terror, the British lefty, Anthony Wedgewood
Benn, has a soft spot for Saddam Hussein. His interview with the
monster will surely rank high up there in the annals of moral
obtuseness along with Jimmy Carter's fellatial interactions with
various mass murderers. (Andrew Sullivan's Daily Dish, 2/2/03)

In other chess news, World Champion Vladimir Kramnik
now has his own website, just in time for his title defense against
Peter Leko later this month. The site features fellatial sponsor
profiles of the "cosmopolitan" Russian champ and the "ascetic"
Hungarian challenger. (Archives
de Colby Cosh, 9/2/04)

fellative: I give you the
classic Washington mode of the fellative. self-consciously literary
interview instead. The kind Cox would and does ridicule online. (link)

The new Bob Woodward book, Plan of Attack, is out on
shelves, now, to fellative Hosanna by the Times' Pulitzer prize-winning
Michiko Kakutani... (link)

fellatic:
Now that the son of a bitch is dead, the media is, of course, back in a
full fellatic frenzy. How well they remember their beloved position, on
their knees, ... [note extended metaphor] (link)

fellational: Nowhere else in
Blogistan can we find such sensational, fellational Minaya-hyping AND
Boras-flacking posted with such impunity. (link)

fellationary: In addition to
discouraging fellationary interviews with the terrorists who raped
Russian schoolchildren, Putin may have also made a crude calculation. (link)

fellationic:
... thanks to their decades-long uncritical (nay, fellationic) regard
for any Republican whatever, regardless of his actual track record on
Second Amendment ... [about the NRA] (link)

fellatiary: ... the criminal
underuse of Chris Mortensen, and the fellatiary treatment of anything
even remotely connected to Southern California football. (link)

Note the Andrew Sullivan examples of metaphorical fellatial from 2001 and 2003, and
one non-Sullivan example. I don't know when Sullivan started
using this adjective in print (he seems to have some fondness for it),
nor do I know who used it first, and these questions don't much
interest me. All these adjectives seem quite likely to have been
created many times, by different writers; they're all possible English
words, built from the stem fellat-
or from the noun fellation,
using suffixes appropriate for material in the classical stratum of the
English vocabulary. And indeed there are numerous literal uses of
fellatial from before 2001 --
"my novice fellatial powers", "the fellatial arts", "fellatial talent",
"fellatial fanatics", "the sloppy fellatial act", "cunnilingual and
fellatial stimulation", "fellatial facial" (all from 2000) -- including
the expected Lewinsky references, as in this excerpt from a
Virginia Vitzthum piece on Salon,
9/25/98:

Monica and the president explored an
amazing span of fellatial landscape over the course of those nine
"encounters." Monica's immediate eagerness to suck presidential dick
offsets the encounters' one-sidedness and makes her seem less victim,
more vixen.

Now I'm ready to move on to the commentary on Sullivan's swipe at
Barnes. Here's John Podhoretz:

ANDREW SULLIVAN'S ANTI-GAY INVECTIVE

Andrew Sullivan calls my old friend Fred Barnes's admiring book about
President Bush "fellatial." Imagine if someone had used such a word
about an Andrew Sullivan blog item about, say, John McCain. Andrew
would have been OUTRAGED! He would have demanded an APOLOGY! Andrew,
you see, is gay. So any comparison of his rhetoric to homosexual
conduct would be UNACCEPTABLE. But Andrew, being gay, is free to use
slighting sexual references to homosexual conduct when discussing the
rhetoric and ideas of others. Why? Because, in Andrew's eyes, he is
beyond reproach solely because he shares a bed with other men. And Fred
Barnes? Married to a...(I know it's unimaginable) woman. How
contemptible of Fred. Doesn't he know marriage is only for gay people?
UPDATE: Yes, the act Andrew S. analogizes to Fred Barnes's treatment of
President Bush is not exclusively one performed by homosexuals. But
since Sullivan uses the word for a male writer's analysis of another
male, his use of the word "fellatial" therefore has an unmistakably gay
tinge.

(Note Podhoretz's modifier from the very mild edge of the toadying
domain,"admiring". The unsigned review in the 1/28/06 Economist (pp. 81-2) calls the book "gushing",
which is a bit more negative. In my introduction
to this posting I used the stronger "adulatory", taking things further into the
toadying domain. "Worshipful" would have gone a bit further
still. Sullivan goes all the way with "fellatial"; "suck-up"
would have been a bit less extreme. No doubt other writers have
characterized the book with other vocabulary choices from the toadying
domain. "Sycophantic" and "fawning" would not be bad choices from
the fairly negative region of this territory. "Boot-licking" has
a lot of ick factor going for it, and "ass-kissing", "ass-licking", and
"shit-licking" have, in turn, progressively more.)

Now if I understand Podhoretz's position here -- not at all a sure
thing -- he's saying that "fellatial" is a homophobic slur (a piece of
anti-gay invective), period. Presumably because it refers to cocksucking, and the act of
sucking cock is strongly associated with gay men and so picks up the
negative affect that attends homosexuality, especially male
homosexuality; after all, "cocksucker" is an insult, right?

Well no, not really. "Cocksucker" can be used literally, it can
be used metaphorically to mean 'toady', it can be used as an insult
directed at a gay man, it can be used as an all-purpose insult, it can
be used as a taboo-word filler noun, otherwise like "jobbie"("I've got
to get all these cocksuckers washed and dried by 6" -- said of a pile
of dirty dishes), it can be used as an affectionate taboo-word sign of
solidarity ("Any of you cocksuckers got a beer?" -- said by one
straight guy to a bunch of his straight buddies), and probably in other
ways as well. It isn't just one thing; it's a lot of different
things, depending on context. That's the way language works.

Now, Sullivan surely meant to pour on the ick factor, but that doesn't
mean that he takes a generally negative view of sucking cock, or of
cocksuckers, as Podhoretz seems to think Sullivan's use of "fellatial"
commits him to. It would be sufficient for Sullivan to believe
that FRED BARNES would find it unpleasant or
humiliating to suck another man's dick -- and surely he would -- so
that comparing Barnes's writing about GWB to sucking GWB's dick
introduces the ick factor, suggesting that Barnes the adulator would go
even to such lengths to satisfy GWB the recipient.

But, of course, Sullivan's use of "fellatial" will be read --
correctly, I think -- as more generally disparaging, and Podhoretz
seems to take it this way. Sullivan not only shares his bed with
another man, but he undoubtedly sucks his boyfriend's cock (sucking
dick being the most ordinary of sex acts between two gay men, the meat
and potatoes of gay male sex, so to speak), with enthusiasm and
pleasure. But gay men (like Sullivan and me) don't suck cock to
show regard or deference, but because cocksucking pleases us (as well
as our partners); this is literal, not metaphorical, cocksucking.
In addition, cocksucking is not some unalloyed good thing, independent
of context. Gay men are not interesting in dick, any dick, every
dick, any time or place; literal cocksucking can be accompanied by a
considerable ick factor. The idea of sucking off GWB is deeply
repellent to me, as I'm sure it is to Sullivan, and that repulsion
carries over from the literal sphere to the metaphorical one.

A tale from my sexual life... My first boyfriend found kissing
other men -- me, in particular -- enormously pleasurable, and I
reciprocated, passionately. Yet he once described an event he
found decidedly unpleasant as "like kissing Richard Nixon". (You
will see how long ago this was.) Instant ick. It wasn't kissing
men, period, that was the problem, but the details of the event.
(Sullivan could have characterized Barnes's book as lavishing kisses on
GWB, and that would have worked, but it wouldn't have been as powerful,
simply because, as people see such things, sucking cock is a much more
intimate act than kissing.)

So far: "fellatial" isn't necessarily a
homophobic slur, and
it's not necessarily hypocritical for a gay man to use this word
disparagingly. I turn now to James Wolcott's critique of
Podhoretz. Here's the bit I want to focus on:

"Gay tinge" is a rather prissy phrase
on Podhoretz's part, as if
Sullivan were trying to slip by a sly innuendo. There's no need to be
sly. I won't presume to speak for Sullivan, but it's clear that there's
a homoerotic ardor for Bush by neonconservatives that bypasses reason
and reduces them to hero-worshipping mush.

My problem here is with "homoerotic". We seem to have moved from
literal "fellatial" to figurative "fellatial" 'servile, etc.' back to a
more literal use, imputing homo-desire (though without actual
cocksucking). But this isn't really about language; it's about
relationships between people. Wolcott is connecting an adulatory
relationship to homo-desire, a connection that someone could make
regardless of what vocabulary is used to describe the adulation.
But why would anyone make that connection?

I can see two contributions towards making this connection. One
is very general in the modern world. Since Freud, we have come to
appreciate the significance of the erotic in our lives. But that
has led many people to see sexual desire in virtually every kind of
relationship between two people. For them, sex is always part of
the story. While not denying the importance of sexual feelings
(after all, I write sexually explicit memoirs of my life and
pornographic fiction and analysis of the fantasy world of gay male
desire, and I create pornographic collages), I resist the idea that
they're the mainspring of social life. There are many other,
equally important, factors that organize human relationships:
affiliation, physical contact, nurturance, power, play, mentoring,
respect, and more. These can, of course, co-occur with sexual
desire, but they need not. I respect many of my colleagues, but
(in general) I don't desire them sexually. (I feel reasonably
assured in saying this, since I'm exceptionally well in touch with my
inner sexpig.) Fred Barnes respects and admires GWB, but that's
no reason to think he has the hots for him.

The other contribution is a sense of bafflement that many of us -- I am
one -- have over the respect and admiration that some people (like Fred
Barnes) have for GWB. We wonder: how could anyone have such
regard for someone who is so transparently unworthy of it?
And so we cast about for explanations other than an appreciation of
GWB's merit. Stupidity and gullibility are two
possibilities. A desire for a strong authority figure is
another. The hope of advancement is yet another. No doubt
there are other possibilities. Meanwhile, especially if you see
sex in all relationships, desire is always available as an
explanation. So you end up discerning homoeroticism. I
think this is just silly. And annoying, because it trivializes
the enormous power of homoerotic desire, for those of us who experience
it. (Well, some gay men find that consequence attractive, since
trivializing homoerotic desire means normalizing it: look, ALL
guys desire other men, so there's nothing special about me! Spare
me.)

Fascinating as all this is, none of it's about language. So let's
return to language, with John Calendo's (tongue in, um, cheek) proposal
in Nightcharm for a
definition of the "new word" fellatial:

fellatial (fel-lay-shel) adj. 1. Of or suitable
for a blowjob. 2. Of the nature of blowjobs, servile,
fawning, with involvement of the mouth in a hoovering motion.
3. Ready to suck off those in authority, usually in exchange for
favors, prestige or political appointments. 4. The way
things work in Washington.

Ok, you knew it, I'm going to object to the claim that it is in the NATURE
of blowjobs to be servile and/or fawning. I'm not going to
lecture here on the complex and varied emotional pleasures of sucking
cock for a gay man (though I have written at some length on the topic
in the newsgroup soc.motss over the years), though I will note that for
a lot of gay men it makes a big difference whether the cock you're
sucking belongs to a gay guy or a straight guy (straight guys can be
problematic in a number of ways, including the strong possibility that
they will understand your blowjob as an act of servility, whatever you
might think about it; on the other hand, some gay men positively desire
straight cock, on the basis that straight guys are "more masculine"
than gay guys), and that in any case though serving another man (not
servility) can be one of those pleasures on some occasions, it's often
a minor component, and may be entirely absent. In fact, both in
gay porn and in real life, the man enthusiastically taking the dick may
understand the event as one in which the man providing the dick is
serving HIM, by providing a cock for him to enjoy; in
my experience, this is especially common for cocksuckers who generally
identify themselves as "tops", in two ways: they like to be in charge,
to run the show, and they fuck guys but don't get fucked
themselves. The world of sexual emotions and relations is
astonishingly rich.

Calendo's dictionary entry moves quickly from the neutral (definition
1) to the negative in tone (all that follows) and thus mirrors what has
long been a view of cocksucking -- the act -- as perverse, dirty, and
abnormal. I'm fighting that view by talking about it in positive
and joyous ways. Meanwhile, young Americans seem to be
increasingly configuring it as routine and not perverse, in fact not
really sex at all. The January/February 2006 issue of the Atlantic Monthly has a review (pp.
167-82), by Caitlin Flanagan, of one nonfiction book, two young-adult
novels, and a television show, all treating adolescent sex.
Flanagan notes "the genuine and perplexing rise of oral sex among
teenagers--specifically of oral sex performed by young girls on boys"
(p. 173). Their parents are horrified, of course.

Once again, we've moved from words to acts, and there's not a lot of
work for a linguist to do, qua linguist. As a final reward,
though, here's the delightful AHD4 account of the history of the word toady:

The modern sense... has to do with the
practice of certain quacks or charlatans who claimed they could draw
out poisons. Toads were thought to be poisonous, so these
charlatans would have an attendant eat or pretend to eat a toad and
then claim to extract the poison from the attendant. Since eating
a toad is an unpleasant job, these attendants came to epitomize the
type of person who would do anything for a superior, and toadeater (first recorded 1629)
became the name for a flattering, fawning parasite. Toadeater and the verb derived from
it, toadeat, influenced the
sense of the noun and verb toad
and the noun toady, so that
both nouns could mean "sycophant" and the verb toady could mean "to act like a
toady to someone."

Written communication plays an integral role in modern social, economic, and cultural life. Writing facilitates the transfer and preservation of information and ideas. However, without direct access to facial expression, body language, and voice inflection, the potential for misunderstanding written communications is considerable.

The application seems to focus on the means of entering the emoticons, rather than the emoticons themselves:

A method and system for generating a displayable icon or emoticon form that indicates the mood or emotion of a user of the mobile station. A user of a device, such as a mobile phone, is provided with a dedicated key or shared dedicated key option that the user may select to insert an emoticon onto a display or other medium. The selection of the key or shared dedicated key may result in the insertion of the emoticon, or may also result in the display of a collection of emoticons that the user may then select from using, for example, a key mapping or navigation technique.

Still, this is my nomination for the lamest patent application ever. Have these people no sense of decency?

The article in The Register links to

"... the most delightful commentary ever written about the practice of using emoticons, Geoffrey Nunberg's celebrated radio piece A Wink is as Good as a Nod - where Nunberg imagines the literary greats employing the technique."

[Update #2: Several readers have written to emphasize the point that I made above, though perhaps not strongly enough, that Cingular is not seeking a patent on emoticons as such -- contrary to the implication of the quote from The Register -- merely a patent on any method for entering emoticons via special keys, key sequences or menus. The lameness of the application, in my opinion, is due to the unprecedentedly high "duh factor" of this "invention", not because emoticons have been around for a quarter of a century or more. I expect that there is some prior art for keyboard shortcuts and the like for emoticon entry -- but even if there weren't, no one should be able to patent something as obvious as this. It's like asking for a patent on the idea of putting portable mp3 players in a shopping bag so that purchasers can carry them home more easily.

In Cingular's defense, they may feel driven to such silly gestures by the extraordinary pliability of the USPTO in the hands of "patent trolls", such as Acacia Media Technologies. That is, Cingular's aim may not be to derive revenue from licensing their simple-minded "invention" to others, but rather to protect themselves against having to pay ransom to the likes of Acacia for the right to do business in the obvious way.]

Jeopardy! strikes the wrong tone

The game show Jeopardy! has something of amixedrecord when it comes to language-related clues. Last night's installment had a whole category on "Language," which was rather unremarkable except for the $400 clue:

In the Kootenai tongue, a word's pitch changes its meaning, as in this most widely spoken world language.

This clue follows a typical formula for Jeopardy!, where the wording may refer to something quite obscure even though the correct response is nice and obvious. In this case the requisite obscurity is Kootenai, also known as Kootenay, Kutenai, or Ktunaxa, a nearly extinct language isolate still in use by only a handful of speakers in southeastern British Columbia, northern Idaho, and northwestern Montana. I checked online and could find no references to Kootenai being a tonal language, and fellow Language Logger Sally Thomason verified that Kootenai is indeed not tonal.

So what's the deal? Did the usually meticulous clue-writers have some other tonal language in mind? My best guess is that they confused Kootenai with one of the many Athabaskan languages that are tonal. The most likely candidate for confusion is Gwich'in, also known as Kutchin or Kootchin. I can imagine the show's researchers consulting a list of North American languages, with Kutenai/Kootenai listed right after Kutchin/Kootchin. Of course, it didn't matter what exotic-sounding language was mentioned, as long as it served to spice up an otherwise uninteresting clue about Mandarin tonality.

January 29, 2006

Dubious quotation marks

Gratifying though it is to see myself quoted in print, I'm peeved to
see myself represented as using quotation marks for emphasis.
Like, 'for emphasis', meaning for
emphasis. But that's what happens in Leslie Savan's Slam Dunks and No-Brainers, chapter
2 ("Pop Talk is History"), in the section (pp. 33-4) titled "Who needs
Esperanto when you've you got Coca-Cola?" I'm not entirely sure how this happened. More interestingly, this case illustrates an issue in
the mention (rather than use) of linguistic material, including
quotation: faithfulness vs. well-formedness (shades of OT!).

Here's the whole quotation (with two bits boldfaced that were not
boldfaced in the original):

Coca-Cola is so ubiquitous that it's not always considered
American. The Stanford University linguist Arnold Zwicky recalled
how, about thirty years ago, his wife, Ann Daingerfield Zwicky, "was
teaching an ESL [English as a Second Language] class at Ohio State and
used one of her ice-breaker topics: words borrowed into English from
your native language. Alas, this time the first word offered was
'Coca-Cola,' by (I believe) a speaker of Hindi. An Arabic speaker
... denied this with scorn; 'everybody'
knows 'Coca-Cola' is an Arabic word. Pandemonium ensued.
Even a female student from Japan, normally silent in class, was moved
to dispute the others' absurd claims. The only thing they were
agreed on was that the idea that the headquarters of the Coca-Cola
Company could possibly be in 'Atlanta'--or
anywhere in the U.S.--was preposterous (or evidence that America just
grabbed everything away from the rest of the world)."
[p. 278: from his post to the ADS and an e-mail
interview, April 2000.]

Now, there are several ways in which this version differs from what I
originally wrote. Point 1: I used double quotes on "Coca-Cola";
these have now been replaced by single quotes, because they're inside a
quotation from me that Savan has enclosed in double quotes. Point
2: Most of my rapid writing is all lowercase, but this has been altered
to conventional capitalization. Point 3: I originally typed
... was "coca-cola", by ...
with quot-punc order, and this has been altered to
... was 'Coca-Cola,' by ...
with punc-quot order. In all of these cases, Savan (or, more
likely, her editors) opted against faithfully reproducing what I wrote,
in favor of conforming to a style sheet different from the one I
prefer. Well-formedness trumps faithfulness.

The boldfaced words were originally typed inside asterisks, to indicate
emphasis in text that sticks to ASCII characters:
... *everybody* knows "coca-cola" is ...
... could possibly be in *atlanta* -- or anywhere ...
The equivalent in handwriting would be underlining; in print, usually
italics, or possibly boldface or small caps, depending on your style
sheet. But not any kind of quotation marks (single or double,
smart or plain). Emphatic quotation marks are usually mocked as
an illiteratism; but in any case, they aren't standard. Yet I
have been represented as using them. I feel sullied, and frankly,
I'm puzzled as to how this happened; either Savan, or someone at Knopf,
apparently thinks this is an ok way to indicate emphasis.

The larger point -- the conflict between faithfulness and
well-formedness in linguistic mention -- is a gigantic one. I
originally started a Language Log posting on the topic back during the
discussion of taboo words in titles of books and movies, but it quickly
bloated up horribly. But for fun, here's an unsubtle example
(there are subtle and complex ones) that also provides a little
homework assignment for the more enthusiastic readers:

Of the two major political parties in Britain, one is known there as
the Labour Party; the -our
spelling is the British one. Here in the U.S., the party is
referred to in print (in political stories in the New York Times, for instance) as
the Labor Party, with the American -or
spelling. Once again, well-formedness, in this case conformity to
the local spelling conventions, trumps faithfulness: references to the
party are re-spelled.

Now, the homework assignment, in two parts: to find American writing
with Labor within a quotation
(referring to the political party, of course) that is then itself
quoted in a British publication, like the Economist or the Guardian; and to find British
writing with Labour within a
quotation that is then itself quoted in an American publication, like
the New Yorker or the New York Times. Are
references within quotations re-spelled? (The earlier examples
were outside of quotations, in the main text.)

For extra credit, look for occurrences of British Labour in book titles that are then
cited in American footnotes or bibliographies, and the reverse:
occurrences of American Labor
in book titles that are then cited in British footnotes or
bibliographies.

General observation: well-formedness tends to trump faithfulness, but
not always.

Language Log final exam

For those of you who are taking Language Log for credit in the fall
semester of 2005 and want to make progress toward your Diploma, you've
had your January reading period (we are on the Harvard-style semester
system, as you know), and it is now time for your final exam. No
cheating; independent work only; essay-style answers.
We will be judging
you on neatness, originality, coherence, clarity, knowledge of elementary
linguistic terminology and conceptual distinctions, and of course creative
ranting. Your exam follows below. Submit the answers to your favorite Language Log contributor as usual, enclosing a stamped addressed
envelope and a bottle of single malt scotch whisky.

LANGUAGE LOG FINAL EXAM, FALL SEMESTER 2005

Read this piece from
The
New York Times Real Estate section and answer the questions that
follow below.

The yearning for a smooth transition from the surging [real estate]
market is seen in the increasingly frequent use in the last six months
of the phrase "soft landing."

"Soft landing is everyone's big hope," said Paul JJ Payack, president
of the Global Language Monitor (languagemonitor.com), which analyzes
language trends and their impact on politics, culture and business.

Mr. Payack, who graduated from Harvard with a bachelor's degree in
comparative literature, calculated the popularity of some 36 buzzwords
chosen by a reporter. He used his Predictive Quantities Indicator, or
P.Q.I., an algorithm that tracks words and phrases in the media and on
the Internet in relation to frequency, contextual usage and appearance
in global media. It is a weighted index that takes into account
year-to-year increases and acceleration in the last several months.

Among the market buzzwords he ranked, "soft landing" and "pause" had
the highest P.Q.I.'s. They were ranked first and second respectively,
while the more ominous sounding "housing bubble" ranked seventh. "
'Pause' is another one of these hopeful things," Mr. Payack said.

(Mr. Payack can also verify that "O.K." is the most frequently spoken
word, that "outside the mainstream" was the top phrase of 2005 and
that as of Jan. 26 at 10:59 a.m. Eastern time, the number of words in
the English language was 986,120.)

Discuss some reason why the text frequency of buzzwords might
not tell us much about anything economic.

Give the definition of the word "algorithm", and consider what
might be meant by saying someone has an algorithm "that tracks words
and phrases in the media" and so on. Compare with other gee-whiz
locutions about computing, like "We ran him through the computer" in
police procedural shows on TV and the like.

Construct three or four interpretations of what Mr Payack might
have meant by saying "'Pause' is another one of these hopeful things",
and discuss them critically.

Explain in detail why it is patently stupid to try and say
exactly, down to a single word, how many words the English language has at a given instant, and why one would have to be a
moron to think that a figure of about a million was right anyway.

Rant a little about how silly this all is and how journalists
really need to develop a bit more skepticism and a bit more
knowledge about language than the typical 9-year-old has, and
so on and so forth.

The cran-morphing of -dango

On his blog Evolving
English II, Mike Pope (aka "WordzGuy") reflects on the name of a
new job-searching website for the Pacific Northwest: Jobdango, evidently
inspired by the movie ticketing service Fandango
(rather than the Spanish dance):

They've broken off -dango and used it
to mean, I am guessing,
something
like "place where you get something":

Fandango = place where you get movie tickets
Jobdango = place where you get jobs

The flaw in this theory is that Fan-
doesn't map cleanly to "movie
tickets." But who says that the -logy
part of etymology has to mean
"logic"? Not I.

Pope has previously blogged about similar breakaway segments, such
as -zilla
and -palooza.
This sort of reanalysis relies on what linguists sometimes call "cranberry
morphemes." The segment cran-
in cranberry is opaque,
though it looks like it's a modifier for the transparent morpheme -berry. Indeed, cranberry was only ever fully
transparent in the Low German dialects from which the term was
borrowed, where it was kraanbere
or 'crane-berry.' Since English underwent the Great Vowel Shift, the
semantic connection between the cognate forms cran- and crane has been lost. But the
opacity of cran- has allowed for
a reanalysis of the morpheme to "stand for" cranberry in new compounds like cran-grape and cran-raspberry. Such cran-morphing
has yielded many productive suffixes in the 20th century: -burger, -(o)holic, -(o)rama, -(a)thon, -(o)mat, -(o)nomics, -gate, etc. (In the case of -burger, the new morpheme quickly
became lexicalized as the standalone burger.)

Though Pope was unable to find many other examples of the
cran-morphing
of -dango (besides an
airbrush template with a flame pattern named "flame-dango"), he missed
one obvious predecessor: fundango.
Among the tens of thousands of Googlehits for "fundango" or "fun-dango"
are a company promoting online activities for
children, a circus
festival, and a juggling
convention. All three of these operations are based in the United
Kingdom, but fundango also
has a long history in American English as a fanciful name for a fun
activity. The earliest examples I've found in digital newspaper
databases date to 1961. On Feb. 26 of that year, the Los Angeles Times
ran a photo feature in its TV section with the headline "Fun-Dango,"
about a flamenco act appearing on NBC's "Galaxy of Music." That
obviously is
still connected to the terpsichorean sense of fandango, but the same can't be
said for this citation, from an article about Missouri osteopaths
attending the annual convention of the
American College of Osteopathic Surgeons in Denver, Colorado:

(Who says osteopaths don't know how to have fun?) A couple of years
later, Chicago Tribune entertainment writer Herb Lyon began using fundango, as in his "Tower Ticker"
column of Apr 29, 1963: "Bob Hope will take up most of Johnny Carson's
Thursday night TV fundango." These examples rely on the preexisting
sense of fandango as a bit of
tomfoolery, popular in American English since the 19th century (along
with the apparently related form fandangle).

The switch from fandango
to fundango only
requires changing one
vowel, but it may have opened the door to the reanalysis of -dango as a bound morpheme. (As Pope
suggests, the use of the name "Fandango"
as a ticketing source for movie fans
also contributes to the reanalysis, particularly as an analogical
basis for a service like Jobdango.) A similar phenomenon occurred with
the cran-morphing of -tastic,
which began with the easy shift from fantastic
to funtastic. I've found
examples of funtastic all the
way back to 1939 in Jimmie Fidler's
syndicated column, "Fidler in Hollywood," as in these three citations:

By the 1960s, X-tastic had
become a
productive formation for US advertisers. A quick scan of
the newspaper databases turns up shoe-tastic
(1966), carpet-tastic (1966),
fang-tastic (1968), shag-tastic (1969), swim-tastic (1970), and so forth.
(During the NFL players' strike of 1987, David Letterman had a Top Ten
list called "Top 10 Slogans of the Scab NFL" — the number one slogan
was "It's scab-tastic!") Since the '90s, the suffix -tacular
has begun to rival -tastic,
though they often attach to the same root (craptastic and craptacular each return hundreds of thousands of Googlehits.)

One quick measure of the relative success of cran-morphemes in
contemporary online discourse is to see how often they get attached to
that promiscuous blend component, blog-
(as in blogosphere, blogorrhea, blognoscenti, etc.). Google
currently finds about 93,700 pages with blogtastic, 39,200 with blogalicious, 15,800 with blogeriffic, 984 with blogapalooza, but only 89 with blogdango. And most of those 89
don't count, since they refer to the Japanese website Blog Dango, which I'm assuming
is a blog about rice dumplings.
Filtering those out leaves just three examples: one about bloggers
"getting ready to trip the light blogdango" (playing on Procol Harum's reanalysis
of the old expression trip the light
fantastic), one
suggesting "Blogdango" as a blog-related project for Kevin Costner
(playing on his 1985 movie "Fandango"), and one
joking about calling "1-800-BLOGDANGO" (playing on the Fandango movie
ticket service). So clearly -dango
has a long way to go to catch up with its crantacular morphoriffic
colleagues.

[Update: Grant Barrett suggests that the reanalysis of fandango as fan + dango has been encouraged by the Fandango ticket service ever since they began screening promotional ads in Sony/Loews theatres featuring paper-bag puppets. In one early promo, one of the puppets says, "I know 'fan' means for the fans, but I don't know 'dango'...what that means." Grant speculates that this widely viewed advertisement "might be a reinforcer of any cranalytical neologizing of 'fandango' now going on."]

Rare in the same sentence

What I immediately noticed about Ben Zimmer's
post about hiphop research was that the
claim that "It's rare to use the words ‘hip hop’ and ‘serious
academic research’ in the same sentence" is
one more case of someone writing for the press (this time a
publicist rather than a journalist) deliberately and quite pointlessly
disguising a topic of discussion as linguistic when it isn't.
Journalists are drawn to irrelevant but checkable corpus-content statements
like moths to a flame. They'll write that some word is
always
accompanied by the qualifier such-and-such, or that some other word is
invariably
followed by the phrase thus-and-so, and they'll never try to substantiate these claims, which are in any
case nothing to do with what they want to discuss.

Sometimes the linguistic claims are screamingly, massively false.
But I'm not really interested in the question of the truth of the present rarity claim. It took about ten seconds to find the sentence
"I would also like to express my appreciation for Brother [Cornell] West,
who, with the recording of his hip-hop albums, showed me that, even in
the midst of academia, non-scholastic pursuits don't have to be put on
hold for serious academic research" at
Generally
Awesome, but that's not the interesting thing.
The interesting thing is that although it contains
the two phrases in question,
it very clearly implies by what it says that hiphop does
not have anything to do with academic research. The mere
presence of the two phrases has nothing to do with how common it is
to find academics studying hiphop.
Why would
it ever seem sensible to a journalist or publicist to take a claim like
"Serious academic research on hiphop is extremely rare" (ravingly false,
but that's not my point) and transmute it into a claim with
different subject matter, about frequency of co-occurrence
of phrases, which the journalist or publicist in question knows nothing
about and has no interest in, but we at Language Log can so
easily fact-check? It's very odd.

January 28, 2006

The dissing of hiphop linguistics

On the anthroblog Savage
Minds, Kerim Friedman takes note of a recent press release
from the University of Calgary under the title "Hip hop and
linguistics: you ain't heard no research like it":

It's rare to use the words 'hip hop' and
'serious academic research' in
the same sentence, but a University of Calgary linguistics professor
has relied on rap music as source material for a study of African
American vernacular English.

Dr. Darin Howe recently contributed a book chapter that focuses on how
black Americans use the negative in informal speech, citing examples
from hip hop artists such as Phonte, Jay Z and Method Man. Howe is
believed to be the only academic in Canada and one of the few in the
world to take a scholarly look at the language of hip hop.

As Friedman remarks, a little basic fact-checking would have helped
here. There's been plenty of serious academic research on hiphop,
including linguistic research, for quite some time now. Friedman
quickly Googled up a bibliography
of hiphop scholarship compiled by John Ranck of Simmons College, to
which I'd add the even more extensive bibliography maintained at the Hiphop
Archive website.

Linguistic research on rap lyrics and hiphop culture more broadly is
certainly nothing new. The founder of the Hiphop Archive, Stanford
communications professor Marcyliena
Morgan, has been writing about hiphop from a sociolinguistic
perspective since at least 1993 (in a paper presented at the American
Anthropological Association annual meeting, "Hip Hop Hooray!: The
Linguistic Production of Identity"). Geneva Smitherman
of Michigan State, author of Talkin
and Testifyin: The Language of Black America (1977) and Black Talk: Words and Phrases from the
Hood to the Amen Corner
(1994) also has an extensive list of hiphop-related publications.
Dissertations on the language of hiphop have been produced since at
least 1997 (Jon
Abdullah
Yasin's thesis at Columbia University, "In yo
face! Rappin' beats comin' at you: A study of how language is mapped
onto musical beats in rap music"). Newer scholars include Samy Alim and Cecilia
Cutler, both of whom were involved in the recent PBS documentary "Do You Speak American?" (Alim on "Hip
Hop Nation Language" and Cutler
on the "crossing" performed by white suburban teenagers using hiphop
talk).

As for whether Howe is the "only academic in Canada" studying hiphop
language, I sincerely doubt that. He's certainly not the first. Though no longer affiliated with a
Canadian university, Awad
Ibrahim wrote his 1998 dissertation at the University of Toronto on
language-learning by Francophone African youths at a Toronto high
school. He found that a crucial aspect of their learning process was
the acquisition of Black English through hiphop, which assisted them in
"becoming black." (See also Ibrahim's contribution to Black Linguistics:
Language, Society and Politics in Africa and the Americas, based on
his doctoral work.)

The parts of speech

Luckily,
the late Peter Ladefoged had a really good sense of humor, and as he sits
in the faculty club at the University of Heaven and reads the AP story
about his death, I'm sure the club booms with his rich
laughter. The first sentence
of what the Associated Press put on the wires (which appears in, for
example, the San Jose Mercury News today) says:

LOS ANGELES - Peter Ladefoged, a UCLA linguistics professor emeritus
who made it his life's work to record the parts of speech used in
human languages, has died.

But (although it is easy to see how this howler might arise)
the parts of speech are not in fact the parts of your body
that you use for speech. More journalistic ignorance of even
the most absolutely basic notions of linguistic science. Sigh.

"Parts of speech" is an old-fashioned name for lexical categories
— classes to which words with similar grammatical properties belong,
e.g., noun, verb, adjective, adverb, preposition. Categories of this
sort are in fact the key element that lifts grammatical description up
to a level of abstraction where you are not talking about speech,
you are talking about higher-level units to which various grammatically
equivalent small stretches of speech can be treated as belonging. The
job of a phonetician is to describe in minute detail the speech sounds
themselves; so classifying into parts of speech (lexical categories) is
exactly what the phonetician must never do, in his capacity as
phonetician. And if you will pardon my being a little irritable (I'm
sorry, but a friend of mine recently died), I do think we have a right
to expect better from the AP than this. What they have done is like
writing up Einstein's demise as the passing of a chemist. It is rank
ignorance. Journalists just don't know anything about the language
sciences, but instead of asking they just write nonsense. Peter deserved
better than to have his passing commemorated with an embarrassing goof
that any of the students that he taught
in UCLA's excellent Department of
Linguistics could have fact-checked.

January 27, 2006

Peter Ladefoged

I'm confident that I speak for the entire profession when I say that we are all deeply saddened by Peter Ladefoged's passing this week at the age of 80. Here are some links to find out more about this extraordinary phonetician.

The late Peter Ladefoged

I'm just not ready to write an obituary for
Peter Ladefoged,
whose death I just learned of today. But I will just say a word here about my
own grief at the death of this man, a good friend and University of California colleague, who at his death was the most distinguished and
important phonetician in the world and the active holder (at the age of 80)
of one of the largest NSF grants ever given for pure linguistics research. He was
loved by everyone who knew him. His works
dominated the field — I have taught phonetics out of Ladefoged
texts since 1982, and I treasure my copy of The Sounds of the World's
Languages. We plan to simply steal the title of the latter book as
the name for a new freshman course at UC Santa Cruz in 2006-2007, and I
know Peter would have been very pleased. He was a fine raconteur, a
tireless investigator of languages, a pioneer in archiving and digitalteaching aids, an original thinker, a pillar of the International Phonetic Association, a true gentleman, a wonderful human being. And he
had this deep, dark, rich British voice, part James Earl Jones, part Christopher Lee. It is a very sad thought that never again, when I call the UCLA Phonetics Laboratory that he founded, will I hear that voice saying, "Peter Ladefoged here." All those of us who knew him will miss him so much. Those who know nothing
much about phonetics but would like to could learn a great deal about it by consulting his relatively
popular book Vowels and Consonants (2000; ISBN 0631214127).

Bring the bling

After this brief exposure to linguistics, it seems to me that linguists are science-minded persons, who like words more than numbers, and are too nice to want to be lawyers.

After giving some thought to this hypothesis, I've concluded that it's roughly true, except that the sorting process is an imperfect one, and a certain number of people who should have been lawyers have ended up as linguists. Perhaps the errors are statistically symmetrical, and a certain number of linguists have also ended up as lawyers.

David's kind observation about the niceness of linguists, in any case, is a sort of rhetorical head-fake to set up a carefully-worded complaint. Linguists may be too nice to be lawyers, he concedes, but

Like lawyers, however, they apparently do tend to take liberties when describing the positions of others. Thus, where I said I was surprised, Benjamin says I am "shocked." Where I merely gave a prominent example, he says I am "troubled."

One thing for sure, I bet Benjamin and Mark would be quite annoyed, if someone wanted to permanently call their weblog a "bling", merely because weblogs by linguists are so unique.

We're certainly as unique as they come. But speaking for myself, I don't think I'd be annoyed by someone's desire "to permanently call" Language Log a "bling", though perhaps I don't adequately understand how that desire would impinge on me.

Meanwhile, Denise Howell at Bag and Baggage, who invented the term blawg, came up with two nifty titles for a comment on Ben's post -- "Pain In The Low Back" and "Better Than A Stick In The Eye-Dialect" -- before deciding on "I, Sandwich Dominatrix". All three titles are hilarious, in a quiet word-nerdy sort of way, but it'll spoil the jokes to explain them, so you'll have to read the sequence of posts.

This back-and-forth between law and linguistics reminds me, for some reason, of Walker Percy's proposed solution to the perceived problems of teaching poetry and biology in a way that allows "a student who has the desire to get at a dogfish or a Shakespeare sonnet [to salvage] the creature itself from the educational package in which it is presented". He describes two methods that he rejects as impractical -- catastrophe and apprenticeship -- and concludes that

since neither of these methods ... is pedagogically feasible ... I wish to propose the following educational technique which should prove equally effective for Harvard and Shreveport High School. I propose that English poetry and biology should be taught as usual, but that at irregular intervals, poetry students should find dogfishes on their desks and biology students should find Shakespeare sonnets on their dissecting boards ...

January 26, 2006

Surprising crocodile kin

It's great having a brother who's a noted science
writer,
especially one who's a fellow blogger.
Today Carl Zimmer's blog ("The Loom") has an entry
about his New
York Times article describing a fascinating new paleontological
discovery: the fossil remains of an ancient reptile related to modern
crocodiles and alligators, with a body much like a dinosaur. What's surprising is that the fossil, named Effigia okeeffeae,
dates to 210 million years ago, or about 80 million years before
dinosaurs evolved similar bodily structures.

In the comments
section for the blog entry one can find knowledgeable discussion about
whether Effigia should be considered an "ostrich mimic mimic mimic." (This relies on the peculiar sense
of "mimic" used by paleontologists, which is evidently applied when a
newly discovered fossil resembles a previous discovery — thus
when early ostrich-like dinosaurs were found, they were dubbed
"ornithomimids," or 'ostrich
mimics.') But what caught my eye was a comment about the headline of
the Times article: "Fossil Yields Surprise Kin of Crocodiles." A
commenter known as "Clueless" wrote:

When I saw the headline, I was wondering how a
fossil yield could
surprise crocodiles (or their kin), and it took a few moments to figure
out what it was intended to mean. Does the author have any control over
the headline, or is it completely up to the editors at the newspaper?

To answer the commenter's question, journalists rarely if ever have
control over the headlines that are put on their articles, much to the
chagrin of writers who wake up to find their painstaking work undercut
by a misleading headline. In this case, the headline wasn't factually
misleading, only syntactically so. It's a great example of the kind of
ambiguous sentence that teachers of introductory syntax classes often present to their students (like the old standby, "I hate visiting
relatives"). If this were a diagramming exercise in Syntax 101, the
students would have to come up with phrase-structure trees to account
for the structural ambiguity:

The ambiguous reading hinges on whether "yields" is understood as a
noun or a verb. Once a reader decides to parse "yields" as a plural noun (with
"fossil" understood as an attributive modifier), then the garden path
has been established. The unusual headlinese of "surprise kin" further encourages the alternate parsing.

A similar ambiguous headline occasionally gets hauled out for the
amusement of linguistics classes: "British
Push Bottles Up German Rear." Again, the key to the battling
interpretations is whether a single word (in this case "push") is
parsed as a noun or a verb. I always figured that this headline was
apocryphal (one also sometimes sees "French" in place of "British").
But I've seen tworeferences
online that say there was an actual headline from World War II
along these lines, evidently reproduced in Fritz Spiegl's What The
Papers Didn't Mean to Say (1965). The headline given in Spiegl's
book reads: "Eighth Army Push Bottles Up German Rear." For American readers
this isn't quite as elegant as using "British" or "French," since the
ambiguity of Spiegl's headline requires construing "Eighth Army" as
plural. That's not a problem for British readers, but in Americanusage
so-called "collective nouns" typically take singular verbs. The ethnonyms "British" and "French," much like "Chinese," can be construed as plural and thus lend themselves to ambiguous readings.

(Another variant on the headline offered by the author Terry Pratchett is "Russian Push Bottles Up German Rear." That doesn't work nearly as well, since the noun "Russian" can only be construed as singular and thus doesn't agree with the verb "push" — unless, of course, one reads "Russian" as a vocative and "Push Bottles Up German Rear" as an imperative. Ouch.)

In the State of Ohio in the United States, what do local residents call themselves? Ohioese? Wrong. Ohioan. In Toronto, Canada, the people there call themselves yes, you guessed it Torontonian. Never Torontonese.

Not enough to make you feel superior should you fall into Group I, or inferior if you unfortunately happen to be in Group II? Let's look at the Longman Dictionary of Contemporary English, 1978, for the definition of "-ese": suffix, 1. (the people or language) belonging to (a country); 2. (usually derogatory) literature written in the (stated) style. Examples: Johnsonese; journalese.

Or MSN Encarta Dictionary online: ... 3. The style of language of a particular group (disapproving). Example: officialese. [Via Old French -eis; Italian -ese]

He continues the argument:

The English-speaking founding fathers of Singapore were well aware of the subtle significance behind the "-ese" and "-an" distinction, and opted for Singaporean when the nation became independent in 1965.

India has a different story. The Indians stemmed from Europe. Europeans saw Indians as relatives. You wouldn't want to use harsh descriptions for your relatives, would you?

The same is true of Central and South Americans, who are cousins of North Americans and Mexicans.

You may ask: What about the Portuguese, also Europeans? Well, a few hundred years back, Portugal was a powerful nation warring fiercely with other major European countries for resources in overseas colonies, and was victimized by being hated and looked down upon by their European rivals.

and concludes:

In the 21st century, the world has evolved into an era when racial discrimination is not tolerated. It is time the names in Group II were abolished.

I don't know the history in detail, but I believe that the development of the derogatory suffix for writing or speaking styles followed, rather than preceded, the use of -ese for adjectival forms of toponyms. That's what the OED says:

A frequent mod. application of the suffix is to form words designating the diction of certain authors who are accused of writing in a dialect of their own invention; e.g. Johnsonese, Carlylese. On the model of derivatives from authors' names were formed Americanese, cablese, headlinese, journalese, newspaperese, novelese, officialese, etc.

The earliest citation for this development is from 1898:

1898 F. HARRISON in 19th Cent. June 941 As Mat Arnold said to me..‘Flee Carlylese as the very devil!’ Yes! flee Carlylese, Ruskinese, Meredithese, and every other ese. 1899Golf Illustr. 14 July 134 American ‘golfese’. 1906Daily Chron. 2 Aug. 3/2 Deplorable guide-bookese.

As for the story of the affix itself, the OED gives it this way:

forming adjs., is ad. OF. -eis (mod.F. -ois, -ais): -- Com. Romanic -ese (It. -ese, Pr., Sp. -es, Pg. -ez):-- L. ēnsem. The L. suffix had the sense ‘belonging to, originating in (a place)’, as in hortēnsis, prātēnsis, f. hortus garden, prātum meadow, and in many adjs. f. local names, as Carthāginiēnsis Carthaginian, Athēniēnsis Athenian. Its representatives in the Romanic langs. are still the ordinary means of forming adjs. upon names of countries or places. In Eng. -ese forms derivatives from names of countries (chiefly after Romanic prototypes), as Chinese, Portuguese, Japanese, and from some names of foreign (never English) towns, as Milanese, Viennese, Pekinese, Cantonese. These adjs. may usually be employed as ns., either as names of languages, or as designations of persons; in the latter use they formerly had plurals in -s, but the pl. has now the same form as the sing., the words being taken rather as adjs. used absol. than as proper ns. (From words in -ese used as pl. have arisen in illiterate speech such sing. forms as Chinee, Maltee, Portugee.)

There's clearly a story to be told about the concentration of -ese derivatives in East Asia, but I don't think that the story Liu tells is the right one, at least historically.

In sorting -ese and -ian, we need to note that English has other processes for forming adjectives from place names, including -ish (Irish, British, Flemish, Polish, Scottish, Spanish, Swedish), -i (Afghani, Iraqi, Israeli, Kuwaiti, Pakistani) and the motley collection of processes involved in cases like French and Greek.

In this context, we should note that -ish also has a disparaging or belittling tinge in nonce formations, as the OED observes:

In recent colloquial and journalistic use, -ish has become the favourite ending for forming adjs. for the nonce (esp. of a slighting or depreciatory nature) on proper names of persons, places, or things, and even on phrases, e.g. Disraelitish, Heine-ish, Mark Twainish, Micawberish, Miss Martineauish, Queen Annish, Spectator-ish, Tupperish, West Endish; all-over-ish, at-homeish, devil-may-care-ish, how-d'ye-doish, jolly-good-fellowish, merry-go-roundish, out-of-townish, and the like.

This can hardly be because the adjectival forms of toponyms with -ish are themselves generally deprecated.

Reforming English to regularize all adjectival forms of toponyms using -an or -ian would, ironically, align everyone with the usage attributed to George W. Bush in what were (among) the earliest reported "Bushisms": Grecians, East Timorians, Kosovians. On this line, I guess, you could pitch it as an educational reform to make it easier for schoolchildren to learn standard English, rather than as an exercise in political correctness designed to avoid negative connotations attached to anyone's morphemes.

But perhaps we'll see an alternative movement to rescue these morphemes from their historical degradation at the hands of elitist irony: "Say it loud: -ish and proud!"

January 25, 2006

Podzinger rejects Jesus

If you haven't already done so, go check out BBN's Podzinger service for searching podcasts. More exactly, according to its current banner, it's "searching 48797 podcasts" -- and growing. Podzinger applies automatic speech recognition to turn podcasts into text, and lets you search the stored texts for words or strings. You can sort results by date or by "Relevancy". Each hit is shown with the harvested title and abstract of the podcast, and 25 words or so of textual context around the matched search term -- more if there are several matches within the window shown -- and an indication of the time point in the podcast where the match occurred. In principle, Podzinger lets you access the audio of the podcast at the point of the match (although in my experience this often doesn't work due to server load or other issues), and it gives you links for the original source of the podcast (URL or RSS).

I should say right up front that I think Podzinger is terrific. I've been using it for several days with considerable satisfaction. And it's an excellent display of the strengths and weaknesses of state-of-the-art speech recognition technology.

Deputy secretary of state robert zoellick is in beijing where he began talks today with senior chinese officials the nuclear standoff with iran and north korea are high on the agenda The two sides also are expected to discuss bilateral relations and preparations for a strategic dialogue later this year China is the host of six party talks aimed at ending north korea's nuclear weapons ambitions -- visit to beijing follows a recent visit by north korean leader kim jong il i'm carl -- NPR news in washington ...

This has the ring of truth -- without even bothering to check, I'm confident that this transcript is mostly correct. Except for the lack of appropriate punctuation and capitalization, it's pretty readable. And all things considered, I think this is an extraordinary achievement. Before we get to some of the area where today's speech recognition technology still needs improvement, we should pause and reflect on how good these programs have gotten to be.

Sometimes.

Speech-to-text (STT) programs are still heavily dependent on their "language model" -- their statistical appreciation of what words and word sequences are likely to occur -- and still find reverberant (and otherwise distorted) recordings difficult. I imagine that these are the factors that led Podzinger to return, as the first hit on my search this morning for Beijing, a passage at 0:28:42 of a sermon titled "Crown Him with Many Crowns", which it renders as:

... by the dot all eaten the curse and the -- in beijing this morning -- finest art so against -- has -- decade remote wooded delight When music scene it is my lord ...

Though I can't identify any particular theological error, this hardly seems like a suitable message to be delivered from the pulpit. When I listen to the appropriate section of the podcast, I hear it as:

(no one speaking) by the spirit of God calls Jesus a curse, and no one can say Jesus is Lord except by the spririt. So yes, you s- have said there came a moment in your life when you said Jesus is my Lord ...

The recording is a bit reverberant, and it's about topics that are not often featured in the newswire text that Podzinger's language model is apparently trained on, but it's not at all hard to follow for a human listener.

You can see what has happened, to some extent, if we line up the passages wordwise:

by the dot all eaten the curse
by the spirit of God calls Jesus a curse

Here the "spirit" is missing, probably because the phrase "spirit of God" is spoken very rapidly, and the word sequence "God calls Jesus" has been rendered as "dot all eaten".

and the -- in beijing this morningand no one can say Jesus is Lord

This time "say Jesus is Lord" has been rendered as "beijing this morn(ing)".

The double hyphens in the Podzinger transcript represent unknown words, or rather (I presume) regions where none of the program's hypotheses reached its threshold of confidence. As this example indicates, the state of the art in assigning confidence ratings to recognition hypotheses is not very good.

decade remote wooded delight When music scene it is my lordthere came a moment in your life when you said Jesus is my Lord

Here "you said Jesus" has been rendered as "music scene it". I think we've seen enough to suspect that Podzinger is not yet ready to accept Jesus into its vocabulary, much less into its stony little silicon heart.

But no -- if we search for {Jesus}, we find 7,990 hits. Some are plausible, if not entirely correct. The 3rd hit I got, for instance, was at 0:04:05 in The Bible Podcast's reading of Genesis 38, which Podzinger rendered as:

... turned to prostitution and as a result she has become pregnant Jesus said Bring her out and let her be burned While they were bringing her route she sent word her father in ...

This is almost entirely correct, except that of course it's Judah, not Jesus, who is featured in the story of Tamar and Onan in Genesis 38.

Podzinger's first hit for {Jesus} this morning was at 0:09:50 of Rounders - The Poker Show for January 22, 2006, in a passage which it rendered as

... six names including daniel le grande do when jennifer harman and jesus ferguson and also -- -- had reaction doctor do we instead of tonight with the winner but at a -- that's ..

Not knowing much about poker, I figured this instance of "jesus" was another error, but in this case Podzinger had it right. At least the "jesus" part. My transcription of the corresponding stretch:

... six names including Daniel Negreanu and Jennifer Harman and uh Jesus Ferguson and also Robert Williams, and so had we actually talked to him two weeks ago instead of tonight, we wouldn't have uh chatted with that, so ...

So it seems that Podzinger is ready to accept Jesus after all, at least as the nickname of the poker player
Chris "Jesus" Ferguson.

[I should make it clear that this post's focus on mistranscriptions of "Jesus" is just a humorous way to highlight some issues with STT technology. If you search Podzinger for "Jesus", you will certainly find plenty of examples where the word has been correctly recognized, and I certainly don't mean to suggest that Podzinger has any special problems with religious as opposed to secular words, or with Christian words as opposed to those associated with any other religion. Bill O'Reilly need not get indignant.

On the other hand, the examples cited above are exactly those that came up as I explored the Podcaster service this morning in writing this post. I first searched for "Beijing", and checked two of the top three hits, one of which looked good while the other looked bad; having observed some problems in rcognizing the word "Jesus" in one of the podcasts, I tried a search for "Jesus", and again checked two of the top three hits. The one I left out (from 36:56 of the
PK & J Show)was transcribed by Podzinger as "embedded this can all learn sues outside community of (%EXPLETIVE) jesus was laying in finance -- awesome -- And The ..." Since this is a family weblog, you'll have to find out for yourself what the transcription should actually have been. Suffice it to say that "Jesus" is one of the few words that Podzinger got right. ]

Not cool enough to ignore the fact that the keynote speaker is none other than Richard Lederer -- nor, apparently, cool enough for Lederer to list this appearance in the Upcoming Speaking Appearances section of his Have Tongue, Will Travel page. Just reading Lederer's own bio is enough to make you wonder: what in the world were they thinking? As I and others have notedseveraltimesbefore, Lederer may be an award-winning punster and quick with a dictionary or two, but he's certainly no linguistic scholar. But perhaps the graduate student masses just want to be entertained.

January 24, 2006

A four-letter word beginning with F...

What's a four-letter word beginning with F that's
guaranteed to make everyone laugh?

Back in November, Barbara and I went to see The Capitol Steps
perform their satire show live in Harvard's Sanders Theater.
At the beginning of every event in the Sanders Theater there is a
calm-voiced announcement over the speaker system telling everyone
to turn off their cell phones and to look round and check the
location of the nearest exit to where they are sitting. But on
this occasion the voice continued: "In the event of an emergency,
do not leave the theater. Remain in your seats, and
wait for FEMA to arrive.

And of course the place erupted in laughter. Everyone roared.
That's what last fall's bumbling in New Orleans by Michael "Heckuva Job" Brown
has done to the reputation of a once important Federal agency.
The very word is a joke. Roughly like smut.

A four-letter word beginning with P...

Scott Adams revealed on
The Dilbert Blog
yesterday that his editor has objected to a panel in an upcoming
strip because a familiar four-letter word beginning with P
appeared in the dialog. Want to guess?

The word was porn. Believe it or not, the recommended
change was to another four-letter word, smut. A rather
old-fashioned word for a character in Dilbert to use, I would have
thought. Tom Lehrer wrote a
wonderful song under that title, and even
back then (in the sixties), the word was sort of jocular.

Blawgs, phonolawgically speaking

Mark Liberman commented last
week on some complaints lodged against the neologism blawg, meaning 'a law-related blog.'
David Giacalone of f/k/a
dismissed the term as "an insider pun by a popular lawyer-webdiva
(which should have been passed around and admired briefly as a witty
one-off)." (The lawyer-webdiva in question, by the way, is Denise
Howell of Bag and Baggage,
who began keeping a "blawg roll" in early March 2002.
An article in Legal Times
gives Howell sole credit for the coinage.)

Mark noted that blawg is
"an unusual sort of portmanteau word"
— unusual in that "the sound of one of the words (law) is completely contained within
the sound of the other word (blog)."
I'd agree that the blending of law
and blog into blawg is a peculiar formation (even
for a "witty one-off"), but not simply because one of the words is
phonologically contained within the other.

First, let's consider the structural possibilities for "blends" or
"portmanteaus" — words that combine two or more forms, with at least
one of the forms getting shortened in the process. In "Blends, a
Structural and Systemic View" (American
Speech 52:1/2, Spring 1977, pp. 47-64), John Algeo discerns three main categories of lexical blending:

For all three types of blending, the majority of items combine their
components sequentially: a segment of the first word is followed by a
segment of second word, with possible overlapping between the two
segments. But Algeo notes that blending sometimes occurs through the
insertion of one form into another, again with possible overlapping of
segments. Following the
terminology of Harold Wentworth, Algeo dubs such inserted blends
"sandwich words." Note that sandwich words, like other blends, still
require that at least one form is shortened in the process of
combination; if there's no shortening then it's simply a case of
infixation, like fanfriggintastic
(expletive infixation) or scrumdiddlyumptious ("diddly" infixation with partial reduplication).

Here
are examples of sandwich words given by Algeo to fit each of his three
categories:

Though two of Lewis Carroll's classic portmanteaus — chortle and slithy — are represented among
Algeo's sandwich words, most are what Giacalone would call "witty
one-offs," or what linguists call nonce formations. Thus we have autobydography 'an autobiography
written by a dog,' in-sin-uation
'the insinuation of sin,' miscevarsitation
'marriage between attendants of different colleges,' and ambisextrous 'sexually
ambidextrous.' (Michael Quinion
notes that ambisextrous is
not so nonce, as it dates
from 1929 and "has achieved a modest continuing circulation.")

Every generation seems to create its own sandwich words, but we are
blessed (and cursed) to live in an era where every nonce formation is
likely to be recorded on some website somewhere, occasionally gathered
up in such repositories of fleeting usage as Urban Dictionary, Langmaker, or
most recently Merriam-Webster's
Open Dictionary. (Such collaborative enterprises tend to be utterly
chaotic, as opposed to the more methodical cataloguing of innovative
forms by Grant Barrett at Double-Tongued
Word Wrester or Mark Peters at Wordlustitude.) It's easy
enough to find latter-day sandwich words on these sites, e.g.: satiscraptory = satisfactory + crap, fantASStic = fantastic + ass, and specyackular
= spectacular + yack. Elsewhere one can find
sandwich words of a less profane nature, e.g.: specTECHular = spectacular + tech, fan-Kaz-tic = fantastic + Kaz (i.e., the baseball player Kaz
Matsui), and ter-RIF-fic
= terrific + RIF ("Reading is Fundamental").

Certain words seem to lend themselves to sandwich blending. Once ridonkulous
and other silly variants of ridiculous
began to spread several years ago, the word ridiculous
became a prime target for nonce sandwich blends. Urban
Dictionary is full of examples like redorkulous,
redrunkulous,
reboozulous,
and recrunkulous
(in these cases, the blending has led to a reanalysis of the first
syllable as re-). In
fact, ridonkulous itself has
been interpreted
as a blend of ridiculous and
donk(ey), though this strikes
me
as an ex post facto rationalization. Another popular target among
left-leaning Netizens is the word Republican,
which gets the sandwich treatment in such epithets as Rethuglican, Resmuglican, Repiglican, Redumblican, Rebooblican, Reporklican, Repooplican, Reputzlican, Repukelican, etc., etc.

The recipe for such sandwich words is pretty constant: take a
polysyllabic word and replace the primarily-stressed syllable with a
punchy monosyllabic word of your choice. It's clear, however, that blawg is a different beast,
morphophonologically speaking. Denise Howell took a monosyllabic word (blog) and inserted another
monosyllable (law), such that
the "bread" for the sandwich consists merely of one initial consonant (b-) and one final consonant (-g). I know of no other sandwich
word so dominated by its filling.

What's more, the two component words are maximally overlapping for
some
speakers and nearly so for others. For speakers with the cot-caught merger of low back
vowels (such as most residents of the western
U.S.), the vowel in blog
merges with the vowel in law,
with the result that blawg is
homonymous with blog.
Speakers
without the merger tend to use the cot
vowel for most words ending in -og,
with the exception of dog and
occasionally other common words. Blog
is not (yet!) common enough to be subject to this lexical diffusion and
thus remains distinct from blawg
for most speakers lacking the merger.

The low back merger is clearly a point of confusion in the blawg wars. The editor of Blawg
Review evidently has the merger and doesn't seem to be aware that
others might not:

Interestingly, the word blawg is
pronounced the same as the word blog,
so there is absolutely no confusion in oral communication. In the
written word, blawg is easily
intelligible and conveys additional
meaning to readers and to search engines.

Conversely, David
Giacalone doesn't have the merger and expressed shock that there
are those who do:

Frankly, I was surprised to read that you
pronounce "blog" and
"blawg" in the same way... That underscores the notion that the word is
just an insider gimmick, because the two words don't
need to be homophones. Merriam-Webster online, for example, does not
pronounce "blog" in a manner that makes it homophonic with "blawg." ...
I believe most "blawgers" pronounce the words blawg and blog
differently -- otherwise, making the distinction seems pointless. If
one has to pronounce them the same way for the uninitiated to
understand what you are talking about, you are making my confusion
argument for me.

Both sides of this argument seem odd to me. The Blawg Review editor
presents it as a virtue that blog
and blawg are pronounced
the same (for everyone, he thinks). I'd have guessed that this would be a
strike against blawg,
since the distinction with blog becomes
difficult to make in spoken interaction, potentially leading to more
confusion, not less. (Indeed, Giacalone links
to a post by Trevor Hill,
who also has the merger, but sees it as a drawback to blawg: "it's homophonous
with blog, making it
useless in actual English speech.")

On the other hand, I don't
think that the presence of the low back merger for some speakers
renders the blog-blawg distinction
"pointless," as Giacalone would have it. It would simply make blawg a sandwich blend with maximal
overlap, like in-sin-uation, fantASStic, ter-RIF-fic, or ri-dick-ulous. True, the punniness
of those polysyllabic blends can be driven home by exaggerating
the stress on the inserted segment, a prosodic device that isn't
available for blawg (unless a
peculiar contrastive pronunciation developed, like "buh-LAW-guh").
But blawg has been doing
just fine as a visual blend, regardless of whether readers think it's pronounced the same as blog
or not. Since the term has thus far existed primarily in online
interaction in the blawgosphere, complaining about its potential
pronunciation makes about as much sense as complaining about the
typographical conventions of l33t. But if blawg really does start taking
off in spoken discourse, it will be interesting to see if these
arguments over the word's pronunciation become intensified.

A companion to the phonological argument over blawg
is the aesthetic one. Hill thinks the word "looks ugly," and Giacalone
is troubled by the similarity to dawg
as eye dialect for dog. (I
say eye dialect
because, as I mentioned, even speakers lacking the low back merger tend
to use the caught vowel for dog. But dawg may also represent a pronunciation spelling if
it represents an exaggerated pronunciation of the vowel; cf., rock vs. rawk.) Giacalone writes:

Most members of the public are far more likely
to think its a
take-off on the incredibly overused "dawg" for dog, rather than a
reference to law-related weblogs. Insiders know what it is,
outsiders do not and are very likely to view it as adolescent jargon.

Personally, I think most "outsiders" are perceptive enough to
avoid seeing blawg as merely
"adolescent jargon." Surely context is key. I can't imagine many
readers would have difficulty distinguishing between, say,
"Blawgs can be used for practitioners to give information about what is
happening in his/her area" on the one hand, and "Kewl blawg, dood!" on
the other. And if there are any concerns about misconstrual, one can
always opt for the more orthographically distinct bLAWg. Aesthetically, though, that's pretty darn odd-looking.

[Update #1: On a side note, Karen Davis emails to comment on the awkwardness of the above quote from the Maryland Bar Bulletin: "Blawgs can be used for practitioners to give information about what is happening in his/her area." As Karen notes, the writer "puts 'practitioner' in the plural and then *still* uses the clunky "his/her" instead of the natural — and totally permissible — 'their.'" I suspect this is simply an editing error, since the previous sentence uses the singular "practitioner." Or perhaps it's a case of pronominal hypercorrection brought upon by an aversion to singular they.]

A data point -- I come from DeKalb, IL, just above the Northern/Midland
isogloss, and have distinguished between 'cot' and 'caught' all my life.
Indeed, my surname constitutes a test case, since to me it rhymes with
'caller', 'taller', and 'hauler' but *not* with
'collar', 'dollar', or 'holler'.

However, *I* pronounce *both* 'log' and 'blog'
(as well as 'dog', 'hog', 'fog', 'frog', 'smog', and 'bog')
with the same vowel as 'caught' (open O), and *never* with /a/.

By contrast, I *always* have /a/ in 'cog', and I'm ambivocalic
with 'slog', 'sog(gy)', 'tog(gle)', and 'trog(lodyte)'.

So 'blog' and 'blawg' do mean the same thing for me, and in
fact when I first saw 'blawg' I assumed it was just an
eye dialect spelling of 'blog', just as 'dawg' is of 'dog'.

I guess the moral is that Paper's Law [1] applies here.

[1] Named after my former colleague Herb Paper, the law is
succinctly stated as "It's not that simple". ]

Unlike dangling

Last November Bob Tess mailed me from Macomb County in Michigan to
bring to the attention of
the
Fellowship of the Predicative Adjunct
a nice dangling adjunct case that does not involve a
participle. I meant to comment on this at the time, but it got lost in the
shuffle. It's about a piece on NPR's Morning Edition.
Says Bob:

In a report on the lack of excitement generated by the Lewis and
Clark expedition's bicentennial, NPR's Kirk Ziegler reported:
"Unlike Lewis and Clark however, people do want to talk about the
budget deficit."

Bob adds that Messrs. Lewis and Clark might have been very pleased to
talk about a budget deficit of proportions that would have seemed like
science fiction to them, only they can't, on account of being dead. That
is, he found he was forced to understand Lewis and Clark as an
understood subject of want rather than as object of about.
He heard the sentence as saying that people want to talk about the budget
deficit but Lewis and Clark don't. We were supposed to hear it as saying
that people want to talk about the budget deficit but they don't want to
talk about Lewis and Clark.

Unlike is an interesting word. It hovers on the boundary
between adjectives and prepositions. When it was formed it must have
been an adjective, because un- doesn't really attach to anything
else. (You don't find *unbetween, *unover, *unwith.)
But like now acts a lot more like a preposition in a number of
syntactic ways, and unlike has been left in an odd position,
not knowing whether to follow the syntax of its root or the syntax
suggested by its derivational prefix (as it were; I'm anthropomorphizing
lexemes here, which is a bit ridiculous, but I hope you see what I mean).

The relevant difference is that adjective phrases are predicative and mustn't
be left to dangle with nothing to predicate about, while preposition phrases
are allowed not to be predicative. So you can contrast the behavior of
ahead (a preposition, albeit of the kind that does NOT take a noun
phrase complement) with that of asleep (an adjective). Ahead
doesn't need an understood subject, but asleep does:

Ahead, there was nothing but the open road.

*Asleep, there was nothing but the open road.

Which way does unlike go? You decide. It seems to me to be
on the cusp. The question is whether you found you read the NPR sentence
the way Bob Tess did, or the way that the NPR scriptwriter intended.
I guess I lean in the same direction Bob does, which supports the view
that unlike a (highly anomalous) adjective.

January 23, 2006

Wordplay's big splash at Sundance

A couple of months ago we were pleased to bring you the news that Patrick Creadon's documentary Wordplay had been accepted into competition at the 2006 Sundance Film Festival. Creadon's film focuses on New York Times crossword guru Will Shortz and his cultish followers, as well as providing a glimpse into the world of competitive cruciverbalism. Now it's Sundance time, and the buzz from Park City is quite promising.

As I suspected, noted crossword nut Bill Clinton is among the celebrities to make an appearance in Wordplay. The AP reports that other "self-professed word nerds" appearing in the film are "Daily Show" host Jon Stewart, folk-rock duo The Indigo Girls, and brainy New York Yankees pitcher Mike Mussina. (Stewart sounds reliably manic: he can be seen "assaulting the Times crossword, shouting 'Come on, Shortz! Bring it!'")

But the film's real excitement derives not from celebrity cameos but from its depiction of the American Crossword Puzzle Tournament, which Shortz directs. According to the AP, "Sundance crowds were so caught up in the film's footage of last year's crossword tournament that viewers groaned over a bitter agony-of-defeat moment in the dramatic finale."

Prospects for a distribution deal are looking good. An article on indieWIRE says that representatives from four independent distributors (Picturehouse, Warner Independent, Fox Searchlight and Roadside Attractions) expressed interest at a Sunday brunch with the filmmakers. Though no deal has been announced yet, it's looking more and more likely that Wordplay will be coming to an art house (or at least a video store) near you.

A rejoinder from our president

I am grateful to President George W. Bush for pointing out to me that
there have been other occasions when he used impermissibly
antecedentless reflexive pronouns. One widely quoted remark of his was: "when I'm talking about myself and when he's talking about myself, all of us are talking about me." The second myself clearly has no antecedent, as he notes in his letter to myself.

However, this does not mean that we are dealing with anything other than sporadic slips. The president fully agrees with me that antecedentless reflexives are indeed ungrammatical in his dialect, and he lends no support at all to the contrary
opinions of Chris Culy. So I hope that clears up the matter of the reflexives.

President Bush does, however, take issue with the suggestion that
"Is our children learning?" is ungrammatical, and he makes a good point.
First, this sentence gets more that 62,000 Google hits now, and frequency
should count for something in linguistic inquiry. But second, as he himself pointed out in a
lecture given at the Radio-Television Correspondents Association 57th Annual Dinner, the example has
been misanalyzed:

Then there is my most famous statement: "Rarely is the question asked, is our children learning." Let us analyze that sentence
for a moment. If you're a stickler, you probably think the
singular verb "is" should have been the plural "are." But if
you read it closely, you'll see I'm using the intransitive plural
subjunctive tense. So the word "is" are correct.

As always, Language Log are happy to correct the record on those
rare occasions where some grammatical subtlety slips past ourselves.
Particularly when the wronged party is the leader of the free world
and could quite easily blow Language Log Plaza straight to hell with
a cruise missile.

Unheimisch op straat

Rita Verdonk, the Dutch Immigration Minister, has recently called for a national code of conduct forbidding the public use of languages other than Dutch. Apparently the city of Rotterdam already has such a code. According to the article in de Volkskrant,

"Speaking Dutch in the street is very important. I get email from many people who feel uneasy in the street", said the minister Saturday at a VVD meeting on immigration in Rotterdam.

What is at issue is not a law, but a set of rules of conduct (what Ms. Verdonk calls gedragsregels) that would include a commitment to use Dutch in all interactions in public places. I think the Rotterdam code might be here, though given my
extremely limited command complete ignorance of Dutch I'm not certain that I've found the one that Ms. Verdonk favors rather than some alternative or opposing proposal. [Later: Dutch correspondents confirm that this is indeed the Rotterdam Code.] This burgerschapscode ("citizenship code") lists seven points:

We Rotterdammers
1. take responsibility for our city and for one another without discrimination;
2. use Dutch as our community's language;
3. accept no radicalism or extremism;
4. raise our children as full citizens;
5. treat women equally with men and with respect;
6. treat homosexuals equally with heterosexuals and with respect;
7. treat adherents of different religions and atheists equally and with respect.

The context of all this is the situation symbolized by the murder of Theo van Gogh.

The de Volkskrant article quotes Laetitia Griffith, a member of Amsterdam's College of Aldermen, and a native of Suriname:

"I think that goes too far. Amsterdam is a world city with foreign investors. If I speak Surinamese [Sranan?] with a friend in the street and we don't cause any trouble, there's nothing wrong with that."

The Amsterdam mayor, Job Cohen, has been quoted as calling Ms. Verdonk a "hot-head with a cold heart", though he later back-pedaled a bit.

I'm not clear whether the part about speaking Dutch on the street is being emphasized because Ms. Verdonk and her party are emphasizing it, or because it's the part of the proposed gedragsregels that her political opponents object to. I imagine (though I don't know) that points 5 through 7 of the cited code are goals that the VVD (a "liberal", i.e. right-wing, party) shares with the Dutch left, though not with all of the immigrant communities; while point 2 (and perhaps the interpretation of point 3) is where the "liberals" and the left part company.

[Thanks to Bruno van Wayenburg for a pointer to the de Volkskrant article. Apologies to the Dutch nation for my (mis)translations from their gemeenschappelijke taal. Bruno also contributed this observation:

The rules are attempts to take a tough stance on the, admittedly very
real, problems of integration and social exclusion of significant groups
of Moroccan, Antillian and Turkish descent, but I doubt if lapsing into
19th century style language suppression will do the trick. No response yet
from the Frisian language minority in the North by the way, although
columnist Remco Campert predicts that they might definitely declare
independence now.

]

[ Readers should be aware that in this post I'm writing about both political and linguistic matters where I have little or no personal knowledge. I invite you to read the comments below and form your own opinions.]

[Marrije Schaake writes:

Thank you for your article on Language Log about Minister Verdonk's plans.

The discussion about it is (today) focussing on the larger explanation of the rules of conduct in the Rotterdam code.

On page 4 of the code you link to (which is the code in question in the whole matter, you found the correct one!), there's more about 'our shared language', and particularly that everybody should speak this language at work, at school, in the street and at the community centre. That's where the rub is: should people be allowed to speak their own language in the street?

The minister of course denies now she would like to implement a 'language police', but she does keep repeating the bit about getting mails form people who feel unheimisch in the street. And I bet she doesn't mean people who are upset about American tourists speaking English: it's Surinamese speaking Surinamese, Turkish people who speak Turkish, and most of all bearded Moroccan men in djellaba who speak Arabic or Berber.

To me, it's quite funny that she uses the word 'unheimisch', since that's a loan from German. It's a correct and accepted word, but still somewhat funny in view of the troubled relationship we used to have with the Germans. Speaking German in the street would have (at least) earned you frowns twenty or thirty years ago.

Michel Vuijlsteke adds this information:

How very, very ironic that the very word Rita Verdonck uses to describe "uneasyness" is a German word.

I was delighted to see that unheimisch is borrowed from German; by my
count that makes a majority of loanwords among the content words in Ms
Verdonk's quote: straat, mailtjes, unheimisch, minister, zaterdag (I
guess you could just count the zater- if you were being strict),
congres, integratie. Amazing how unaware these linguistic
nationalists are.

Also, as far as I know there is no language called "Surinamese"; the
main languages of Surinam are Dutch and Sranan (oddly, an
English-based creole), and I suppose the latter is being referred to.

]

[Rob Malouf writes:

Interesting discussion of language attitudes in the Netherlands! One thing
that struck me too is the irony of Verdonk's use of a German loanword. It's
not just that it's a foreign word -- the Dutch have a very complicated relationship
with the German language. As a non- native I won't claim to understand it,
but I will note that when Dutch racists painted anti-foreigner slogans on
a Turkish-owned gas station in my neighborhood, they did so in German. The
use of a German word by Verdonk (or her correspondents) carries a lot of meaning.

Here unheimlisch gets scare quotes, but "autochtone" goes by without
notice. "Autochtoon" is the opposite of "allochtoon",
which usually gets translated as "foreigner" or "immigrant".
Technically, it's any first or second generation immigrant from a country
outside of Europe besides the US, Canada, Australia, Indonesia, or Japan.

]

[Stefan Tilkov wrote:

"Unheimisch" is not a German word, at least not one that I (as a native German speaker) have ever heard. There is "unheimlich", which means "sinister", "strange", or "uneasy", there is "heimlich", which means "secret", and there is "heimisch", which means "at home" (in the sense of "heimisch fühlen" -- "feel at home").

I asked him what he makes of the
Taalunie page on the topic, and he responded

A Google search for "unheimisch" in German pages only yields 231 results; in Dutch pages, it's 688. Not really significant, I guess ... the first few of the German results use "unheimisch" in the sense of "not at home".

My Dutch is almost non-existent, still:
this document - the first hit for "unheimisch" in German - mentions

Thanks, Mark, your story is quite accurate, as far as I can judge. Here
are some comments, for your information (unless of course you want to
change the log name into Dutch Language Policy Log):

Job Cohen didn't really back-pedal: he later declared that his 'hot-head'
qualification was used in a different context in the interview, -more
importantly- the journalist acknowledged this and apologized quickly and
publicly. (However, Verdonk is predictably backpedaling now, as Marije
Schaake mentions)

Although I noted it myself, I think Rob Malouf and (especially) Steve of
Language Hat make a bit too much of the German loan unheimisch. It's
recognizably German, a bit learned but quite an ordinary word to use
(though apparently not for Belgians), probably something like 'Deja vu' in
English.

Besides, as far as I can see, the issue is not so much linguistic
nationalism, purism or even language at all, but tolerance of foreign
cultures. Verdonk might as well have started the old discussion about
head-scarfs again, using English loans.

German and Germany are not by far as loaded as they used to be (even
leading magazines to declare Germany 'cool' again). Still, I think Dutch
racists use German for an obvious reason: the Nazi associations. (Although
I didn't ask them).

]

[Lane Greene wrote:

You've already had many e-mails on this, and this isn't really a correction so much as a perception, but the VVD wouldn't normally be considered "right-wing" in the European context, though it is "liberal" in Europe. Liberal parties tend to be something like what we call libertarian: small government, lower taxes, but also socially permissive. I'm sure you know a bit or more of this. In the context of language policy, though, surely it's the socially permissiveness, and not economic policy, that's at issue, and in this context the VVD isn't particularly right-wing at all. (European countries, including the Netherlands, have Christian Democratic parties for their social conservatism.)

I think the real story here is how, since Pim Fortuyn and Theo van Gogh were murdered, even the traditionally socially liberal parties (also including the standard center-left Labor party) are starting to be tempted by the anti-immigration bandwagon, though the socially left parties (including VVD and Labor) tend to dress it up as a need for "integration" of immigrants, not hostility.

It's easier, I think, to translate between Dutch and English -- or even Berber and Dutch -- than to translate between European and American political parties. By putting "liberal" into scare quotes in writing about VVD, I meant to clarify that this word doesn't mean in Europe what in means in the U.S. I guess that "libertarian" would be a better translation than "right-wing", since European liberal parties are also not generally counted as being on the right. But their small-government, tax-cutting, degegulation-oriented outlook tends to make them look Republican, even if their laissez-faire social attitudes don't... Maybe the best American translation would be "South Park Republicans", without the overall contempt for government.]

January 22, 2006

Truthiness in journalism

I didn't go to the voting session for the ADS "word of the year" this time, but I sent a proxy with Erin McKean, and when she told me that truthiness had won, I was surprised. It seemed to me to be an overly-specific reference to a particular episode of a TV show, which probably wouldn't have gotten much circulation at all if the NYT hadn't mis-reported it. However, I'm starting to think that I was wrong: truthiness might have some staying power.

Frank Rich's most recent column, "Truthiness 101", is behind the Times Select wall, but as usual is available elsewhere on the web. It contains the best explanation of truthiness that I've seen, in the form of an unusual journalistic admission:

It’s the power of the story that always counts first, and the selling of it that comes second. Accuracy is optional.

Rich intends to describe a state of affairs that he dislikes and blames on politicians. But in fact he's describing the ethos of journalism, as it's generally practiced rather than as it's traditionally preached.

The only thing that prevents the complete fictionalization of journalism, it seems to me, is the adversarial process of complaints from powerful people and contradictory stories from alternative sources, with the implicit threat that these pose to journalistic reputations. The introduction of weblogs into this process is presumably an annoyance for traditional journalists. The task of weaving the raw fibers of truth into an attractive tapestry of truthiness is difficult enough, without millions of bloggers constantly picking at the fabric.

(And the reason that science journalism is so particularly bad, I think, is that scientists have never been especially powerful, and few of them have had easy access to public channels of information. )

From a blogger's point of view, the truthiness of the mainstream media is simply part of what H.L. Mencken called "the daily panorama of human existence", which

is so inordinately gross and preposterous, so perfectly brought up to the highest conceivable amperage, so steadily enriched with an almost fabulous daring and originality, that only the man who was born with a petrified diaphragm can fail to laugh himself to sleep every night, and to awake every morning with all the eager, unflagging expectation of a Sunday-school superintendent touring the Paris peep-shows.

Rich ends his column by raising the curtain on a particularly juicy scene:

Fittingly enough against this backdrop, last week brought the re-emergence of Clifford Irving, the author of the fake 1972 autobiography of Howard Hughes that bamboozled the world long before fraudulent autobiographies and biographies were cool. He announced that he was removing his name from “The Hoax,” a coming Hollywood movie recounting his exploits, because of what he judged its lack of fidelity to “the truth of what happened.” That Mr. Irving can return like Rip van Winkle after all these years to take the moral high ground in defense of truthfulness is a sign of just how low into truthiness we have sunk.

That's a robustly truthy peroration. It might even be true. Rich seems to be referring to a paragraph in Ben Sisaro's "Arts, Briefly" columns from 1/16/2006:

"The Hoax," a movie based on Clifford Irving's memoir about his infamous publishing scam - his 1972 "autobiography" of Howard Hughes - is not to be released until later this year, but Mr. Irving has already asked that his name be removed from the credits as its technical consultant. Mr. Irving made the request in a brief letter recently sent to Mark Gordon, one of the producers, and copied to others on the project, including the director, Lasse Hallstrom. In the letter he gave no reason for his decision, but in a recent telephone interview said: "My feeling, based on the script, is that there was more concern for the kind of cigarettes I smoked and the type of suitcase I carried than there was for the truth of what happened." The film's producers responded last week, saying in a statement issued by the studio, Miramax Films: "Clifford Irving's book, 'The Hoax,' contributed greatly to Bill Wheeler's screenplay. Throughout development and production, we reviewed Mr. Irving's notes and incorporated many of them into the script. We deeply regret that he feels this way in advance of seeing the finished movie." Mr. Irving, who spent 16 months in prison for his involvement in the fraudulent Hughes memoir, has also taken umbrage with the film's characters, including his own (portrayed by Richard Gere), as being largely unlikable. PAT H. BROESKE

The implication seems to be that Irving thinks the movie is truthful, in matters like cigarette and suitcase brands, but not truthy, in terms of his (un)likability. If so, then he's taking the moral high ground in defense of truthiness, not truthfulness. Though I admit it makes a better story the other way around.

Anyhow, you can see that the the word truthiness is showing signs of liveliness, and maybe even of life.

Who let the 'n' in?

I think it was the Dutch, actually, though the influence is an indirect one. Victor Mair emailed to ask why the language and people of Shanghai are known in English as "Shanghainese" (210,000 Google hits vs. 25,500 for "Shanghaiese"). The response that first comes to mind for most linguists, as Victor knows, would reference the universal preference for consonant-vowel alternation, the resulting uneasiness about vowels in hiatus (i.e. vowel-vowel sequences across morpheme boundaries), and the status of coronal (i.e. tongue-tip) consonants as the least-marked option for consonant epenthesis in repairing cases of hiatus.

However, Victor also cites these other counts:

Shandongese

Shandongnese

Vietnamese

Vietnamnese

236

1,760

34,500,000

1,230

Now, many Chinese languages/dialects would pronounce Shandong with a final nasalized vowel rather than a velar nasal, but that's not the way it works in the English version of the place name, so why is "Shandongnese" with instrusive -n- preferred by 7 to 1?

I haven't done a careful study of this -- nor have I checked carefully to find the existing careful studies that may well exist. But my guess is that this starts with the analogical shadow cast by the place names ending in 'n' -- Japan, Taiwan, Canton, Bhutan -- whose adjectival forms (and the corresponding language names and/or ethnonyms) add '-ese' -- Japanese, Taiwanese, Cantonese, Bhutanese. Then there are the cases where a final syllable is elided in the place names to get adjectival forms that happen to end up ending in '-nese': Chinese, Lebanese.

Finally -- and most relevantly -- there are some long-established cases where there is an intrusive 'n': Java → Javanese, Sunda → Sundanese, Bali → Balinese, etc. The oldest of these seems to be Javanese, which the OED traces back to 1704:

1704 CHURCHILL Collect. Voy. III. 724/1 The Javaneses and Mardykers.

and which may derive from an earlier Javan:

1606 SCOTT (title) An exact Discovrse..of the East Indians, as well Chyneses as Iauans.

The preference for -ese as the adjectival ending for places in the "East Indies" presumably reflects the influence of Dutch, which also (I think) regularly has intrusive -n- in such words: Javanees, Sundanees,Balinees, etc. I don't have access to a historical dictionary of Dutch -- is there one? -- but I assume that these words date back at least to the early 17th century, if not the 16th. I also don't know whether the use of intrusive -n- to repair hiatus is the general pattern in Dutch, or whether (as in English) it's just one of many quasi-regular local options.

Anyhow, Shanghainese follows this well-established pattern, though the OED's earlier citation is from 1964:

As for "Shandongnese" and "Vietnamnese", I guess that people have started to re-analyze the ending as -nese rather than -ese. In the case of Shandongnese, there is no established English term, so the coinage is a recent one, and the 7-to-1 preference for "Shandongnese" over "Shandongese" apparently is telling us about the state of the net's collective neural net in this connection, so to speak. In the case of "Vietnamnese", there has been a standard English form "Vietnamese" for some time, so that "Vietnamnese" has the status of a rare mistake -- though I was surprised to learn that the OED's earliest citation is from 1947:

1947 H. R. ISAACS New Cycle in Asia viii. 157 Matters came to a head in Hanoi on December 19, 1946, when clashes in that city resulted in generalized warfare... The French charged that the Vietnamese were the instigators of the outbreak.

It's interesting that we chose the Dutchlike form (compare Vietnamees) rather than the Frenchy one (compare Vietnamien). They gave us their war, but not their word.

[Update: Steve of Language Hat emails to point out an obvious fact that I'd forgotten about:

I imagine the reason the OED's earliest citation is from 1947 is that
until WWII and Ho's independence movement, there was no such thing as
"Vietnam" -- what we think of as Vietnam was three provinces of French
Indochina, and you'd use Tonkinese, Annamese/Annamite (interesting
that there was no settled form), or Cochin-Chinese as called for.

He goes on to observe

Interesting also that the OED has no entry for Cochin-Chinese; they do
have one for Cochin-China, which is defined as "Name of a country in
the Eastern Peninsula"! I had never heard or seen that phrase used in
that way, but a little googling turned up "Geography of the Eastern
Peninsula: comprising a descriptive outline of the whole territory,
and a geographical, commercial, social and political account of each
of its divisions, with a full and connective history of Burmah, Siam,
Anam, Cambodia, French Cochin-China, Yunan, and Malaya," by Henry
Croley (1878). Forgotten geography...

]

[Update #2: Rogier Blokland writes:

As an avid reader of Language Log I couldn't resist answering your question ('I don't have access to a historical dictionary of Dutch -- is there one?'). There is one indeed, and it might be larger than the OED, or at least that's what we like to think.

I don't have access to one at the moment, nor have I found one on the web (the German Grimm is!), so I cannot check when 'Javanees' or 'Balinees' was first recorded, but I can have a look one of these days and get back to you, if you're interested.

I'll look forward to learning from Prof. Blokland, or another reader, what the WNT says about the antiquity of Javanees and similar words. ]

[Ben Zimmer observes

The historically older forms in Dutch are "Javaan" (pl. "Javanen") to
refer to a Javanese person and "Javaansch" (now usually spelled
"Javaans") to refer to the Javanese language or ethnic group. I know
that both of these forms were in use at the time of the first Dutch
expedition to Java in the 1590s. I don't have sources at hand, but I
seem to remember that this and similar ethnonyms were borrowed from
the Portuguese, who may have based their forms on Latin ("javana,
javanensis"?).

He added in a later note

The historical Portuguese ethnonym is actually "Jãos" (= 'Javanese
people'), used by João de Barros in 1553 (mentioned in the Hobson
Jobson entry for "Java"). That appears to be based on Arabic "Jawi",
which was a broader term used to refer to inhabitants of island
Southeast Asia.

With regard to your post on intrusive "n", there's also of course the "l" in
some African placename adjectives, such as Congolese and Togolese. These
seem to follow "o" and I recall the story that the analogy is based on
Angolese (a word that exists but which I'm not use to hearing, given the
prevalence of Angolan). There may be other models as well.

]

[Update 1/23/2006: Marina Muilwijk writes:

In your post on "Who let the 'n' in" you wonder about the date of
Javanees and Balinees.
Since I'm sitting in a library, I could easily look them up in the
Woordenboek der Nederlandsche Taal.

"Javanees" can't be found anywhere in the WNT. The first example for the
form "Javaansch", that Ben Zimmer mentions, is from 1688, but that is in
an example for a not very relevant word.

The earliest example for "Balineesch" is from 1726 (as an example with
"vrouwentimmer (women's quarters)").

January 20, 2006

Guest post: Getting ourselves in trouble

Darrell Waltrip, successful NASCAR driver and sports commentator, had this to say about talking and getting into trouble:

"I had a reporter one time tell me, 'Waltrip, you're
a great interview, but you talk too much,"' Waltrip said. "He told me I talked
and talked and talked, and eventually I'd say something that would get myself
in trouble.

"I was always like that. Talk, talk, talk, talk."

So, what did you think when you read that? Did you choke on your morning double mocha soyaccino and sputter, as Geoff Pullum might, "That's totally ungrammatical, in all dialects." Or did you calmly sit back and think to yourself, as Ricky Rudd fan DDniteOwl21 does, "Ignore him/them.
Don't even risk saying something back that might get yourself in trouble. It's just NOT worth it."?

Unlike Geoff Pullum and more like Mark Liberman, I wasn't shocked by the reflexive pronoun in President Bush's statement (cited on Reuters here):

And so long as the war on terror goes on, and
so long as there's a threat, we will inevitably need to hold people that would
do ourselves harm.

The issue, as Geoff points out, is that

Reflexive pronouns like ourselves must
(to put it roughly -- there are some codicils) have an antecedent earlier in
the same clause, agreeing with it in person, number, and gender.

The codicils include the idea that stressed reflexive pronouns are not subject to this constraint, and there are certain other constructions that allow a reflexive pronoun without a same-clause antecedent. For example, Geoff pointed out to me in e-mail, that "between ourselves" in the sense of "confidentially" is one such construction. (More on that below.) And of course, other languages (e.g. Ewe, Japanese, Latin, and many more) do not necessarily have the same-clause antecedent constraint on particular pronouns. Those kinds of pronouns lacking the same-clause antecedent constraint have been referred to variously as "long distance reflexives," "non-clause bound reflexives," and "logophoric pronouns" — but that's another topic.

But English does have (some form of) the same-clause antecedent condition on its reflexives, so Bush's statement, as well as those by Waltrip and DDniteowl21, which are syntactically parallel to Bush's, I take to be one of those mysterious aspects of English that Geoff likes.

So, just for my own curiosity I tried to find more examples of unstressed ourselves used without a clausal antecedent. I did two kinds of searches. One type of search was to use Google to look for examples parallel to Bush's: ourselves as the direct object of a subject relative. The other type of search was to look for ourselves in a few (about two dozen) 18th-early 20th century works, fiction and non-fiction, that I happened to have scattered across my hard drive. Obviously, neither search was exhaustive of its kind. The 30 examples I found via Google are here, for brevity(?!) of this posting.

Here are some observations about the Google searches (examples parallel to Bush's utterance):

Looking for things that are/seem ungrammatical can lead to sites using fake
English to fool search engines
e.g. look for: "that did ourselves" -bush

The examples are rare, but they do exist, and they are probably (though
I haven't tried to check) no rarer than some examples from syntax articles.
Also, I didn't look for any other pronoun or any other position or function
of the pronoun.

Many of the examples are from religious-themed pages.

Many, though not all, of the examples have another first person plural pronoun
in the same sentence.

"X that GET ourselves Y" is easy to find, and seems among the most natural
to me.

Past tense of the main verb of the relative clause is rare (though less
so with GET).

I didn't find any examples with the perfect (present or past) in the relative
clause.

I excluded examples where the relative clause modifies a noun that is predicated
on a first person e.g. "One aspect of Human Selection says that we are a species
that can reinvent ourselves and what I am saying is that we have done so before
and it's a consequence of learning and surviving the lessons." (Source:
http://groups.yahoo.com/group/Fountain_Society/message/3448")

The four examples I found from the historical works are very different, and
none are of the object in a relative clause type. The numbers are too small
to make any general observations. The examples are at the
end of the post.

While all these people could have forgotten what the subject of the clause was, as Geoff suggested Bush did, it seems unlikely to me. Not that I have an explanation of what is going on, mind you. That would need more, and more systematic, data. And while I'm willing to admit that they could all be errors, it's worth taking another look.

Finally, a side benefit to me of looking for these examples was finding other things that piqued my curiosity. For example, "between us" meaning "confidentially" is pretty common in a cursory look on Google, but very uncommon in the few historical sources I looked at. Another example is that the ratio of among + pronoun to amongst + pronoun is significantly lower when the pronoun is a reflexive (ourselves, yourselves, themselves) than when it is a non-reflexive (us, you, them), and the ratios vary widely across persons. (Counts were the extremely crude measure of Google hits.) I didn't expect this disparity, given the synonymy of among and amongst — amongst doesn't even rate a separate entry in the collegiate dictionary I looked in.

So, whether or not Bush's utterance is ungrammatical in some or all dialects, taking it as a mystery to be explored leads to some (potentially) interesting results.

Here are the four examples from historical sources of unstressed ourselves
without a same-clause antecedent. None of them are in the above pattern discussed
above (object of a verb in a subject relative clause). All the books were from
Project Gutenberg. Emphasis on ourselves
added throughout.

Source: Robinson Crusoe by Daniel Defoe

In this distress the mate of our vessel laid
hold of the boat, and with the help of the rest of the men got her slung over
the ship's side; and getting all into her, let go, and committed ourselves,
being eleven in number, to God's mercy and the wild sea; for though the storm
was abated considerably, yet the sea ran dreadfully high upon the shore, and
might be well called DEN WILD ZEE, as the Dutch call the sea in a storm.

Source: Emma by Jane Austen

"Oh!" she cried in evident embarrassment, "it
all meant nothing; a mere joke among ourselves."

Source: Thomas Jefferson's Autobiography (2 examples)

It was argued by Wilson, Robert R. Livingston,
E. Rutledge, Dickinson and others ... [a long list of propositions, all starting
with "that"] ... That it was prudent to fix among ourselves
the terms on which we should form alliance, before we declared we would form
one at all events: And that if these were agreed on, & our Declaration of Independance
ready by the time our Ambassador should be prepared to sail, it would be as
well as to go into that Declaration at this day.

... and the amendment against the reeligibility
of the President was not proposed by that body. My fears of that feature were
founded on the importance of the office, on the fierce contentions it might
excite among ourselves, if continuable for life, and the dangers
of interference either with money or arms, by foreign nations, to whom the choice
of an American President might become interesting.

Mr. Understanding-the-key-point Chen

The article repeatedly talks about this as if it were part of fengshui (fēngshui / 風水 / 风水). Coming up with a lucky name, however, traditionally belongs to fortune-telling, an entirely different field, though I suppose it’s possible that the two have become combined in modern China, where the traditional ways were broken.

Here's one of the name-changing narratives from the WSJ piece:

Chen Mingjian changed his given name in 1998, after he was fired from an investment consultancy he co-founded. A feng shui master said Mr. Chen's original given name, Jian, or "healthy," attracted money and success but mainly for his employers, not for himself. The name Mingjian, which means "understanding the key point," would give him a better chance at earning a personal fortune, the master said.

Mr. Chen now owns HollyHigh International Capital Co., a mergers-and-acquisitions consultancy with offices in Beijing and Shanghai. "Name changing ... gives you psychological assurance in difficult times," says Mr. Chen. "My career didn't take off until I changed my name."

Mr. Chen says eight of 32 of his classmates from prestigious Tsinghua University, most of them bankers and investment bankers, have followed his example and changed their names in recent years.

"I may change my name again if there are dramatic changes in my life," says Mr. Chen.

Who let the blawgs out?

I've come to know you as an articulate lover of the English language.
As far as I know, you don't say "lawgic" or "lawnguage," drink "lawtte," bill clawents, or use Blawk's Dictionary. You don't call lazy associates "slawkers," and have yet to dub Jack Abramoff a "lawbbyist."

You're usually a skeptic and no fan of "cute." If linguists called their
weblogs "blings" (or argonauts called theirs "blargs"), you'd probably
smirk. But, note: no one else uses such verbal oddities in naming their
weblogs. So, Ed, why do you, and other otherwise-serious members of
the legal community, refer to law-oriented weblogs as "blawgs?" Why
take an insider pun by a popular lawyer-webdiva (which should have been
passed around and admired briefly as a witty one-off) and help perpetuate it?

The Blawg Review's editor responded under the heading "Who let the blawgs out?". I'll let you read the closely-argued brief for yourself, but (s)he ends the argument by playing the trump card of lexicographic democracy:

In an ex parte communication, Ed. wrote to me to ask for a link expert testimony on the matter:

There's probably a lot more that could be said about the portmanteau "blawg" from a linguist's point of view, and we'd be very interested in your thoughts.

I'll observe that "blawg" is an unusual sort of portmanteau word -- it is indeed "a word formed by merging the sounds and meanings of two different words, as chortle, from chuckle and snort". However, the sound of one of the words (law) is completely contained within the sound of the other word (blog). At the moment, I can't think of any other examples of that kind. (I'm sure there are some others, at least among what Giacalone calls "witty one-offs", but they don't come to mind at the moment. Send them to me and I'll add them here.)

Beyond that, I don't have much to contribute, except the standard quotation from Horace about norma loquendi:

But why should the Romans grant to Plautus and Caecilius a privilege denied to Virgil and Varius? Why should I be envied, if I have it in my power to acquire a few words, when the language of Cato and Ennius has enriched our native tongue, and produced new names of things? It has been, and ever will be, allowable to coin a word marked with the stamp in present request. As leaves in the woods are changed with the fleeting years; the earliest fall off first: in this manner words perish with old age, and those lately invented flourish and thrive, like men in the time of youth. ... Mortal works must perish: much less can the honor and elegance of language be long-lived. Many words shall revive, which now have fallen off; and many which are now in esteem shall fall off, if it be the will of custom, in whose power is the decision and right and standard of language. [translation from C. Smart]

The crucial phrase is "licuit semperque licebit signatum praesente nota producere nomen", meaning something like "it has always been allowed, and always will be allowed, to coin a word stamped with the current year". And as Horace observes, whether a new word is accepted as the coin of the realm, and for how long, is not determined by lawyers or linguists.

A guest rant: "All we want are the facts"

Or does it just seem that way? Does the apparent decline of respect for veracity in public discourse amount to just another case of "the country going to the dogs" -- as people seem to have been repeatedly rediscovering from time immemorial? In a penetrating essay on the significance of the Million Little Pieces dust-up (N.Y. Times. Jan. 17, 2006,"Bending the Truth in a Million Little Ways"), Michiko Kakutani (MK) makes a pretty convincing case that this time there may be a real wolf at the door.

MK highlights the no-big-deal attitude of author James Frey, his publisher Doubleday and his major promoter Oprah Winfrey to the fact that Frey's self-styled memoir contains some undeniable, and for that matter undenied, fiction, which makes him out to have been a significantly bigger loser and thug than he really was and thus fulsomely inflates the drama of his supposed redemption. (Oprah used the phrase "much ado about nothing" to appraise Frey's lies in the context of his emotional message.) MK proposes that this is not an isolated incident, not even an isolated case of an inflated memoir. She points to staged reality shows, phony biographies of both the gilding and tarring varieties, slanted opinion-mongering that masquerades as news, ... and specifically to a Bush aide's dismissive characterization of reporters "who live in the reality-based community ... we're an empire now and when we act we create our own reality." That declaration would be laughable if it wasn't terrifying.

MK also mentions several unfortunate turns of phrase that have become part of our everyday language; for example "virtual reality", "creative non-fiction" and the word "survivor" applied to those who have overcome bad credit or obesity. She could have added "war on terror(ism)", "They hate us for our freedom", and countless others, including Fox News's self-identification as "fair and balanced." One of the most alarming of the recent truth-obliterating usages, it seems to me, is "deniability." The first time I heard this expression was when Admiral John Pointexter, the Reagan administration's uber-point man on Iran-Contra, explained some administrative skullduggery as justified because it provided President Reagan with deniability. What Poindexter meant was that his own dishonesty was admirable because it enabled his boss to claim unassailably, albeit untruthfully, that he didn't know what was going on.

Lest the reader conclude too quickly that the fault for the decline of truth in public discourse lies exclusively with the political right, MK points to the culpability of the overwhelmingly left-leaning post-modernists of our humanities and social science departments. In "deconstructing" all historical texts and arguing that they merely express the power of the interests their authors represent, postmodernists apotheosize the obstacles to objectivity rather than combating them. She cites in this connection an elegant line of Stanley Fish's: "the death of objectivity 'relieves me of the obligation to be right'; ... it 'demands only that I be interesting.'" And the most visible language-based strategist of the current left, George Lakoff, urges liberals not simply to tell the unvarnished truth, but to "frame" issues in ways that will combat the propaganda of conservatives and so aid the achievement of liberal goals. Lakoff makes forcefully the familiar point that the facts never speak for themselves; they have to be recounted in human languages, which are fraught with connotation and presupposition. Fair enough. But if the truth is seldom plain and never simple, it is nonetheless the only truth we've got. I'm all for the achievement of liberal goals, but I worry a little that even some of my best friends seem to care less about plain facts than they used to.

Truthiness: a flash in the pan?

Has the golden era of truthiness
already passed? The above graph, generated by
BlogPulse,
suggests that inhabitants of the blogosphere are already losing
interest in Stephen Colbert's
term for faux
truth, less than two weeks after the American Dialect Society named
it Word
of the Year and Colbert launched his offensive
against those who would deny him credit for the coinage. The recent
controversies in the literary world over the pseudo-memoirs of James
Frey and J.T. Leroy may
have provided one final boost, as the word was featured prominently in
commentary on the scandals in USA
Today, the Chicago
Tribune, and the San
Francisco Chronicle. But even if truthiness
is already on
the wane, at least it was a fun ride.

Wordanistas are split on the viability of the term. Even before the
ADS vote, noted etymologist Anatoly Liberman speculated on Minnesota
Public Radio that truthiness might
enter new dictionaries in the next year or two (presumably with a sense
differing from the archaic
meaning of 'truthfulness' found in the Oxford English Dictionary and the Century Dictionary). But even
Liberman — clearly not a fan of "The Colbert Report" — disparaged the word as "rather ugly and rather useless." On the ADS mailing
list Allan Metcalf wrote, "Like astronomers witnessing the birth of
a nova, we are watching the nativity and infancy of a new word that has
the possibility of becoming a permanent addition to the vocabulary."
Ron Butters, on the other hand, countered that "truthiness
is not a lexicological nova, it is a cute, stunt-wordy flash in the lexicographical pan and will go the way of Bushlips, and about as
quickly."

Bushlips, defined as
'insincere political rhetoric,' is something of an albatross for the ADS.
It was named Word
of the Year for 1990, the first year that the organization made
such a selection, and it recalls the days of Bush the Elder's notorious
backtrack from his "Read my
lips, no new taxes" pledge. Needless to say, Bushlips quickly withered on the
vine. Some might argue that selection as the Word of the Year isn't
intended to be an indicator of future success — after all, the ADS has
the category "Most Likely to Succeed" for that (in 1990 it was notebook PC and rightsizing, while in 2005 it was sudoku). Rather, the Word of the
Year is meant to capture something of the annual zeitgeist, and both
Bushlips and truthiness accomplished that for their respective years.

If truthiness is indeed
headed for the neologistic scrapheap along with Bushlips and so many others, its
rise and fall will at least serve as a fascinating case study for media
observers. To that end, I'd like to add two more pieces to the puzzle.
First, AP reporter Heather Clark, who Stephen Colbert declared as "Dead
to Me" for neglecting to assign proper credit in the initial
coverage, finally had her say in asap, the AP's "new
multimedia service featuring original content designed to appeal to
under-35-year-old readers." Clark was apparently inundated with emails
(no doubt due to the call to arms issued by Adam Green on the Huffington
Post), but she feels she is being unfairly maligned:

Now, listen up. Many of you insist that Colbert
"coined" the word
"truthiness."

In
fact, Colbert himself is the epitome of the word — as in "truthy," not
"facty." Mr. "Truthy" — witty intellectual that he claims to be [Huh? I thought he was supposed to be anti-intellectual. —BZ] — did
not coin "truthiness," though he did popularize it. The Oxford English
Dictionary has a definition for truthy that dates back to the 1800s and
includes the derivation "truthiness."

And for the record, I did mention Colbert's show in the initial
article that was read far and wide — or at least across New Mexico
(though I did NOT credit him for inventing the word!). Seems, though,
that the reference to Colbert was edited out by our national desk,
which often tightens stories and drops information that they feel isn't
all that important.

I have no problem believing that the omission of Colbert can be blamed on the AP's national desk editors, since the full
version of Clark's story did emerge in some outlets at around the
same time as the shortened version. But regardless of who was
responsible for leaving Colbert's name out of the national desk story,
the oversight turned out to be a godsend for "The Colbert Report," according to
Stephen Colbert himself (the comedian, as opposed to the absurd on-air
character of "Stephen Colbert"). Colbert was recently interviewed by San
Francisco Chronicle TV critic Tim Goodman as part of the Chronicle's
City Arts & Lectures program, and you can listen to all four parts of
the interview here
in podcast form. About halfway through Part
3, Colbert talks about the honor of having truthiness named Word of the Year.
He goes on to say how ecstatic he was that the AP didn't mention him,
since his character was in need of a persecution complex à la
Bill O'Reilly.

And who knows? Maybe another successful neologism will emerge from all of this. I myself am quite fond of wordanista and will probably be using it for a long time to come.

Technology review: giving new media a bad name?

As an MIT grad, I've been getting the MIT alumni magazine Technology Review in the mail for many years, and I generally read with interest and pleasure. But Tech Review is now undergoing some changes, to make it bloggier, or at least webbier: more "immediate", more "searchable" and more "interactive". Unfortunately, these changes are not all for the better, as I evaluate them so far, because the information seems also to be becoming less reliably true. At least, in a couple of recent cases, Tech Review presented false statements dealing with simple matters of fact, about which the truth could have been learned in a few seconds of Google searching.

Ironically, old-media pundits have been complaining for years that blogs and wikis and such, lacking editorial oversight, are not factually reliable. This was never true, in my experience -- bloggers who know their areas are more reliable, on average, than journalists are. But what seems to be happening now is that Tech Review is aiming for the immediacy of blogging and other new media, in a way that really does degrade factual reliability rather than improving it.

This is a shame, because the theory behind the changes seems otherwise to be a good one. The new editor, Jason Pontin, has smelled the same new-media coffee as everyone else in the industry, and writes:

Readers want information to be immediate, searchable, and easily customized, and advertisers are demanding accountability from the publishers who take their money. Put baldly, the era when publishers could rely on print magazines to satisfy their readers and build sustainable businesses is over.

In keeping with MIT's history of innovation and leadership, Technology Review has decided to invest more of its resources in interactive media.

Specifically, Pontin explains, they're going to:

• Decrease the frequency of the print magazine to bimonthly publication;

• Dramatically increase the number of stories we publish on technologyreview.com every day;

• Expand the range of media we employ online to include podcasts, blogs, RSS feeds, and a variety of new technologies;

• Focus all our editorial content on the impact of emerging technologies and discontinue our coverage of the business models and financing of new technologies.

My first indication that the dramatic increase in online content might be sacrificing accuracy was a story by Kate Greene about machine translation, datelined Wednesday, January 18, 2006, under the headline "Repetez, en anglais, s'il vous plait". It contains this paragraph:

In 2005, DARPA also announced the Linguistic Data Consortium (LDC), a project aimed at acquiring huge amounts of translated documents, for distribution to Global Autonomous Language Exploitation, another DARPA-funded project in which computers will process the data. The intention of both of these initiative is to speed up the progress in machine translation. LDC is currently in the first year and will be transcribing speech from broadcast news sources and talk shows in Arabic, Chinese, and English, and also cataloguing text newswire feeds, Web news discussion groups, and blogs in those languages. For now, the project is focused mainly on data collection from these genres, with researchers in the computer and engineering science department at the University of Pennsylvania doing much of the work.

Now, the truth of the matter is that the LDC was founded in 1992, not 2005, and has been publishing materials for speech and language research since 1993. And the LDC's goals are quite a bit broader than collecting translated documents for MT research. And only a few of the LDC's staff members are associated with Penn's CIS department. And many LDC publications are authored by researchers from other institutions around the world. I know all this because I was the P.I. on the initial DARPA grant (which ended in 1995), and continue to direct the organization. Greene could have learned the facts about the LDC by asking Google for information on {linguistic data consortium history}, or poking around on the LDC web site for a few minutes, or by contacting someone at the organization.

These are small points, which I wouldn't care much about if I didn't have a personal connection to the work. I mean, 1992, 2005, what's 13 years in the grand tapestry of human history? In some ways, Greene's story is a step up from the July 2003 NYT story on an earlier DARPA MT evaluation -- which didn't mention DARPA at all, or the LDC for that matter, though it did track statistical MT back to 1999 or so. And I'm impressed that Tech Review allows comments on its online articles, so that readers can offer corrections.

However, it bothers me to think that when I read an article in Tech Review, I have to allow for the possibility that its "facts" are plainly and simply false, in ways that anyone can discover in a few seconds of research on the web. I don't have the time to check all the facts in every article that I read, so I like to think that in a reputable and well-edited publication like Tech Review, someone will have done that for me, at least to a first order.

Is this an isolated case of an unchecked mistake of fact? Apparently not. When I took a look at the Technology Review front page this morning, one of the prominently displayed blog headlines was "Lifespan for CD-Rs Around Two Years". The blog post behind the headline, by Brad King, quotes as if it were fact a 1/10/2006 IDG News Service story, which in turn quotes Kurt Gerecke, identified as "a physicist and storage expert at IBM Deutschland":

"Unlike pressed original CDs, burned CDs have a relatively short life span of between two to five years, depending on the quality of the CD. There are a few things you can do to extend the life of a burned CD, like keeping the disc in a cool, dark space, but not a whole lot more."

That's scary stuff -- think of all the crucial stuff naively saved on CD-Rs! But is it really true?

I checked into it a bit, not to get on Tech Review's case, but because I was genuinely worried about all the crucial data that I have backed up on CD-Rs. And apparently, it ain't necessarily so. The wikipedia article on CD-Rs says:

There are three basic formulations of dye used in CD-Rs:

Cyanine dyes were the earliest ones developed, and their formulation is patented by Taiyo Yuden. Cyanine dyes are mostly green or light blue in color, and are chemically unstable. This makes cyanine discs unsuitable for archival use; they can fade and become unreadable in a few years. Many manufacturers use proprietary chemical additives to make more stable cyanine discs.

Azo dye CD-Rs are dark blue in color, and their formulation is patented by Mitsubishi Chemicals. Unlike cyanine, azo dyes are chemically stable, and typically rated with a lifetime of decades.

Phthalocyanine dye CD-Rs are usually silver, gold or light green. The patents on pthalocyanine CD-Rs are held by Mitsui and Ciba Specialty Chemicals. These are also chemically stable, and often given a rated lifetime of hundreds of years.

The same article says that

With proper care it is thought that CD-Rs should be readable one thousand times or more and have a shelf life of several hundred years. Unfortunately, some common practices can reduce shelf life to only one or two years. Therefore, it is important to handle and store CD-Rs properly if you wish to read them more than a year or so later.

That model predicts (at the 95% confidence level) that 95% of properly recorded discs stored at the recommended dark storage condition (25°C, 40% RH) will have a lifetime of greater than 217 years.

It wasn't hard to find this information: these pages were the first and third hits on a Google search for {CD lifetime}.

I'm glad to be warned that low-quality CD-Rs may lose data after a couple of years, and from now on I'll check to see what dyes are used in the CDs I buy. (I checked the ones I've been using, and I think I'm OK.) The E-MELD "School of Best Practices in Digital Language Documentation" mentions this problem in the general context of hardware and software obsolescence, but doesn't make any specific recommentations (that I could find in a quick search, anyhow), except the suggestion to

Place archival copies in a stable online linguistic archive that will:

Maintain a constant URL.

Migrate data to new formats

Good idea -- the LDC, among other outfits, stands ready to publish significant and well prepared language documentation archives -- but E-MELD ought also to tell language documenters to use CD-Rs with phthalocyanine dyes. And Tech Review should have done so, too, rather than just repeating an apparently incorrect newswire story.

A prescriptivist rant? Get a clue

I was astonished to find my
musings about why people people call common abbreviations acronyms
described at the excellent copy-editor's blog
Tongue Tied as a "thoroughly prescriptivist rant". "Pullum pounced",
it says, as if I was some wild carnivorous beast; "what's with the
complainin' 'n' prescribin' act?", it asked (in the original form of the post,
now slightly revised), as if I had howled for vengeance and laid down ukases. I mean, really! I said, very calmly, "It's funny that
people get this wrong" (about calling something like FTBSITTTD an acronym),
and went on to note what acronyms and abbreviations have in common: they
are both what The Cambridge Grammar calls initialisms —
words formed anomalously from lists of initial letters. And that is supposed to be a rant? Only to someone who has never seen me
rant.

I've noted before that with popular beliefs about language it's all
"Everything is correct" versus "nothing is relevant". Two extremes,
no sensible middle. Let me say it again, as clearly as I can, in boldface
this time: It is not inconsistent for a linguist to note that somebody
used a word with a meaning that it does not standardly have.
Even a "descriptivist" professor of linguistics like me is perfectly entitled to the
the view that comprise means "comprise", i.e., "embrace"
or "include"; it doesn't mean "compose" or "jointly make up" (despite
a century of evidence of people confusing comprise and
compose — and again, it's psycholinguistically
interesting that these two words
are confused with each other whereas red and green
are not). And likewise, acronyms standardly denotes
the initialisms that can be pronounced like words rather than lists
of letter names (hence the
apology offered by
Slate magazine
for wrongly calling FTBSITTTD an acronym). Why would anyone think that because I'm a linguistic
scientist I have to pretend nobody ever misuses any words?

Tongue Tied
is quite right, though, to point out that
Webster's
actually lists "abbreviation" as a second meaning
of acronym (after an "also"). The Webster's
practice is helpful to dictionary users: it
enables a reader to figure out what some people mean when they say
"acronym". That's good.
And paying attention to me will enable you to understand why
"abbreviation" is only given as a secondary meaning, and why The Cambridge
Grammar uses its terms the way it does. You need to understand both
that "abbreviation" is not the original or standard meaning (the American Heritage Dictionary, Tongue Tied notes,
does not give that as a meaning for
acronym) and that lots of people
believe otherwise.

January 18, 2006

And still they come

The market for books of word and phrase origins seems to be
inexhaustible. Most of them have no visible scholarship
whatsoever, just bald assertion. And many of the sources they
propose are preposterous, or plausible-sounding but clearly
wrong. (There are books that are honorable exceptions, most
recently Michael Quinion's Ballyhoo,
Buckaroo, and Spuds.) Yet still they come.

The latest of these horrors to come to my attention is Albert Jack's Red Herrings and White Elephants
(HarperCollins, 2004). No references for its claims, and just
opening pages at random I found three appalling entries in as many
minutes.

Number one: mealy
mouthed, Jack claims, derives from Ancient Greek melimuthos 'honey speak'.
There is no mention of meal (as in the OED), instead this elaborate and
strained loan-word account. Well, they sound sort of alike.

Number two: fell swoop Jack
takes back to Shakespeare, claiming that once the bard used the word fell in this phrase in the Scottish
tragedy, it came to have the meaning 'evil'. Good grief, even
without checking the OED, I knew that fell
'evil' goes back to Old English. (I checked the OED anyway;
memory is a fickle thing.) I'm guessing that somewhere Jack heard
that Shakespeare was involved in the history of the phrase and then
just made up the rest of the story. Shakespeare is in fact
involved, as Quinion explains in his entry for one fell swoop: when Macduff learns
that his entire family has been murdered, he does indeed cry out, "O
hell-kite! All? What, all my pretty chickens and their dam
at one fell swoop?" And this would have been understood entirely
compositionally by everybody in the audience, as a metaphorical
allusion to the evil plummeting of a kite (the bird) as it seizes its
prey. Not a novel meaning of fell
at all, but a wonderfully effective image. And so one fell swoop became a memorable,
and quotable, expression. Unfortunately, fell 'evil' later pretty much
dropped out of use in English, leaving this expression marooned as an
idiom. It's a nice little story about language history, much
better than the story Jack invented.

Number three: the spill the beans
entry provides a charming tale about voting with beans in Ancient
Greece. (Again, the Ancient Greek thing!) Against this is
the OED's assertion that the expression is originally U.S. slang and
the fact that the dictionary has no cites for it earlier than 1919
(from an American source, of course).

Enough, enough. It's a terrible book, by someone who doesn't seem
to know how to use dictionaries. We need a new genre category for
publications like this: "etymological fantasy", "fantasy etymology", or
maybe "fantetymology".

The one customer review on amazon.com -- it's a counterweight to a positive snippet from an
editorial review in the Knutsford (Cheshire) Guardian -- is rather more
detailed, and even more negative, than mine. But then I spent
only 15 minutes on mine, mostly writing time; my reading time was
blessedly brief. Here is "Syntinen", writing from
southeast England:

This book should carry a label saying
"Warning - don't assume that any of this is true". In the foreword the
author portrays himself as being inspired to write it when sitting in
an olde English pub musing on the oddness of English phrases. It reads
as though it had been researched in a pub as well; many of the
"origins" given are exactly the kind of thing you'd be told by some
wiseacre leaning up against the bar. To disprove some of them, such as
"keeping danger at bay" and "on the fiddle", wouldn't even take a
reference library; you'd only need to look up the words in a good
dictionary. One or two of them - such as "dead ringer" - come directly
from a famous internet spoof, "Life in the 1500s".

The book is sloppy in every way. Regardless of whether the explanation
of a phrase's origin is broadly correct or not, many of the supporting
"facts" are wrong; such as the statements that a pig's ear "cannot be
eaten or used in any way" - an assertion that would startle peasant
cooks from all over Europe - and that pigs are "sacred to Hindus" (!)

It's very odd that some of the "explanations" of phrases in this book
don't actually explain them at all. The images evoked by phrases like
"flogging a dead horse" or "scratch my back and I'll scratch yours"
exactly match what we mean when we say them; the stories in "Red
Herrings and White Elephants" actually make much less sense. And yet
people seem to prefer the far-fetched stories. Strange.

We, We, Madame

... one thing united lawmakers on both sides: reverence for the first person. Republicans used the "I" word 1,180 times. Democrats used it 1,123 times. Combined, they used it well more than the nominee, who said "I" 1,907 times.

Milbank doesn't tell us whether the rate of uses of the first person singular was significantly more or less than we should have expected from the various parties in such confirmation hearings. But as I asked myself this question, it reminded me of a recently noteworthy first person (plural) pronoun: the errant reflexive "ourselves" that Geoff Pullum cited in President George W. Bush's 1/13/2006 press conference with German Chancellor Angela Merkel. Did an uncharacteristic focus on the diplomatic "we", due to the renewed emphasis on trans-Atlantic identity, lead W into over-reflexivization?

The problematic sentence came up in response to the first question from a reporter, and (according to the White House transcript) was:

The answer to your question is that Guantanamo is a necessary part of protecting the American people, and so long as the war on terror goes on, and so long as there's a threat, we will, inevitably need to hold people that would do ourselves harm in a system that -- in which people will be treated humanely, and in which, ultimately, there is going to be a end, which is a legal system. [emphasis added]

That ourselves should have been us, since the subject of its clause is "people", not "we". Why might the president have spoken as if an extra "we" had snuck into the subect slot? Well, according to the transcript, Bush's opening remarks were 576 words long, and included

26 we
10 our
2 us

for a total of 38 first person plural pronouns, or a remarkable 6.6% of his word count. In particular, it's notable that 4.5% of these 576 words were the subject form "we". If we compare his opening remarks at the most recent three visits of heads of state, we find "we" percentages between 0.4% and 2.3%, or roughly a tenth to a half of the rate in his remarks welcoming Chancellor Merkel.

Specifically: when President Bush welcomed President Saleh of Yemen on 11/05/2005, his 167 words included

1 we
3 our
3 us

for 4.2% 1st plurals, and 0.5% "we". When he welcomed Prime Minister Berlusconi to the White House on 10/31/2005, his 171 words included

4 we
3 our

for 4.1% 1st plural pronouns, and 2.3% "we".

And when he welcomed President Abbas of the Palestinian Authority on 10/20/2005, his 928 words included

4 we
4 our

for a mere 0.9% 1st plural pronouns and 0.4% "we".

President Bush was not alone in focusing on "we" in the session with Chancellor Merkel. The 954 words of Merkel's (translated) opening remarks included

47 we
1 ourselves

for fully 4.9% "we".

It was a veritable festival of we-ity. It pegged the we-meter. So it's not surprising that a stray "we" crept into the empty subject slot of that relative clause.

Sigmund Freud, on the other hand, would have been more impressed by the literal, subversive interpretation: who indeed are those "that would do ourselves harm"? And these days, John McCain might agree with him.

The birth of truthiness?

Last week's greattruthinessdebate
is still raging in somecorners,
despite the fact that both the American
Dialect Society and Comedy Central's "The
Colbert Report" have probably milked about as much publicity out of
the spurious squabble as can be expected. At the heart of the debate is
the question of what sort of ownership Stephen Colbert (or rather the
truculent on-air persona known as "Stephen Colbert") has over truthiness, the word
first popularized on his show and later selected as ADS Word of the
Year. Colbert was
appalled when the initial Associated Press story on the Word of the
Year selection didn't even mention him, instead turning to an ADS member, Michael Adams, for a quick gloss. (The AP's shoddy reporting has led,
bizarrely, to Colbert calling the AP the "No. 1 threat facing
America"... in an article
by the AP.)

Though Colbert vehemently declared
that he "pulled that word right out of where the sun don't shine,"
Adams defended his right
to define the word by pointing out (both to Colbert himself and to the
AP in its followup article)
that truthiness can already
be found in the Oxford English
Dictionary. Colbert's rejoinder — "you don't look up
truthiness in a book, you look it up in your gut" — is unassailably
truthy. Nonetheless, we would be failing in our mission as wordanistas
if we didn't try digging a little deeper into the roots of truthiness.

Since the OED's lone 1824 citation for truthiness was first noted right
here back in October, it's incumbent on us to investigate the
source of this earliest known usage. The citation is taken from a book that was
actually published in 1854 by Joseph Bevan Braithwaite, entitled Memoirs of Joseph John Gurney, with
selections from his journal and correspondence. Gurney
(1788-1847) was an English banker who gained renown as a charismatic
Quaker minister, traveling to the United States and other countries to
preach on behalf of world peace, the abolition of slavery and capital
punishment, and abstinence from alcohol. Braithwaite was a disciple of
Gurney's evangelism, and he sought to spread his mentor's teachings by
presenting Gurney's collected writings posthumously.

Fortunately, both
volumes of the memoirs are publicly available from the University
of Michigan's Making of
America digital library. And it turns out Gurney used truthiness at least twice in his writings. (I haven't found any other pre-Colbert uses of the
word in printed materials, though the Usenet
archive finds a number of mostly tongue-in-cheek examples in online newsgroups over the
past decade.)

The first of Gurney's uses, the one that made it into the OED,
describes Amelia
Opie (1769-1853), a family friend who, through Gurney's influence,
decided to become a Quaker herself:

The chronology here is a bit confusing. The date at the top of the
page
is 1824, which is what the OED used for its citation. But Gurney is
describing the difficulties Opie encountered
"when she found herself constrained to make an open profession of
Quakerism," which didn't happen until 1825. The chapter where the passage appears actually begins with letters to Opie in 1824, but then Braithwaite
injects other material that Gurney wrote about her and her
decision to become a Quaker. This particular passage is from "his
notice of his long valued friend," which on an earlier page
Braithwaite explains is from Gurney's autobiography, a manuscript
written in 1837
while he was on a voyage to America. So it looks like the OED got the
dating wrong — truthiness is
actually 13 years younger than we thought. (Maybe Colbert was right
about not trusting reference books!)

Regardless of the exact date of the usage, it's immediately striking to the reader due to its italicization in the text, which suggests that Gurney was
emphasizing the unusualness of the word, perhaps in recognition of its
nonce status. I don't find any uses of truthy (or other derived forms)
elsewhere in the text, so I doubt that this was a term in
common use by the Quakers of the era. But certainly the word truth had a particular resonance for
Gurney and his fellow Quakers. To this day, Quakers often call
themselves "Friends
of the Truth" and place great importance on truthful
testimony. So for Gurney to trumpet Opie's "truthiness" must have
been an innovative form of praise for a recent convert to Quakerism.

The second example of truthiness
that I found
in Gurney's writings, from a journal
entry written in 1844, relates not to a personal quality but to the
Scriptures themselves:

Again, the italicization of the word highlights its peculiarity. But
here the usage seems positively (dare I say it?) Colbert-esque. Late in
life, Gurney learned to take delight in the odd little contradictions
found in the Scriptures. But these contradictions only reinforced his
faith in the truth, or rather the truthiness,
of the biblical text. Without those minor inconsistencies, the
Scriptures would lack "genuineness and authenticity." So clearly Gurney
was reaching for a concept beyond mundane truth. The Bible is no mere
reference book, after all. As I'm sure Mr. Colbert would remind us, no
one ever accused the Good Book of being "all
fact, no heart."

January 15, 2006

What Whorf would have said

[This is a guest post by Paul Kay, responding to an earlier Language Log post.]

My colleagues and I would like to express our appreciation for the nice things Mark Liberman ("What would Whorf say?" Language Log, Jan 3, 2006), had to say about our study (Gilbert, Aubrey L., Terry Regier, Paul Kay and Richard B. Ivry. "Whorf hypothesis is supported in the right visual field but not the left." PNAS. 103, 489-494, 2006). In that paper, we presented evidence for the Whorf hypothesis operating in the right visual field (RVF) but not the left visual field (LVF). This pattern is suggested by the functional organization of the brain, since the RVF furnishes visual input to the left cerebral hemisphere (LH), and the LH is significantly more dedicated to language processing than the right hemisphere (RH). In studies involving visual search for colors, we found that reaction times to target colors in the RVF were faster when the target and distractor colors had different names than when they had the same name; in contrast, reaction times to targets in the LVF were not affected by the names of the target and distractor colors.

Mark’s post gave an excellent description of our experiments and findings, and sparked some very useful email discussion. However, at the risk of seeming ungracious, we wish to contest one part of his interpretation of our results. Our disagreement arises with the following passage of his post:

One [problem] is that the explanation might have worked just as well if the experiment had come out quite a bit differently. […] Other possible results -- basically anything except a situation in which color category makes no difference, or doesn't interact with visual field -- could similarly be given a Whorfian interpretation. [Italics ours]

We think the Whorf hypothesis makes a more specific prediction than this – a prediction that is confirmed by our findings. It has previously been established that (1) other things being equal, stimuli from distinct lexical (thus, linguistic) categories are discriminated faster than stimuli from the same lexical category – a Whorfian finding, since it apparently stems from language, and (2) as mentioned above, language function tends strongly to be biased to the left hemisphere, to which the RVF projects. Under the Whorfian hypothesis that language affects perceptual discrimination, a straightforward extrapolation from (1) and (2) is that cross-category stimulus pairs should be discriminated more readily than within-category pairs to a greater extent in the RVF than the LVF. Our results confirm this specific prediction. More complicated scenarios leading to different predictions are of course possible, but, we submit, less well motivated.

[This post has benefited from email exchanges between some of us and Mark Liberman.]

January 14, 2006

Forensic linguistics, the Unabomber, and the etymological fallacy

It's been noted here at Language Log that mass-media
reporting on linguistic topics very often turns out to be frustratingly
simplistic or misleading. But the truth is, it's difficult to get journalists
interested in writing about linguistics at all. Despite the success of Steven Pinker
in popularizing cognitive linguistics and Deborah Tannen
in doing the same for gender-based sociolinguistics, most research by linguists remains resolutely unsexy. (American dialectologists and
lexicographers find that the only sure-fire way to get mentioned in the
press is to anoint a Word
of the Year — and if that selection sparks a phonyfeud,
all the better!)

But one subdiscipline that seems tailor-made for media attention is forensic linguistics, the
application of linguistic analysis in legal settings, such as criminal casework. The Washington
Times, reporting on the field in its Jan. 12 edition, went with the
obvious headline, "CSI:
Language analysis unit." Hey, if forensic anthropology can get its
own network
TV show, why not forensic linguistics?

The article touches on the forensic analysis of academic scholars
such as Roger Shuy, as well as work
done within the FBI. James R. Fitzgerald, the acting chief of the
FBI's Behavioral Analysis Unit-1 and a longtime Bureau analyst, spoke of
perhaps the most famous application of forensic linguistics in a U.S.
criminal case:

[Fitzgerald] recalls how a transposition of
verbs in the manifesto
written by the Unabomber helped lead to a closer identification of Ted
Kaczynski in April 1996.
The latter used the phrase "You can't eat your cake and have it, too," instead of the usual form, which is "You can't have your cake and eat it, too." Like most people, Mr. Fitzgerald thought Kaczynski had made a mistake. But examination of other letters by him contained a similar feature, which, Mr. Fitzgerald says, "is actually a
traditionally middle English way of using the term. He technically had
it right and the rest of us had it wrong. It was one of the big clues
that allowed us to make the rest of the comparison and submit a report
to the judge who signed off on a search warrant."

There are a few problems with this account. First, by focusing
strictly on forensic linguistics, the article glosses over the role of
David Kaczynski, the brother of the Unabomber. It was David who first made
the realization that the appearance of "you can't eat your cake and
have it too" in the Unabomber manifesto
might be an indication of the writer's true identity. [See Update #3 below.] Fitzgerald has
elsewhere discussed how David Kaczynski's call to the FBI set the
identification of the Unabomber in motion. Following David's hunch,
Fitzgerald's team of agents and analysts made a more systematic comparison
of the Manifesto with letters written by Ted Kaczynski to his brother
and mother. The idiosyncratic use of the "cake" expression, among other
stylistic evidence presented in the FBI's affidavit,
was enough to convince a judge to issue a search warrant for
Kaczynski's cabin in Montana. (See the abstract
from a paper presented by Fitzgerald at the 2001 conference of the
International Association of Forensic Linguistics.)

But what of Fitzgerald's assertion that Kaczynski's particular usage of the
"cake" phrase is "actually a traditionally middle English way of using
the term"? Well, the "eat your cake and have it" ordering is indeed
older than "have your cake and eat it," though its first dating
of 1562 (in John Heywood's A
Dialogue Conteynyng Prouerbes and Epigrammes) only makes it
Early Modern English, not Middle English. But beyond that nitpick,
Fitzgerald's claim that Kaczynski "technically had it right and the rest of us
had it wrong" is a clear variant of the etymological fallacy frequently observed by Arnold Zwicky and others (see here,
here, and here).
As with "could care less" developing from "couldn't care less," it's
often claimed that the historically later idiom is less "logical" and
therefore incorrect.

But does "you can't have your cake and eat it" really lack the
inherent logicality of "you can't eat your cake and have it"? Only if
you consider the ordering of the two conjoined verb phrases to imply
sequentiality: you can't eat your cake and then (still) have it, but you can
have your cake and then eat
it. On the other hand, if the and
conjoining the VPs implies simultaneity of action rather than
sequentiality, then neither version is more "logical" than the other:
cake-eating and cake-having are mutually exclusive activities,
regardless of the syntactic ordering.

Fitzgerald seems to suggest that Kaczynski's "correct" use of the
idiomatic phrase helped guide FBI profilers into looking for an
exacting academic type (rather than a mere raving crank), one who knows
how to use the "right" language that ordinary folks get "wrong." But of
course it's only "wrong" in the sequential-and (rather than simultaneous-and) way of thinking. According to this
article, "eat your cake and have it" was also the ordering that
Kaczynski's mother used (probably another reason why his brother spotted it in the Manifesto). If that's the case, then it's understandable
why he would have grown up scorning the "have your cake and eat it"
ordering, especially if his education at top universities (Harvard and
Michigan) reinforced an elitist view of language use. The FBI thought
they were looking for a paragon of linguistic propriety, when they were
actually just looking for a pedant.

Finally, I should note that the "wrong" version of the expression
has been around for 180 years or so, at least in American usage. A
search on the American Periodical Series and the Making of America
databases finds the have-eat ordering in use from 1827, and
firmly established by the mid-19th century:

North
American Review, July 1827, p. 116
This may have its advantages, but how will he contrive to live below
the common standard and above it at the same time? He cannot both have
his cake and eat it.

Tennessee Farmer, Feb. 1837,
p. 2
We beg of them to look about the River Towns for Farmers who will join
them in getting up a sugar refinery, and in that way falsify the old
proverb, which says "you cannot have your cake and eat your cake."

Cincinnati Weekly Herald,
Nov.
13, 1844, p. 9
If your Jewish creed be right, you are wrong to deny its manifest
deduction. If your Jewish creed be wrong, you are right in wishing to
explain it away. But you cannot have your cake and eat it, too.

North American Review, Apr.
1848, p. 371
The reading public cannot have its cake and eat it too, still less can
it have the cake which it ate two thousand years ago.

Daguerreotype, May 20, 1848,
p. 289
The experiment will end in the discovery that "you cannot have your
cake and eat your cake."

The earliest example, "He cannot both have his cake and eat it," is
helped along by the use of both
— which, as Michael
Quinion notes, assists the reader with the simultaneous-and reading. Later examples from
the 1840s onwards simply append too
at the end of the expression to imply simultaneity, and this remains an
overwhelmingly common phrasing.
But this still doesn't seem to satisfy those who consider the
sequential-and reading to be
somehow more "logical." In fact, Kaczynski said "you can't eat your cake and have it too" in his manifesto, so the presence of too is apparently not sufficient to establish the simultaneous sense of and for those who are committed to the sequential version.

(I certainly don't mean to tar all sequential-and types with the same brush as
Ted Kaczynski. But perhaps he's a cautionary tale for what can happen
when narrow-minded pedantry goes unchecked!)

[Update #1: Early American Newspapers supplies an earlier variant with keep-eat rather than have-eat in a verse entitled "Guillotina for 1797," first published on Jan. 1, 1797 in the Connecticut Courant and subsequently appearing in other papers (Chelsea Courier, Jan. 18, 1797; Providence Gazette, Feb. 4, 1797):

Thus greedy boys would gladly treat it,
Could they but keep their cake and eat it.

Here the exigencies of verse dictate the ordering, but this example still establishes that the simultaneous-and reading was already available by the late 18th century.]

[Update #2: Richard Mason takes issue with my assertion that "cake-eating and cake-having are mutually exclusive activities, regardless of the syntactic ordering," noting that one "has" cake during the process of eating it. Though this is technically correct, the "having" part of the idiom seems to me to imply possession over a long period of time, rather than the transient cake-having that occurs during cake-eating. (The 1797 example, interestingly enough, makes this sense more explicit by using keep instead of have.) Ultimately, however, such ruminations over logicality are irrelevant when it comes to the popular usage of crystallized idioms. Few people protest the expression head over heels to mean 'topsy-turvy,' despite the fact that its "literal" reading describes a normal, non-topsy-turvy bodily alignment.]

[Update #3: James R. Fitzgerald sent the following email:

I recently read your posting on "Language Log" regarding my interview with the Washington Times. I want to make a few clarifications.
Firstly, if David Kaczynski did know of his brother Ted's non-standardized usage of the proverb/idiom "you can't eat your cake and have it too," he never provided it to me or my colleagues on the Unabom Task Force in 1995 or 1996, or any other time. He was apparently aware of the term "cool-headed logicians," which was found in the Manifesto, and also known to have been used by Ted, as he told various investigators of its use. But, as valuable as he was to the FBI in providing his brother Ted's information to the Task Force, he never mentioned anything about the "cake" proverb/idiom. As I explained in chapter 14 of the book Profilers, I was the first one to recognize this unusual usage.
Secondly, years ago, upon doing some basic research re. this phrase, I dated the idiom to the Middle English period as, according to the Morris Dictionary of Words and Phrase Origins, it was first found in Heywood's "Proverbs" in 1546, but, "...it had been in circulation for centuries before that...." (1988: p 277). While the Modern English period is generally seen as beginning c. 1500, I felt it safe to say that its etymological roots are firmly planted in the Middle English period. ]

More freedom but not more right or more rule

I wasn't as knocked over by the out-of-place reflexive as
Geoff Pullum was, but I did notice something in President Bush's news conference yesterday with Angela Merkel. In two places, he used the phrase "rule of law" without an determiner:

We share common values based upon human rights and human decency and rule of law; freedom to worship and freedom to speak, freedom to write what you want to write.

I think the best way for the court system to proceed is through our military tribunals, which is now being adjudicated in our courts of law to determine whether or not this is appropriate path for a country that bases itself on rule of law, to adjudicate those held at Guantanamo.

This is not the first time he's made this choice. For example, on 11/04/2005, in remarks headlined "President Bush Meets with President Kirchner of Argentina", he said:

Argentina and the United States have a lot in common. We both believe in rule of law.

Sometimes, however, he (or his speechwriter) says "the rule of law" in similar contexts:

As two strong, diverse democracies, we share a commitment to the success of multi-ethnic democracy, individual liberty, and the rule of law.

My initial reaction was that the determiner is optional with freedom, but required with right and strongly preferred with rule:

I believe in freedom of speech.
??I believe in rule of law.
*
I believe in right to bear arms.

Web search supports the notion that freedom is different from the other two:

MSN

Yahoo

Google

believe in the freedom of

8,651

63,300

39,500

believe in freedom of

27,398

154,000

104,000

the/0 ratio

0.32

0.41

0.38

believe in the rule of

8,443

65,400

35,400

believe in rule of

252

343

323

the/0 ratio

35

191

110

believe in the right to

18,417

72,600

45,500

believe in right to

235

179

810

the/0 ratio

78

405

56

However, the specific phrase makes a difference: "right to life" occurs an order of magnitude more often without an article than "right to bear arms" does. In fact, it seems that the article is dropped from "the right to life" much more often, relatively speaking, than from "the rule of law".

MSN

Yahoo

Google

believe in the rule of law

7,347

57,200

31,600

believe in rule of law

251

323

285

the/0 ratio

29

177

111

believe in the right to life

877

903

1,190

believe in right to life

101

61

273

the/0 ratio

9

15

4

believe in the right to bear arms

972

973

1,670

believe in right to bear arms

1

3

58

the/0 ratio

972

324

29

"You have right to remain silent" sounds like my Russian grandfather, and indeed the web instances of that phrase that I checked seemed either to be typos or associated with writers from places like Ukraine.

The media sometimes use "rule of law" without an article:

(Salem Statesman Journal) (link) Respect for rule of law and oversight of those in power distinguish our democracy from a dictatorship.

but the article seems to be commoner (except in headlines):

(New York Times) (link) Speaking calmly, if with a continued hint of nervousness, Judge Alito provided no substantive new insights into his judicial philosophy or background as he tried to cast himself as open-minded and dedicated to the proposition that the rule of law should trump personal views and public opinion.

If there were a categorical difference between freedom on one hand, as opposed to rule and right on the other, we could attribute it to the fact that freedom is often used as a mass noun:

I just want some freedom.
They want more freedom.

whereas rule and right are not:

??I just want some rule.
??They want more rule.

However, it's clear that some native speakers think that it's fine to use "rule of law" or "right to life" without an article. The more I read the examples, the more plausible they sound to me as well. I suppose that these phrases are on the way to becoming conventional names for belief systems -- intellectual brand names, so to speak -- so that their use without articles follows the pattern of phrases like "I believe in verificationism" or "we share a commitment to democracy".

One historical note on "rule of law": the OED suggests that this phrase began (and also continues in lawyers' use) as a way of referring to specific legal principles, rather than as a term for the general idea that "no person ..., no matter how high or powerful, is above the law, and no person ... is beneath the law" (as Judge Alito put in in his confirmation hearings).

1765 BLACKSTONE Comm. I. 70 Whenever a standing rule of law..hath been wantonly broke in upon by statutes or new resolutions.

1827 JARMAN Powell's Devises (ed. 3) II. 89 This case was considered to have fixed, beyond controversy, the rule of law upon this subject.

1861 MAINE Anc. Law ii. (1876) 26, I employ the expression ‘Legal Fiction’ to signify any assumption which conceals, or affects to conceal, the fact that a rule of law has undergone alteration.

1994 (title of Act) Sale of Goods (Amendment) Act: An Act to abolish the rule of law relating to the sale of goods in market overt.

[gloss for maxim] 2. a. Law. A proposition (ostensibly) expressing a general rule of law, or of equity

The earliest citation for "the rule of law" in the sense of "political supremacy of an independent legal system" is from 1981:

1981Times 10 Feb. 9/3 Geriatric judges with 19th century social and political prejudices only bring the rule of law into disrepute.

1988Representations Autumn 117 Pudd'nhead Wilson translates mob opinion into the rule of law at the conclusion of Twain's novel.

But I expect that Ben Zimmer will find that Horace Greeley, if not Benjamin Franklin, used this term in its contemporary sense.

[Update: Ben writes:

I can't take it back to Greeley or Franklin, but the generalized sense
of "rule of law" was definitely in common use by the fall of 1914,
shortly after the outbreak of World War I. Both American and British
scholars used the phrase in their justifications for war against the
Central Powers.

"The Meaning of the War", New York Times, Sep. 23, 1914, p. 8
By David Starr Jordan, Former President of Leland Stanford University.
The invasion of Belgium changed the whole face of affairs. As by a
lightning flash the issue was made plain: the issue of the sacredness
of law; the rule of the soldier or the rule of the citizen; the rule
of fear or the rule of law. ... However devious her diplomacy in the
past, Britain stands today for the rule of law.
Reprinted in The New York Times Current History of the European War, Vol. 1

"Oxford Historians Defend England," New York Times, Oct. 14, 1914, p. 4
"The war in which England is now engaged with Germany is fundamentally
a war between two different principles — that of raison d'état, and that of the rule of law."
Quoting Why We Are At War: Great Britain's Case

]

[Update: Margaret Marks points out, in a
post on her Transblawg, that the OED actually has a specific entry for "rule of law", which I carelessly missed, and which (uncharacteristically) Ben Zimmer didn't catch me on:

c. rule of law: (a) with a and pl. : a valid legal proposition; (b) with the : a doctrine, deriving from theories of natural law, that in order to control the exercise of arbitrary power, the latter must be subordinated to impartial and well-defined principles of law; (c) with the : spec. in English law, the concept that the day-to-day exercise of executive power must conform to general principles as administered by the ordinary courts.

What's at issue here is b and (especially) c; and for b the earliest citation is

1885 A. V. DICEY Law of Constitution v. 172 When we say that the supremacy or the rule of law is a characteristic of the English constitution, we generally include under one expression at least three distinct though kindred conceptions. We mean, in the first place, that no man is punishable or can be made to suffer in body or goods except for a distinct breach of law established in the ordinary legal manner before the ordinary courts of the land.

It's embarrassing to have missed this.

In any case, both of these uses are explicitly said to require the.

This leaves me with two questions:

First, since the concept was a prominent one in 18th-century political philosophy, why does the phrase not appear until the late 19th century? What phrases were used instead? Where was Whorf for that century and a half?

Second, when did the phrase start being used without the definite article?

January 13, 2006

No tattooed acronym

The hugely newsmaking exposé of Oprah-touted memoir-faker James Frey on The
Smoking Gun reports that he "wears the tattooed acronym
FTBSITTTD (Fuck The Bullshit It's Time To Throw Down)." He
does not. No matter what the mendacious and vomit-bespattered self-deaggrandizer may have tattooed on his body, or on what
part, FTBSITTTD is not an acronym. It's an abbreviation. It's
funny that people get this wrong. What the two terms have in common
is that they are composed of the initial letters of a phrase.
The difference is whether you can read out the initial letters
as if they were a word (as with AIDS, but not TB). Try
pronouncing [ftbsitttd] as a word if you like, but if your
tongue gets tangled into a knot, don't come complaining to me. Just get
your boyfriend to untangle it.

People that would do ourselves harm

I jumped when I heard it ten minutes ago
on NPR's "All Things Considered",
and turned to Google immediately to double-check transcripts.
And sure enough, during
his press conference with German chancellor Angela Merkel,
President Bush used a reflexive pronoun with no permissible
antecedent — the noun phrase it was co-referring with
was not in a position that the grammar allows.
I didn't mishear.
Reuters
has already quoted it:

"Guantanamo is a necessary part of protecting the American people. And so long as the war on terror goes on, and so long as there's a threat, we will inevitably need to hold people that would do ourselves harm."

That's totally ungrammatical, in all dialects.
Reflexive pronouns like ourselves
must (to put it roughly — there are some codicils)
have an antecedent earlier in the same clause, agreeing with it in person,
number, and gender. This isn't a subtle usage or style point. It
isn 't a matter of dialect variation. Bush really does have a problem
with spontaneously uttering sentences that
respect the syntax of Standard English. He balks even on short ones.
I'm not generally one of
the picky-picky "Bushism" collectors, but even I sometimes have to wonder,
are
our president learning?

Of course, the wires are burning up with people emailing me examples of
emphatic reflexives and telling me Bush might have been using one of those.
He wasn't. The example cited above isn't a possible context for an emphatic
reflexive, and he spoke it with no stress. He just forgot what the subject of
his clause was. So did the people who wrote these examples, sent to my by
Chris Culy:

So we put our collective heads together and came up with a moniker that does
ourselves proud. (http://www.cufsnorth.org/newslett1.htm)

I think labeling them as "evil" is a demonization which does ourselves a
disservice by adding a religious bias to our relationship with that country.
(http://68.166.163.242/cgi-bin/readart.cgi?ArtNum=7777)

I'm not buying it, Chris: sometimes people write down sentences that just
don't cut it grammatically. These are painfully ungrammatical.

What I will grant you, though, is that ourselves does sometimes
occur (stressed) with the meaning "us ourselves", as in the example you
sent me:

'I said come here, cell-sword! This concerns you as much as it does
ourselves', hissed the rogue, his rat-like face pulling itself into a
grimace of agitation.
(http://www.pulpanddagger.com/pulpmag/dark/cobra1.html)

Colbert immortalized again

Wordanista Michael Adams may no longer be "On Notice" over at the Colbert Report, though AP reporter Heather Clark is still shunned; but now another journalist has taken lexicographical notice of Stephen Colbert, and this time it's not for truthiness. Lane Greene, from economist.com, has nominated Colbert to be cited in the Eggcorn Database for copywrite, copywritten and copywrote.

Here's Lane's letter to Language Log, dated 1/11/2006:

Yesterday's post on the Colbert Report made me watch it last night [this refers to the show of Tuesday, 1/10/2006 - ed.] Not only did he return to "truthiness", but another linguistic item popped up. He noted to Carl Bernstein that "-gate" had become a common scandal suffix, and asked him something like "have you copywritten that?" He referred back to a word he'd used earlier, "sexageterrorists", and then said "I copywrote that." Obviously he meant "copyrighted" in both cases.

IP law aside (you copyright a work, not a single word; you can try to trademark a word), it's an interesting eggcorn candidate. Perhaps Colbert-the-character was engaging in another language joke, but instead it seemed to me that Colbert-the-actor, speaking quickly and without a script, might have actually made the mistake.

Some forms of "copywritten" might be an inflection of a phrasal verb like "to write copy", but most seem to be inflections of the eggcornic "copywrite". (Looking for "copywritten material", "copywritten music", etc. confirms this.) It's also an interesting candidate because most people wouldn't use "copywrite" in its noun form. I imagine that a lot of doubletakes would accompany a CD that carried the label "Copywrite 1998", but when people go for the participle, "copyrighted" seems wrong and they go for "copywritten".

There's already a citation in the Eggcorn Database for copywrite, entered by Chris Waigl on 2/25/2005, and the nominal form is reasonably common even on respectable journalistic sites -- searching Google News this morning for copywrite turns up 12 hits like this one:

CNN/money 1/7/2006: Sony (Research) CEO Howard Stringer brought out Hanks, the star of its new film the "Da Vinci Code," ... to talk about the importance of copywrite protection.

But Colbert surely deserves special mention, if only for using so many of the principal parts of the verb "copywrite", whether in jest or in earnest. And he shouldn't feel in any way disrespected, since here at Language Log we consider eggcorns to be a poetic form even more compact than the haiku.

The truthiness wars rage on

It was round two of Colbert vs. Adams Thursday night.

In our last installment, Comedy Central's mock-newsman Stephen Colbert put Michael Adams of North Carolina State University "on notice" for his quote in an AP article about the selection of truthiness as the American Dialect Society's 2005 Word of the Year. Colbert excoriated the Associated Press for its failure to recognize him as the source for the word, and Adams, who provided a Colbert-free definition for the article, ended up being one of the targets of his righteous indignation.

But Adams had the opportunity to fight back on Thursday's "Colbert Report," in a debate via telephone about the ownership rights to truthiness. While Colbert claimed to have invented the word, Adams pointed out that it already appears in the Oxford English Dictionary (first noted here, with the OED's 1824 citation, back in October — though to be totally truthy, the truthiness of 1824 simply meant 'truthfulness').

You know, by now everyone's aware of the conspiracy against me by the Associated Press. The American Dialect Society named truthiness as the Word of the Year. So far, so good.

But then the AP picked up the story and didn't even call me for a definition. They asked one "Dr." (in air quotes) Michael Adams, visiting associated professor at North Carolina State — which I think may be a made-up school. This "professor" told the AP that truthiness means, quote, "truthy, not facty," earning him a place on my "Notice" board.

Anyway, this Adams guy now claims he can explain himself. And since I am nothing if not generous to those I have crushed, I talked to Dr. Adams by phone earlier today. Here's how it went.

(Beginning of taped interview.)

COLBERT: Hello.

ADAMS: Hello.

COLBERT: Hello, is this Dr. Michael Adams?

ADAMS: Yes it is.

COLBERT: Uh, Dr. Adams, this is Stephen Colbert from "The Colbert Report." Are you familiar with the show?

ADAMS: Ummm, no.

COLBERT: OK, are you the same Dr. Adams who took it upon himself to define truthiness to the Associated Press last week?

ADAMS: Uh, yes I am.

COLBERT: Um, sir, where do you get off defining a word that I made up?

ADAMS: You didn't make it up. It's in the dictionary.

COLBERT: What dictionary?

ADAMS: It's in the Oxford English Dictionary.

COLBERT: OK, stop right there. I pulled that word right out of where the sun don't shine on October 17th.

ADAMS: Umm...

COLBERT: All right, you are aware that you are "on notice" and this phone call's not helping. Do you understand the implications of that?

ADAMS: Um, no, I don't understand the implications.

COLBERT: Well, they're many.

ADAMS: How do I get off notice?

COLBERT: You could apologize.

ADAMS: Apologize for...?

COLBERT: I accept! Thank you! It takes a big man to admit he was wrong.

ADAMS: I didn't apologize.

COLBERT: Too late, I forgive you. Good day!

ADAMS: But...

COLBERT: I said good day, sir!

(End of taped interview.)

You hear that, Associated Press? I am standing by for your formal apology. And that means engraved.
Good night, citizens. We'll see you tomorrow.

(Just in case anyone was wondering, that was indeed the voice of Michael Adams, though the nerdish "visual approximation" they used bears no resemblance.)

[Update: Colbert and Adams also face off in a new Associated Press article, this time by entertainment writer Jake Coyle. The OED entry for truthy and its derived form truthiness comes up again, but Colbert counters this lexicographical reproach in expected fashion:

"The fact that they looked it up in a book just shows that they don't get the idea of truthiness at all," Colbert said Thursday. "You don't look up truthiness in a book, you look it up in your gut." ]

The evolution of "birdflu"

(For now the mutation may be limited to isolated cases, such as the Reuters headline writers who have been trying it out since at least July. But look for the compact single-word form to catch on as the virus — or at least hysteria about the virus — continues to spread.)

The [sic]ing of the President

In November,
when the White House Press Office sought to change transcripts of a briefing by Scott McClellan (who either thought that it was "accurate" or "not accurate" that Karl Rove and Scooter Libby were known
to have had conversations about Valerie Plame), liberal bloggers were
quick to invoke the usual dystopic Orwellian imagery. Though that
suspicious incident has still not been fully explained, I continue to
give the White House transcribers the benefit of the doubt, since the official transcripts rarely give
the appearance of being "cleaned up," even to correct trivial (but
potentially embarrassing) slips of the tongue. Two examples of
transcript problems involving President Bush this week put this idea to the
test.

One possible case of transcript-cleansing occurred on Monday, on the
occasion of Bush's appearance with Judge Samuel Alito before
his confirmation hearing. Eric Pfeiffer on Wonkette
reported that this sentence appeared in an early transcript emailed to
the White House press pool:

Sam Alito is imminently qualified to be a
member of the bench.

It took about fifteen minutes for the White House Press Office to
catch
this and email out a correction informing the press corps that Bush
actually said:

Sam Alito is eminently qualified to be a member
of the bench.

This version is also what went into the official White House transcript.
But the damage had been done, as Pfeiffer and several other bloggers took
the opportunity to ridicule Bush's supposed implication that Alito is
not quite qualified but should be soon. (News organizations were
roughly split on the matter: a Google News search currently finds 31 appearances
of "eminently"
and 39 appearances of "imminently."
This includes two
comments on the transcript correction itself, one from Wonkette and
one from Townhall.com.)

In the video
accompanying the official transcript, one can clearly hear Bush say
['ɪmɪnəntli] rather than ['ɛmɪnəntli]. But how do we know that this was a malapropistic gaffe as the bloggers imply, rather than simply an example of the pin-pen
merger? The merger of /ɪ/ and /ɛ/ before nasals, typically with /ɛ/
raising to the position of [ɪ], is a dialectal
feature encompassing most of
Texas, and Bush identifies himself as a native Texan. (His family moved to
Texas when he was two, though it's often claimed that his
Texan accent is a relatively new phenomenon and lacks authenticity.) I
haven't made an exhaustive study, but I believe Bush frequently
exhibits the
pin-pen merger (especially
when he's in a folksy Texan mode), though the feature is not always evident. When he called Harriet Miers "eminently qualified" (whoops!) on Oct.
4, the audio suggests that he raised the initial vowel to [ɪ]. But
when he said it
again about Miers on Oct.
12, the word sounded more like ['ɛmɪnəntli]. And on Nov.
6, when he wanted to make it "eminently clear that the United
States is a friend of Brazil" the initial vowel again seemed to be in the neighborhood of [ɪ] (though the audio is not entirely clear).

A transcriber lacking the pin-pen
merger might misconstrue Bush's pronunciation of "eminently" as
['ɪmɪnəntli], and this appears to be what happened when the initial
transcript of Monday's comments was released with "imminently." The
later correction by the White House Press Office might not have been
remarked upon in another presidency, but since it has become such a
sport to lampoon Bush'sdisfluencies,
this simply added more fuel to the fire. Despite the clumsy handling of
the correction, I don't think Bush's use of ['ɪmɪnəntli] is necessarily
proof of a lexical confusion between "imminently" and "eminently." Of
course, such a confusion could be encouraged by the existence of the pin-pen merger, since the two words
would become homonymous. But this homonymy also means that we have no
way of knowing if a speaker with the merger has the lexical confusion
solely based on spoken evidence. (The confusion would be easier to spot
in written form, but in that case we'll just have to wait until Bush's
presidential papers are released for verification.)

Let's see how the White House transcribers deal with a more obvious
slip of the tongue from Bush. On Tuesday, in his address to the
Veterans of Foreign Wars, Bush uttered this unfortunate remark:

You took an oath to defend our flag and our
freedom, and you kept that
oath underseas and under fire.

The sentence as constructed by Bush's speechwriters has multiple
parallel structures, such as "took an oath...kept an
oath" and "our flag and our freedom." But Bush flubbed the final
parallel structure of "overseas and under fire" by overextending the
parallelism to "underseas and under fire," thus committing a kind of anticipatory
assimilation. Again, Eric Pfeiffer on Wonkette
and other bloggers were quick to snicker. (Jacob Weisberg must have
been busy,
since neither of these has appeared on his list of Bushisms yet.) Fair enough, a
clear gaffe. But how does the official transcript
read?

You took an oath to defend our flag and our
freedom, and you kept that
oath underseas [sic] and under fire.

No coverup here. Indeed, the White House transcribers have no
problem deploying a well-placed [sic], as in these recent examples from
President Bush:

12/4/05:
In his capacity to grow and to excel as an artist, Robert Redford has
shown very few limitations. In 1980, he decided to try working behind
the camera. The result was "Ordinary People," and it won him the Oscar
for best actor [sic].

12/7/05:
An Iraqi battalion has consumed [sic] control of the former American
military base, and our forces are now about 40 minutes outside the
city.

1/6/06:
I can't imagine a tax code that penalizes marriage. It seems like to
me we ought to be encouraging marriage to [sic] our tax code.

In the first example, Bush commits a factual error: Kennedy Center
honoree Robert Redford won the Oscar for best director, not best actor
in 1980. The next example is an assimilatory slip like "underseas and
under fire"; in this case, "assumed control" is transformed into
"consumed control." The third [sic] flags a clumsy prepositional usage,
since we can assume Bush is in favor of encouraging marriage through, not to, our tax code.

In fact, Bush got [sic]ed twice in the same speech on Monday at a
Maryland elementary school,
commemorating the fourth anniversary of the signing of the No Child
Left Behind Act. From the transcript:

And as I mentioned, there was a lot of
non-partisan cooperation --
kind
of a rare thing in Washington. But it made sense when it come [sic] to
public schools.

Laura and I's [sic] spirits are uplifted any time we go to a school
that's working, because we understand the importance of public
education in the future of our country.

First, "it come..." is given the [sic] treatment, evidently Bush's
latest stumble over agreement
in number. (Perhaps Bush was wavering between "came" and "come(s)"
since the main clause uses a past-tense construction, "it made
sense.") It would have been easy enough to change the transcript
to read "it comes" without anyone noticing, but the transcriber
remained meticulous. In the second case, Bush takes a common route for
dealing with a coordinate possessive structure in which the last item
is a pronoun. English is notoriously vexing when
it comes to such structures, and Bush's solution of "Laura and I's
spirits" may actually be a slight improvement for some speakers over
the putatively standard but no less awkward "Laura's and my spirits."
Nonetheless, it too gets the [sic].

Could it be that the transcriber is making a point of correcting
Bush, particularly in a speech about education? (During the speech Bush
made one of his usual self-effacing remarks about his own disfluencies: "I can remember [Laura] reading to our little girls all the time. Occasionally, I did, too, but stumbled over a few of the words and might have confused
them.")
One
blogger seemed to think so, commenting, "Hell, even the transcript
guy is marking Bush down."

But President Bush isn't the only one who gets [sic]ed by the White
House transcription team. For the month of December, I found two [sic]s
for Vice President Cheney and three for Scott McClellan:

Cheney, 12/6/05:
One unit of the 40 (sic) I.D., the "Fighting 69th" from New York City,
showed its toughness in confronting insurgents around Baghdad.

Cheney, 12/20/05:
I don't believe for a minute that the vast majority of Americans are
prepared to accept defeat, to retreat in the face of terror, to turn
over Iran (sic) or Afghanistan to the likes of Osama bin Laden.

McClellan, 12/12/05:
And it's important that all of us, not only in the coalition, but the
entire international community and the Arab world, stand behind the
Iraqi people during this time of transition to a peaceful and
democratic future, because the Iraqi people have shown through their
courage and determination that they want a freedom of future [sic].

McClellan, 12/14/05:
Then the members were able to hear from Ambassador Khalilzad, who was
on with General Casey from Baghdad, video conference. And General
Khalilzad [sic] gave an update on the elections and talked about how
there are more than 300 political parties that are participating in the
elections.

McClellan, 12/16/05:
I think these are difficult issues that you have to address in a
post-September 11th world. Some people go back to a post-9/11 [sic]
mind-set now that we're four years after the attacks of September 11th.

Most of these misspeaks appear to be factual errors (besides
McClellan's odd invocation of "a freedom of future") and thus are obvious candidates for [sic]ing. By contrast, Bush gets [sic]ed not just on errors of fact but also on seemingly minor grammatical lapses. All in all, the transcribers at the White House seem at pains to demonstrate that they are not, in fact, sanitizing any potential embarrassments in the public comments of Bush and other officials. This is good to know, especially for those of us in the reality-based community.

I've been sporadically noting the whitehouse.gov [sic] phenomenon, which I personally attribute to the very understandable annoyance of someone assigned to transcribe the speeches of George W. Bush. They're payback sics. More:

"Our journey from national independence to equal injustice [sic] included the enslavement of millions, and a four-year civil war." (blog link)

January 11, 2006

Trying to talk alike and not succeeding

There were a lot of great talks at the LSA annual meeting in Albuquerque, and I wish I had the time to tell you about them. But for now, I'll dash off a note on one presentation, because it included a quote that caught my eye and my inward ear.

Alexandra Jaffe spoke on the topic "Transcription in Sociolinguistics: Nonstandard Orthography, Variation and Discourse". She started with her own work on the "polynomic" orthography of Corsican, where
"variation in spelling is understood to be a systematic representation of coherent linguistic systems (regional dialects of Corsican)". In contrast, she observed, we Americans most often use respelling to index stigmatized dialects. This effect is especially striking when the respelling represents ubiquitous, pan-dialectal pronunciations, like "wuz" for was, "hist'ry" for history, or "subjecks" for subjects.

Jaffe described an experiment by Jennifer Nguyen that brought this out clearly ("Transcription as Methodology: Using Transcription Tasks to Assess Language Attitudes", NWAV 32). Jaffe's summary:

Novice transcribers in Michigan listened to two speakers with accents that were different from their own: one stigmatized (Appalachian English) and one not (British English). They were given instructions to transcribe them in such a way that anyone reading their transcriptions would “get the same impression of the speaker that the participants got listening to the samples" and were told that they could represent speakers in any way they wanted, that dictionary spellings were not required. Nguyen found that the percentage of respellings in these novice transcripts was significantly higher for Appalachian vs. British English...

The quote that caught my attention in Jaffe's handout was a passage written by a Glaswegian poet, Tom Leonard:

Yi write doon a wurd, nyi sayti yirsell, that's no thi way a say it. Nif yi tryti write it doon thi way yi say it, yi end up wit hi page covered in letters stuck thigither, nwee dots above hof thi letters, in fact yi end up wi wanna they thingz yi needti huv took a course in phonetics ti be able ti read. But that's no thi way a think, as if ad took a course in phonetics. A doan't mean that emdy that's done phonetics canny think right—it's no a questiona right or wrong. But ifyi write down "doon" wan minute, nwrite doon "down" thi nixt, people say yir beein inconsistent. But ifyi sayti sumdy, "Whaira yi afti?" nthey say, "Whut?" nyou say "Where are you off to?" they don't say, "That's no whutyi said thi furst time." They'll probably say sumhm like, "Doon thi road!" anif you say, "What?" they usually say "Down the road!" the second time—though no always. Course, they never really say, "Doon thi road" or "Down the road!" at all. Least, they never say it the way it's spelt. Coz it izny spelt, when they say it, is it?

[quoted in Ronald Macaulay, "Coz it izny spelt when they say it: Displaying dialect in writing". American Speech 6(3): 280-291.]

In fact, I think there's an important sense in which Leonard's last point is wrong. Because human speech has what Hockett called "duality of patterning", it's fair to say that it is spelt, when they say it. Maybe not spelt spelt, but still, in some sense, spelt...

Jaffe missed the chance to cite Mark Twain's well-known "Explanatory" from the start of Huckleberry Finn, where he takes a much more positive and self-confident line on respellings.

In this book a number of dialects are used, to wit: the Missouri negro dialect; the extremest form of the backwoods Southwestern dialect; the ordinary "Pike County" dialect; and four modified varieties of this last. The shadings have not been done in a haphazard fashion, or by guesswork; but painstakingly, and with the trustworthy guidance and support of personal familiarity with these several forms of speech.

I make this explanation for the reason that without it many readers would suppose that all these characters were trying to talk alike and not succeeding.

It's interesting to see how Twain uses such normal and invariable respellings as "wuz", which appear to do nothing more than represent the ubiquitous pronunciation of words whose spelling is phonetically irregular. He mostly reserves "wuz" for Jim, representing the "Missouri negro dialect" -- though sometimes he has Jim say "'uz" for was, even in the same sentence as "wuz":

Twain distinguishes Jim's rendition of and as "en" from Mrs. Hotchkiss' rendition of the same word as "'n'". I wonder whether (and especially how) this really reflects the sociolinguistic facts of the time? Anyhow, his usage supports the idea that this sort of respelling is used to index stigmatized dialects of various sorts. However, it also underlines the fact that this connection is by now a highly conventionalized one, not something that is invented anew by each transcriber.

Time was that alternative spellings in English meant -- as far as I can tell -- absolutely nothing at all. According to the Textbase of Early Tudor English, John Skelton (1460-1529), "poete laureate in the unyversite of Oxenforde" and also poet-laureate to Henry VIII, spelled should in his poems as "shold", "sholde", "should", "shoulde", "shuld", "shulde", and "xuld". In the first of his poems in the LION database, "An Elegy on Henry Fourth Earl of Northumberland", Skelton uses two of these spellings in one line:

41 What shuld I flatter? what shulde I glose or paynt?

and at least one other spelling a few lines later:

67 To the right of his prince which shold not be withstand;

He manages to spell one three-word phrase in two completely different ways, within the space of 48 lines:

130 Of this lordis dethe and of his murdrynge.

178 Thys lords death, whose pere is hard to fynd

Proper names are not spared:

43 In Englande and Fraunce, which gretly was redouted;

179 Allgyf Englond and Fraunce were thorow saught.

I wonder, in which contemporary orthographies is this sort of catch-as-catch-can spelling used? One that I've encountered personally is Somali; but in that case, the orthography is only a few decades old, and the educational system that promulgated it has been defunct for much of that time.

[Update: Gene Buckley points out that the spelling "boyz" has become a conventional orthographic index of AAVE, although the voicing of plural /s/ after vowels has been normal in most variants of English for centuries.

I've often wondered whether Twain's "wuz" is properly understood as
eye-dialect (i.e., a mere respelling indexical of the quoted speaker's
low status, education, etc.) or as a pronunciation spelling indicating
a real dialectal difference. It's possible it could have been the
latter when used by Twain or other keen-eared 19th c. writers if, for
instance, "was" had a standard pronunciation with an open back rounded
vowel (IPA turned script-a, as in the British pronunciation given by
the OED), while "wuz" represented a once-nonstandard (now standard)
Amer. pronunciation with an open mid back unrounded vowel (IPA wedge).
I don't have any evidence for this shift in the pronunciation of
"was", but it's something to consider.

There's one small and indirect piece of support for this view in the quotes that I gave. At least judging from contemporary BBC pronunciations, the vowel in was will in any case be reduced to a schwa/wedge sort of quality except where the word is emphasized ("she *was* there") or phrase-final ("so it was"). In the quote from Miss Hotchkiss, Twain the first "wuz" is given emphasis with italics, as well as by the sense of the passage. And in the quote from Jim that I happened to pick, the was spelled "wuz" is arguably emphasized, while the one spelled "'uz" is reduced. However, there are plenty of cases where "wuz" is used to render Jim's speech with no basis for assuming any emphasis, e.g.

Well, I wuz dah all night. Dey wuz somebody roun' all de time.

And Huck's narrative voice never uses "wuz", although he shows other non-standard features ("There was things which he stretched, but mainly he told the truth") and other eye dialect spellings like "di'monds" and "s'pose". Nor is it used in the quoted speech of Tom Sawyer ("Well, Ben Rogers, if I was as ignorant as you I wouldn't let on"), though again Tom is rendered with some spellings like "A-rabs". Likewise Huck's father is given plenty of indices of non-standard speech, like "afeard", "ain't" and double negatives, but all of his examples of was are spelled "was" ("There's a hand that was the hand of a hog; but it ain't so no more; it's the hand of a man that's started in on a new life, and'll die before he'll go back.")

Anyhow, I guess it's possible that in Twain's youth, "the ordinary Pike County dialect" and its "four modified variants" all had [wɒz] or [wʊz], while the "Missouri negro dialect" had [wʌz] or [wəz]. I'll ask someone who knows about the history of American speech patterns.

Stupid machine-generated spiritual blather

The site established by the Devi Press (find them if you want at
http://www.devipress.com/, but I am damned if I am going to
give them a link) for the purpose of advertising its books on Christian,
gnostic, and mystical topics has a set of pages containing 1,185 (one thousand,
one hundred eighty-five) articles
on religious topics, each with an accompanying link to a page advertising
a book called The Mystic Christ. The article titles, indexed in
ASCII order, run from "1 John God Is Love" and "A Love Sent From God
Above" down to "Youth Group Devotions" and "Zohar Kabbalah". And a
typical piece of prose from one of them looks like this:

Abounding opposites present a few pages but he was carried out of members
who went to our conversation ends where it really fulminating on
ecumenically united states submitted as well i daniel had died of morality
and a final analysis be fighting with which are by subject browse for an
unbearable. The fourth year of heaven on a thousand strong and he had the
christians about. We will of your policy who are his entrance into
sticking it was called the only your sects so that will take the whole
world or unpopular they do you?

That's right. Every single article was generated by an extremely
crude random text-generation algorithm. (Fantastically crude. Computational
linguists can do a lot better than this. Heck, a trained
trunk
monkey could do better.) The articles even bear a notice saying
(lest the program actually write something intelligible)
"DISCLAIMER: The text for this article was generated automatically by a
computer. As such, nothing in this article should be construed as a
statement of fact or as the opinion of the maintainers of this site."
And each article has ten links to others on the list. The entire fraudulent
assemblage is just an exercise in Google-bombing: Devi Press is trying to raise
its Google ranking by having more than a thousand pages that link to ads for
its crappy books, each of those pages being the target of links by multiple other articles.

Would you like me to begin my rant now?

<RANT>What I'm objecting to is not that this crap is
religious drivel. It's that it's dishonest drivel. It's an illicit attempt
to get advertising space (in the form of appearances on Google search
results lists) that other people ultimately pay for. It's like having thousands
of huge styrofoam cubes with your company's name on them delivered to a
public landfill so that others will see them (only that doesn't happen because
neither styrofoam nor landfill space is free).

The poor Google
corporation is buying new CPUs and disk drives every day as it tries to
keep up with the growth of the web, and every byte of this asyntactic
Christian-gnostic-mystical garbage, this useless verbal waste,
has to be stored in some huge refrigerated data barn somewhere and
indexed and searched every single day. Every legitimate site and every
genuine shopper (and — declaration of interest — every
honest syntactician trying to explore language using
Google as a corpus) is being slowed down (at least a tiny bit), and
sometimes baffled
and misled, by the totally fake pseudo-text these venal morons are
stashing on their server for the sole purpose of masking the fact that
nobody is very interested in their boring useless crappy books,
and it makes me mad, OK? As
Stephen Colbert would say, Devi Press, you're dead to me.
You're on my
Dead To Me
board. All right, I'm done.</RANT>

January 10, 2006

Colbert fights for truthiness

On Friday
the American Dialect Society chose as its 2005 Word
of the Year Stephen Colbert's sublimely silly neologism truthiness. In a post submitted that night from the ADS/LSA meetings in Albuquerque, I surmised that the
initial Associated
Press coverage of the voting, which didn't even mention Colbert,
would "serve as more fodder for Colbert's put-upon persona of perpetual
outrage."

Well, "The
Colbert Report" returned to Comedy Central from an extended break
on Monday night, and sure enough Colbert was in high (faux) dudgeon. At the end of the show he
called out not only AP reporter Heather Clark, but also wordanista
Michael Adams
(author of the excellent Slayer
Slang), who happened to be the ADS member that Clark buttonholed
for a quick definition of truthiness. Colbert even dug up Adams' academic title and course information at North Carolina State University, in homage to the over-the-top ad hominem attacks perfected by the likes of Bill O'Reilly. At Language Log Plaza, our hearts go out to Adams, the blameless victim of a pseudo-anchor's pseudo-wrath.

A transcript of the segment follows. [Update: A video clip is now available from Comedy Central. It can also be viewed here.]

Before we go, I want to say something about the
first "Word" from the
first ever broadcast of this show. Jimmy, roll the tape.

(Video from first show: "Truthiness.
Now I'm sure some of the Word Police, the wordanistas over at
Webster's, are gonna say, 'Hey, that's not a word.'")

Turns out I underestimated those wordanistas. On Friday the American
Dialect Society chose truthiness
as the 2005 Word of the Year (applause),
beating words like podcast
and Katrinagate. We kicked
their asses. And I've never been so honored and insulted at the same
time.

You see the Associated Press article announcing this prestigious award,
written by one Heather Clark, had a glaring omission: me. I'm not
mentioned, despite the fact that truthiness
is a word I pulled right out of my keister. Instead of coming to me,
here's where Ms. Clark got the definition.

Quote: Michael Adams, a professor at North Carolina State University
who specializes in lexicology, said (subquote) "truthiness" means
"truthy, not facty."

First of all, I looked him up. He's not a professor, he's a visiting
associate professor. And second, it means a lot more than that,
Michael. I don't know what you're getting taught over there in English
201 and 324 over at Tompkins Hall, Wolfpack. But it isn't truthiness.

You know what? Bring out the board, bring out the board. (Stagehand brings out the "On Notice"
board, with entries including "Black hole at center of galaxy," "E
Street Band," "grizzly
bears," "Bob Woodruff," "the Toronto Raptors," "The British Empire,"
"business
casual," and "Barbara Streisand.")

But the real culprit here is so-called reporter Heather Clark. This is
her sleaziest piece of yellow journalism since "New Mexico Poll
Watchers See Smooth Election Day." Now I already tore her a new one for
that. Heather Clark, you are dead to me.

Get ready, Heather. Get ready, brace yourself. (Colbert adds card for "Heather Clark" to the board.)
How does that feel? Does
that sting? Now that you're dead to me, you're gonna wish you were
never born.

I'm sorry you had to see that, nation. But in the interest of
truthiness, it had to be done. Good night.

[Update #1: Adams has been enshrined on the Wikipedia page for "The Colbert Report," in a section now moved to the rapidly expanding entry for truthiness.]

[Update #2: Adam Green of the Huffington Post suggests that defenders of truthiness should ask Heather Clark to correct the record, even supplying her email address. To be fair, she did file a later wire story that credited Colbert, albeit indirectly. (Yet another iteration of Clark's story gives Colbert direct credit.)]

[Update #3: Steve Kleinedler recommends this column for anyone who is still puzzled by the concept of truthiness.]

January 09, 2006

Nias, Komodo, and "Kong"

I have yet to find three hours to devote to Peter Jackson's remake
of King Kong, but I did catch
the
original 100-minute version on Turner Classic Movies over the holidays.
I hadn't seen it in its entirety since I was a kid, but now I can
see why Jackson has said it was the movie that inspired him to become a
filmmaker. It's an extremely appealing adventure tale, despite the
now-quaint
special effects, occasionally clunky storytelling, and typical
Hollywood exoticization of "primitive" lands.

Since one of my areas of research is Indonesia, my ears perked up
when Carl Denham, the leader of the expedition, shows Captain
Englehorn their destination on a chart, saying it is "way west of
Sumatra." Englehorn then tells Denham, "I know the East Indies like my
own hand, but I was never here." My interest was further piqued by the captain's early suspicion that "Kong" was "some Malay superstition, a
god or a spirit or something." When they finally arrive at Skull
Island, Englehorn says the speech of the natives "sounds something like
the language the Nias Islanders speak."

Nias is an island
off the west coast of northern Sumatra, most recently in the news for
the heartbreaking devastation wrought by the one-two punch of the Dec. 2004 tsunami
and the less-reported earthquake
of Mar. 2005. The first language of most of the island's estimated
600,000 inhabitants is also called Nias
(known locally as "Li Niha") and is related to the Batak
languages of northern Sumatra and more distantly to Malay and other
languages on the Sundic
branch of the Austronesian family tree.

The film's depiction of the Skull Islanders is notoriously racist,
with mostly African-American actors enlisted to prance around like generic
savages, but I thought the specific references to Sumatra and Nias
could mean that their linguistic interaction with Captain Englehorn
might carry a shred of verisimilitude. From what I could catch,
there was only the tiniest shred. When the chief makes an offer to trade six
of his women for Ann Darrow (as a "gift for Kong"), Englehorn declines
by saying "Tida, tida!" That seems to be modeled on Malay-Indonesian tidak /tidaʔ/,
meaning 'no, not.' Also, when Englehorn buys time by telling the chief
that they'll
come back tomorrow, he says "dulu," which in Malay can mean 'for the
time being' (as in tunggu dulu
/tuŋgu dulu/ 'wait for now'). Other than that, nothing in the
exchange between the chief and Englehorn sounds much like Malay or related
languages.

But should we expect the dialogue to be anything but gibberish? A
recent article
by Kenneth Turan in the Los Angeles Times looking back on the original Kong suggests otherwise:

To understand the 1933 version's success, you
have to start with how
close two of its key characters, director Denham (the irresistibly
intense Robert Armstrong) and cameraman Jack Driscoll (Bruce Cabot),
were to producer-directors [Merrian C.] Cooper and [Ernest B.]
Schoedsack. In fact, as related
in Orville Goldner and George E. Turner's "The Making of King Kong,"
when Cooper hired his wife, tyro writer Ruth Rose, to do the final
polish on the "Kong" script, he told her flatly, "Put us in it ... Give
it the spirit of a real Cooper-Schoedsack expedition."

For with Cooper as the driving visionary and Schoedsack as the
unflappable director-cameraman, these two were adventurers before they
were filmmakers. As related in a new Cooper biography, "Living
Dangerously" by Mark Cotta Vaz, the two had made a pair of successful
ethnographic documentaries in faraway places — "Grass" in what was then
Persia, "Chang" in Siam — that fully lived up to Cooper's celebrated
determination to keep his films "distant, difficult and dangerous."

In fact, when Denham complains that critics are always bemoaning
the lack of a love interest in his films, he's echoing what was
actually said about the Cooper-Schoedsack films. And the language Rose
created for the natives of Skull Island was based on the idiom of the
Nias Islanders, near Sumatra, whom she and Cooper had visited. Fearful
that disguised indecent language might sneak on-screen, the Production
Code Administration reportedly insisted on a translation of all Skull
Island dialogue before giving the film its approval.

I thought I'd look for this supposedly Nias-based dialogue online,
and I
found what purports to be a draft of the screenplay on Val
Lewton's Whiskey
Loose Tongue website. Sure enough, the Skullese dialogue is
provided with "translations," presumably for the skittish Production
Code
Administration. A sampling:

Chief:

Bado! Maka mini
tau ansaro.

(Wait! Two
warriors come with me.)

Watu! Tama di?
Tama di?

(Stop! Who are
you? Who are you?)

Englehorn:

Tabe! Bala kum
nono hi. Bala! Bala!

(Greeting! We
are your friends. Friends! Friends!)

Chief:

Bala reri!
Tasko! Tasko!

(We don't want
friends, Go! Get out!)

Englehorn:

Vana di humya?
Malem ani humya vana?

(What are you
doing? What is that woman doing?)

Chief:

Ani saba Kong!

(She is the
bride of Kong!)

Sita! Malem!
Malem ma pakeno!

(Look! The
woman! The woman of gold!)

Malem ma
pakeno! Kong wa bisa! Kow bisa para Kong!

(The woman of
gold! Kong's gift! A gift for Kong.)

Dama, tebo
malem na hi?

(Strangers,
sell woman to us?)

Sani sita malem
ati - kow dia malem ma pakeno.

(I will give
six women like this for your woman of gold.)

Englehorn:

Tida, tida!
Malem ati rota na hi.

(No, no! Our
woman stays with us.)

Dulu hi tego.
Bala. Dulu.

(Tomorrow we
come. Friends. Tomorrow.)

Most of the dialogue and "translation" accords with the 1932 novelization
of the script adapted by Delos Lovelace (searchable on Amazon).

The reader is welcome to search for any vague correspondences
between
the screenplay and this Nias word
list prepared by Robert Blust for the Austronesian Basic
Vocabulary Database. (Those familiar with Indonesian can also
consult the Nias-Indonesian dictionary maintained here.)
Suffice to say, whatever Ruth Rose Schoedsack used as the basis for
Skullese, it surely wasn't Nias or any other related language. The word
for "woman" in Nias is a-lawe,
not malem; "who" is ha, not tama; "you (plural)" is yaʔami, not di; "we (exclusive)" is yaʔaga, not hi; "come" is möi, not tego. The Nias-Indonesian
dictionary supplies some more examples: "six" is önö,
not dia; "gold" is anaʔa, not pakeno; "tomorrow" is mahemolu, not dulu (hey, at least the final
syllable for that one is right!).

I picked up the new biography of Merrian C. Cooper, Living Dangerously by Mark
Cotta Vaz, to see if there was any mention of Cooper or the Schoedsacks
going to Nias Island. There is a brief account of Cooper passing through the Toba Batak region of Sumatra on a round-the-world expedition with explorer Edward Salisbury, and another part describes the Schoedsacks' trip to Aceh on Sumatra's northernmost tip to shoot the orangutan movie Rango. But the only time Nias comes up is later in the book when stop-motion
animator and hardcore Kong
fan Ray Harryhausen
recalls going to the island with his wife in search of the model for Skull
Island:

Well, Nias Island actually exists, although
they're not black but Asian
people, and we thought we'd go there. We arrived early in the morning,
in the fog, but there was no skull, no ancient wall. I stepped out on
the pier and there was this native guy and I thought I'd try out Ruth
Rose's language and I said, "Bala,
bala Kong nna hee." And the native put his hands on his hips and
said, "What are you talking
about?" (Vaz, p. 407)

It turns out another Indonesian island probably had more of an
influence on the making of Kong:
Komodo, one of the Lesser Sunda Islands (which also include Flores,
Sumba, and Timor). When Cooper was first formulating Kong in 1929-30, he contacted
another adventurer named W. Douglas Burden. In
1926 Burden had led an expedition sponsored by the American Museum of
Natural History to bring the first live Komodo dragons to the West.
The following year Burden wrote a book about his expedition, Dragon Lizards of Komodo,
describing how he and herpetologist F.J. Defosse were enchanted by the "lost world" of Komodo. Before leaving, Defosse
told Burden, "I would like to bring my whole family
and settle here, and be King of Komodo."

Cooper was inspired by Burden's story of "primeval monsters"
on a faraway island and his description of how the creatures' spirits
were broken once they were taken back to New York in captivity.
(The two live Komodo dragons were brought to the Bronx Zoo and quickly
died there.) Here is an excerpt from a 1964 letter from Burden to
Cooper reminiscing about their conversations:

I remember, for example, that you were quite
intrigued by my
description of prehistoric Komodo Island and the dragon lizards that
inhabited it. ... You especially liked the strength of words beginning
with 'K,' such as Kodak, Kodiak Island, and Komodo. It was then, I
believe, that you came up with the idea of Kong as a possible title for
a gorilla picture. I told you that I liked very much the ring of the
word...and I believe that it was a combination of the King of Komodo
phrase in my book and your invention of the name Kong that led to the
title you used much later on, King
Kong. (Vaz, p. 193)

In response to Burden's letter, Cooper wrote, "Everything you say is
right on the nose." He did add that he conceived of a "Giant
Gorilla" story before reading Dragon
Lizards of Komodo, which reminded
Cooper of his own expedition to the Andaman Islands and the giant
lizards he saw there. But at least we know where the K in Kong came from!

January 08, 2006

New swords for old

"It's a double-edged sword, being known as DeLay Inc.," said one Republican lobbyist. "They are on the sharp edge of the sword now."

Swords are not part of Americans' everyday experience these days, and so the sword-related metaphors that we've inherited from earlier times are open for creative re-interpretation. A double-edged blade is traditionally one that has two cutting edges, and being sharp on both sides, can "cut both ways". This can make a double-edged sword or knife dangerous to its user. Instead, the anonymous lobbyist has taken the expression to refer the now-standard kind of blade, with one sharp edge and one dull one, and rekeyed the metaphor to the contrast between two different sides, not the symmetry of two similar ones. The new interpretation is like the familiar use of "two-sided coin", where the whole point is that the two sides are different -- see this article headlined "the two-sided coin of PA credentialing" for an example.

We've previously noted that this sort of thing has been happening to terminology associated with the harnessing of animals: "reigns of power", "unbrided fury", "yolked to the coloniser". In those cases, though, the result was an eggcorn; here's it's just a new interpretation of an old expression. And unlike the various new interpretations of "beg the question", this new interpretation happens to mean essentially the same thing as the old one .

January 06, 2006

The wordanistas have spoken

Back in October, when Comedy Central's Stephen Colbert kicked off his faux-news show The Colbert Report, he promoted a new word that nailed the malleability of "truth" in today's mediascape. His word was truthiness, and he used his blustery O'Reillyesque persona to launch a preemptive strike against naysaying "wordanistas":

Now I'm sure some of the Word Police, the wordanistas over at Webster's, are gonna say, "Hey, that's not a word." Well, anybody who
knows me knows that I'm no fan of dictionaries or reference books.
They're elitist. Constantly telling us what is or isn't true, or what
did or didn't happen. Who's Britannica to tell me the Panama Canal was
finished in 1914? If I wanna say it happened in 1941, that's my right. I
don't trust books. They're all fact, no heart.

Well, the wordanistas have heeded his call. Earlier today, the American Dialect Society selected truthiness as its 2005 Word of the Year.

Truthiness edged out Katrina (and Katrina-related words) in the annual voting, with other nominees such as podcast, intelligent design, and refugee trailing behind. (The full results, with the voting in other categories, is available in PDF form here.) For the ADS voters, Colbert's creation just seemed to capture a certain ineffable zeitgeistiness.

Fittingly, truthiness has circulated in the media with only a tenuous connection to those pesky "facts." As we noted here, the New York Times rendered the word as trustiness, apparently due to an errant spellchecker. (The redfaced Times not only issued a correction but also elevated truthiness to a place on its list of year-defining buzzwords.) Now that the ADS has coronated it as Word of the Year, the media coverage has once again come up short. The Associated Press article, published around the nation and indeed the world (in the Washington Post, the Los Angeles Times, the UK's Guardian, Australia's Age, etc.), doesn't even mention the genesis of the word on Colbert's show.

Ah well. Perhaps this will serve as more fodder for Colbert's put-upon persona of perpetual outrage.

[Update, 1/6/06: A later and longer version of the Associated Press wire story gives the background on Colbert (though it implies that his show is still part of "The Daily Show with John Stewart") and adds some other new details. But I suspect the incomplete article that first hit the wires will be the one picked up by most papers.]

[Update, 1/10/06: On his Jan. 9 show, Colbert responded with all the phony indignation he could muster. Details here.]

January 05, 2006

Shakespeare used they with singular antecedents so there

Not happy that I cite
Sean
Lennon as a source of evidence concerning the way they
can be used in modern English? Feeling that only something 400 years
older would really convince you that it's OK. Has Coby Lubliner got news
for you! Coby writes from Berkeley to point out the following lines from Shakespeare's A Comedy of Errors, Act IV, Scene 3:

There's not a man I meet but doth salute me
As if I were their well-acquainted friend

It's not just a case of they with singular antecedent;
like Lennon's example, it uses they despite the fact that
the sex of the antecedent's referent (male) is known! And there's more.

Marilyn Martin writes from Cornell to say that she's O.K. with
normally, but this example was a bit more than she could take
("somehow bothers me", she wrote):

UK scientists have identified the part of the brain that determines
whether a person perceives themselves as fat.
(BBC News, Tuesday, 29 November 2005, 11:52 GMT)

What she doesn't like, I'm quite sure, is that the reflexive form
themselves is morphologically marked as plural (self /
selves), yet still it is used with singular antecedent. Don't
flinch, Marilyn! Look at this example of Shakespeare's (from the
poem The Rape of Lucrece):

Now leaden slumber with life's strength doth fight;
And every one to rest themselves betake,
Save thieves, and cares, and troubled minds, that wake.

So even the reflexive form of the pronoun lexeme
they is used in Shakespeare with a singular antecedent
(every one, spelled everyone in modern English).

[Added later:
I would have to agree with you if you said that the above example
is quite difficult to parse. It is indeed. Having direct objects before subjects
is never helpful for those of us speaking SVO languages, but that's what Tudor
English poetry is like. After some discussion with Marilyn
Martin and Mark Liberman, I think I am satisfied that leaden slumber
is understood as the subject of betake, and every one
is its object. The reason betake doesn't have a final
-s is not that it's agreeing with
a plural subject (its subject is leaden slumber, singular),
but rather that
it is understood as doth betake with the doth omitted:
it is not a present-tense verb, finite; it's in
what The Cambridge Grammar calls the "plain form", as required
by doth. So, in other words, the sense of the passage is roughly
as follows
(I change "doth fight"
to "fights" in accord with contemporary English syntax, and simply
murder the poeticality):
"Now leaden slumber fights with life's strength;
and takes everyone off to rest themselves,
except for thieves, and worries, and troubled minds, which remain awake."
The relevant point is unchanged by this clarification:
the antecedent of themselves
is the singular noun phrase every one.
That's the current thinking in the halls of 1 Language Log Plaza,
anyway. I did warn you that it was difficult.]

By all means, avoid using they with singular antecedents
in your own writing and speaking if you feel you cannot bear it. Language
Log is not here to tell you how to write or speak. But don't try
to tell us that it's grammatically incorrect. Because when a construction
is clearly present several times in Shakespeare's rightly admired plays
and poems, and occurs in the carefully prepared published work of just
about all major writers down the centuries, and is systematically present
in the unreflecting conversational usage of just about everyone including
Sean Lennon, then the claim that it is ungrammatical begins to look
utterly unsustainable to us here at Language Log Plaza. This use of
they isn't ungrammatical, it isn't a mistake, it's a feature
of ordinary English syntax that for some reason attracts the ire of
particularly puristic pusillanimous pontificators, and we don't buy what
they're selling.

January 04, 2006

Maybe Globalization Isn't As Advanced as We Think

I was watching Commander in Chief and, not having watched it regularly enough or with sufficient attention, was unclear as to one character's role, so I googled the show in hope of clarification. I ended up at this site.
It proved satisfactory - it had the information I wanted - but one thing was peculiar.
It described Nathan Templeton as el portavoz de la Casa Blanca, that is,
"White House spokesman". Inexpert as I was, I knew this was wrong. Actually, Nathan Templeton is Speaker of the House of Representatives or portavoz de la Casa de Representantes. Evidently, the translator (its an American series and further googling suggests that the Spanish blurb is a translation of one provided in English by the network) confused the White House and the House of Representatives.
Maybe we should find it heartening that the details of American government are not so universally familiar as to render such mistakes impossible.

Update 2006-01-04: some people have commented that the House of Representatives should be Cámara de Representantes. Both Casa and
Cámara are in use as you can easily establish by googling for the two terms. Its possible that there is some sort of dialectal basis for the choice, but if so, I don't know what it is.

January 03, 2006

Happy Abramoffukkah!

Another legal brouhaha, another celebratory blend. Last year we had Fitzmas and Kitzmas. This year kicks off with Abramoffuk(k)ah, commemorating Republican lobbyist Jack Abramoff's guilty plea earlier today.

As with Fitzmas, it looks like there were multiple discoverers of this felicitous blend. Maximus Clarke (aka "Artifice Eternity") used it in a comment on Metafilter on Dec. 21 ("First comes Fitzmas, then comes Abramoffukah!"). It showed up on Ed's Daily Rant the same day with the "Abramoffukkah" spelling. (Both were reacting to the news that Abramoff was looking for a plea deal that could implicate Tom DeLay and other top Republican legislators.) The day after that, it was used by "DCeiver" guest-blogging on Wonkette. The expression was further popularized in a Dec. 30 post on Daily Kos by "Sherlock Google," who credited Clarke with the coinage — though if time stamps are to be trusted, it looks like Ed's Daily Rant beat out Clarke by several hours.

It's not too surprising that several online wags should independently hit upon Abramoffukkah. The wild success of the Fitzmas blend was an easy model for liberal bloggers to follow, with the new coinage conjuring up the same schadenfreude at the legal follies of top Republicans. Secondly, Abramoff's orthodox Judaism makes a blend with the seasonally appropriate Hanukkah a natural fit. And finally, the blendability of -ukkah has already been established over the last couple of years by the jocular pseudoholidays of Chrismukkah (celebrated by the fictional inhabitants of "The O.C.") and Chrismahanukwanzakah (featured in tongue-in-cheek advertising from Virgin Mobile), along with several other variants. If there's such a thing as an overdetermined neologism, this is certainly an example of one.

You made a lot of valid points about why this
blending might occur to multiple people at once. However, I do think
that you missed one, being the orthographic and phonological
similarities between either spelling of the neologism and many pseudo-dialectal alternative spellings of the common pejorative (fucker).
The coincidental presence of the word 'off' in Mr Abramoff's name
makes it even more evocative. ]

No problem

I couldn't resist the opportunity provided by Mark's mention
of the latest MS Windows security crisis to point out that those of us running GNU/Linux or other Unix variants
are blissfully unaffected by this, and most other, security problems.

The Chemical Composition of Words

Mark Nandor, a math teacher at Wellington School in Columbus, Ohio,
has posted a list
of all of the English words that can be spelled using the symbols for the first 111 elements, as well as lists of magic squares made up of chemical symbols.
His definition of English word is "listed in the ENABLE word list",
which is used by Scrabble players. You can get your own copy here if you like:
enable.zip.

Nandor says that he computed the list using Mathematica in about 25 hours including programming time. Mathematica is a wonderful tool for doing mathematics,
but it isn't ideal for this sort of problem. I solved the same problem by matching
this regular expression
case-insensitively against the ENABLE word list:

using the GNU version of the standard Unix utility grep (specifically, its egrep avatar). It took ten minutes
or so to locate and download the ENABLE list and construct the regular expression.
The computation time? Less than one second on my 1.6GHz P4 with 512MB of RAM,
not exactly a supercomputer.
Moreover, I think that I got the correct result. Nandor's program somehow missed the valid
words berg and urges, but included the non-words cryosurg ical,
urg es, and v irgins.

Personally, I don't find this sort of exercise all that fascinating though I know
some people do. It does, however, provide a nice illustration of the utility of
regular expression matching for linguistic searching.

What would Whorf say?

Something about most anything, it seems. I've recently come across two papers about the influence of language on thought and action. Both papers strike me as suggestive (in roughly the same way) and also not entirely convincing (in roughly the same way). Otherwise, the two papers are just about as different as they could possibly be.

The first paper is Heesook Kim, "What would Sapir and Whorf talk about the social conflicts in the South Korean Society" [sic], in 어너학 [Eoneohag -- Journal of the Linguistic Society of Korea], No. 40, December 2004. Actually, it's just the translated title and abstract of a paper published in Korean, which I haven't read. (The English is somewhat imperfect, though infinitely better than my Korean.) Here's the abstract:

Like Taiwan, South Korea is a somewhat lately democratized society. However, comparing two countries, South Korea has been featured with more social conflicts, we find. In modern times, both have shared a similar experience in social, political and economic aspect. They have been neighbors even geographically. Then, how could one society reveal more confrontations among the members than the other? With the help of Sapir-Whorf hypothesis, we tried to show that honorifics in Korean, which is believed the most complex in the world, is reponsible for the distinction. We proved that honorifics, which was born and developed in the pre-modern social structure, tends to prevent equality, which is necessary for people to face one another on equal footing, from being established among the individuals and make people seek collective actions to make their voices heard and resolve difference in their interests in equal terms.

The idea, I guess, is that if your language forces you to use honorifics all the time, drawing your attention to your place in society relative to others, you will be more conscious of distinctions in social status; and therefore you will be more likely to interpret issues in terms of social status, and/or to feel a greater sense of solidarity with your social peers. That makes some sense.

But if the facts about social conflict had been different -- more recent conflict in Taiwan than in Korea -- you might have taken Whorf the other way. Maybe the use of honorifics helps keep people grounded in their traditional roles, and less likely to challenge the traditional division of power. And in fact, if we take a slightly larger geographical and historical frame, it's not obvious to me that China/Taiwan has had less social conflict than Korea.

Any way you slice it, it seems to me that it's hard to make a strong case based on two data points. If you had a survey of 10 or 20 dominant national languages, convincingly quantifying the degree to which they reflect the relative social status of conversational participants; and an independent survey of the degree of social conflict in the countries in which these languages are spoken; and a clear correlation between the two measures...

(My evaluation of this paper is obviously limited by the fact that I've only read the abstract -- perhaps the body of the paper presents other arguments or addresses these points in some way.)

The question of whether language affects perception has been debated largely on the basis of cross-language data, without considering the functional organization of the brain. The nature of this neural organization predicts that, if language affects perception, it should do so more in the right visual field than in the left visual field, an idea unexamined in the debate. Here, we find support for this proposal in lateralized color discrimination tasks. Reaction times to targets in the right visual field were faster when the target and distractor colors had different names; in contrast, reaction times to targets in the left visual field were not affected by the names of the target and distractor colors. Moreover, this pattern was disrupted when participants performed a secondary task that engaged verbal working memory but not a task making comparable demands on spatial working memory. It appears that people view the right (but not the left) half of their visual world through the lens of their native language, providing an unexpected resolution to the language-and-thought debate.

There were 13 subjects, all Berkeley students. The stimuli were made up of colored squares drawn from a set of four, spanning a series from "green" to "blue":

On each trial, the subject was shown a ring of 12 squares surrounding the fixation point. Eleven of the twelve square were the same color, and one (in a random location) was a different color.

The subject's task was to indicate as rapidly as possible whether the different-colored square was in the left half or the right half of the array. The oddball square could be could be of the "same" basic color-name category (in English!) or of a different category.

The crucial thing about the experimental design is that the left side of the visual field projects to the right (non-dominant) side of the cerebral cortex, while the right side projects to the left (language-dominant) side of the cortex.

And here's some of the results:

In the "no interference" condition, the subjects were significantly faster when making a between-color-category judgments (i.e. the oddball was green and rest were blue, or vice versa) in the right visual field. (Eyeballing the figure, the difference was apparently only about 15-30 msec. out of about 420 msec., or about 5% -- but that's what reaction time experiments are usually like.)

When subjects were distracted by having to silently rehearse an 8-digit number during a block of trials (and they had to recall it at the end of the block), the results were quite different:

In this case, all the reaction times were a bit longer, of course. But now in the right visual field, the within-category trials were significantly faster than the across-category trials! The effect of (English) color category was reversed. In fact, curiously, the RVF within-category trials were now also significantly faster than the LVF within-category trials, while the LVF between-category trials were faster than the RVF between-category trials.

In a second experiment, the authors compared a different verbal-interference task (remembering an irrelevant color word, like "red") with a non-verbal interference task (remembering the arrangement of a spatial grid of square). They found that with the non-verbal interference, RVF between-category judgments were faster (similar to the no-interference condition), while with the verbal interference, RVF within-category judgments were again faster -- and again, the same curious inversion of effects obtained in the verbal interference case, with the RVF within-category trials being significantly faster than the LVF within-category trials, while the LVF between-category trials were faster than the RVF between-category trials.

Well, you can read the details for yourself, if you want. It's a great piece of work, and the authors' interpretation makes a lot of sense, and might well be true. But a couple of things about it worry me.

One is that the explanation might have worked just as well if the experiment had come out quite a bit differently. For example, if the LVF between-category reaction times had been slower instead of faster, you could say that it's because the color names are interfering with a faster non-linguistic process. Indeed, you have to say something like that to explain the (unpredicted) result that the verbal interference task actually reverses the effect, making between-category reaction times significantly slower than within-category reaction times.

Other possible results -- basically anything except a situation in which color category makes no difference, or doesn't interact with visual field -- could similarly be given a Whorfian interpretation.

A second cause of worry is that the experiments, though extensive, are somewhat limited. There are just four colors, and one basic color category distinction. The particular colors used have lots of other properties besides their relationship to English color-name boundaries. It would be unfortunate (for example) to learn that things are quite different if we use purple-to-red instead of green-to-blue, or if we use a green-to-blue sequence with different saturation or brightness.

We might also wonder what happens with subjects who have various other sorts of cerebral lateralization, for language and perhaps other knowledge and skills. Or what happens if you ask subjects to judge whether the oddball square is in the top half or the bottom half of the array, instead of left vs. right.

And finally, the cross-linguistic shoe hasn't dropped yet. The authors observe that "a majority of the world's languages" use a single word for the basic color categories of the green-to-blue sequence they used in this experiment. So the prediction is that the speakers of these languages will not show any effect of the boundary between stimuli B and C; and similarly for other such boundaries on other color sequences.

A crucial difference between the paper on honorifics and the paper on colors is that it's a straightforward job of work to do more color experiments of the same general sort, and no doubt we'll see some along these lines in the future. Furthermore, the same LVF-vs.-RVF RT paradigm could be used with things other than color, e.g. a range of similarly shaped unnamable pictures vs. pictures with a range of similar-sounding names vs. pictures with a range of semantic similarities. A whole psycholinguistic industry devoted to seeking Whorfian effects in cortical RT asymmetries may be ahead of us.

[More Language Log posts referencing Benjamin Lee Whorf are here. There are shockingly many of them.]

Singular they with known sex

Sean Lennon (the singer/songwriter son of John Lennon and Yoko Ono),
who is 30, would like to have a new girlfriend,
and for
some reason talked to the people at the Page Six department of the New
York Post about it and "playfully pleaded with The Post to find him a
girlfriend for the new year", his relationship with Bijou
Phillips having broken up a while ago (in 2003, I'm told, though
you wouldn't know that from the Post, which has a big picture of
them together as if this were recent news). The Post publishes some
self-descriptions sent in by young women who thought they might suit his
needs ("I lead a full life and would like to share it with someone," says Betsy
Head, 27). The linguistic angle (this is Language Log) is that what he is
reported as having said to the Post about those needs provides a
nice example of the way singular they is going in the speech
of younger people (and 30 counts as young in this context). Said Sean
(Thursday, December 29, 2005 , page 9):

Any girl who is interested must
simply be born female and between the ages of 18 and 45. They must have
an IQ above 130 and they must be honest.

The antecedent of they, both occurrences, is any girl who is
interested, a singular noun phrase. Yet because of the head noun
girl we know that semantically the quantifier that noun
phrase denotes ranges only over female humans.
Thus the sex of the 18 to
45-year old honest person with the 130+ IQ that Sean hopes to find is
fixed by his stipulation. Yet he still says they.

Notice that the verb phrase must have an IQ above 130 clearly
needs a singular subject (each girl has one IQ; he could have said "They
must have IQs above 130" if he was talking about the whole group of
hopeful applicants). Clearly, in the speech of Sean Lennon,
they not only can have a morphosyntactically (and
semantically) singular antecedent, but it can do so even if the gender of
the referent is known, and syntactically overt, as with any
girl.

The context that most favors they with singular
antecedents, I think, is where it roughly corresponds to what
logicians call a bound variable. Lennon's two sentences above
convey a meaning something like:

"For any girl X such that X
is interested, X must be born female and X must be at least
18 years old and X must be not more than 45 years old and X
must have an IQ above 130 and X must be honest."

The use of
they is, to put it informally,
a reminder that the pronoun is not referring to
any one person. Gender reflects the sex of the person referred to in
English, and it is evident that Sean Lennon feels that
with no definite reference for
the pronoun, they is more appropriate than
she.

It's tricky to talk about this with care. The traditional simplistic line is
just the
pronoun they is plural and that's all there is to it;
but that won't do. The uses of they in the quote above
do not really
refer to any particular person or persons. Nor does the noun phrase
any girl. But any girl expresses a quantifier, and the
quantifier binds variables semantically, and the pronoun
they realizes the bound variables syntactically. The number
agreement facts show that they is morphosyntactically
plural. But the anaphora facts show that it can take an antecedent that
is morphosyntactically singular (as The Times style guide once agreed but
then it changed its mind back again to the old-fashioned view that says
there's something wrong with that).

Semantically, in some uses they is semantically "plural"
in the sense that it refers to a group. But in the use illustrated here
it is not semantically "plural"; it corresponds to a bound variable, and
the semantic notions of singular and plural reference don't really apply
to it. The semantic notion of sex reference doesn't really apply either.
The fact that Lennon stipulates a range for the variables that includes
only humans born of the female sex apparently is not quite enough for him
to use she, though of course he would use that pronoun in a
context where a specific human such as Bijou Phillips was being referred
to. (The Post's picture has her in a tight red
dress, and she is very clearly female. Trust me. At Language Log we do
fact-checking on this sort of thing.)

This has nothing to do with linguistics, but it isn't as widely known as it ought to be, and it's important, so I'll post it here. If you or yours have any computers running Windows XP, you should run, not walk, to this story at the Internet Storm Center, and consider following the instructions found there (installing a patch and de-registering a particular .dll). This may protect you until Microsoft makes a more systematic solution available.

Because this exploit was publicized on Dec. 27, bad guys around the world have had a week to work on ways to use it while most people have been busy with other things. I believe you'll be hearing more about this.

Let me add that something about this situation puzzles me a great deal. It was back in January of 2002, fully four years ago, that Bill Gates was
reported to be "kicking off an all-out effort to repair the company's reputation for poor security and reliability".
The simplest and most obvious security vulnerabilities are those that arise because a standard, commonly-used file format includes, by design, the capability to instruct the OS to execute some arbitrary piece of code. How can it possibly be true that after a few weeks (never mind four years) of "all-out effort", some MS software engineer didn't call attention to the fact that
Windows Meta Files -- a common graphics format on Windows machines -- contain such a vulnerability? If no one noticed this, then Redmond's engineers are incompetent. If someone did notice, and nevertheless up to four years went by during which no one did anything to patch the vulnerability, then Redmond's managers are incompetent. Either way, it's a bad omen for Microsoft's future.]

[Note -- I incorrectly glossed "wmf" as "windows media file" -- thank to several alert readers for correcting the mistake. That's what I get for learning as little as possible about Windows internals...]

January 02, 2006

Everyone at The Times agrees... No they don't

I have a one-step-forward-one-step-back story. I noticed a while ago
that the second edition of The Times Guide to Grammar and Usage
(ed. by Simon Jenkins; London, 1992, and now long out of print, I think),
explicitly states that they with singular antecedent
(the example Everybody should bring their lunch is cited) is
"acceptable usage" and will often constitute a good solution to the
problem that one is otherwise forced to choose between he
(which says the referent is male) or she (which says the
referent is female) or he or she (often much too clumsy,
as anyone who thinks he or she might like to spend some of his or her
time convincing himself or herself will soon find out for himself or
herself).

Jenkins thus endorses the position that
The Cambridge Grammar of the
English Language was to take a decade later. And rightly
so. It's a position that couldn't really be doubted by anyone who
had devoted even a few minutes to looking at the facts of usage, be it
literary (over the past 600 years) or everyday
conversational. Excellent advice.

But if you read on, there is sadder news to come.

I wondered for a while if the old-fashioned handbooks that condemn
singular antecedents for they realized they were contradicting
The Times as well as CGEL. Even dyed-in-the-wool old-tyme
prescriptivists, who might regard CGEL's descriptive stance with
horror, I thought to myself, should surely be prepared to agree that
The Times of London knows whereof it speaks with regard to the
English language. If any newspaper in the world can be regarded as
virtually definitive of good written Standard English over the past two
hundred years it has to be The Times. But before telling you of
my discovery in the 1992 edition,
I had a look around to see what was the current
version of The Times's style guide. And what I found plunged me
back into depression.

Go to
The Times
Online Style Guide and take a look at what they have under
"they" now. It is the following piece of ill-considered stupidness:

they should always agree with the subject. Avoid sentences
such as "If someone loves animals, they should protect them". Say instead
"If people love animals, they should protect them"

Agree with the subject? Subjects have nothing to do with this,
as you can see from an example like
We told everyone they were free to leave, where
they has a direct object as its antecedent. Here no sense
can be made of the idea that they should "agree with the
subject". The editors — Richard Dixon, Mike Murphy, and
Denis O'Donoghue — don't know a subject from an antecedent, or
doubtless from an artichoke, and they should be ashamed of themselves.
(It is not clear what they would recommend. Probably some sort of
rewording like We told all of them they were free to leave.
This is no improvement.)

What Dixon, Murphy, and O'Donoghue are trying (ineptly) to say is clear
enough: they have returned to the dopey position that says they
must never have a morphosyntactically singular antecedent. They are back
in tune with American backwardness: with Strunk & White, and their
thousands of latter-day co-religionists such as Stanley Fish and the style
guides of the Modern Language Association and the American Psychological
Association. Why do these sources continue to damn singular antecedents
for they in defiance of all the evidence of its constant
use by respectable authors during at least the past six centuries?
I have no idea.

But you can look the matter up for yourself: the wonderful
Merriam-Webster's Dictionary of English Usage would be a very
good place to start. Look at their list of literary examples, and
then decide, freely and of your own will, uninfluenced by me, whether
to side with The Cambridge Grammar or the atavistic loonies.
(Hint: The atavistic loonies would not be a good choice.)

Dinner at the L.S. Cabal: the sequel

Claire Bowern of anggargoon.org has suggested another blogging-oriented dinner at the annual LSA meeting, which is in Albuquerque this year (preliminary program here). Last year's dinner was interesting and fun. Claire suggests either Friday 1/6/2006 or Saturday 1/7/2006. My preference would be for Saturday, leaving around 7:30 from the reception after the Presidential Address, but Friday is also possible for me. Let me know if you'd like to come so we can get a headcount for a reservation.

[Note: I've changed my (careless) wording from "bloggers' dinner" to "blogging-oriented dinner", to make it clear that friends, correspondents, readers, and any other interested parties are welcome. 10 blogging-oriented people have signed up so far, but there are a few who can't make it on Friday and a few who can't make it on Saturday, so the schedule is still up in the air. ]