All the speech sounds in the world

The emails I receive about phonetics range from the very naïve to the very sophisticated. Sometimes it is difficult to know which is which. One correspondent writes:

I am particularly keen to know where I could find a single list that would include all the sounds found in all the languages. Does such a list exist?

Is my correspondent a beginner who should be advised to consult a textbook of general phonetics, e.g. Peter Ladefoged’s Vowels and Consonants (Blackwell), or, even better, to enrol for a course in general phonetics?

Or is this a serious question for which a serious answer might be sought in Maddieson and Ladefoged, Sounds of the World’s Languages?

But I think a qualified phonetician would not have asked the question in this way unless putting on an act of being naïve. There is a difficulty not always appreciated by non-specialists: you can’t enumerate “sounds”. This is because there is no real way to say whether sounds are “the same” or “different” except with reference to a specific language or dialect. For example, English /t/ is by default a voiceless aspirated alveolar plosive, Dutch /t/ is ditto but unaspirated, Swedish /t/ is aspirated but dental, and French /t/ is both unaspirated and dental. So depending on language /t/ may by default be aspirated or unaspirated, alveolar or dental.

Once we start to add in allophonic variation within a language, the difficulty is multiplied. Sometimes English /t/ is not voiceless, sometimes it is not aspirated, sometimes it is not alveolar, and sometimes it is not even plosive.

So how many “sounds” are we dealing with here? For everyday purposes, they are all transcribed with the same t symbol, because we apply a principle of transcriptional simplicity. But in a sense they are all different.

The English /l/ phoneme comprises two sounds (two allophones), as we know. The clear l is clear (= has a palatal resonance, sort of), but not as clear as the German l and nothing like as clear as the Russian ‘soft’ lʲ; the place is alveolar, i.e. retracted from what we might think of as the cardinal dental position, but not as retracted as that of the Korean l. Factor in the various l-sounds of thousands of other languages: how many l-sounds are there altogether?

Even if you start with a list of the 109 alphabetic IPA symbols plus 33 stress, length and tone marks or thereabouts, there are still 32 diacritics that can be applied to the alphabetic characters to symbolize secondary articulations and various other attributes. Sometimes more than one diacritic can be applied to the same base. The combinatorial explosion makes it unrealistic to attempt to count the resulting “sounds”. And we haven’t even started on voiced or nasalized clicks, double articulations, affricates and diphthongs, all of which can be represented in IPA by combining two alphabetic symbols with a tie bar, but are arguably single “sounds”.

Peter Ladefoged and me

Ian Maddieson

Tuesday 29 April 2008

Lolcats

Thanks to yesterday’s Guardian, I came across a charming timewaster of a website called icanhascheezburger.com. It consists of captioned pictures of cats (and occasionally other animals), like this.

The linguistic interest here is that the cats are shown as speaking a special dialect of English, LOLspeak, which the many thousands of contributors have managed to master (learn it here).

As well as various grammatical oddities (particularly extraneous s and z endings), LOLspeak is characterized by a kind of do-it-yourself spelling reform combined with txtspeak and typical computer-users’ mistypings (as teh above).

The respelling thot in the picture (= thought) would make sense to whoever captioned it, presumably someone from those extensive parts of north America where LOT and THOUGHT are merged, but not to us Brits who keep them distinct and for whom thawt would seem more appropriate.

Perhaps this will placate those who have complained about my removing the old animated cat gif from my homepage.

Talking of the web, I find that on Facebook there is a group called Phonetics is the new Rock n' Roll (sic). I quote:

Featuring:

J.C. Wells on Drums
Peter Trudgill on Lead Bass
Peter Ladefoged (deceased) on Lead Guitar
Ken Lodge on Lead Vocals

Hello?

I've had three e-mails in the last few days with this word spelt hello, hallo, and hullo.

I think I’ve heard /e,æ,ʌ/ and I also suspect I’ve heard the stress on the first syllable, though maybe this was in stress-shift positions.

I doubt searching any spoken corpus would help.

If I am not mistaken, Americans always spell it hello. (My Webster’s Collegiate also gives hollo, but perhaps this is a different word.)

We can agree on its usual pronunciation: həˈləʊ. But as Alan says, instead of a schwa I think it can sometimes in BrE have a strong vowel in the first syllable; and this strong vowel may, as Alan points out, correspond to any of the spellings hello, hallo, hullo.

Can these non-schwa variants be stressed? Do they have to be? I see that in LPD I am rather inconsistent. I mark an optional secondary stress for the non-schwa variants under the BrE spellings hallo and hullo, but not under the spelling we all agree on, hello.

The way to determine whether or not there is a secondary stress on the first syllable is to see whether stress shift is possible. And here there is a bit of a problem.

As far as I can tell, stress shift happens only if hello (however spelt) is followed by the name of the person being greeted, i.e. a vocative. To trigger stress shift, this vocative has to be accented.

The difficulty is that final vocatives are normally not accented.

Furthermore, as far as I can see the intonation pattern under stress shift has to be fall plus rise.

\Hello, | /Mary.

I don’t think you can say

*\Hello, Mary.

though of course you can say the neutral

Hel\lo, Mary.

(or the same thing with a rising or falling-rising nuclear tone).

This stress shift triggered by the anomalous accenting of a vocative is parallelled by the cry of despairing British tennis fans,

\Come on | /Tim!

where logic would lead you to expect the unremarkable

(')Come \on, Tim!

But in Hello, Mary the stress shift itself is anomalous, too, given that the usual first vowel in hello is ə. The only explanation seems to be that we’re dealing with intonational idioms.

Another joke for eight-year-olds (blog, Saturday), this one from Eric Armstrong. For it to work properly, the eight-year-olds have to be not only non-rhotic but also h-droppers and to have no distinction between -wʊd and -wəd. Well, that’s a lot of eight-year-olds in England, anyhow.

What do you call a man with a tree on his head? — Edward.

And what do you call a man with three trees on his head? — Edward Woodward.

Greg Porilo has a follow-up to Saturday’s pun.

What do you call a deer with no eyes and no legs?— Still no idea!

Alan Cruttenden

Hello Kitty speaks Japanese, not English— but Japanese beginners in English say haro!

Saturday 26 April 2008

Puns

Penwortham

Mike Mayor, editorial director of the Dictionaries section at Longman, tells me he comes from Penwortham, a village on the outskirts of Preston, Lancashire. Those who live there call it ˈpenwə(r)ðəm, with initial stress. (Preston is on the edge of the shrinking rhotic area of Lancashire, so there may still be a few people who pronounce it with -r-.)

On taking over responsibility for LPD, Mike congratulated me on having got the pronunciation of this placename right in the dictionary. Non-locals, particularly southerners, tend to assume that it’s penˈwɜːðəm (or -ˈwɜːθəm), with penultimate stress; they’re wrong.

(I don’t know how the Australians pronounce the place with the same spelling in South Australia.)

This is one of several such placenames in the north of England. In Lancashire just near where I lived as a boy is the mining village of Winstanley. We called it ˈwɪnstənli, but those unfamiliar with the place, and indeed bearers of the corresponding surname, almost all seem to say wɪnˈstænli. The BBC Pron. Dict. of British Names gives both. Etymologically, the name derives from the OE proper name Wynnstān, modern Winston, plus lēah ‘wood, clearing’, which explains its initial stress.

Just over the summit of the Pennines as you go east into Yorkshire lies Todmorden. I’ve always known this place as ˈtɒdmədən, though according to the BBC PDBN it can also be ˈtɒdmɔːdən. But not tɒdˈmɔːdən, which is what parachuted-in TV reporters tend to go for.

Conversely, we all know about Newcastle (upon Tyne), which locals stress on the -cas- but most other people on the New- (a stressing ‘firmly established in national usage’, in the words of the BBC PDBN). But Newcastle(-under-Lyme) in Staffordshire has initial stress, and so do various other Newcastles.

Just to keep us on our toes, however, near Todmorden there is Mytholmroyd. That’s actually ˌmaɪðəmˈrɔɪd, with final stress.

I console myself with the thought that all of this must be good for the sales of pronunciation dictionaries.

Penwortham: the bridge over the Ribble

Coal wagon, Winstanley Colliery

Thursday 24 April 2008

Data processing under stress

Masa Hirata, a senior manager with Pearson Education/Longman in Japan, passed on a query from a customer.

I understand that usually in compounds the first element will receive the primary stress, with a few exceptions. In [the Longman English-Japanese Dictionary], and also in your latest edition of LPD, data processing has the primary stress on the second element (processing) and secondary stress on the first one (data). I checked the earlier edition of LPD and there it followed the general rule. I was curious to know what motivated this change, whereas the compound like 'word processing' remains the same.

So here’s an edited version of what I said in reply.

It is my belief that the usual stress pattern for this compound is double stress: ˌdata ˈprocessing.

(I could be wrong about this, of course. But that’s how I say it myself, and as far as I can tell so do most people.)

Like many recently coined compounds, it has perhaps not yet really settled down lexically. Meanwhile we give it a pragmatic stressing.

?(i) We’re going to do some processing. | Today it’ll be 'data processing.

(ii) We've got some data. | Now we must move on to data 'processing.

Clearly, (ii) is the more usual implicit situation.

So this pragmatic effect seems to override the usual early-stress rule for compound nouns. But as the compound becomes more established it would be expected to fall into line and move towards 'data processing (and no doubt already has done for some speakers).

With word processing, on the other hand, we would never think

*(ii′) We've got some words. | Now we must move on to word 'processing.

(That is why the cartoon alongside is funny.)

So it is always 'word processing, in accordance with the compound rule.

Compare also the American hesitation over the stressing of Thanksgiving, or other compounds where we don’t all agree: ice cream, armchair.

Mr Hirata replied (I think, without irony):

I'm sure our customer will be grateful for your clear explanation.

Personally, I can’t help feeling that my explanation was a bit ad hoc. But it’s the best I could do.

Wednesday 23 April 2008

Respelling

Would-be dictionary users who are native speakers of English are often reluctant to get to grips with proper IPA transcription. The phonetic symbols are unfamiliar, you have to learn them. Anyhow, most dictionary users never read the front matter where such things are explained.

So, despite the widespread adoption of IPA in British-published monolingual dictionaries, others hold out for something perceived as more user-friendly. That means a respelling system.

In the US the IPA has made less headway than in Britain among dictionary publishers. In US-published dictionaries (as far as I am aware) there is no use of IPA, only of respelling.

In a respelling system we ideally use the familiar letters of English in their familiar meanings, as far as possible without diacritic marks.

This poses some difficult problems. How, for example, can we represent the ‘long i’ aɪ diphthong of price? We need to distinguish it clearly from the short ɪ of kit. Otherwise, how could we discuss the two competing pronunciations of dissect? Or distinguish the two words spelt wind? One traditional solution is to use a macron diacritic, as in ī, so that aisleaɪl is shown as īl. But people don’t like diacritics. Another is to write y, thus yl. But this is in danger of being misinterpreted, since traditional spelling uses y in this meaning only when word-final (cry, shy). A third possible solution is igh, thus ighl, which is unambiguous but clumsy. Ordinary iCe would do for ile but not for winde (wind around, not the wind that blows). There is no ideal solution.

There is also no really unambiguous way to respell the diphthong of cow: both ou and ow are open to misinterpretation (soul, mouth, soup; know, how).

And what about schwa? The best solution, ə, requires a special letter. The alternative, uh, involves an arbitrary meaning for a digraph not used at all in traditional orthography. Non-rhotic speakers might be happy with er: but spellings such as bernáhner (banana) will shock rhotic speakers.

There is also sometimes a problem distinguishing clearly between /s/ and /z/, since in non-initial position the letter s is commonly used for both: alongside basebeɪs we have riseraɪz. In the respelling systems I designed first for the Reader’s Digest Great Illustrated Dictionary (1984) and then later for the Encarta World English Dictionary (1999), with their spin-offs (pictured), I introduced the idea of making use of doubled consonant letters. This is not only a familiar way to indicate that a preceding vowel is short (rívvər), but also means that ss will automatically be read as voiceless (bayss as against rīz).

PS: Graham Pointon protests, quite rightly, that the BBC has been using double consonant letters in its respelling system since 1928. But that was not for a general dictionary.

Tuesday 22 April 2008

Skeat transcribed

As I’ve mentioned (18, 21 April), this year sees the centenary of the founding of the Simplified Spelling Society (now the Spelling Society).

In the Maître Phonétique of May-June 1909 Daniel Jones wrote about “ði eimz əv ðis nju: səsaiəti” and published an account of the address given at the inaugural meeting by its President, the philologist Walter Skeat.

Like everything else in the m.f., it was written in transcription — in this case the old quantitative transcription of English in which the difference between reed and rid was shown by length marks alone. But this is presumably Jones’s transcription of Skeat’s text, not something done by Skeat himself. It is not clear whether or not it is an attempt to transcribe exactly how Skeat would have pronounced. Since Jones was not present at the meeting, and since this was long before the days of tape recording, it cannot be an exact record of how Skeat said it.

I won’t bore you by giving the whole of the address. But the last paragraph is interesting on linguistic grounds.

I don’t think I have ever before come across the word irrefragable in use, only as a headword in old editions of EPD. Nor have I ever heard chemistry pronounced ˈkɪmɪstri. By the 1963 edition of EPD, Jones declared ˈkɪm- in this word “probably obsolete, or nearly so”. It is certainly obsolete now. But Skeat may well have pronounced it like that.

Monday 21 April 2008

Lexicographers and spelling

At the centenary AGM of the Spelling Society on Saturday there was an interesting talk by the lexicographer George Davidson, editor of the Chambers 21st Century Dictionary and of Roget’s Thesaurus.

His claim was that, contrary to what is often believed, English lexicographers over the past four centuries have not in general been innovators in spelling, but rather have felt constrained to follow contemporary practice.

Samuel Johnson declared that it was necessary to ‘sacrifice uniformity to custom’ and thus for example to accept the difficult inveigh alongside the etymologically parallel and easier convey, receipt alongside deceit, and phantom alongside fancy.

Furthermore, by no means all of the spellings in Johnson’s great dictionary (1755) met general acceptance. We do not nowadays write persue, raindeer, spunge, villany or musick.

By 1806 we were ready to agree with Noah Webster and drop the k from musick. Webster persuaded the Americans (but not the British) to write favor, honor, color, but failed to convince anyone that we should drop the final e from determine and examine and the final b from thumb. By 1828 he had retreated on these latter points.

Even some of the OED’s preferred spellings have not succeeded in displacing deprecated alternatives: ax and tire (tyre) are the norm in the States but not in Britain, while the OED’s preference for rime and connexion has not won out over rhyme and connection.

Nigel Greenwood sends me this advertisement from lastminute.com currently displayed in the London Underground.

George Davidson

Friday 18 April 2008

Definitely

Tomorrow, Saturday, sees the centenary AGM of the Spelling Society, founded in London in 1908 as the Simplified Spelling Society (see the minutes of that first meeting). So in contemplation of tomorrow’s meeting here is a rant of the type you would normally expect to come not from me but from a BBC R4 listener or Telegraph reader particularly uptight about spelling.

It has always surprised me how many people misspell definitely as definately.

A Google search throws up 24.5 million hits for definately as against 153 million for definitely. That means that web authors get it wrong about one time in seven.

The difficulty of course stems from the fact that adjectival weak -ate and -ite are pronounced identically.

It’s particularly surprising that students of phonetics and linguistics would get this wrong (which some of them do). You’d think they would be aware of orthographic and etymological relationships such as those between definite(ly) and definition, where the stressed ɪ vowel of -ition in the latter shows clearly how -ite in the former is to be spelt. Not to mention definitive, finish, infinite and the foreign but widely-known finis and finito.

As (in)considerate(ly) is to consideration, so (in)definite(ly) is to definition. Why don’t people get it?

Anyone who has learnt Latin should have no difficulty, since verbs belonging to the first conjugation give us -ate -ation -ative but those belonging to the fourth conjugation -ite -ition -itive. (OK, since you ask, the second conjugation is exemplified by complete completion expletive and the third conjugation, in which the stem is consonantal, by correct correction corrective.)

I try not to be annoyed by spelling mistakes. After all, I do think it would be a good idea to reform English spelling so that we didn’t obsess so much about such ultimately trivial matters.

Nevertheless, as things stand educated people are supposed to get it right. Until we have a spelling reform. Deffo.

Thursday 17 April 2008

A survivor of the Titanic

Charles Lightoller was the senior surviving crew member in the Titanic disaster of 1912.

The BBC has just made available a sound recording, dating from 1936, of his memories of the sinking.

Lightoller was born in Chorley, Lancs — not far from where I come from myself. Although a naval officer, he did not speak RP. Yet listening to his voice you would not immediately take him for a Lancastrian. His accent is rhotic, and his use of a long vowel in last and a southern-style ʌ vowel (rumbling, bulk, up) suggest, if anything, a mildly Wessex-southwest-of-England accent.

He has an old-fashioned fully back oʊ or oː in moment, rows, broke, slowly, and a remarkable sort of ɛɪ in sky. His LOT vowel seems to be unrounded ɑ, American-style (got, solemn, gone).

You can read details of his life in Wikipedia. From this it appears that he travelled widely. Perhaps he picked up his speech habits during his teenage years as an apprentice seaman.

Commander Charles Lightoller

Wednesday 16 April 2008

Lucida Grande

Not being a Mac user, I was until recently unfamiliar with the font Lucida Grande. But my colleagues who are Mac users tell me that it is the default font for their browser Safari, and that it comes with IPA symbols.

After a bit of hunting around on the web, I found a .ttf file of the font. But on installing it on my Windows computer I found it had no phonetic symbols. So I uninstalled it, and hunted further. I found that if one downloads and installs Safari for Windows it comes with a much more extensive Lucida Grande included. This is what the IPA Extensions range looks like.

As you can see, it appears to have all the official IPA symbols and diacritics.

The only symbol obviously missing from this and other Microsoft fonts is the new labiodental flap symbol, only recently added to Unicode at U+2C71. But as far as I am aware no published font yet has that symbol at that location.

Furthermore, the ɪ and ʊ in Lucida Grande are well shaped. (Compare the blog entries for 12 March and 4 April for unsatisfactory symbol shapes in other fonts.)

The only symbol I would find fault with is the “Latin small letter reversed open e with hook”, aka the AmE NURSE vowel, ɜ with a rhotacizing hook. Here it is on the right, compared with the properly shaped non-rhotacized symbol on the left. You can see that it is based upon a Cyrillic e (i.e. э) rather than on the proper Latin reversed open e (ɜ).

Anyhow, in the invisible style sheet at the head of this web page I shall continue to specify Lucida Sans Unicode as the first choice for the phonetic symbols you see in green, but will give Lucida Grande as the second choice. I hope this helps my Mac-using readers.

PS: Steffen Höder tells me that there is after all a published font with the labiodental flap symbol at U+2C71. The newest version of Code2000 has it. This shareware font can be downloaded here.