In phonetics, a diphthong, pronounced/ˈdɪf.θɒŋ/ or /ˈdɪp.θɒŋ/, (also gliding vowel) (from Greekδίφθογγος, diphthongos, literally "two sounds" or "two tones") is a contourvowel—that is, a unitary vowel that changes quality during its pronunciation, or "glides", with a smooth movement of the tongue from one articulation to another, as in the English words eye, boy, and cow. This contrasts with "pure" vowels, or monophthongs, where the tongue is held still, as in the English word papa.[1]

Diphthongs often form when separate vowels are run together in rapid speech. However, there are also unitary diphthongs, as in the English examples above, which are heard by listeners as single-vowel sounds (phonemes).[2]

In the International Phonetic Alphabet, pure vowels are transcribed with one letter, as in English sun[sʌn]. Diphthongs are transcribed with two letters, as in English sign[saɪ̯n] or sane[seɪ̯n]. The two vowel symbols are chosen to represent the beginning and ending positions of the tongue, though this can be only approximate. The diacritic< ̯> is placed under the less prominent component to show that it is part of a diphthong rather than a separate vowel, though it is sometimes omitted in languages such as English, where there is not likely to be any confusion. (In precise transcription, [ai] represents two vowels in hiatus, found for example in Hawaiian and in the English word naïve, and does not represent the diphthong, for instance, in the Finnish word laiva, "ship").

Contents

Falling (or descending) diphthongs start with a vowel quality of higher prominence (higher pitch or louder) and end in a semivowel with less prominence, like [aɪ̯] in eye, while rising (or ascending) diphthongs begin with a less prominent semivowel and end with a more prominent full vowel, like [ɪ̯a] in yard. The less prominent component in the diphthong may also be transcribed as an approximant, thus [aj] in eye and [ja] in yard. However, when the diphthong is analysed as a single phoneme, both elements are often transcribed with vowel letters (/aɪ̯/, /ɪ̯a/). Note also that semivowels and approximants are not equivalent in all treatments, and in the English and Italian languages, among others, many phoneticians do not consider rising combinations to be diphthongs, but rather sequences of approximant and vowel. There are many languages (such as Romanian) that contrast one or more rising diphthongs with similar sequences of a glide and a vowel in their phonetic inventory.[3]

In closing diphthongs, the second element is more close than the first (e.g. [ai]); in opening diphthongs, the second element is more open (e.g. [ia]). Closing diphthongs tend to be falling ([ai̯]), and opening diphthongs are generally rising ([i̯a]), as open vowels are more sonorous and therefore tend to be more prominent. However, exceptions to this rule are not rare in the world's languages. In Finnish, for instance, the opening diphthongs /ie̯/ and /uo̯/ are true falling diphthongs, since they begin louder and with higher pitch and fall in prominence during the diphthong.

A centering diphthong is one that begins with a more peripheral vowel and ends with a more central one, such as [ɪə̯], [ɛə̯], and [ʊə̯] in Received Pronunciation or [iə̯] and [uə̯] in Irish. Many centering diphthongs are also opening diphthongs ([iə̯], [uə̯]).

Some languages contrast short and long diphthongs, the latter usually being described as having a long first element (see vowel length). Languages that contrast three quantities in diphthongs are extremely rare, but not unheard of; Northern Sami is known to contrast long, short and finally stressed diphthongs, the last of which are distinguished by a long second element.

While there are a number of similarities, diphthongs are not the same as a combination of a vowel and a semivowel or glide. Most importantly, diphthongs are fully contained in the syllable nucleus[4][5] while a semivowel or glide is restricted to the syllable boundaries (either the onset or the coda). This often manifests itself phonetically by a greater degree of constriction.[6] though this phonetic distinction is not always clear.[7] The English word yes, for example, consists of a palatal glide followed by a monophthong rather than a rising diphthong. In addition, while the segmental elements must be different in diphthongs so that [ii̯], when it occurs in a language, does not contrast with [iː] though it is possible to contrast [ij] and [iː].[8]

Catalan possesses a number of phonetic diphthongs, all of which begin or end in [j] or [w]. They include:[9]

[ej]

rei

'king'

[ɛw]

peu

'foot'

[uj]

avui

'today'

[ow]

pou

'well'

[ja]

iaia

'grandma'

[wa]

quatre

'four'

[jɛ]

veiem

'we see'

[wə]

aigua

'water'

In addition to these, Catalan also possesses two sets of diphthongs in variation; [wi] varies with [uj] (as in afluixar[aflujˈɕa~aflwiˈɕa] 'to loosen') and [iw] with [ju].[9]

There are also certain instances of compensatory diphthongization in the Majorcan dialect so that /ˈtroncs/ ('logs') (in addition to deleting the palatal plosive) develops a compensating palatal glide and surfaces as [ˈtrojns] (and contrasts with the unpluralized [ˈtronʲc]). Diphthongization compensates for the loss of the palatal stop (part of Catalan's segment loss compensation). There are other cases where diphthongization compensates for the loss of point of articulation features (property loss compensation) as in [ˈaɲ] ('year') vs [ˈajns] ('years').[10]

The dialectal distribution of compensatory diphthongization is almost entirely dependent on the dorsal plosive (whether it is velar or palatal) and the extent of consonant assimilation (whether or not it's extended to palatals).[11]

[eɪ̯], [øʏ̯], and [oʊ̯] are normally pronounced as closing diphthongs except before [ɾ] in the same word, in which case they are centering diphthongs: [eə̯], [øə̯], and [oə̯]. In many dialects, they are monophthongized[16]

The dialect of Hamont (in Limburg) has five centring diphthongs and contrasts long and short forms of [ɛɪ̯], [œʏ̯], [ɔʊ̯], and [ɑʊ̯].[17]

Canadian English exhibits allophony of /aʊ̯/ and /aɪ̯/ called Canadian raising. GA and RP have raising to a lesser extent in /aɪ̯/.

In Received Pronunciation, the vowels in lair and lure may be monophthongized to [ɛː] and [oː] respectively.[18] Australian English speakers more readily monophthongize the former.

In rhotic dialects, words like pair, poor, and peer can be analyzed as diphthongs, although other descriptions analyze them as vowels with [ɹ] in the coda.

The erstwhile monophthongs /iː/ and /uː/ are diphthongized in many dialects. In many cases they might be better transcribed as [uu̯] and [ii̯], where the non-syllabic element is understood to be closer than the syllabic element. They are sometimes transcribed /uw/ and /ij/.

While /wa/, /wɛ̃/, and /ɥi/ may be considered diphthongs (that is, fully contained in the syllable nucleus), other sequences of a glide and vowel are considered part of a glide formation process that turns a high vowel into a glide (and part of the syllable onset) when followed by another vowel.[19]

In general, unstressed /i e o u/ in hiatus can turn into glides in more rapid speech (e.g. biennale[bjenˈnaːle] 'biennial'; coalizione[ko̯aliˈtːsjoːne] 'coalition') with the process occuring more readily in syllables further from stress.[21]

Rising diphthongs in Mandarin are usually regarded as a combination of a medial glide (i, u, or ü) and a final segment, while falling diphthongs are seen as one final segment. Tone marker is always placed on the vowel with more prominence.

The diphthong system in Northern Sami varies considerably from one dialect to another. The Western Finnmark dialects distinguish four different qualities of opening diphthongs:

/eæ/ as in leat "to be"

/ie/ as in giella "language"

/oa/ as in boahtit "to come"

/uo/ as in vuodjat "to swim"

In terms of quantity, Northern Sami shows a three-way contrast between long, short and finally stressed diphthongs. The last are distinguished from long and short diphthongs by a markedly long and stressed second component. Diphthong quantity is not indicated in spelling.

European Portuguese has 14 phonemic diphthongs (10 oral and 4 nasal),[23] all of which are falling diphthongs formed by a vowel and a nonsyllabic high vowel. Brazilian Portuguese has roughly the same amount, although the two dialects have slightly different pronunciations. A [w] onglide after /k/ or /ɡ/ as in quando[kʊ̯ɐ̃dʊ] ('when') or [ˈɡʊ̯aɾdɐ] ('guard') may also form rising diphthongs and triphthongs. Additionally, in casual speech, adjacent heterosyllabic vowels may combine into diphthongs and triphthongs or even sequences of them;[24] in more formal speech, these are realized as hiatus e.g., férias[ˈfɛ.ɾi.ɐʃ] ~ [ˈfɛ.ɾjɐʃ].[How to reference and link to summary or text]

In addition, phonetic diphthongs are formed in Brazilian Portuguese by the vocalization of /l/ in the syllable coda with words like sol[sɔʊ̯] ('sun') and sul[suʊ̯] ('south') as well as by yodization of vowels preceding /s/ in words like arroz[aʁoɪ̯s] ('rice') and mas[maɪ̯s] ('but').[25]

Romanian has two diphthongs: /e̯a/ and /o̯a/. As a result of their origin (diphthongization of mid vowels under stress), they appear only in stressed syllables[26] and make morphological alternations with the mid vowels /e/ and /o/. To native speakers, they sound very similar to /ja/ and /wa/ respectively.[27] There are no perfect minimal pairs to contrast /o̯a/ and /wa/,[28] and because /o̯a/ doesn't appear in the final syllable of a prosodic word, there are no monosyllabic words with /o̯a/; exceptions might include voal ('veil') and trotuar ('sidewalk'), though Ioana Chiţoran argues[29] that these are best treated as containing glide-vowel sequences rather than diphthongs. In addition to these, the semivowels /j/ and /w/ can be combined (either before, after, or both) with most vowels, while this arguably[30] forms additional diphthongs and triphthongs, only /e̯a/ and /o̯a/ can follow an obstruent-liquid cluster such as in broască ('frog') and dreagă ('to mend').[31] implying that /j/ and /w/ are restricted to the syllable boundary and therefore, strictly speaking, do not form diphthongs.

Spanish has six falling diphthongs and eight rising diphthongs. In addition, during fast speech, sequences of vowels in hiatus become diphthongs wherein one becomes non-syllabic (unless they are the same vowel, in which case they fuse together) as in poeta[ˈpo̯eta] ('poet') and maestro[ˈmae̯stɾo] ('teacher'). The phonemic diphthongs are:[32]

↑The tongue will move at the boundaries even of monophthongs, because this is necessary for the pronunciation of adjacent consonants. However, the description given here is correct for the middle of the vowel, which is most prominent to the human ear. Monophthongs can be pronounced in isolation without any movement of the tongue, which is not possible for diphthongs. More technically, monophthongs are said to have one target tongue position, diphthongs two, and triphthongs three.

Bertinetto, Pier Marco; Loporcaro, Michele (2005), "The sound pattern of Standard Italian, as compared with the varieties spoken in Florence, Milan and Rome", Journal of the International Phonetic Association35 (2): 131-151