Phonetics is the study of the
sounds of language. These sounds are called phonemes. There are literally
hundreds of them used in different languages. Even a single
language like English requires us to distinguish about 40! The
key word here is distinguish. We actually make much finer
discriminations among sounds, but English only requires 40. The
other discriminations are what lets us detect the differences in
accents and dialects, identify individuals, and differentiate tiny
nuances of speech that indicate things beyond the obvious meanings of
the words.

The Vocal Tract

In order to study the sounds of language, we first need to study the vocal tract. Speech starts
with the lungs, which push air
out and pull it in. The original purpose was, of course, to get
oxygen and eliminate carbon dioxide. But it is also essential for
speech. There are phonemes that are little more than
breathing: the h for example.

Next, we have the larynx, or
voice box. It sits at the juncture of the trachea or windpipe coming up from
the lungs and the esophagous
coming up from the stomach. In the larynx, we have an opening
called the glottis, an epiglottis which covers the glottis
when we are swallowing, and the vocal
cords. The vocal cords consist of two flaps of mucous
membrane stretched across the glottis, as in this photograph:

The vocal cords can be tightened and loosened and can vibrate when air
is forced past them, creating sound. Some phonemes use that
sound, and are called voiced.
Examples include the vowels (a, e, i, o, and u, for example) and some
of the consonants (m, l, and r, for example). Other phonemes do
not involve the vocal cords, such as the consonants h, t, or s, and so
are called unvoiced.

The area above the glottis is called the pharynx, or upper throat. It
can be tightened to make phryngeal
consonants. English doesn’t have any of these, but they sound
like when you try to get a piece of food back up out of your throat.

At the top of the throat is the opening to the nasal passages (called
the nasopharynx, in case you
are interested). When we allow air to pass into the nose while
speaking, the sounds we make are called nasal. Examples include m, n,
and the ng sound of sing.

Much of the action during speech occurs in the mouth, of course,
especially involving the interaction of the tongue with the roof of the
mouth. The roof of the mouth has several specific areas: At
the very back, just before the nasal passage, is that little bag called
the uvula. Its major
function seems to be moisturizing the air and making certain sounds
called, obviously, uvular.
The best known is the kind of r pronounced in the back of the mouth by
some French and German speakers. Uvular, pharyngeal, and glottal
sounds are often refered to as gutterals.

Next, we have the soft palate, called the velum. If you turn your tongue
back as far as it will go and press up, you can feel how soft it
is. When you say k or g, you are using the velum, so they are
called velar consonants.

Further forward is the hard palate.
Quite a few consonants are made using the hard palate, such as s, sh,
n, and l, and are called palatals.
Just behind the teeth is the dental ridge or alveolus. Here is where many
of us make our t’s and d’s -- alveolar
consonants.

At the very outer edge of the mouth we have the teeth and the
lips. Dental consonants
are made by touching the tongue to the teeth. In English, we make
the two th sounds like this. Note that one of these is voiced
(the th in the) and one is unvoiced (the th in thin).

At the lips we can make several sounds as well. The simplest,
perhaps, are the bilabial
sounds, made by holding the lips together and then releasing the sound,
such as p and b, or by keeping them together and releasing the air
through the nose, making the bilabial nasal m. We can also use
the upper teeth with the lower lip, for labiodental sounds. This is
how we make an f, for example.

Incidentally, we also have two names for the parts of the tongue used
with these various parts of the mouth: The front edge is called
the corona, and the back is
called the dorsum.
Sounds like t, th, and s are made with the corona, while k, g, and ng
are made with the dorsum.

Consonants

Consonants are sounds which involve full or partial blocking of
airflow. In English, the consonants are p, b, t, d, ch, j, k, g,
f, v, th, dh, s, z, sh, zh, m, n, ng, l, r, w, and y. They are
classified in a number of different ways, depending on the vocal tract
details we just discussed.

1. Stops, also known as plosives. The air is blocked
for a moment, then released. In English, they are p, b, t, d, k,
and g.

In other languages, we find labiodental, palatal, uvular, pharyngeal,
and glottal plosives as well, and retroflex
plosives, which involve reaching back to the palate with the corona of
the tongue.

In many languages, plosives may be followed by aspiration, that is, by a breathy
sound like an h. In Chinese, for example, there is a distinction
between a p pronounced crisply and an aspirated p. We use both in
English (pit vs poo), but it isn’t a distinction that separates one
meaning from another.

3. Affricates are sounds
that involve a plosive followed immediately by a fricative at the same
location. In English, we have ch (unvoiced) and j (voiced).
Many consider these as blends: t-sh and d-zh.

4. Nasals are sounds
made with air passing through the nose. In English, these are m,
n, and ng.

a. Bilabial nasal: m
b. Alveolar nasal: n
c. Velar nasal: ng

5. Liquids are sounds
with very little air resistance. In English, we have l and r,
which are both alveolar, but differ in the shape of the tongue.
For l, we touch the tip to the ridge of the teeth and let the air go
around both sides. For the r, we almost block the air on both
sides and let it through at the top. Note that there are many
variations of l and r in other languages and even within English itself!

6. Semivowels are sounds
that are, as the name implies, very nearly vowels. In English, we
have w and y, which you can see are a lot like vowels such as oo and
ee, but with the lips almost closed for w (a bilabial) and the tongue
almost touching the palate for y (a palatal). They are also
called glides, since they
normally “glide” into or out of vowel positions (as in woo, yeah, ow,
and oy).

In many languages, such as Russian, there is a whole set of palatalized consonants, which means
they are followed by a y before the vowel. This is also called an
on-glide.

Vowels

There are about 14 vowels in English. They are the ones found in
these words: beet, bit, bait, bet, bat, car, pot (in British
English), bought, boat, book, boot, bird, but, and the a in ago.
There are also three diphthongs
or double vowels: bite, cow, and boy. Diphthongs involve off-glides.: You can hear the y in
bite and boy, and the w in cow. Actually, the sounds in bait and
boat are also diphthongs (with y and w off-glides, respectively), but
the first parts of the diphthongs are different from the nearby sounds
in bet and bought.

Vowels are classified in three dimensions:

1. The height of the tongue in the mouth -- low, mid, or high

high are beet, bit, boot, and book
mid are bait, bet, but, boat, bought, bird and a in ago
low are bat, car, and british pot

2. How far forward or backward in the mouth the tongue rises -- front, center, or back

front are beet, bit, bait, bet, and bat
center are but, bird, and a in ago
back are boot, book, boat, bought, and british pot

3. How rounded or unrounded
the lips are

the front vowels are unrounded
the center and back vowels are rounded

The rounding idea may seem unnecessary until you realize that many
languages have rounded front vowels -- such as the German ü and
ö and the French u and eu -- and many have unrounded back vowels
-- such as the Japanese u. If you took French in high school, you
may remember the teacher telling you to say tea with your lips rounded
for French tu. It isn’t the best way to teach the sound, but it
shows you where it fits in the scheme.

There is one more dimension that doesn’t have much to do with English,
but is essential in many languages, and that is vowel length. Vowels can be
short or long, and it is just a matter of how long you continue the
sound. The closest we get in English is that the vowel in beet is
longer (as well as higher) than the vowel in bit. The same goes
for boot and book, and for caught and the British pot.

In some languages, such as French, there is another quality to vowels,
and that is nasality.
Some vowels are pronounced with airflow through the nose as well as the
mouth. Originally, these were simply vowels followed by nasal
consonants. But over time, the French blended the vowels and the
nasals into one unit.

IPA

Over the years, linguists have developed a complex chart of phonemes
for transcribing the sounds of all languages around the world. It
is called the International Phonetic
Alphabet, and much of it is in the charts below. If you
get question marks or little squares, that means your computer isn't
equipt with unicode, in which
case you will have to look elsewhere
for
charts like this.

Consonants

bilabial

labio-
dental

dental

alveolar

retroflex

palato-
alveolar

palatal

velar

uvular

glottal

plosives

uv.

p

t

ʈ

c

k

q

ʔ

v.

b

d

ɖ

ɟ

g

ɢ

fricatives

uv.

Φ

f

θ

s

ʂ

ʃ

ç

x

χ

h

v.

β

v

ð

z

ʐ

ʒ

ʝ

γ

ʁ

ɦ

nasals

m

ɱ

n

ɳ

ɲ

ŋ

ɴ

semivowels

uv.

ʍ

v.

w

ʋ

ɹ

ɻ

j

rolled/
trilled

в

r

ʀ

tapped/
flapped

ɾ

ɽ

laterals

l

ɭ

λ

L

lateral
-fricatives

uv.

ł

v.

ɮ

Vowels

front

central

back

high

i
y

ɨ
ʉ

ɯ
u

ɪ
ʏ

ʊ

middle

e
ø

ɜ
ə ɵ

ɤ
o

ɛ
œ

ɐ
ʌ

ɔ

low

æ
a

α
ɒ

Vowel length is marked with a colon after the vowel, e.g. i:

Nasal vowels are shown by placing a tilde over the vowel, e.g. ã

There are dozens more phonemes beyond the ones in the preceding charts,
but one set is particularly interesting: clicks. Clicks are sounds
made
by creating a vacuum with the tongue and then suddenly snapping the
tongue away. We use these ourselves, though not as parts of
words: When we “tsk tsk,” when we make clucking sounds, and when
we make a click in the side of our mouths when we tell a horse to get a
move on. Clicks are used in the Bushman languages and in the
Bantu languages that had prolonged contact with them. The best
known is the Bantu language Khosa, because of the famous South
African singer Miriam Makeba.

Stress and Tones

In many languages around the world, including English, words are
differentiated by means of stress.
One syllable is usually given a higher pitch
("up" the musical scale) and sometimes a bit more force. This is
how we differentiate af-fect
(as
in influence) and af-fect (as
in emotion), for
example. In longer words, there may even be a second
semi-stressed syllable, as in math-e-mat-ics: mat has the primary
stress, math has the secondary
stress. In IPA, primary stress is
indicated by preceding the syllable with a high vertical line,
secondary with a low vertical line.

Note that even when we do not need to use stress to differentiate
words, we use it anyway. Sometimes we can tell where a person is
from by how they use stress: insurance is usually stressed on the
sur; southerners stress it on the in. But many languages do not
use stress at all. To our ears, they sound rather monotone.

Some other languages use dynamic
stress or tones.
Swedish is an example. This means that there is actual change of
stress within syllables. In Swedish, there are two tones:

The single
tone starts high and goes down. If a single toneword has a second
syllable, that syllable is unstressed. Single tone words don’t
sound very unusual to English speakers.

The double
tone is only found
in two syllable words. The first pitch starts in the middle range
of pitch and the second tone starts high and goes down. If there
is a third syllable, it is unstressed. The double tone gives the
word a sing-song quality to English speakers.

These tones differentiate many words in Swedish. In the single
tone, anden, tomten, biten, and slaget mean the duck, the building, the
bit, and the battle, respectively. In the double tone, they mean
the spirit, the elf, bitten, and beaten, respectively! English
uses dynamic stress or tones also, but only one whole phrases, such as
the rising pitch at the end of questions.

But many languages in Africa and Asia use far more complex tones, and
in fact are called tonal
languages. Chinese is the best known example. Although
words are often more than one syllable in length, each syllable has a
particular meaning. And Chinese uses a very limitied number of
phonemes. It is the tones that prevent every syllable from having
hundreds of meanings. There are five of them:

Tone 1 -- high
and level (as in hey!)Tone 2 -- middle, then rising (as in was it
you?)Tone 3 -- middle, falling, then rising (as in
mom!? spoken by a whining
teenager)Tone 4 -- high, then falling (as in Tom spoken
by a disappointed mom)

For example, the simple syllable yi can mean many different
things. With tone 1 it means cloth, with tone 2 it means to
suspect, with tone 3 it means chair, and with tone 4 it means
meaning. The syllable wu means house, none, five, and fog,
respectively. And ma means mother, hemp, horse, and scold.
In the official transcription, the four tones are indicated by ¯,
´, ˇ, and `.

We don't know how tonal languages arise. Many believe that it has
to do with phonemes or even whole syllables that have been lost, but
influenced the pronounciation anyway. But this makes it hard to
explain that Cantonese, which has kept many old consonant endings, has
nine tones, while its relative Mandarin Chinese, which has lost those
endings, only has four. Of course a linguist from China might ask
how non-tonal languages lost their tones!

One interesting tidbit is that tonality often crosses family
lines. In Asia, for example, tonality is found in Chinese, Thai,
and Vietnamese -- which are unrelated languages. On the other
hand, Tibetan and Burmese are related to Chinese, but are not tonal;
neither is Khmer, a relative of Vietnamese. Most African
languages are tonal, but Swahili is not. Hausa, spoken in
Nigeria, is tonal, but relatives like Arabic are not. It is
possible that one or another language family influenced others around
it, or was original to an area before being invaded by speakers of
another language.