In linguistics, intonation is variation of spoken pitch that is not used to distinguish words; instead it is used for a range of functions such as indicating the attitudes and emotions of the speaker, signalling the difference between statements and questions, and between different types of questions, focusing attention on important elements of the spoken message and also helping to regulate conversational interaction. It contrasts with tone, in which pitch variation in some languages distinguishes words, either lexically or grammatically. (The term tone is used by some British writers in their descriptions of intonation but to refer to the pitch movement found on the nucleus or tonic syllable in an intonation unit.)

Although intonation is primarily a matter of pitch variation, it is important to be aware that functions attributed to intonation such as the expression of attitudes and emotions, or highlighting aspects of grammatical structure, almost always involve concomitant variation in other prosodic features. David Crystal for example says that "intonation is not a single system of contours and levels, but the product of the interaction of features from different prosodic systems – tone, pitch-range, loudness, rhythmicality and tempo in particular."[1]

Most transcription conventions have been devised for describing one particular accent or language, and the specific conventions therefore need to be explained in the context of what is being described. However, for general purposes the International Phonetic Alphabet offers the two intonation marks shown in the box at the head of this article. Global rising and falling intonation are marked with a diagonal arrow rising left-to-right [↗] and falling left-to-right [↘], respectively. These may be written as part of a syllable, or separated with a space when they have a broader scope:

He found it on the street?

[ hiː ˈfaʊnd ɪt | ɒn ðə ↗ˈˈstɹiːt ‖ ]

Here the rising pitch on street indicates that the question hinges on that word, on where he found it, not whether he found it.

Yes, he found it on the street.

[↘ˈjɛs ‖ hi ˈfaʊnd ɪt | ɒn ðə ↘ˈstɹiːt ‖ ]

How did you ever escape?

[↗ˈˈhaʊ dɪdjuː | ˈɛvɚ | ə↘ˈˈskeɪp ‖ ]

Here, as is common with wh- questions, there is a rising intonation on the question word, and a falling intonation at the end of the question.

In many descriptions of English, the following intonation patterns are distinguished:

Rising Intonation means the pitch of the voice rises over time [↗];

Falling Intonation means that the pitch falls with time [↘];

Dipping or Fall-rise Intonation falls and then rises [↘↗];

Peaking or Rise-fall Intonation rises and then falls [↗↘].

It is also common to trace the pitch of a phrase with a line above the phrase, adjacent to the phrase, or even through (overstriking) the phrase. Such usage is not supported by Unicode as of 2015, but the symbols have been submitted. The following example requires an SIL font such as Gentium Plus.

All vocal languages use pitch pragmatically in intonation—for instance for emphasis, to convey surprise or irony, or to pose a question. Tonal languages such as Chinese and Hausa use intonation in addition to using pitch for distinguishing words. Many writers have attempted to produce a list of distinct functions of intonation. Perhaps the longest was that of W.R.Lee, who proposed ten, and both put forward six functions. Wells's list is given below; the examples are not his:

attitudinal function (for expressing emotions and attitudes)

example: a fall from a high pitch on the 'mor' syllable of "good morning" suggests more excitement than a fall from a low pitch

grammatical function (to identify grammatical structure)

example: it is claimed that in English a falling pitch movement is associated with statements, but a rising pitch turns a statement into a yes–no question, as in He's going ↗home?. This use of intonation is more typical of American English than of British.

focusing (to show what information in the utterance is new and what is already known)

example: in English I saw a ↘man in the garden answers "Whom did you see?" or "What happened?", while I ↘saw a man in the garden answers "Did you hear a man in the garden?"

discourse function (to show how clauses and sentences go together in spoken discourse)

example: subordinate clauses often have lower pitch, faster tempo and narrower pitch range than their main clause, as in the case of the material in parentheses in "The Red Planet (as it's known) is fourth from the sun"

psychological function (to organize speech into units that are easy to perceive, memorize and perform)

example: the utterance "You can have it in red blue green yellow or ↘black" is more difficult to understand and remember than the same utterance divided into tone units as in "You can have it in ↗red | ↗blue | ↗green | ↗yellow | or ↘black"

indexical function (to act as a marker of personal or social identity)

example: group membership can be indicated by the use of intonation patterns adopted specifically by that group, such as street vendors or preachers. The so-called high rising terminal, where a statement ends with a high rising pitch movement, is said to be typical of younger speakers of English, and possibly to be more widely found among young female speakers.

is not known whether such a list would apply to other languages without alteration.

The dominant framework used for American English from the 1940s to the 1990s was based on the idea of pitch phonemes, or tonemes. In the work of Trager and Smith[2] there are four contrastive levels of pitch: low (1), middle (2), high (3), and very high (4). (Unfortunately, the important work of Kenneth Pike on the same subject[3] had the four pitch levels labelled in the opposite way, with (1) being high and (4) being low). In its final form, the Trager and Smith system was highly complex, each pitch phoneme having four pitch allophones (or allotones); there was also a Terminal Contour to end an intonation clause, as well as four stress phonemes.[4] Some generalizations using this formalism are given below. It should be noted that the American linguist Dwight Bolinger carried on a long campaign to argue that pitch contours were more important in the study of intonation than individual pitch levels.[5]

Normal conversation is usually at middle or high pitch; low pitch occurs at the end of utterances other than yes–no questions, while high pitch occurs at the end of yes–no questions. Very high pitch is for strong emotion or emphasis.[6] Pitch can indicate attitude: for example, Great uttered in isolation can indicate weak emotion (with pitch starting medium and dropping to low), enthusiasm (with pitch starting very high and ending low), or sarcasm (with pitch starting and remaining low).

Declarative sentences show a 2–3–1 pitch pattern. If the last syllable is prominent the final decline in pitch is a glide. For example, in This is fun, this is is at pitch 2, and fun starts at level 3 and glides down to level 1. But if the last prominent syllable is not the last syllable of the utterance, the pitch fall-off is a step. For example, in That can be frustrating, That can be has pitch 2, frus- has level 3, and both syllables of -trating have pitch 1.[7]Wh-questions work the same way, as in Who (2) will (2) help (3↘1)? and Who (2) did (3) it (1)?. But if something is left unsaid, the final pitch level 1 is replaced by pitch 2. Thus in John's (2) sick (3↘2) ..., with the speaker indicating more to come, John's has pitch 2 while sick starts at pitch 3 and drops only to pitch 2.

Yes–no questions with a 2↗3 intonation pattern usually have subject-verb inversion, as in Have (2) you (2) got (2) a (2) minute (3, 3)? (Here a 2↗4 contour would show more emotion, while a 1↗2 contour would show uncertainly.) Another example is Has (2) the (2) plane (3) left (3) already (3, 3, 3)?, which, depending on the word to be emphasized, could move the location of the rise, as in Has (2) the (2) plane (2) left (3) already (3, 3, 3)? or Has (2) the (2) plane (2) left (2) already (2, 3, 3)? And for example the latter question could also be framed without subject-verb inversion but with the same pitch contour: The (2) plane (2) has (2) left (2) already (2, 3, 3)?

Tag questions with declarative intent at the end of a declarative statement follow a 3↘1 contour rather than a rising contour, since they are not actually intended as yes–no questions, as in We (2) should (2) visit (3, 1) him (1), shouldn't (3, 1) we (1)? But tag questions exhibiting uncertainty, which are interrogatory in nature, have the usual 2↗3 contour, as in We (2) should (2) visit (3, 1) him (1), shouldn't (3, 3) we (3)?

Questions with or can be ambiguous in English writing with regard to whether they are either-or questions or yes–no questions. But intonation in speech eliminates the ambiguity. For example, Would (2) you (2) like (2) juice (3) or (2) soda (3, 1)? emphasizes juice and soda separately and equally, and ends with a decline in pitch, thus indicating that this is not a yes–no question but rather a choice question equivalent to Which would you like: juice or soda? In contrast, Would (2) you (2) like (2) juice (3) or (3) soda (3, 3)? has yes–no intonation and thus is equivalent to Would you like something to drink (such as juice or soda)?

Thus the two basic sentence pitch contours are rising-falling and rising. However, other within-sentence rises and falls result from the placement of prominence on the stressed syllables of certain words.

Note that for declaratives or wh-questions with a final decline, the decline is located as a step-down to the syllable after the last prominently stressed syllable, or as a down-glide on the last syllable itself if it is prominently stressed. But for final rising pitch on yes–no questions, the rise always occurs as an upward step to the last stressed syllable, and the high (3) pitch is retained through the rest of the sentence.

A more recent approach to the analysis of intonation grew out of the research of Janet Pierrehumbert[8] and developed into the system most widely known by the name of ToBI (short for "Tones and Break Indices"). The approach is sometimes referred to as Autosegmental. The most important points of this system are the following:

Only two tones, associated with pitch accents, are recognised, these being H (high) and L (low); all other tonal contours are made up of combinations of H, L and some other modifying elements.

In addition to the two tones mentioned above, the phonological system includes "break indices" used to mark the boundaries between prosodic elements. Breaks may be of different levels.

Tones are linked to stressed syllables: an asterisk is used to indicate a tone that must be aligned with a stressed syllable.

In addition, there are phrasal accents which signal the pitch at the end of an intermediate phrase (e.g. H− and L−), and boundary tones at full phrase boundaries (e.g. H% and L%).

A full ToBI transcription includes not only the above phonological elements, but also the acoustic signal on which the transcription is based. The ToBI system is intended to be used in computer-based transcription.

A simplified example of a ToBI transcription is given below. In this example, two phrases "we looked at the sky" and "and saw the clouds" are combined into one larger intonational phrase; there is a rise on "sky" and a fall on "clouds":

L*L*H−H*H* L−L%

we looked at the sky and saw the clouds

Because of its simplicity compared with previous analyses, the ToBI system has been very influential and has been adapted for describing several other languages.[9]

British descriptions of English intonation can be traced back to the 16th century.[10] Early in the 20th century the dominant approach in the description of English and French intonation was based on a small number of basic "tunes" associated with intonation units: in a typical description, Tune 1 is falling, with final fall, while Tune 2 has a final rise.[11] Phoneticians such as H.E. Palmer[12] broke up the intonation of such units into smaller components, the most important of which was the nucleus, which corresponds to the main accented syllable of the intonation unit, usually in the last lexical word of the intonation unit. Each nucleus carries one of a small number of nuclear tones, usually including fall, rise, fall-rise, rise-fall, and possibly others. The nucleus may be preceded by a head containing stressed syllables preceding the nucleus, and a tail consisting of syllables following the nucleus within the tone unit. Unstressed syllables preceding the head (if present) or nucleus (if there is no head) constitute a pre-head. This approach was further developed by Halliday[13] and by O'Connor and Arnold,[14] though with considerable variation in terminology. This "Standard British" treatment of intonation in its present-day form is explained in detail by Wells[15] and in a simplified version by Roach.[16] Halliday saw the functions of intonation as depending on choices in three main variables: Tonality (division of speech into intonation units), Tonicity (the placement of the tonic syllable or nucleus) and Tone (choice of nuclear tone);[17] these terms (sometimes referred to as "the three T's") have been used more recently.[15]

Research by Crystal[18][19] emphasized the importance of making generalizations about intonation based on authentic, unscripted speech, and the roles played by prosodic features such as tempo, pitch range, loudness and rhythmicality in communicative functions usually attributed to intonation.

The transcription of intonation in such approaches is normally incorporated into the line of text. A typical example would be:

We ˌlooked at the ↗sky | and ˈsaw the ↘clouds

in this example, the | mark indicates a division between intonation units.

An influential development in British studies of intonation has been Discourse Intonation, an offshoot of Discourse Analysis first put forward by David Brazil.[20][21] This approach lays great emphasis on the communicative and informational use of intonation, pointing out its use for distinguishing between presenting new information and referring to old, shared information, as well as signalling the relative status of participants in a conversation (e.g. teacher-pupil, or doctor-patient) and helping to regulate conversational turn-taking. The description of intonation in this approach owes much to Halliday. Intonation is analysed purely in terms of pitch movements and "key" and makes little reference to the other prosodic features usually thought to play a part in conversational interaction.

The most distinctive feature of French intonation is the continuation pattern. While many languages, such as English and Spanish, place stress on a particular syllable of each word, and while many speakers of languages such as English may accompany this stress with a rising intonation, French has neither stress nor distinctive intonation on a given syllable. Instead, on the final syllable of every "rhythm group" except the last one in a sentence, there is placed a rising pitch. For example[22]:p.35 (note that as before the pitch change arrows ↘ and ↗ apply to the syllable immediately following the arrow):

Hier ↗soir, il m'a off↗ert une ciga↘rette. (The English equivalent would be "Last eve↗ning, he offered ↗me a cigar↘ette.")

As can be seen in the example sentences above, a sharp fall in pitch is placed on the last syllable of a declarative statement. The preceding syllables of the final rhythm group are at a relatively high pitch.

Most commonly in informal speech, a yes/no question is indicated by a sharply rising pitch alone, without any change or rearrangement of words. For example[22]:p.65

Il est ↗riche?

A form found in both spoken and written French is the Est-ce que ... ("Is it that ...") construction, in which the spoken question can end in either a rising or a falling pitch:

Est-ce qu'il est ↗riche? OR Est-ce qu'il est ↘riche?

The most formal form for a yes/no question, which is also found in both spoken and written French, inverts the order of the subject and verb. There too, the spoken question can end in either a rising or a falling pitch:

Est-il ↗riche? OR Est-il ↘riche?

Sometimes yes/no questions begin with a topic phrase, specifying the focus of the utterance. Then, the initial topic phrase follows the intonation pattern of a declarative sentence, and the rest of the question follows the usual yes/no question pattern:[22]:p.78

Information questions begin with a question word such as qui, pourquoi, combien, etc., referred to in linguistics as interrogatives. The question word may be followed in French by est-ce que (as in English "(where) is it that ...") or est-ce qui, or by inversion of the subject-verb order (as in "where goes he?"). The sentence starts at a relatively high pitch which falls away rapidly after the question word, or its first syllable in case of a pollysyllabic question word. There may be a small increase in pitch on the final syllable of the question. For example:[22]:p.88

In both cases, the question both begins and ends at higher pitches than does a declarative sentence.

In informal speech, the question word is sometimes put at the end of the sentence. In this case, the question ends at a high pitch, often with a slight rise on the high final syllable. The question may also start at a slightly higher pitch:[22]:p.90

Mandarin Chinese is a tonal language so pitch contours within a word distinguish the word from other words with the same vowels and consonants. Nevertheless, Mandarin also has intonation patterns that indicate the nature of the sentence as a whole.

There are four basic sentence types having distinctive intonation: declarative sentences, unmarked interrogative questions, yes–no questions marked as such with the sentence-final particle ma, and A-not-A questions of the form "He go not go" (meaning "Does he go or not?"). In the Beijing dialect, they are intonationally distinguished for the average speaker as follows, using a pitch scale from 1 (lowest) to 9 (highest):[23][24]

Declarative sentences go from pitch level 3 to 5 and then down to 2 and 1.

A-not-A questions go from 6 to 9 to 2 to 1.

Yes–no ma questions go from 6 to 9 to 4 to 5.

Unmarked questions go from 6 to 9 to 4 to 6.

Thus, questions are begun with a higher pitch than are declarative sentences; pitch rises and then falls in all sentences; and in yes–no questions and unmarked questions pitch rises at the end of the sentence, while for declarative sentences and A-not-A questions the sentence ends at very low pitch.

Because Mandarin distinguishes words on the basis of within-syllable tones, these tones create fluctuations of pitch around the sentence patterns indicated above. Thus, sentence patterns can be thought of as bands whose pitch varies over the course of the sentence, and changes of syllable pitch cause fluctuations within the band.

Furthermore, the details of Mandarin intonation are affected by various factors like[23] the tone of the final syllable, the presence or absence of focus (centering of attention) on the final word, and the dialect of the speaker.

Intonation in Punjabi has always been an area of discussion and experimentation. There are different studies [Gill and Gleason (1969), Malik (1995), Kalra(1982), Bhatia (1993), Joshi (1972 & 1989)]which explain intonation in Punjabi language according to their respective theories and models.

Chander Shekhar Singh (2014) carried forward a description of the experimental phonetics and phonology of Punjabi intonation based on sentences read in isolation. His research design is based on the classification of two different levels of intonation (Horizontal level and Vertical level). The first experiment (i.e. at the Horizontal Level) is conducted to investigate three utterance types- declarative, imperative, and interrogative. In his second experiment, the investigation of sentences- declarative, imperative, and interrogative are conducted to view intonation but in vertical sense. 'Vertical' here means a comparative analysis of intonations of these three types of sentences by keeping the nuclear intonation constant. This experiment shows some extremely significant results. The vertical level demonstrates four different types of accentuations in Punjabi language:

1. Normal statement

2. Simple emphatic

3. Confirmation

4. Information

5. Doubtful/Exclamation

Experiment II provides a significant difference between the Horizontal level and the Vertical level.[25]

Cruttenden[26] points out the extreme difficulty of making meaningful comparisons among the intonation systems of different languages, the difficulty being compounded by the lack of an agreed descriptive framework.

An ESRC-funded project (E. Grabe, B. Post and F. Nolan) to study the intonation of nine urban accents of British English in five different speaking styles has resulted in the IViE Corpus and a purpose-built transcription system. The corpus and notation system can be downloaded from the project's website.[27] Following on this work is a paper explaining that the dialects of British and Irish English vary substantially.[28]

A project to bring together descriptions of the intonation of twenty different languages, ideally using a unified descriptive framework (INTSINT), resulted in a book published in 1998 by D. Hirst and A. Di Cristo.[29] The languages described are American English, British English, German, Dutch, Swedish, Danish, Spanish, European Portuguese, Brazilian Portuguese, French, Italian, Romanian, Russian, Bulgarian, Greek, Finnish, Hungarian, Western Arabic (Moroccan), Japanese, Thai, Vietnamese and Beijing Chinese. A number of contributing authors did not use the INTSINT system but preferred to use their own system.