Speech intelligibility and hemispheric asymmetry

Abstract

It is very rare, even in degraded listening environments, that we might confuse
speech with a dog bark or vice versa, despite the fact that both are complex acoustic
signals. Despite the solid assumption of left lateralisation in speech processing from
clinical and anatomical observations, the results from brain imaging studies have been
inconsistent. One possible cause for this controversy may come from the use of
different imaging system. Using inadequate baselines, however, may bring more
critical problem. In brain imaging studies, especially when cognitive subtraction is
used, images of cognitive processes are generally derived by subtracting a control
stimulus/task from an experimental counterpart. The two stimuli/tasks to be compared
are expected to differ only in one factor/process and the difference in brain activations
is thus considered to come from the particular difference between the two. This thus
makes it difficult to find baseline stimuli/tasks that activate all but the process of
interest. By far, spectrally rotated speech stands as a most satisfying control against
intelligible speech as it is equally complex as speech but totally unintelligible.
However, the spectral rotation so far has been a total rotation regardless of the source
and the filter of sound, which are independent and heterogeneous by nature. A series of
behavioural studies performed in this thesis showed that the source rotation did not
significantly affect speech intelligibility whilst filter drastically decreased the
intelligibility. Another possibility can be different brain imaging paradigms used. With
carefully designed parametres using functional magnetic resonance imaging (fMRI),
we confirmed that intelligible speech recruited predominantly the left superior
temporal area, replicating the results from previous positron emission tomography (PET) and fMRI studies. Since the intervention of scanner noise has been an issue in
auditory research using an MRI system, four imaging paradigms were compared and it
is concluded that a sparse sampling with 8 seconds of repetition time had a clear
advantage over longer repetition time with 16 seconds and a continuous sampling. This
paradigm was used in the study investigating the effects of channel number and
presence/absence of tonal variation on speech intelligibility. Intelligibility increased
together with increasing number of band channels and showed drastic improvement
especially in the range of 2 – 6 numbers of frequency channel bands. A brain imaging
study followed with mixed subtraction and parametric designs and revealed that the
right superior temporal gyrus responded most when pitch variation was provided in the
speech, regardless of intelligibility, unlike the pitch variation in non-speech (spectrally
rotated speech here). Increasing intelligibility with increased spectral detail showed
linear increase in percent signal change in hemodynamic response in the left superior
temporal gyrus. The current result supports a streamed hierarchical model, in which
speech comprehension occurs predominantly in the left hemisphere.