It has been established that either
modality influences spatial localization of the source through the
other [20]): Subjects who are instructed to point at a visual
source of information deviate slightly from it if a competing acoustic
source is heard from another spatial position, and conversely, subjects
deviate more from the original acoustic source if a competing optical
source interferes from another location. In speech, such a capture of the
source is well known and widely used by ventriloquists, as the audience is
much more attracted by the dummy whose facial gestures are more coherent
with what they hear than those of its animator [354]! Even
four-to-five month old infants, presented simultaneously with two screens
displaying video films of the same human face, are preferentially attracted
by a face pronouncing the sounds heard rather than a face pronouncing
something else [165]. This demonstrates a very early capacity of
humans to identify coherence in the facial gestures and their corresponding
acoustic production. This capacity is frequently used by listeners in
order to improve the intelligibility of a single person in a conversation
group, when the well-known ``cocktail party effect'' occurs.