Language acquisition: From sounds to the meaning

Without understanding the 'referential function' of language (words as 'verbal labels', symbolizing other things) it is impossible to learn a language. Is this implicit knowledge already present early in infants?

The word "apple," as we pronounce it, is a sequence of sounds (phonemes) that we use whenever we want to refer to the object it indicates. If we did not know that a referential relationship exists between the sound and the object it would be impossible for us to use, and learn, a language. Where does this implicit knowledge come from, and how early in human development does it manifest? This is the question Hanna Marno and her SISSA colleagues Marina Nespor and Jacques Mehler in a collaboration with Teresa Farroni, from the University of Padova, attempted to answer in a study just published in Scientific Reports.

"A sensitivity to speech sounds is already present in newborns. These types of sounds are in fact perceived as special starting from the first days of life, and they are processed differently from other types of auditory stimuli. What makes this type of stimulus so special for the newborn?" asks Marno. "There's definitely a 'social' saliency: speech sounds signal interaction between conspecifics, which is important for the survival of the infant. But there is also another important aspect, i.e., referentiality: words are symbols that carry meanings and convey messages. If infants didn't know this, albeit implicitly, they wouldn't be able to acquire language."

"Try to imagine an infant who, on several occasions, sees his mother holding up a cup while uttering the word 'cup'," explains the researcher. "He could just think that this is something his mum would do whenever holding the cup, a strange habit of hers. But instead in a short while he will learn that the word refers to that object, as if he were 'programmed' to do so."

To test this hypothesis, Marno conducted experiments with infants (4 months old). The babies watched a series of videos where a person might (or might not) utter an (invented) name of an object, while directing (or not directing) their gaze towards the position on the screen where a picture of the object would appear. By monitoring the infants' gaze, Marno and colleagues observed that, in response to speech cues, the infant's gaze would look faster for the visual object, indicating that she is ready to find a potential referent of the speech. However, this effect did not occur if the person in the video remained silent or if the sound was a non-speech sound.