Tuesday, January 21, 2014

Tecumseh Fitch’s The
Evolution of Language was published in 2010, making a quick review of the
book a long time coming, and perhaps not as apropos as it might have been.
Still, I found the book to be a quite informative synthesis of many areas of
research in the speech and language sciences. A more appropriate title for the
book might be The Evolution of
Proto-language or perhaps The
Evolution of Speech, given the author’s heavy focus on the development of vocal
communication in humans, and much less discussion regarding the higher-level
components of language, particularly recursive syntax. I was surprised at this,
given the author’s contribution to the seminal paper The Faculty of Language: What Is It, Who Has It, and How Did It Evolve?
(Hauser, Chomsky & Fitch, 2002). One of the major conjectures in this paper
is that the core, human-specific component of language is recursion, leading to
major questions about how it evolved and its instantiation in the brain. This
was the main reason I picked up the book in the first place, but the topic is mostly
avoided aside from some introductory discussion, perhaps for good reason, given
the uniqueness of its place in human language (and the accompanying difficulty
of applying the comparative method). The book’s discussion of syntax is nicely
summed by the following quotation (pg. 185):

In conclusion, animals which actually generate call
sequences that appear random seem to be exceptional, and in many species there
are rules (or constraints) upon vocal sequences that can reasonably be termed “animal
syntax.” However, the types of rules that govern these arrangements in primates
are very simple compared to human linguistic syntax: they typically can be
captured by trivial finite state grammars, and only the propositionally
meaningless “songs” of birds and whales require more complex grammars. Thus,
current data support the existence of a large gulf between animal “syntax” and
that employed in any human language.

Oh well – looks like I’ll have to wait to figure out the
answer to syntax. At any rate, the book is an excellent introduction into interesting
nuances of evolutionary theory, the comparative method and theories of
proto-language evolution, and has plenty of goodies that I think neurolinguists
should pay attention to. Here are some salient points that I was personally
unaware of before reading the book:

1. The descended
larynx is not uniquely human.

The “uniquely” descended larynx in humans is often touted as
indicative of the selective pressure of vocal communication in humans. It turns
out, though, that this trait is not uniquely human. In most non-human animals
without a permanently descended larynx, the larynx descends while vocalizing.
Other mammals, such as deer, koalas, and the big cats (lions, tigers, etc.), do have a permanently descended larynx. What
function does a descended larynx serve? The lower the larynx, the lower the
pitch, which is a reliable cue to an animal’s body size. This is an important
signal both inter- and intra-species for potential conflict between animals.
Lowering the larynx may have been adaptive for humans as well, and studies show
that human subjects use pitch as a cue to body size, even though in humans pitch is not a reliable cue to
body size, indicating that perception of body size through pitch is a cognitive
relic from a previous ancestor. Also, the lack of a lowered larynx does not
appear to be an impediment for nonhuman primates to produce speech, given that
(i) they can lower their larynx while
vocalizing, and (ii) they have plenty of control over their vocal apparatus for
other functions, like eating and biting.

2. A crucial
difference in vocal control between humans and other primates/mammals consists
of neuronal control of the larynx.

One crucial difference is the development of novel direct
connections between frontal motor cortical areas and brainstem motor neurons
involved in laryngeal control. While nonhuman primates lack these connections,
their presence in humans suggests that an important stage in the evolution of
speech involved developing neuronal control mechanisms of the larynx. This
hypothesis is supported by fossil evidence concerning the enlargement of the
thoracic canal in modern humans (including Neanderthals) relative to existing
primates and earlier fossil hominids. The thoracic spinal cord is critical in
the control of muscles for breathing. The thoracic canal is the opening in the
base of the skull through which the thoracic spinal cord travels, and its
diameter is enlarged in late hominids compared to earlier hominids and
non-human primates.

At first, this argument may seem counterintuitive: doesn’t
the existence of sign language support
the evolution of language from a gestural system? However, the hard problem in
the evolution of language is why humans, uniquely among primates, use speech exclusively as the default signal of
communication, and how humans developed such intricate vocal control. If a
gestural system served as the proto-linguistic step on the way to complex
language, why not stay in the visual-manual modality? The viability of sign
language proves that there really is no reason to make the extensive
evolutionary jump to speech. Arguments in favor of speech, such as
communication at a distance, etc., face equally strong counterarguments in
favor of sign, such as silent communication during hunting, etc. Given a
community of speakers already using a gestural system, there appears to be no
selective pressure on the development of vocal communication. This is not to
say that gesture isn’t an important part of modern language use, or didn’t form
part of the proto-linguistic system of communication in humans, but that a
gestural proto-language doesn’t have much explanatory power as a bridge between
the communication of our least common ancestor and modern language. Arguments
against a gestural proto-language often come from experts in the study of sign
language (e.g., Karen Emmorey, 2005).

Think birdsong, in humans. This theory was actually first
proposed by Darwin, and supported by a variety of pieces of comparative evidence,
and derided by linguists at the time. He noted the similarities between
birdsong and human speech, critical facts such as (i) learned vocalizations,
(ii) babbling in young birds and humans, and (iii) local “dialects” in both
birdsong and human language. He appealed to sexual selection in birds as the
mechanism that drove the evolution of vocal imitation in humans, and the
emergence of meaning through imitation of natural sounds using both sounds and
gestures (onomatopoeia and imitation of innate human vocalizations like
laughter or crying). This last statement will probably prompt many linguists to
point out the fundamental, Saussurean notion of arbitrariness of the linguistic
sign against this theory, but the crucial point is to explain how vocalizations
began to acquire meaning, not to account for the development of the rest of the
lexicon. In fact, sign languages have many iconic gestures, which explains how
the signs were first developed, but iconic signs are treated the same as any
arbitrary word once the full system of language is in place by signers. So the
theory gets you to sound-meaning correspondences, and assumes that
arbitrariness took over from there.

The nice thing about musical proto-language theories is that
they explain music – as an ancestral relic of our proclivity for structured,
vocal utterances that do not depend on analytic meaning.

5. The importance of
the comparative method in studying cognition, even in traits that are not homologous with humans.

This is driven home throughout the entire book, and there
are many examples that embody the utility of the comparative method. The
descended larynx is one example – larynges descended in other mammals, but this
trait is not homologous to humans. The repeated evolution of laryngeal descent
does provide insight into why such a
trait evolved, and we can apply this insight to humans. Likewise for birdsong –
birdsong isn’t homologous to humans,
but there is much insight to be gleaned from studying it. In particular, animal
models of the infamous FOXP2 gene highlight this quite nicely – transgenic mice
with the FOXP2 mutation that the KE family has (speech motor control issues)
show approximately normal vocalizations, but FOXP2 expression in songbirds
increases during song learning in the bird homolog of the basal ganglia.
Downregulation of FOXP2 expression in living birds show incomplete and
inaccurate song learning. This shows suggests very conspicuous connections to
the function of this gene in humans, and about how vocal learning and vocal
control in humans operates.

The book reads exceptionally well – the points are laid out
very clearly, and the main goal of the discussion is always kept in sight. Fitch
also excels at exploring the theoretical foundation of many of our modern ideas
– his exposition of Darwin’s thoughts on language evolution (and the
observation about how rarely Darwin is cited about this issue) was interesting
and fun to read through. Fitch is very fair when discussing different
perspectives on the issues, almost too fair – he is quite charitable when
discussing Rizzolatti and Arbib’s mirror neuron hypothesis of language
evolution, even while pointing out important problems. He is also quite
thorough, exploring several important hypotheses while not rambling
pedantically through every possibility. If you’re looking for a book crammed
with witticisms and ways to get undergraduates interested in language a la Pinker, you won’t find it here;
instead, you’ll find a useful, interesting and exhaustive resource regarding
the evolution of language (minus recursion).

4 comments:

A good follow up could be "The Unfolding of Language" by Guy Deutscher (a linguist), which I highly recommend, in turn! Not a lot of recursion there either, but it does very satisfyingly explain how complex linguistic systems might arise from simple proto-languages.

I believe you have misunderstood an important distinction in speech production and perception. The descended larynx does not affect pitch (a.k.a. fundamental frequency) but rather formant frequencies. These are the peaks in the filter function of the supralaryngeal vocal tract, i.e., the frequencies that the tract filters out least. I used to know Tecumseh Fitch (we had the same doctoral adviser, Philip Lieberman), and the association between body size, larynx length, and formant frequencies was cental to his doctoral research. I'd be very surprised if he did not make the, er, fundamental difference between pitch and formants clear in his book.

Subscribe to Talking Brains

Blog Moderators

Greg Hickok is Professor of Cognitive Sciences at UC Irvine, Editor-in-Chief of Psychonomic Bulletin & Review, and author of The Myth of Mirror Neurons. DavidPoeppel, after several years as Professor of Linguistics and Biology at the University of Maryland, College Park, is now Professor of Psychology at NYU. Hickok and Poeppel first crossed paths in 1991 at MIT in the McDonnell-Pew Center for Cognitive Neuroscience where Hickok was a post doc, and Poeppel a grad student. Meeting up again a few years later at a Cognitive Neuroscience Society Meeting in San Francisco, they began a collaboration aimed at developing an integrated model of the functional anatomy of language. Research in both the Hickok and Poeppel labs is supported by NIDCD.