The Evolutionary History of Music

Human evolution has been relatively rapid since the first breakoff from the apes 6 million years ago. Preparation for crossing the symbolic threshold went on for 5.9 million years, and as we fully acquired the symbolic mind, culture flourished across the world. One of the most significant outcomes of culture was music. Wherever there existed culture, music always co-existed. Evolution of music is directly related to evolution of symbolic representation. This paper recognizes the constraints needed for such evolution and puts them in a chronological order.

6.1 million years ago humans and apes were not separate. This was the time when one common ancestor of the two species existed. I am assuming that the characteristics of modern chimpanzees and monkeys were the same as those of our common ancestors. These characteristics will be the basis of human evolution.

The communication of the apes was manipulative, holistic, limited, and gestural. A vervet monkey produces alarm calls that manipulate the behaviors of other conspecifics. Whenever a predator comes into its sight, it calls the others to run up the tree or hide. Their sounds are holistic. They do not consist of compositional units. Vocal system of the apes is highly underdeveloped relative to that of humans. Thus the sounds they make are usually limited to grunts and barks. They tend to use gestural communication such as smiling and handshaking.

The apes possess partial Theory of Mind. They have a sense of self and are able to recognize that others have some kinds of intentions. They are not adept enough to imagine what others’ intentions maybe, but as they are able to follow the gaze of others. Apes communicate with each other through dyadic interaction. They are sexually dimorphic, which suggests that communication for sexual selection is not too pressured. They are quadrupedal, usually using their knuckles to move around. They possess competitive nature. For example, when they go out for group hunting, each ape would perform a same task.

Apes have the brain size one third of that of modern humans. They possess limited and non-recursive memory system. Studies have shown that they have mirror neurons, which fire up whenever they observe movements of the others. This suggests that the apes were aware of other’s movements. They carry basic emotions such as happiness, sadness and anger. They are usually biologically (involuntarily) represented such as a strong grunt as the conspecific steals one’s food.

6.0 million years ago, Ardipithecus breaks off from the apes. The first hominids were partially bipedal. Partial bipedalism resulted in enlargement of the brain to fulfill the requirements of more movements. They usually lived in trees, but as they became more bipedal, they moved to more open landscape. Being prone to more dangerous areas often suggested more cooperation and interaction.

4.5 million to 1.8 million years ago, Australopithecus had evolved. All of them were partially bipedal, and more of them moved into the savannas, making the Australopithecus group to be more cooperative. Their brains became bigger as their bodies became larger. The ratio of the sizes between males and females declined, but the difference was significant enough to be sexually dimorphic. Decrease in the ratio between male and female demands more vocal production for sexual selection. Crude stone tools were made, suggesting greater interaction between the species.

2.5 million to 1.8 million years ago, Homo habilis had evolved. The group size became larger. They had good knowledge of the members in the group. The group started to become more cooperative. Given the difference in dentitions and jaws, they started eating more meat, a possible evidence of hunters and gatherers society. Therefore, vocal or gestural interaction between each group member probably has been more frequent to decide social roles and employ different methods. As the group size enlarges, complex emotions such as shame start to develop and the sense of self must have been stronger.

Tool making became more complex possibly the result of much bigger brain size. One important aspect of the brain develop during this period is the enlargement of the Broca’s area that has to do with producing speech. Their ears became more sensitive to higher frequency, possibly suggesting production of vocal sounds in more diverse frequencies. Social interaction of the Homo habilis was usually conducted through facial expressions, body language, actions and vocalizations.

About 1.6 million years ago, Homo ergasters became fully bipedal, possibly a result from fruit picking. Turkana Boy’s Thoracic vertebrae development suggests how his neck posture was straighter. Free arms led to development of sensorimotor control and sense of rhythm. As they moved more to the savannah or open landscape, they lost their body hair to stay cool. With narrower pelvis and loss of hair due to bipedalism, human infants were born early and became helpless. Therefore, the caretakers had to put their baby down, leaving greater exposure to vocal communication. They became less sexually dimorphic and females started to choose their males. This process requires more communication in sexual selection.

Significant increase in brain size mostly for bipedalism and sensorimotor control ultimately led to more diversity in use of body language, gesture, and vocalization. Larynx development allowed them to have greater range of pitches. With the development of the Broca’s area, I suspect that vocalizations became more natural. Symbolic threshold has not yet been crossed, so most of the vocalizations were probably involuntary.

First form of the Infant-directed speech was used. IDS during this period was holistic, emotional, and involuntary. The caretaker and the infants took turns exchanging highly emotional, primary content dyadic interactions. Thus, musilanguage started to develop. Requirements such as the Broca’s area, lowering of the larynx, ear sensitivity, and enlargement of the brain allowed their vocalizations to sound more like modern humans. With these kinds of biological evolution, homo ergasters were more motivated to use vocalizations in their daily lives.

About 1.5 million years ago, Homo erectus had evolved. During this time, they dispersed out of Africa and moved on to regions of Asia and Europe. This process requires cooperative nature involving basic communication. Even lower degree of sexual dimorphism encouraged more vocal grooming.

They started creating symmetrical hand axes. Recognizing a bias in nature such as symmetry or identical colors leads to proto-iconicity. Creating something symmetrical requires knowledge of asymmetry. However, mimesis could not have started here since the hand axes were not developed for another million years. Then what explains such production of a bias? During the time when they were not symbolic creatures, form and meaning were bundled together since the symbolic hierarchies were not established yet. However, creating hand axes that have the exact same forms could not have happened without proto-iconicity. For the creator, the biased symmetry must have meant something unique to him – like a tinge of happiness. However, for the others, it was impossible to understand the meaning for the creator because Theory of Mind has not yet been set. Thus, as humans later cross the symbolic threshold, the creator of a natural bias would know that others might perceive the same meaning as he does. But since the stage of homo ergasters is in proto-iconicity (involuntary, Partial Theory of Mind), similarity is constrained to oneself.

0.6 million years ago, Homo hidelbergensis had evolved. Their box grove contained spatially indexical spots. For example, the land next to the tree was for eating, and the portion of land next to that was for cooking. Specific ‘rules’ were given in each spot. This is the evidence of proto-index. Vast data of proto-iconicity have been engrossed together to form icon of icon, which is an index. Again, use of proto- is for the involuntary, immediate, and primary content characteristics of it. Soon, vast data of proto-indices will form a symbol, which would give insight to the early humans the symbolic mind.

0.2 million years ago, Neanderthals had evolved. Right before their extinction, there exist some evidence of symbol imitations of homo sapiens symbols. However, Neanderthal unique symbol had never been found, which proves that the Neanderthals had never crossed the symbolic threshold. Without the symbolic mind, neither music nor language could have evolved. A strong sense of hierarchy and categorization is crucial for symbolic creation.

0.25 million years ago, Homo sapiens have evolved. 70,000 years ago, they have created shell beads, which serve as indices of indices – a symbol. At least by 70,000 thousand years ago, they fully understood the power of symbolism. Symbolic mind led to performing mimesis. At this period, human actions became voluntary and mediated. Vocal mimesis allows full development of the Infant-directed speech, and groups now possessed shared intentionality. Fully developed Theory of Mind liberated proto- meanings from being constrained in oneself. Groups became even larger interacting with triadic engagements.

The brain size of homo sapiens was equal to that of modern humans, which allowed them to understand other’s intentions, entrain others, go through recursive thoughts and have cognitive fluidity. Once the symbolic threshold has been crossed, all of the requirements for music and language have been developed.

Then what is the difference between music and language? Music is an advanced holistic communication. This does not mean that music is not compositional at all. The emphasis is on the holistic level. The reference of the communication is focused on emotion. Thus when we hear music, we tend to perceive it without mediation. On the other hand, language is an advanced compositional communication. Again, this does not mean that language is not holistic at all. The emphasis is on the compositional level. Language tends to have concrete reference.

Both music and language are symbolic. However, music is more abstract while language is more indicative. Thus with abstraction that contains similar emotional reference, social bonding can be easily formed as chains of indications form a holistic abstraction that are immediately inputted to the listeners. On the other hand, with indication that contains concrete reference, individual bonding (dyadic engagement) can be easily formed as concrete reference is given and taken.

From apes to culturally rich symbolic creatures, humans have gone through rapid evolution past 6 million years. Since there exist no fossilized evidence of music and language, the evolutionary puzzle remains complex and incomplete like a treasure yet to be discovered.