February 16, 2012

Language is great for looking at the Holocene, but it changes too fast for it to be useful for more remote relations. The point is not to question the origin of language in Africa, but to question the utility of modern linguistic data for providing answers to such questions.

From the paper:

we find that the correlation between phoneme levels and distance from putative origins is most negative when the origin is located in Eurasia, not Africa (figure 3d), implying that phonemic diversity has not been moulded at the global level by the same evolutionary processes that shaped neutral genetic diversity.

Also contrary to the predictions of SFE, the correlation between phoneme inventory size and geographical distance was most positive when the origin was located in Oceania rather than in the Americas. This is because there is a relative deficit of total and private phonemes in Oceania, and an excess in the Americas.

...

The non-tree-like pattern of phonemic variation is also inconsistent with the predictions of SFE. Because we were unable to construct a robust phoneme tree, we are unable to determine whether the observed correlations between phonemic difference and geographical distance are a by-product of the SFE process or the result of phonemic exchange between neighbouring languages. The fact that the correlations exist within regions independent of language family status, however, indicates that local exchange is responsible for at least some of the correlation.

...

Having diversified within the last 10 000 years, currently attested language families are young relative to the age of our species, and specialists have had success reconstructing the evolutionary process in many of them [3,24–28]. Only the IE correlation reached statistical significance, but the correlation was positive.

The paper also has free supplementary material here, from which the tree of Indo-European languages (left) is taken. The numbers attached to languages indicate the number of vowels/consonants in their phonemic inventory.

Rejection of a serial founder effects model of genetic and linguistic coevolution

Keith Hunley et al.

Recent genetic studies attribute the negative correlation between population genetic diversity and distance from Africa to a serial founder effects (SFE) evolutionary process. A recent linguistic study concluded that a similar decay in phoneme inventories in human languages was also the product of the SFE process. However, the SFE process makes additional predictions for patterns of neutral genetic diversity, both within and between groups, that have not yet been tested on phonemic data. In this study, we describe these predictions and test them on linguistic and genetic samples. The linguistic sample consists of 725 widespread languages, which together contain 908 distinct phonemes. The genetic sample consists of 614 autosomal microsatellite loci in 100 widespread populations. All aspects of the genetic pattern are consistent with the predictions of SFE. In contrast, most of the predictions of SFE are violated for the phonemic data. We show that phoneme inventories provide information about recent contacts between languages. However, because phonemes change rapidly, they cannot provide information about more ancient evolutionary processes.

The Celtic languages (Welsh, Irish) are not "sandwiched between Albian and Romanian" -- they are sister to the clade of all Italic languages. The Italo-Celtic languages are then sister to a big clade containing all other extant Indo-European languages.

"Wow, they are proposing an Albanian-Germanic language stage - interesting?!"

Yes. That is the bit that surprised me too. Possible, I suppose. But Albanian seems not really to fit anywhere. I've seen it placed at one time or another with almost all IE families. P)erhaps it has become particularly admixed with the various languages it has come in contact with over the years.

Also, we see that Romanian is the oldest of the Romance languages, I wonder if this is because the people of ancient Thrace, spoke a similar language and thus retained older forms into Romanian?

Earlier separation in the tree from all other extant Romance languages does not mean that Romanian* is the oldest one of them. It just means that the ancestor of Romanian separated from those of all other extant Romance languages at an earlier date than they did among themselves. This is because Romanian is geographically remote and isolated from all other extant Romance languages.

Look at where Greek falls. They break away late which means the other languages probably didn't travel through Greece. It means that indo European probably did not originate in anatolia and then an early branch pushed into Greece/southeast Europe.

The relatedness of Greek, Baltic, and indic seems to fit with the idea they shared an origin in the stepps.

Cladograms only rarely work for language families, especially where languages have separated but then remained in contact, or where there has been migration. Areal efects can be so strong that they can make it hard even to distinguish Sprachbunds from actual families.

Old Blog Archive

Dienekes' Anthropology blog is dedicated to human population genetics, physical anthropology, archaeology, and history.

You are free to reuse any of the materials of this blog for non-commercial purposes, as long as you attribute them to Dienekes Pontikos and provide a link to either the individual blog entry or to Dienekes Anthropology Blog.

Feel free to send e-mail to Dienekes Pontikos, or follow @dienekesp on Twitter.