Since being able to sequence entire genomes, not only of humans but of Neanderthals and Denisovans, our picture of the last 500,000 years of the Homo phylogeny is rapidly changing. We now know that modern humans interbred with other similar species around at the time, and a recent article suggests a fourth species was also contributing to interbreeding as well (http://www.sciencedaily.com/releases/2013/12/131218133658.htm).

Do you think evidence of this interbreeding is interesting for linguistics at all?

Since being able to sequence entire genomes, not only of humans but of Neanderthals and Denisovans, our picture of the last 500,000 years of the Homo phylogeny is rapidly changing. We now know that modern humans interbred with other similar species around at the time, and a recent article suggests a fourth species was also contributing to interbreeding as well (http://www.sciencedaily.com/releases/2013/12/131218133658.htm).

Interesting!

(What's a species anyway?? Is it as poorly defined as a language? I thought interbreeding was the definition of same species, y'know, mutual intelligibility.)

Quote

Do you think evidence of this interbreeding is interesting for linguistics at all?

Hm, I don't know. Certainly it is for a couple reasons:1. It's possible that these other species also had linguistic abilities-- if so, that would put an end to this "humans are superior because language" nonsense. It's not a uniquely human ability, even if humans happen to be the only extant species to have it-- I don't see any reason why another species couldn't evolve to do it either.2. To understand the genetic background for "UG" (or your preference of theories), it might be useful to know where humans came from, and why they're not like gorillas and chimpanzees.(3. A very unlikely scenario is that this interbreeding somehow allowed humans to stumble upon the 'language gene' and be able to speak, specifically due to mixing DNA. I find that very unlikely, if somewhat intriguing.)

But for theory of modern languages, I don't know if this makes much of a difference. On the other hand, at the very least it should help to define how the theory of modern languages shouldn't make random guesses and call them theories Clearly the domain of genetic ability for language is more related to this information than the study of Turkish or Maori syntax, at least in a "hardware" sense.

(it's a very good point - actually here is another similarity with biological evolution and language change that I hadn't thought of before you mentioned this! What constitutes a species has similar fuzzy lines and definition problems - generally the definition of a species is, like two mutually unintelligible dialects, two populations that cannot produce fertile offspring [note, fertile offspring, so lions and tigers, although they can produce a 'liger', it is sterile. Same with mules I think?]. But this produces problems with the species we think of as separate, because sometimes they can interbreed. Does this mean we should consider them the same species? There are situations sometimes where certain groups can interbreed with neighbouring groups, and they can with its neighbour, and it can with its neighbour etc, but the first one cannot with the farther neighbour. So it's a gradual thing. Defining species in this way would of course make Neanderthals and humans the same species, or subspecies of the same genus. But there are other ways of describing different species as well, such as 'separate breeding populations', but that also has its own definitional problems... I'm pleased you highlighted another similarity between the language evolution and biological evolution analogy!

I totally agree with your post, especially with point 1. I'm not going to comment because I'm interested in what others have to say...

(It is an interesting parallel, isn't it? And you extended it as well: sometimes there are chains where one group can interbreed/understand their neighbors, but the longer distance pairings are mutually unintelligible / can't interbreed. And that's right about mules. I'd forgotten about ligers not having offspring, but you're right. And clearly with Neanderthals and the other 'species' they would need to pass on the genetic information or we wouldn't exist today apparently!)

Now if someone wanted to claim this has nothing whatsoever to do with language, this is probably the part to pay attention to:

Quote

The research team estimates that between 1.5 and 2.1 percent of the genomes of modern non-Africans can be traced to Neanthertals.

Note "non-Africans"-- and we know that Africans can use language; so apparently it's not that 1.5-2.1% that matters for language. Just some 'noise' in the DNA, perhaps.

The research team estimates that between 1.5 and 2.1 percent of the genomes of modern non-Africans can be traced to Neanthertals.

It would appear I am slightly more Neanderthal than the average bear! I have 2.6% Neanderthal DNA in my genome. But the good news is that it is just slightly under (~0.01%) the mean average of the other users of the DNA service I subscribed to.

Here's a random guess: your number and also the average of that service are higher than what Cory posted, but perhaps that is because the general figure was for "non-Africans", not specifically Europeans. I'd imagine, based on the distributions of the species, that Europeans would be more likely to have Neanderthal DNA than other groups (such as Asians).

Yeah, they're becoming much easier though the FDA just recently placed some serious restrictions on the health information they used to provide, but for ancestry it's still up and running. I had to spit into a little tube and send it over to the US and then a few weeks later the results were available online. I was going through a big phase of interest in population migrations and DNA evidence and I needed to find out what 'my story' was (hoping not to sound so cheesy). Anyway, the price has come down over 300% since I first used the service in May 2012. As more info is found out, the records are always updated. I can trace my genetic mutations back to Ireland on my father's side (no surprise there) and also see that his ancestors descend from the group of early humans that lived in the caves in France during the last Ice Age. My mother's side is even more interesting, it appears her ancestors were part of the first batch of people to go to Europe (the modern descendants are the Finnish Saami and Basques). That's also something I found interesting because it brought me to research the connections with these languages as well. What are the languages that don't trace back to PIE? Hungarian, Basque, Finnish and Estonian. Finnish, Estonian and Hungarian are part of the same early language family and it just happens that these language isolates came down to be the languages of first migration while it seems the second migration (I think about 20,000 - 30,000 years later) brought PIE to Europe. A lot of it is conjecture and theorising, but it's just so interesting I had to know how I fitted into the whole thing.

Edit: the company I used is called 23andme (link), in case anyone wanted to know.

What are the languages that don't trace back to PIE? Hungarian, Basque, Finnish and Estonian.

Not quite. You're forgetting about all of the extinct languages like Iberian, Etruscan and Pictish, as well as probably hundreds or thousands of others we have no records of. I'd expect, mathematically, that it's instead unlikely to find some relationship between Basque and Finno-Ugric. Also, Uralic is large in genera ( http://en.wikipedia.org/wiki/File:Fenno-Ugrian_languages.png )-- I wasn't aware of quite how big until I just looked it up. So you'd be assuming it's not related to other families like Sino-Tibetan, Dravidian, etc.

More than that, I don't know how likely it would be to find a genetic and linguistic link-- genetic populations are often maintained despite linguistic replacement (via small groups of conquerors who become rulers).

As I understand it, all languages outside of Africa came through (well near) Europe, so other families like those found in the Americas or Australia would also somehow be related.

But I don't mean to sound too discouraging-- all very interesting, and also possible, but we have to be very careful about making any interpretations of this data... it's just so hard to know.

For example, personally I wouldn't be surprised if the Basques really do represent a very old linguistic and genetic line in Europe from the first humans there. But it's hard to know that with any confidence. For one thing, we're highly biased by our current impression of the Basques, while that's a historical effect, not a genetic or linguistic one. It's just as possible that the Basques could have ended up ruling Russia, which would make us less likely to make the same genetic assumptions. It's an interesting kind of bias. The mystery and uniqueness of the Basques makes us think there's something interesting to explain about their history (I mean: relatively interesting, compared to other peoples-- obviously they have some story or other which would be interesting in itself regardless). But that may just not be true. They may be a genetically and linguistically relatively uninteresting population who happened to survive until today as a minority group, giving only the impression that they are unique.(One interesting detail here is that I don't see Basque really as a "lingustic isolate"-- the dialectal diversity approaches multiple languages, and there's fairly clear evidence that there were related languages spoken farther north centuries/millennia ago.)Anyway, none of this makes the Basques any less interesting linguistically, but it does bring into question our assumptions based on current distributions about which peoples have interesting stories or originated where.

The PU urheimat is hypothesised to have been just north or northeast of the PIE urheimat, as evidenced by the multitude of PIE/Proto-Indo-Iranian loanwords. Although PU may not be as large as the PIE family, I would hesitate to call it an isolate.

Also, the only language showing any evidence of an earlier Paelolithic(?) substrate would be Sami. Finnish and Estonian has a lot of Proto-Germanic and Proto-Balto-Slavic loans (as well as later Proto-Norse, Old Norse, and Swedish), whereas Hungarian has many (Proto-)Turkic loans — showing the approximate migration of the two families. Sami has PG and PS, just as Finnish and Estonian, but also seemingly shows a large substrate vocabulary, probably stemming from sometime around the paleolithic at end of the last ice age.

PU is not really any older than PIE, estimates for both families seem to be fairly close to each other, which is also corroborated by PIE loans in PU.

So I would hesitate to put the Uralic family with Basque. I'm not saying that Basque is a paleolitihc remnant, just that we don't know. This is not the case with Uralic.

Isolate was probably the wrong word to use. What I meant was a language that sticks out as being vastly different and historically unrelated to the languages surrounding it. I also didn't mean to imply the connection between them all.

What I perceive to be pretty uncontroversial is that PIE came with the second migration of humans to Europe, and a lot of these - very different and historically unrelated to PIE languages - very likely have their roots in the migration of the first peoples. I know we, as linguists, instantly want to jump away as far as possible from any claim that ties genetics to languages, because of the persistent misinformed arguments that arise where people try to tie them together. However, sometimes it just makes perfect sense. I think a scientist/linguist/whatever who could see that there is exists (albeit alongside a lot of extinct languages) a modern society with a few languages that notably stick out in their environments as being very, very, very unlike their neighbours, alongside a genetic picture that places these exact same regions as belonging to a different historically migratory period. Ha, my sentence got a little too long so I will rephrase it. I think someone looking at that data would not be doing their job properly if their preconceptions of tying connections together clouded their judgement to investigate such a very real possibility.

Freknu, are you saying Hungarian is not historically part of the Uralic family? I thought it was a pretty widely-accepted fact that it was, but I don't know a lot about the topic and I know being Finnish-speaking, you are in a much better position to clarify this point.

Quote

PU is not really any older than PIE, estimates for both families seem to be fairly close to each other, which is also corroborated by PIE loans in PU.

Corroborated in what sense? Existence of loanwords in one language family to another wouldn't really provide any evidence for timing them as being of equal age, right? Even if PIE came along 20-30,000 years later, loanwords can enter, but I don't see how that corroborates any evidence. I'm totally just spurring on a point of potential discussion here, please don't think I'm trying to argue a point against you guys.

So I would hesitate to put the Uralic family with Basque. I'm not saying that Basque is a paleolitihc remnant, just that we don't know. This is not the case with Uralic.

Right. It's just unknown.

Quote

PU is not really any older than PIE, estimates for both families seem to be fairly close to each other, which is also corroborated by PIE loans in PU.

Ah, that reminds me-- I also meant to say earlier that there's a fair amount of comparative work on Uralic and IE suggesting some genetic relationship-- none of it is widely substantiated or accepted, but certainly it is at least as likely that PU and PIE are (possibly distantly) related as it is that Basque is related to either.

Quote

Isolate was probably the wrong word to use.

It's certainly the standard word for Basque (which I questioned above not at you in particular!). And I see what you mean about Hungarian for example-- in any basic sense it's about as obviously different from the rest of Europe as Basque, except that wide historical connections are known. On the surface, I'd be willing to accept Hungarian (and the others perhaps) as "isolates" though I don't know how much weight the term would (or should) have anyway. It just seems to be a comment that we don't know much about a language, which isn't interesting except perhaps as a reason to study it more.

Quote

What I meant was a language that sticks out as being vastly different and historically unrelated to the languages surrounding it.

That's the problematic bias I was talking about (and you're not alone!). It sticks out based on modern happenstance. There's absolutely nothing inherent about Basque linguistically (or the Basques genetically) that should suggest that it's inherently an isolate or unique. I imagine they were ruling the region at some point before the Celts (and perhaps others) arrived. So Basque is no more special than Breton, or even English for that matter. It's an interesting scientific puzzle to try to figure out where Basque came from, certainly, but the answer is no more interesting than any other linguistic relationship and therefore shouldn't be biased toward a special explanation (like the first Europeans) without other evidence. In the sense of "why not?" I probably would agree that it seems like Basque may be special, but perhaps I'm just biased too.

Quote

What I perceive to be pretty uncontroversial is that PIE came with the second migration of humans to Europe, and a lot of these - very different and historically unrelated to PIE languages - very likely have their roots in the migration of the first peoples.

What's the time depth?As an arbitrary example question, would something like Nostratic or Eurasiatic (some proto-language existed, whatever it was) have been spoken in Africa rather than in the regions for which they're named (after the modern descendents)?Or is the time depth of the migrations perhaps irrelevant to PIE and PU, etc.? Has there been so much contact (consider the Americas!) that evidence of deeper history is hidden or gone?

Quote

I know we, as linguists, instantly want to jump away as far as possible from any claim that ties genetics to languages, because of the persistent misinformed arguments that arise where people try to tie them together.

I'm tentative but I find the questions interesting. The problem is that I just don't think the relationships are very reliable. Sometimes, perhaps often, linguistic and genetic backgrounds split.

Quote

However, sometimes it just makes perfect sense. I think a scientist/linguist/whatever who could see that there is exists (albeit alongside a lot of extinct languages) a modern society with a few languages that notably stick out in their environments as being very, very, very unlike their neighbours, alongside a genetic picture that places these exact same regions as belonging to a different historically migratory period. Ha, my sentence got a little too long so I will rephrase it. I think someone looking at that data would not be doing their job properly if their preconceptions of tying connections together clouded their judgement to investigate such a very real possibility.

I agree. But don't get too stuck in the available evidence. There's no reason to assume it's representative. That's what I'm suggesting.

Quote

Freknu, are you saying Hungarian is not historically part of the Uralic family? I thought it was a pretty widely-accepted fact that it was, but I don't know a lot about the topic and I know being Finnish-speaking, you are in a much better position to clarify this point.

Not that I can speak for freknu, but I don't think he meant to suggest that:Uralic is a large family with lots of subgroupings. Within that is Finno-Ugric containing Finnic (Finnish, Estonian, Sami, etc.) and Ugric (Hungarian). I'm not sure on all of the details, but I don't think that's disputed. But freknu was, I believe, discussing Finnic vs. Ugric within Uralic/Finno-Ugric, just like Germanic vs. Romance within IE.

Quote

Corroborated in what sense? Existence of loanwords in one language family to another wouldn't really provide any evidence for timing them as being of equal age, right? Even if PIE came along 20-30,000 years later, loanwords can enter, but I don't see how that corroborates any evidence.

There are two ways to use loanwords to date contact / development:1. Assume (and defend) a historical relationship for the borrowing. A lot has been discussed for PIE about the word for "wheel" because it was presumably invented at a certain point. Same with words for horse, etc. But there can be other versions of this, such as if the word for some basic idea, like "fish", is borrowed, then it's probably orginal. Shared borrowing throughout the family is unlikely, so it probably goes back to the original language.

2. Via comparative reconstruction it is possible to determine the relative time depth of borrowing-- if a word did or did not go through certain sound changes, we know whether it was borrowed in a daughter or mother language within the family. For example, Spanish and English share many words, but we can separate them by how old they are based on their sounds-- compare pie/foot and cafetería/cafeteria-- obviously (and for more technical reasons) we can date the second pair as a later borrowing. This method, however, assumes relevant changes that make this apparent.

All of that is tentative, but it's possible to make some relatively strong arguments. I don't know about the PIE/PU evidence specifically, though.

Quote

I'm totally just spurring on a point of potential discussion here, please don't think I'm trying to argue a point against you guys.

No, not at all, and I'm not in a position to tell you that you're wrong anyway. It's an interesting topic. Speculation tends to lead the field of ancient comparative linguistics!

Freknu, are you saying Hungarian is not historically part of the Uralic family? I thought it was a pretty widely-accepted fact that it was, but I don't know a lot about the topic and I know being Finnish-speaking, you are in a much better position to clarify this point.

Not at all, they are part of the Uralic family, but their development differ in part due to contact with different nearby cultures.

PU is not really any older than PIE, estimates for both families seem to be fairly close to each other, which is also corroborated by PIE loans in PU.

Corroborated in what sense? Existence of loanwords in one language family to another wouldn't really provide any evidence for timing them as being of equal age, right? Even if PIE came along 20-30,000 years later, loanwords can enter, but I don't see how that corroborates any evidence. I'm totally just spurring on a point of potential discussion here, please don't think I'm trying to argue a point against you guys.

Hmm.. I don't get it. Are you saying PU would have remained stable for 20-30,000 years before ultimately coming in contact with PIE?

Using comparative linguistics both families are estimated to have co-existed, and the loanwords which often are remarkably preserved in form match up with the reconstructed sound rules and linguistic developments.

Freknu, are you saying Hungarian is not historically part of the Uralic family? I thought it was a pretty widely-accepted fact that it was, but I don't know a lot about the topic and I know being Finnish-speaking, you are in a much better position to clarify this point.

Not that I can speak for freknu, but I don't think he meant to suggest that:

Uralic is a large family with lots of subgroupings. Within that is Finno-Ugric containing Finnic (Finnish, Estonian, Sami, etc.) and Ugric (Hungarian). I'm not sure on all of the details, but I don't think that's disputed. But freknu was, I believe, discussing Finnic vs. Ugric within Uralic/Finno-Ugric, just like Germanic vs. Romance within IE.

Evidence today shows that in these regions of the world, these people seem to have concentrated amounts of genetic markers in their DNA that place them earlier in Europe and are likely to have direct ancestors who were among the first humans to enter Europe.

Hey, these regions that were just mentioned, there is a curious fact that they are the tiny percentage that also don't descend from Proto Indo-European.

It'd be interesting to evaluate the hypothesis that these two facts are related somehow.

If anything further than this was interpreted in my posts, I did not intend for it.Regarding the dating of loanwords, I imagined a scenario much, much further back than tests such as the First Consonant Shift or anything like that. Way before the age we can date these sort of late changes. While the connection I wanted to imply was maybe more led by analysis of the DNA results rather than the languages. I accept the trepidation about connecting the two, but an observable analysis we can make is that Basques and Laaps (Sami) people in northern Finland (indigenous) do both have very high instances of people who have the U5 genetic marker in the maternal line (which is what I also have). It's only found in 9% of all Europeans. Basques and Sami. There is a connection. I don't want you to think I was saying "therefore the languages are related" because there are no connections. Maybe that genetic marker belonged to so many more people in the past. But don't you think it's interesting we seem to have, what could potentially be, a glimpse into a linguistically pre-PIE past and good corroborative data from the genetic markers give a relatively rare genetic trait (U5 genetic marker) that only occurs in 9% of Europeans tested today, is predominant in regions where we can see these different languages?

It's a tantalising thought - one that would require much more investigation to even have a good hypothesis formed. But isn't that what we're supposed to do? Look for clues and make reasonably sensible suggestions based on sparse data to try to get a good glimpse into the past? That's all I meant. Seems like a basis upon which an idea could be formed.

Maybe that genetic marker belonged to so many more people in the past.

This is the problem I was hinting at-- we're quantifying in a way that is completely illogical. "More" is meaningless when we have no idea what kinds of numbers are actually correct-- was it 1,000 languages or 100,000? And of those, what percentage (roughly) was part of this potential subgrouping?That's the problem with intuitive statistical guesses with limited evidence.

Quote

It's a tantalising thought - one that would require much more investigation to even have a good hypothesis formed.

Your hypothesis is good, but there are many other equally good hypotheses based on available evidence.

Consider the Eurasiatic and Nostratic theories, for instance. Neither is widely accepted, but there's a chance one is right. The problem is that we can't even know, assuming one is actually correct, which one that would be. And there are many other potential groupings that could work out just as well.

I'm genuinely fascinated by these questions, but I can't imagine a career in historical linguistics, because of that: we can't ever be sure of anything, and even something as apparently simple as narrowing the hypothesis space down a little is an impossible task.