Trask on Mama and Papa.

Larry Trask is a longtime LH hero (search results), and I was delighted when Faldage at Wordorigins.org linked to his brilliant essay “Where do mama/papa words come from?” (pdf). A nugget I plucked out and shared in the Wordorigins.org thread: “As always, lookalikes are a waste of time, and words that genuinely share a common ancestry do not have to resemble one another.” The whole thing is worth your while; as I also wrote there, “he rejoiced in a rare combination of scholarly accuracy and compulsive readability.”

Comments

Now I am very proud of myself because this is exactly the explanation I have been giving to people for years. Parents and coworkers asked that same question many times at the infant-toddler centrers where I worked, and it was obvious to me that all this mama/papa/tata words were assigned by parents (or grandparents, or whoever) to the babbling exercises of infants. “Listen, she knows her papa!”

It’s a great essay in several ways: debunking a linguistic myth, exploring comparative etymologies, and describing language change. I quoted gratefully from it in my post a few years ago summarising the origins of mama/papa words.

Since not everyone is going to make it to page 16 of a pdf, I hereby herewith and herewhenceuntonotwithstanding remark that it was in fact my personal favourite linguist, Roamin’ Roman Jakobson, who first gave the correct answer, in the 1950s.

Larry Trask was not one of Merritt Ruhlen’s most fervent admirers (to put it mildly), and all the way through the article I was expected to see something rude about him, but probably he thought the best to ignore him completely. It was a sad day for sci.lang (and more generally, of course) when Larry died — more than a year younger than me.

Curiously, the first word Spanish parents “hear” their children “say” is “ajo” (garlic). This comes even before “mamá” and “papá”, and is greeted with tremendous joy and enthusiasm: he/she said his first ajo!

I wonder if the ajo that Spanish parents hear might be some sort of pre-babbling noise that other cultures don’t interpret as attempts at speech. The word does sound a bit similar to some kinds of non-linguistic vocalizations.

Even a non pro like me realized ages ago that the mama/papa/baba words were onomatopoetic. But I wonder to what extent that could be stated about other sounds heard worldwide by humans. I’m for instance thinking about whiz, whine, whistle, zing, hiss, swish, chirp, peep, beep and the like. Are they also over and over again re-invented? (The Swedish varieties are all built up around an /i/). Birds with pleasant (?) sounds often contain this /i/ like siskin (Sw. siska) or tit (Sw. tita). ‘Pippi’ is Swedish child language for ‘bird’ (the name of the strong girl with long stockings is secondary).

The last one, ’korp’ is a newcomer in Swedish. The old word is ‘ramn’ (c.f. Icelandic ‘hrafn’ and English ‘raven’). ‘Korp’ certainly looks like an onomatopoetic re-invention, as it is exactly how my ears hear the descendants of Huginn and Muninn. http://en.wikipedia.org/wiki/Huginn_and_Muninn (reminding of the gulls who cried Three quarks for Muster Mark and changed our conception of the universe).

Is it thus just a matter of time until the southern big brothers of mine will stop calling a butterfly a Schmetterling? (‘Smattra’ in Swedih means ‘patter’ or ‘clatter’ – also onomatopoeia – and that’s the last thing a butterfly does). :-)

A lot of interesting detail in this article, however, as to the main argument, I am quite sure that every person who ever had a baby has made this linguistic discovery quite independently. It was a bit like reading a long proof of why is 2×2=4 that stared by explaining why it is not 5 as some may believe.

(it also has a wrong Russian word for “girl”; it did not affect the argument, but I could not help noticing).

On a related note, when I lived in CA, I often observed that in Mexican Spanish people use the word mama for both mother and female baby and papa for both father and male baby. So it is more of a word for a male/female member of the parent/child relationship. I always wondered how common is it, and if it occurs in other cultures.

“‘Korp’ certainly looks like an onomatopoetic re-invention, as it is exactly how my ears hear the descendants of Huginn and Muninn. ”

Then it has to be a really old one, because the word looks cognate to “corvus”, “corbeau”, “cuervo”. There is also a scots word “corbie”. I always assumed it was of Norse origin, but on second thought it looks Latin or maybe Celtic. I have looked and can’t find an Irish cognate, but it might have entered Scots Gaelic from Cumbric and then on into Norway and further.

“I’m for instance thinking about whiz, whine, whistle, zing, hiss, swish, chirp, peep, beep and the like. Are they also over and over again re-invented?”

‘Smattra’ in Swedih means ‘patter’ or ‘clatter’ – also onomatopoeia – and that’s the last thing a butterfly does

Oh, zerschmettern is the literary word for “smash”, and schmettern can be applied to loud, fast, more or less professional singing, implying it is impossible to ignore (even if it’s actually beautiful). The etymology of Schmetterling is actually quite similar to that of butterfly: various Slavic smetana, śmietana etc. mean “cream”, which is attested in some German dialects as Schmetten and perhaps Schmand.

There’s a children’s book about a very upset dragon (poetic: Lindwurm; lind “mild”) and an equally upset butterfly that eventually end up happy ever after by agreeing to remix their names into Lindling and Schmetterwurm.

The first language-like sound coming from the mouths of French babies is areu (with fricative r), which must be what Spaniish babies also produce, but there is no such word in French and so areu is just a sound babies make – pre-language, perhaps.

Tatiana: in Mexican Spanish people use the word mama for both mother and female baby and papa for both father and male baby. So it is more of a word for a male/female member of the parent/child relationship. I always wondered how common is it, and if it occurs in other cultures.

It may be derived from some indigenous languages. In a number of North American languages it is not the parent-child but the grandparent-grandchild relationship which is involved, with a grandchild being called by the word for the grandparent of the same sex.

The first language-like sound coming from the mouths of French babies is areu (with fricative r), which must be what Spanish babies also produce, but there is no such word in French and so areu is just a sound babies make

In Russian the “babies’ first word” is either “gu” or “agu”, neither means anything but the baby’s first word. And the activity of making these sounds also has two names, either
гуление or агуканье (“making gu sounds” vs. “making agu sounds” 🙂 )

Both words are used as equivalent translations of Eng. “baby’s crowing” in a 1989 Russian translation of Frederick Tracy’s “The Psychology of Childhood”, but only гулить ~~ лепетать is visible in Google word searches before the 1890s

Australian languages (or at least Pama-Nyungan?), according to paper by Rachel Hendery, typically have mama ‘father’ in some areas, ‘elder brother’ in others. papa is ‘mother’ in many languages around Victoria.

It looks like ah-gu’s are seldom produced in doubles. E.g. here.. BTW thinking some more of гуление or агуканье: the former is always a sound a baby makes, while the latter is, commonly, a parent’s way to talk to a baby.

Of course, Hans, David knows – hence my smiley. Lind in modern Swedish is lime-tree (Ger. Linden) and orm is snake so most Swedes believe that a lindorm (lindworm) is a snake for some mysterious reason living in a lime-tree. I myself in a previous posting used ‘Hänschen und Gretchen’ from a children’s book instead of Grimm’s original ‘Hänsel und Gretel’.

Google gave Eng. ‘lime-tree’ → Ger. ‘Linden’. That can’t be, I thought, it must be ‘Lind’ or ‘Linde’ in singular. The plural I of course knew from famous Berlin street Unter den Linden, ‘Under the Lime-Trees’. So it’s not stupid Google in this case but stupid me.

There was a book, by F. Holiday, titled The Great Orm of Loch Ness (1968), claiming Nessie was an invertebrate. That was the only English occurrence of orm I’ve seen, and I think it’s Holiday’s innovation. It’s not in the OED.

There was a book, by F. Holiday, titled The Great Orm of Loch Ness (1968), claiming Nessie was an invertebrate. That was the only English occurrence of orm I’ve seen, and I think it’s Holiday’s innovation. It’s not in the OED.

I wonder if Holiday got it from Great Orme, formerly also known as Great Orm’s Head?

Y: orm is the Scandinavian cognate to worm. In front of ‘u’ and ‘o’ Scandinavian has dropped ‘w’ and ‘j’ (‘y’). Compare word-ord, wolf-ulv, wound-ond (painful), wool-ull, work-ork (physical ability), young-ung, yoke-ok. Even your-er (plural) are cognates, where the the ‘o’ and ‘e’ are regular English and Scandinavian respectively developments of a proto-Germanic ‘ei’.

The semantic change from worm to snake in Swedish is of course due to the similar way of movement of these creatures. Snok in Swedish is the specific species ‘grass-snake’ (Natrix natrix) and the plural snokar is the family of snakes Colubridae (as opposed to the family huggormar, Viperidae or vipers). The other way around the German and Danish snake Words, Schlange/slange, in Swedish (slang) simply means ‘hose’.

What a bizarre development! The Norse name orm is preserved by the English, but reinterpreted by the Welsh as the French word orme ‘elm’ and translated into Welsh as gogarth ‘id.’ See Yew-land > Boar-town > Horse Bay.

[N.b.: JC is referring to this ungrammatical, unsubstantiated, and incorrect claim in the WP article: “Gogarth mean elm in Welsh, as Orme in Norman.” I have deleted it. -LH]

gwenilian: do you know which Native American languages this occurs in?

I know there are several. One of them is Chinook(an), a language or small family formerly apoken along the Columbia Riiver, but there are also others in that general area. There might be articles about this in anthropological works. You might look up “kin terms of address”.

I used to own a copy of Holiday’s The Great Orm of Loch Ness, but I seem to have lost it. However, I have a vague recollection that he claimed to have taken the word “orm” from an account in an old manuscript. Since this would have been an old Scottish manuscript, this would most likely have been the Norse word, since Scots has a lot of Norse influence.

The reason he picked up on the word was to argue for a different type of beast than a sea serpent or dragon, some kind of giant slug or something (I dredging this up from old memories). By claiming that old Scots had a special word for this beast, it would show that people thought it to be distinct from other kinds of beast. But since it’s just the Norse version of “worm” (also applied to dragons and such), that argument doesn’t hold up very well.

The claims about mentions of Nessie in historical sources is a complicated one. Ronald Binns’s “The Loch Ness Mystery Solved” does a good job of untangling it, but basically most Nessie books cite each other rather than original sources. Some of the original sources are a bit murky too.

According to Binns, Holiday eventually gave up on Nessie and started writing about “Welsh humanoids”, whatever that may be.

Wikipedia sez “The most recent etymology of Lloegyr is that by noted linguist Eric Hamp, who suggested in 1982 that Lloeg(y)r could be derived from a Proto-Celtic compound *(p)les-okri-s, meaning ‘having a nearby border, being from near the border’. Hamp had a lot of clever but not compelling ideas, and this is one of them.

Sorry for coming late to this discussion. I would stand up for a view radically different from that advocated by Trask (after Murdock and Jakobson sixty years ago, and various authors in the 19th century).

Me and my friend Alain Matthey de l’Etang have analyzed kinship terminologies from some 3,500 languages, and our conclusion is that papa/mama words _are_ inherited, certainly from Proto-Sapiens in most cases.

The arguments and lots of data are presented on our website, see under Publications.

The last publication on the list (Pierre J. Bancel & Alain Matthey de l’Etang. 2013. Brave new words. In Claire Lefebvre, Bernard Comrie and Henri Cohen eds. New Perspectives on the Origins of Language. Amsterdam: John Benjamins. 333–37)” contains a discussion of Trask’s data and analysis, and a description of the mechanism by which the limited capacities of babies act as an exceptional preservation tool for these words.

I find it quite likely that “mama/papa” forms are “baby talk”, but the words/stems/roots for these basic kin terms can be much more varied over the world, especially those for “father”, since the biological link is not always culturally the most important.

Unfortunately, none of these will stand up. If Latin tata survived into Romanian, it would take the form *tată, not tata. The Latin word pap(p)as means not ‘father’ but ‘tutor, governor’, and is not recorded in the vocative (which is lost in the Romance languages anyway). If pappa survived into French, it would take the form *pape; if it were papas, it would be *pabe. And as for Greek pappas, of which Latin pap(p)as is a borrowing, there is no other instance of sound-change in Greek from /p-/ to /b-/.

our conclusion is that papa/mama words _are_ inherited, certainly from Proto-Sapiens in most cases

I’ve now read your two papers from 2005. It certainly is true that such words can be inherited and even borrowed. However, despite the very large number of examples you bring up, I think your database still isn’t large enough. Examples:

– How do you explain the rare cases like Georgian, where dede and mama have the opposite of the expected meanings?
– What about the fact that mam(m)a means “breast”, exactly as Jakobson expected, in Latin and apparently also in modern Spanish?
– You claim that kaka never means “father”. I’ve read it does in Turkmen.
– You claim that it’s completely unrealistic to assume that the father’s brothers consistently meet the child earlier than the mother’s brothers do. Consider a patriarchal, patrilocal society where the mother’s brothers aren’t even in the same village; while by no means universal, such cultures have been very common. This is actually a nice opportunity to test one of your hypotheses.
– There are languages where nursery words (phonologically simple, and distinct from the synonyms adults normally use) are not restricted to kinship terms, but extend for example to “water”.

I find etymological problems, too. You try to derive German Opa, Oma “grandpa, grandma” from Pokorny-PIE *au̯-; this, in turn, you derive from PIE *XAXA, and that in turn from *KAKA. Every link in this chain is wrong:
– Opa, Oma have longer versions: Opapa, Omama. These in turn are suspiciously similar to Großpapa, Großmama, which are also attested; clearly, [o(ː)] is baby for Groß-. Obviously, Großpapa, Großmama are just the formal terms Großvater, Großmutter “grandfather, grandmother” with the nursery terms calqued in.
– If you don’t accept my “clearly” and insist that this [o(ː)] must be related to the root of Latin avus, ava by something other than regular sound correspondences, there remains the fact that today’s IEists tend to wince when Pokorny’s ancient dictionary is brought up. Some say half the stuff in it can’t be taken seriously by modern standards. Pokorny consistently erred on the side of inclusion, listing everything that he thought might conceivably go back to some kind of PIE even if several irregular developments have to be assumed, and even if a supposed root is only attested in a single descendant family; he did not consider what was in his time called “laryngeal theory”, so three whole consonant phonemes are missing from his reconstructions, and the phonotactics are off.
– OK, OK, so maybe Pokorny’s *au̯- was a well-behaved root *h₂aw-, and the *XAXA you propose was some kind of **h₂ah₂a (PIE roots are not supposed to end in a vowel, but nursery terms may well have been an exception). Perfect! The first two phonemes match! But how would *h₂ have changed into *w?
– If we project PIE **h₂ah₂a straight back to Proto-Nostratic, we get **qVqV. Proto-Nostratic **kaka, on the other hand, would have given PIE **gege or perhaps **gaga!

By the way, the Hattic language is nowhere near Indo-European except geographically. It is no more similar to IE than Basque or Burushaski are.

David, my compliments, I suspect you are now officially overqualified for the job of teaching the history of the French language. Yes, a Latin form *papa would indeed have yielded *pève in Modern French (Compare the real French word fève, from Latin faba).

Romanian tata is a definite form, as if from a Latin feminine *tata(m) illa(m) (Romanians still use the older *tatam illum, with a Latin masculine demonstrative, in the Lord’s prayer: Tatăl nostru… ), so this example, unlike the French and Modern Greek, does go through.

From the same article JC read: “How could Trask miss the fact that this opposition already existed in Sanskrit, from which Bengali and Hindi obviously inherited both baba and pita?” But bābā is not in Sanskrit dictionaries, and meanwhile Hindi and Bengali pitā is not inherited but borrowed from Sanskrit.

JC: sound-changes that produce /v/ are not allowed to apply to inherited baby-French, whose words go by different rules

What has not been mentioned yet (unless I missed something) is that many of the baby words involve reduplication, so that for instance the ancestor of French papa has been treated not as a unit word with single penultimate stress but as two stressed syllables, pa’ pa’, so that each p behaved as word-initial, unchanged since Latin times. Reduplication is typical of French baby words, only some of which seem to be versions of adult words: among them are lolo ‘milk’ (standard lait), dodo ‘sleep’ (dor- apparent stem of the verbal forms), tété ‘boob’ (téter ‘(baby) to nurse’), néné ‘boob’, toutou ‘doggy’, and of course pipi ‘pee’ and caca ‘poo(p)’.

The last word shows the evolutionary difference between baby words and normal ones. Ca ca presumably existed in Latin baby vocabulary since there was a verb cacare. The reduplicated baby word has survived intact, like pa pa has and for the same reason (and similarly with pi pi), but the verb has gone through several perfectly regular Latin-to-French changes, ending up as chier ‘to shit’ (a word not normally used in respectable company).

I find it interesting that the Proto-Sapiens or Proto-World people consider the suitability of (K)AKA as relating to a male relative of the father’s or grandfather’s generation while ignoring the far more pervasive presence of excrement in the life of a baby.

George Gibbard, you make a good point about bābā not being attested in Sanskrit. I might add that that even if it had existed it would have yielded a monosyllabic form, as intervocalic /b/ was lost in the transition from Sanskrit to Hindi and Bengali. So was intervocalic /t/, so that there can be no doubt that pitā in both modern Indo-Aryan languages is indeed a learned borrowing from Sanskrit.

David, since you asked…I have worked at two French departments where the history of French was taught by a faculty member. In both instances the entire distinction between inherited and borrowed Latin elements was not explained to the students, for the good reason that the faculty member (in both instances a literary specialist) not only didn’t know about the difference but couldn’t understand it. Really. Trust me. I tried.

But I would be doing those two an injustice if I didn’t tell you what I learned from them, as both taught the history of French through the lens of their own research interest, and thus (via the students we had in common) I learned quite a bit about class struggle on the one hand and representations of sexuality on the other. And I had no idea you could speak of such matters relating to Ancient Rome and the Middle Ages with such authority, all without knowing a word of Latin or, in the case of one of those two faculty members, knowing that Greek has its own alphabet.

Modern Greek μπαμπάς (Google translate: ‘dad’ but not ‘father’) is likely from Turkish baba, but I don’t think the word can be old in Turkish either: cf. Uzbek bor : Turkish var ‘there is’; and Turkish pazar from Persian bāzār would seem to point to a time when ba- was not allowed in Turkish. Colloquial Arabic has /bʶaːbʶaː/ where the /bʶ/ ought to also indicate a borrowing: Persian (possibly via Turkish) seems a likely source. (For what it’s worth Google Translate thinks the Persian for ‘dad’ is pedar, but ‘daddy’ is bābā.) Hindi bābā could be from Persian too, but Bengali bap could not, and would seem to go back to a form beginning *bapp-. As to why Bengali bap should resemble Persian bābā, it seems we have to go back to Trask and the baby-talk hypothesis.

Who knows if some of the papa ~ tata ~ ata words for ‘father’ in IE languages don’t go back to baby-speech attempts to approximate the “adult” word *pə₂ter with a reduplication (or something equally simple). It’s interesting, at any rate, that the stops /p/ and /t/ recur in them.

Very happy to read all your comments. Sorry that our 2013 paper was not accessible to readers, I’ll try to have it posted on our website (and will advertise it here).

A few words of answer to your many interesting reactions.

Papa/mama words do not comply (in most cases) with regular phonetic evolutions because of the phonetic limitations of babies, that remain the same when a change occurs in a given language.

For instance, when intervocalic -p- became voiced in Romance, “papa” did not become *paba because when babies utter their first words they are mostly limited to reduplication, so that when their parents started to say *paba they only could repeat “papa”. Parents then recognized the word they had heard since they were babies themselves (recall that we are at the start of the sound change) and reinforced it, so that papa remained the same in adult-talk-to-babies.

(Papa/mama word are adult-talk-to-babies, not baby talk; baby talk differs from child to child, e.g. my daughter at 19 months consistently said “manat” instead of “tomate”, for interesting phonetic reasons but no other French child I know ever said “manat”).

As a result, French babies today learn to say “papa” to their father, just like Occitan, Catalan, and Spanish babies. No monolingual parents in these languages would pick “tata” or “dada” from the mouth of their child as a word meaning “daddy”.

So that there is no real point in mentioning that Latin “pappa” should have become anything else than “papa” according to recognized phonetic rules. It has remained “papa” in most Romance languages because of babies. Romanian “tata” is either inherited from Latin, which had the word as well, or borrowed it from the surrounding Slavic languages which all have “tata”.

There are a lot of counter-examples, of course, e.g. Kurdish “bav” mentioned by a lister above. The very frequent exemption from regular evolution does not mean that no papa/mama word ever evolves.

But the fact remains that most Romance languages use “papa” and “mama”, while all Celtic languages use “tat/tad” and “mam”, all Slavic languages “tata” and “mama” (except Russian which uses “papa”, most probably borrowed from either German or French), etc.

They all are inherited within their language family. And most probably from much older times, judging by PIE *pa-ter and *ma-ter (themselves no baby talk but regular adult words), evidently derived from pre-existing papa/mama words, which was recognized by Indoeuropeanists since the 19th century. The same fidelity is found the world over in most language families.

On the topic of interesting phonetic reasons for baby words: I have a relative who, as a toddler, pronounced “squirrel” as “ferrol”. I think it’s really cool how she combined the fricative quality of [s] and the labial quality of [w] to produce [f].

You could just as well say that words for the cuckoo represent Proto-World inheritance because they tend (and have always tended) to sound like /kuku/ and often (seem to) disobey otherwise regular sound changes. But that’s precisely the reason why baby-talk, onomatopoeia and interjections should not be used as reliable evidence in historical reconstruction (unless they lose their expressive character and become “normal words”). They can be formed, re-formed and re-re-formed again at any time.

While the ‘mother’ word in PIE may well have been based originally on a “nursery root”, it is by no means agreed that the same is true of the formal ‘father’ word. One rather commonly encountered idea is that it’s a productively formed agent noun in *-ter-, derived from the verb root *peh₂- ‘protect’.

In Slavic, the formal word for ‘father’ is *otьcь, which consists of the nursery word *ata (possibly a distortion of *ph₂-ter-, otherwise lost in Balto-Slavic) plus the diminutive suffix *-iko-. As soon as this formation was lexicalised, it began to develop regularly. In Germanic, *attan- is a nasal stem, which immediately suggests that the geminate is not “expressive” but produced by Kluge’s Law: in the oblique form of the stem, pre-Gmc. *at-n- (also from the *ata baby word) became *att-, which spread analogically to the nominative and the and the acc. sg. The theory that such family terms are exceptions to regular sound change doesn’t hold water.

Of course it’s possible that nursery words have been transmitted over myriads of years, obeying their own rules, in an unbroken tradition back to the first time a baby babbled. Sort of like the way playground games and rhymes are transmitted out of sight and often out of reach of adult culture.

But it’s not really a falsifiable proposition, is it? Seems to me that the answer to Trask’s treatise boils down to ‘but the rules are different in this case’ — well, then it’s out of the realm of science.

“… that’s precisely the reason why baby-talk, onomatopoeia and interjections should not be used as reliable evidence in historical reconstruction (unless they lose their expressive character and become “normal words”). They can be formed, re-formed and re-re-formed again at any time.”

I entirely agree with you that papa/mama words should not be used in reconstruction, because they do not comply with phonetic evolution rules.

Nevertheless, they may offer another kind of reliable evident in historical reconstruction. If all Slavic languages (a valid language group, warranteed by innumerable common words, grammatical morphemes, and sound laws) have a word “tata”, while (nearly all) Romance languages have “papa”, all Sinitic languages have “pa”, and so on the world over, it makes clear that these words have been inherited and must descend from a common ancestral word in their respective family.

Otherwise, one shoud find a mix of all forms in most language families. This is not the case, and even some of the (not so rare) exceptions are in most cases traceable to likely or attested borrowings : Rumanian “tata” may have been borrowed from Bulgarian or another Slavic language, English “dad” may have been borrowed from a Brythonic language (which all have tat/tad forms), Greek “baba” was borrowed from Osmanli Turkish during the Ottoman rule over Greece, etc.

In theory, papa/mama words could be “formed, re-formed and re-re-formed again at any time”. But they are not.

“In Slavic, the formal word for ‘father’ is *otьcь, which consists of the nursery word *ata … plus the diminutive suffix *-iko-. As soon as this formation was lexicalised, it began to develop regularly. … The theory that such family terms are exceptions to regular sound change doesn’t hold water.”

You confuse lexicalisation with, so to speak, adultization. Papa/mama words as used in adult-to-babies-talk are words in their respective language. All French speakers know that “papa” and “maman” are the French words which are used by adults to speak of their parents to babies, and by babies to call their parents, and actually this use continues in most cases when children grow adults. They refer to the same people as but differ in use from “père” and “mère”.

The Slavic formal word *otьcь is not a papa/mama word. It is a father/mother word, a kind of word belonging to adult speech and as such submitted to regular phonetic evolution. Most “formal” father/mother terms (“formal” is somewhat misleading, in general the distinction is between appellatives and reference terms) are derived from papa/mama words, e.g. PIE *pa-ter and *ma-ter, which have evolved regularly in most of the dozens of IE languages having preserved them. But this specialization into “father/mother words” drove them out of the realm of adult-to-babies-talk, where papa/mama (appellative) words are preserved from most phonetic changes by the limited capacities of babies.

“The theory that such family terms are exceptions to regular sound change doesn’t hold water.” You’re absolutely right with father/mother words! But papa/mama words hold water and stand phonetic evolution over millennia.

Yes, it is. If papa/mama words were “formed, re-formed and re-re-formed” again and again, as Piotr said in the post just above yours, one should find a mix of “papa”, “tata”, “ada”, “tat, bab” forms within each language family.

Instead, what one finds is “tata” in all Slavic languages (save Russian “papa”, borrowed from French or German), “papa” in most Romance languages (save Rumanian “tata”, borrowed from Bulgarian or another neighboring Slavic language), “pa” in all Sinitic languages (including Archaic Chinese), etc.

This is the touchstone of their inheritance from an ancestral word in their respective family.

Oh, but you started out by saying ‘inherited from Proto-Sapiens’. If your claim is just that the forms are inherited within specific known language families, exempt from the usual phonetic developments, that seems falsifiable to me as well.

one should find a mix of “papa”, “tata”, “ada”, “tat, bab” forms within each language family.

And we do find something like this, even in individual languages. In English, we have father, dad, daddy, dada, pa, papa, pop, pops and maybe others. Some are borrowed, some are nursery-room distortions of the adult word. What about mama for ‘father’ in Kartvelian? Is it Proto-Sapiens too?

It’s very likely (if umprovable) that tens of thousands years ago some languages then spoken had ma or mama for mother. This speculation can be supported by observations of recurrent human behaviour, but has nothing to do with reconstructing “Proto-Sapiens” (a construct I don’t accept anyway).

Not really (and it isn’t just Hindi). A few years ago, Ronald Kim connected it with related words in Iranian and argued in favour of reconstructing an Indo-Iranian paradigm like *ser-ih₂/*sr-jah₂- (levelled out in individual languages), hence Vedic (and Indo-Aryan) strī. There is also a related word (with the same feminine “motion” suffix) in Tocharian B (also discussed by Ron), and more distant relatives in Anatolian. The element *ser- (recognised as a femininity marker a long time ago) is fascinating in itself. It seems to represent an alternative PIE word for ‘woman, female’, mostly replaced by *gʷenh₂-, but reconstructible from scattered traces (mostly in compound words, like *swe-sor- ‘sister’, *t(r)i-sr-es (the feminine form of ‘3’ in Indo-Iranian and Celtic). Ron reconstructs the free-standing word as *h₁os-r̥/*h₁es-r-, which is a bit on the speculative side; a plain *ser- would perhaps do just as well.

Not true. To take a couple of examples, in Sūzhōu and Shànghai, mom is “姆妈” /m.mɑ/ and dad is “爹爹” /tiɑ.tiɑ/; in Táng-period texts, the colloquial terms are “阿孃/娘娘” *ʌ.ɳiɔw̃ / ɳiɔw̃.ɳiɔw̃ for moms, and “阿爺/爺爺” *ʌ.ja / *ja.ja as well as “哥哥” *qʌ.qʌ for dads; one dialectal term which recently became (in)famous is “大大” dàda, because Xí was affectionately called 习大大 Xí Dàda “Dad Xí”, a moniker which infuriates many, who consider it 认贼作父 “taking a bandit as one’s own father”.

[one should find a mix of “papa”, “tata”, “ada”, “tat”, “bab” forms within each language family.]

“And we do find something like this, even in individual languages. In English, we have father, dad, daddy, dada, pa, papa, pop, pops and maybe others. Some are borrowed, some are nursery-room distortions of the adult word.”

“Father” does not count here. It is not a papa/mama word, even if it was forged on a pre-PIE root *pa- (itself certainly a papa/mama word) several millennia ago. It is a father/mother word, a word essentially used to refer to anyone’s parent (while papa/mama words are essentially used

“Daddy” and “dada” are derived from “dad” (likely from Celtic origin). “Pa”, “pop” and “(?)pops” are derived from “papa” (likely a proto-Germanic word, judging by Dutch, German, Swedish, Danish and Norwegian papa).

But consider a child babbling “baba” or “atata” with English monolingual parents. What they will do is either ignore these sounds, or, if phonetically broad-minded enough, recognize “papa” and “dad”, respectively, and reinforce them with the correct English sounds. In all cases, the result will be the preservation of the English words “papa” and “dad” (and their diminutive variants).

This does not amount to an amorphous mix of forms, as should be expected if babies freely forged parental terms.

2. Regarding your (certainly rhetorical) question “What about mama for ‘father’ in Kartvelian? Is it Proto-Sapiens too?”, my answer is “Most probably”. You mention yourself that the word is Kartvelian, not Georgian or Mingrelian or any Kartvelian dialect. Actually similar forms are found in all Kartvelian languages. It means this word has lasted for some 3,000 thousand years. In Proto-Kartvelian its meaning had changed (other cases are known), but this rare event did not happen again in Kartvelian languages in the last 3 millennia.

3. Regarding your last comment:

“This speculation can be supported by observations of recurrent human behaviour, but has nothing to do with reconstructing “Proto-Sapiens” (a construct I don’t accept anyway).”

I do not claim to “reconstruct Proto-Sapiens”, just to identify words that must have been parts of the vocabulary of our earliest ancestors. “Papa”, “mama” and “kaka” (not reconstructions!) are such words, given their (i) high inheritability in most language families, (ii) their presence in many language families worldwide.

You certainly are right that “observations of recurrent human behaviour”, e.g. all human children babble “papapa, bababa, tatata, dadada, mamama, nanana” when starting with articulate speech (and most of the time starting their first words with papa, mama, dad, baba, amma, according tho their parents’ language) certainly are powerful arguments supporting the idea that Proto-Sapiens must have used papa/mama words.

It seems that everyone here is in agreement that all of the folowing are possible:
1) mama/papa words may be created de novo;
2) they may be transmitted genetically;
3) they may be transmitted genetically, but resisting some sound changes which would obscure their iconic value;
4) they may be transmitted through contact.

The question is which is typically more significant. Trask, I presume, would emphasize possibility (1). Bancel, I presume, would emphasize possibility (2). Sometimes, as shown in some examples in this thread, (2), (4) and perhaps (3) may be discerned through formal correspondences. But sometimes you really can’t tell. If two neighboring isolate languages both have the word [ˈmama], how can you tell if one borrowed the word from the other, if one or the other created the word afresh 1000 or 10000 years ago, or if they both go back to some common ancestral language?

Yep. English computer = Dutch computer = German Computer = Danish computer, so the word must be Proto-Germanic.

True, Icelandic has tölva and Swedish has dator. But they must be loans from an unknown pre-Germanic substrate, since computer is doubtless a PIE word — cf. Lithuanian kompjuteris, Polish komputer, Albanian kompjuteri, Italian computer, Irish coimpiutair, etc. Sound laws be damned. If they don’t have to apply to mamas and papas, maybe there are other semantic fields exempt from them.

You mention yourself that the word is Kartvelian, not Georgian or Mingrelian or any Kartvelian dialect. Actually similar forms are found in all Kartvelian languages. It means this word has lasted for some 3,000 thousand years.

It has — in Georgian. It has become muma in Megrelian (mua- in compounds), mū in Svan, and has been replaced in Laz. Quite a lot of change in just 3000 years, considering that you claim its form did not change at all between “Proto-Sapiens” and Proto-Kartvelian. How long was that, in your opinion? How many times longer that the time-depth of Kartvelian?

In Proto-Kartvelian its meaning had changed…

How do you know it had a different meaning before? (Note: try to avoid circular reasoning.) And how does the meaning ‘mother’ change into ‘father’? Can you offer a plausible scenario?

I do not claim to “reconstruct Proto-Sapiens”, just to identify words that must have been parts of the vocabulary of our earliest ancestors.

Well, may have been, just because the inventory of infant babbling is limited and the same simple combinations of sounds are likely to be co-opted by adults as nursery terms. Plus, there are psycho-physiological reasons why e.g. mama is applied more frequently to mothers than to fathers (same reason why we call mammals mammals). There’s nothing obligatory about using any particular piece of onomatopoeia, baby talk or interjections like “ah” and “hey” in any individual language. They are common cross-linguistically — that’s all.

The same bird is called cuckoo in English and kukułka in Polish, but these words are not particularly ancient and do NOT go back to a common ancestor (other than the bird’s call). The Proto-Germanic ‘cuckoo’ word was *gaukaz, and the Proto-Slavic one was *žegъza.

By the way, what do you understand by Proto-Sapiens? Are you seriously claiming that there was a time when all humans spoke one language? (BTW, “our earliest ancestors” were some primitive life forms four billion years ago, but I don’t think you are talking about them.)

I don’t, of course, speak for Pierre in any way. But most people who use the terms Proto-World or Proto-Sapiens use it to refer to the most recent common ancestor of all living languages other than pidgin, creole, and sign languages (see Wikipedia). That there once existed such a language is not currently provable, but not disprovable either. Nor does the use of the term imply a former state of affairs in which all then-living humans spoke the same language.

Pierre, I’m afraid that your claim that Romanian TATĂ must/may be borrowed from a Slavic language is groundless. The Romanian word and Dalmatian /twota/ or /tuta/ are both unproblematic (i.e. phonologically regular, as can be seen in Dalmatian /vetruna/ or /vetrwona/ (meaning old, feminine singular) from Latin VETERANA, stressed on the antepenultimate) reflexes of a Latin word TATTA, itself attested.

A childhood friend of mine (now living in California) has a daughter who calls him Dadu. I don’t know if he or his wife might have encouraged the “error” but everyone finds it cute. I don’t know what it would take for such an innovation to spread more widely though.

But most people who use the terms Proto-World or Proto-Sapiens use it to refer to the most recent common ancestor of all living languages other than pidgin, creole, and sign languages (see Wikipedia).

In that case it definitely isn’t the language of “our earliest ancestors”. And whether Proto-Sapiens thus defined makes sense depends on the applicability of the tree model at great time depths (Proto-Sapiens could scarcely be younger than the main migration out of Africa). In fact, Proto-Sapiens is a taxonomic concept dependent on the reconstruction of language genealogies. It’s where all our reconstructions should coalesce. You can in principle reconstruct the histories of individual words rather than languages, and claim thet there are Proto-Sapiens words even if the Proto-Sapiens language is not definable, but it’s a bit disappointing that we end up with rounding up a few candidates like *mama, *papa, *kaka and pipi plus a lot of ‘splainin’ to do (see above, re Kartvelian *mama: why did it change its meaning? why has it changed phonetically quite a lot in most Kartvelian languages but not the tiniest bit for tens of thousands of years before Proto-Kartvelian?). If Proto-Sapiens is the MRCA of the extant languages, it should have been a fully fledged and normally complex language, like any spoken today. Saying that it very possibly had some nursery words like mama is not terribly insightful.

Piotr (sorry for other listers, too many things to answer with him, but some of your points are indirectly answered here anyway),

NB: [my original comment, if any, appears between square brackets] and is followed by “Piotr’s objection between parentheses”, which is followed by > my answer introduced by a right-oriented angle.

1. [*pa- (itself certainly a papa/mama word)]

“Certainly — ‘coz you say so? There is no PIE *a in this word, so your certainty is based only on the initial consonant. How do you explain the rest of the word?”

> No, Piotr, not “‘coz I say so’, nor based “only on the initial consonant”. It as also based on the parallel with PIE *ma-ter ‘mother’. This was observed already in the 19th century by Indoeuropeanists (and reiterated by Jakobson in his 1960 paper on papa/mama words). The suffix *-ter is also found in two other kinship terms, namely *bhra-ter ‘brother’ and *dhuga-ter ‘daughter’.

Plus, the idea that “there is no PIE *a” in *pH2ter” (with a laryngeal) is only an hypothesis, not a fact. In the face of the parallelism between *pa-ter and *ma-ter (plus the same suffix in *bhra-ter and *dhuga-ter), it is a more economic hypothesis to suppose that the laryngeal in *pH2ter.

“Yep. English computer = Dutch computer = German Computer = Danish computer, so the word must be Proto-Germanic.”

> Nope. Words “computer” are not attested anywhere before the mid-20th century. Others loanwords may be older, like “kangaroo”, but there are clear extralinguistic reasons to exclude it from Proto-Germanic or Proto-Romance. Actually “papa” might be a loanword in all the Germanic languages it is found in, but do you really think it is the most economic hypothesis?

3. [Actually similar forms are found in all Kartvelian languages. It means this word has lasted for some 3,000 thousand years.]

“It has — in Georgian. It has become muma in Megrelian (mua- in compounds), mū in Svan, and has been replaced in Laz. Quite a lot of change in just 3000 years, considering that you claim its form did not change at all between “Proto-Sapiens” and Proto-Kartvelian.”

> Likewise, Proto-Germanic *papa ‘dad’ changed to papi in Faeroese, with an hypocoristic suffix -i (cp. German Vati), then to pabbi in Icelandic. Papa/mama words in all language families may be treated as father/mother words, and thus evolve, change meaning and even disappear, but this is not the common rule.

And Germanic words papa do not result from incessant changes, otherwise one should find Faeroese **ata’, Swedish **bab, German **ted, Dutch **pa, Danish **dada – or a similar mix – all meaning ‘dad’.

4. “Sound laws be damned. If they don’t have to apply to mamas and papas, maybe there are other semantic fields exempt from them.”

> You wrote yourself in a preceding post that papa/mama words are irregular :
“… that’s precisely the reason why baby-talk, onomatopoeia and interjections should not be used as reliable evidence in historical reconstruction (unless they lose their expressive character and become “normal words”).”

Sound laws aren’t damned. Yes, papa/mama words very often escape regular sound changes. They sometimes undergo them. They may be suffixed, enter the plain adult lexicon and then evolve quite regularly, as did PIE *pater and *mater. But as long as they are used with babies, they keep protected from most changes by the babies’ limited phonetic ability.

5. “By the way, what do you understand by Proto-Sapiens? Are you seriously claiming that there was a time when all humans spoke one language? (BTW, “our earliest ancestors” were some primitive life forms four billion years ago, but I don’t think you are talking about them.)”

> By Proto-Sapiens, I mean the hypothetical language ancestral to all known languages, which may have been spoken around 100,000 years BP (±50,000 years), when Homo sapiens left their African home continent and began to spread over the world. Probably other languages were spoken at this time by other Sapiens groups, or even by other humans (e.g. Neandertals or Denissovans).

And, yes, you’re right, “our earliest ancestors” was a very wrong phrasing. I meant the Sapiens speaking the (hypothetical, just like PIE) “most recent common ancestor of all known languages”. Before them, language certainly had already a very long history. It happens to be the case that papa/mama words give us a key to understand how it may have begun.

Actually “papa” might be a loanword in all the Germanic languages it is found in, but do you really think it is the most economic hypothesis?

Borrowing from Romance is surely more economic, given the very late attestation of papa as a synonym of dad (as for the meaning ‘pope’ — Old English pāpa was obviously a loan from church Latin). English is an excellently documented language, yet the earliest attestation of papa is from the late 17th century (ditto for German, I believe), and it was used almost exclusively in upper-class families till about a hundred years later.

Plus, the idea that “there is no PIE *a” in *pH2ter” (with a laryngeal) is only an hypothesis, not a fact. In the face of the parallelism between *pa-ter and *ma-ter (plus the same suffix in *bhra-ter and *dhuga-ter), it is a more economic hypothesis to suppose that the laryngeal in *pH2ter.

Every reconstruction is “only a hypothesis”. Your *pa- is also a hypothesis. So we have two competing hypotheses: *ph₂ter- vs. *pater-. They are both compatible with, say, Old English fæder-, Latin pater, Greek πατήρ, Old Irish athair, etc., but only one of them (guess which one) is compatible with Vedic pitar- and Older Avestan ptar-. Thus facts tell us that *ph₂- works where pa- fails, and so the latter must be abandoned. Anyway, these kinship terms are nowhere as parallel as you claim them to be. The historical reflexes of the ‘father’ word have a short root vowel (or even zero), while ‘mother’ and ‘brother’ consistently show a long vowel. In *dʰugh₂ter- we have additional evidence of a laryngeal (beside the branch-specific reflexes of *h₂ in this position: it causes the aspiration of the preceding stop in Indo-Aryan.

My views (which are not provable, but just ordinary scientific views):

1) Proto-World existed, in the sense that there was a common ancestor of all the living (non-creole, non-pidgin, non-sign) languages. It was not the only language of its day.

2) It is not possible to say anything about it by reconstruction alone, the tool being too weak to reach back to the remote period when it was spoken.

3) It might be possible to say something about it by contrasting it with known languages spoken in the past but not related to any living language, but the difficulty is to find such languages, since we have only negative criteria of unrelatedness.

4) Most estimates of the date of Proto-World are overestimates, because of a general tendency to underestimate the rate of language change, and therefore to overestimate the time elapsed between (say) English and its earliest reconstructible ancestor.

With your last post (October 14, 2015 at 3:12 am, answering to John Cowan) we are arriving on some relatively common ground, it seems to me.

“You can in principle reconstruct the histories of individual words rather than languages, and claim that there are Proto-Sapiens words even if the Proto-Sapiens language is not definable, but it’s a bit disappointing that we end up with rounding up a few candidates like *mama, *papa, *kaka and pipi …”

> It’s not unimportant to remark that a few words have survived over some 100,000 years. Actually there are more words than those you quote, and not only kinship appellative or onomatopeic words.

“… plus a lot of ‘splainin’ to do (see above, re Kartvelian *mama: why did it change its meaning? why has it changed phonetically quite a lot in most Kartvelian languages but not the tiniest bit for tens of thousands of years before Proto-Kartvelian?).”

“If Proto-Sapiens is the MRCA of the extant languages, it should have been a fully fledged and normally complex language, like any spoken today.”

> Why should the MRCA of extant languages “have been a fully fledged and normally complex language”? For instance, it may have been on the verge of acquiring the syntactic articulation (which obviously evolved after the phonetic one). Finding out how syntax was acquired is one of the reasons why it may be so important to fumble around in this remote past.

I remain agnostic as for the reality of Proto-World defined in terms of common ancestry. Let’s imagine it’s possible to trace back the genealogy of every morpheme, and that every word used today has a chain of ancestors extending indefinitely into the past. There’s so much horizontal diffusion caused by language contact that those genealogies will coalesce at different times and in different languages. I simply don’t know if at the time scales we are talking there’s enough tree structure left (as opposed to a tangled network of genealogies) to make a sizeable proportion of those genealogies converge close to one another. Rather than a single ancestral language, there might have been a number of languages (not even spoken at the same time) which are collectively the source of all the modern linguistic material. A dispersed MRCA, shall I say.

When trying to date the putative MRCA, we have to take into account the history of human populations and their migrations. Australia has been almost completely cut off for quite a long time, and it somehow strains credibility to suggest that, for example, the descendants of an MRCA spoken somewhere in Asia were brought back to Africa and replaced all the languages spoken there previously, all across the continent. That would require ancient migrations on a massive scale, not supported by any genetic data I’ve heard of. Without assuming anything about rates of language change, it’s quite safe to estimate that if anything like Proto-World really existed, it was located somewhere in Africa and should be dated earlier than ca. 60 thousand years ago.

I have little doubt that spoken language is much older than, and that any hypothetical MRCA was itself the product of a long evolutionary process (in both biological and cultural terms), not much different, typologically, from languages spoken today. Certainly not like infant babbling.

According to the Tower of Babel etymological database (run by the Russian Nostraticist George Starostin), the Chinese character 爸 is attested in Archaic Chinese and its pronunciation is reconstructed as baʔ.

Look at http://starling.rinet.ru/cgi-bin/response.cgi?root=config&morpho=0&basename=\data\china\bigchina&first=1&off=&text_character=%E7%88%B8&method_character=substring&ic_character=on&text_reading=&method_reading=substring&ic_reading=on&text_ochn=&method_ochn=substring&ic_ochn=on&text_cchn=&method_cchn=substring&ic_cchn=on&text_wchn=&method_wchn=substring&ic_wchn=on&text_echn=&method_echn=substring&ic_echn=on&text_epchn=&method_epchn=substring&ic_epchn=on&text_mpchn=&method_mpchn=substring&ic_mpchn=on&text_lpchn=&method_lpchn=substring&ic_lpchn=on&text_mchn=&method_mchn=substring&ic_mchn=on&text_fanqie=&method_fanqie=substring&ic_fanqie=on&text_rhyme=&method_rhyme=substring&ic_rhyme=on&text_meaning=&method_meaning=substring&ic_meaning=on&text_oshanin=&method_oshanin=substring&ic_oshanin=on&text_shuowen=&method_shuowen=substring&ic_shuowen=on&text_comment=&method_comment=substring&ic_comment=on&text_karlgren=&method_karlgren=substring&ic_karlgren=on&text_go=&method_go=substring&ic_go=on&text_kanon=&method_kanon=substring&ic_kanon=on&text_jap=&method_jap=substring&ic_jap=on&text_viet=&method_viet=substring&ic_viet=on&text_jianchuan=&method_jianchuan=substring&ic_jianchuan=on&text_dali=&method_dali=substring&ic_dali=on&text_bijiang=&method_bijiang=substring&ic_bijiang=on&text_shijing=&method_shijing=substring&ic_shijing=on&text_any=&method_any=substring&sort=character&ic_any=on

Sorry for this ultralong URL, but if you copy it into the navigation box of your Firefox window it should work.

Why should the MRCA of extant languages “have been a fully fledged and normally complex language”?

Because of the principle of uniformitarianism. There is no reason to suppose that the MRCA was in any way an unusual language or unlike those spoken
today (other than pidgins and sign languages). Similarly, there is no reason to suppose that Mitochondrial Eve (the common ancestor of all living human beings through female lines exclusively) was in any way unusual. We can say that she had at least two children (not an unusual characteristic): if she had had no children, she could not have been ME, whereas if she had had exactly one child, that daughter would be ME.

I am not sure if you are saying that PW probably did not exist, or that it is not reconstructible. I believe that it did exist but is probably not reconstructible. I concede that, as you say, it’s possible that it might not have existed, but your evidence for that seems to consist of lack of reconstructibility, which is no evidence at all.

[I]t somehow strains credibility to suggest that, for example, the descendants of an MRCA spoken somewhere in Asia were brought back to Africa and replaced all the languages spoken there previously, all across the continent. That would require ancient migrations on a massive scale, not supported by any genetic data I’ve heard of.

There would be no such genetic data if the palaeo-Africans had died out along with their languages during a bottleneck event in the human species. In that case, resettlement in fairly recent times of Africa from Asia (or even conceivably Australia, though that seems unlikely) would be quite possible.

@Pierre: The mainstream word for “father” indeed started b- in Old Chinese. (*baʔ regularly gives fù in Mandarin, the first morpheme in the polite word fùqin for fathers)

It was then replaced by a j- form before Táng, then by a t- form somewhere around the time of the Mongols (The Old Khitan, though, still had yé 爺). Most Northern Chinese dialects today still use t- words, although the p- words, felt as more urban, are steadily gaining ground.

The new p- forms arose in Beijing in the late 18th century. Guō Xī (2004) counted occurences of diē and bà in novels and plays:

Similarly, for English, I think someone did dig out some specific evidence for the Frenchy, courtly, snobbish nature of papa in the 19th century. Otherwise OED wouldn’t feel confident enough to state so.

I concede that, as you say, it’s possible that it might not have existed, but your evidence for that seems to consist of lack of reconstructibility, which is no evidence at all.

I mean something slightly different. Neither languages not their historical lineages are not well-bounded entities. Given enough time, a lineage may disperse enough to lose its identity. I tried to explain it here:

To give you a biological analogy, our mtDNA lineages coalesce in “mtEve”, and male Y chromosome DNA coialesces in “Y-Adam”. These two fragments of our total genome form two different family trees without a common root. There may be some geographical and chronological correlation between them (since the father and the mother often belong to roughly the same community), but the farther back you go, the less correlation you can expect.

There would be no such genetic data if the palaeo-Africans had died out along with their languages during a bottleneck event in the human species.

But the basalmost haplogroups for both mtDNA (L0) and the Y chromosome (A00 and several other “A” lineages branching off successively) are exclusively African, and the corresponding splits are dated well before 100,000 years ago. This is at odds with a scenario of extinction and recent resettlement. It seems the palaeo-Africans are still there.

These two fragments of our total genome form two different family trees without a common root.

That seems unlikely to me. These two persons mtEve and Y-Adam necessarily have a MRCA, like any other two persons in the Homo sapiens lineage. This person necessarily lived longer ago than either, since it is an ancestor.

That seems unlikely to me. These two persons mtEve and Y-Adam necessarily have a MRCA, like any other two persons in the Homo sapiens lineage. This person necessarily lived longer ago than either, since it is an ancestor

They have “a” MRCA, but only a tiny fraction of either genome comes from that particular ancestor. The rest comes from a multitude of other, not-so-recent common ancestors. I used mitochondrial and Y-chromosome DNA in my analogy, since they at least have rather neat family trees, in which the oldest branchings don’t predate the beginnings of Homo sapiens. But many human polymorphisms in the rest of the genome can be expected to predate the sapiens/neanderthalensis split; some lineages even coalesce before the human/chimp split. Of course only one among the host of common ancestors any two humans share is “most recent” but apart from this trivial and accidental property there’s nothing else to make him/her special.

One could also argue that ultimately our mitochondrial and nuclear genomes also have a common ancestor. But in this case the MRCA = the LUCA (the last universal common ancestor). Biologists can at least be reasonably sure that such a universal common ancestor existed. For language, there is no such certainty.

I agree with much of Piotr’s views expressed in his post dated October 14, 2015 at 11:05 am:

“When trying to date the putative MRCA, we have to take into account the history of human populations and their migrations. Australia has been almost completely cut off for quite a long time, and it somehow strains credibility to suggest that, for example, the descendants of an MRCA spoken somewhere in Asia were brought back to Africa and replaced all the languages spoken there previously, all across the continent. That would require ancient migrations on a massive scale, not supported by any genetic data I’ve heard of. Without assuming anything about rates of language change, it’s quite safe to estimate that if anything like Proto-World really existed, it was located somewhere in Africa and should be dated earlier than ca. 60 thousand years ago.”

> I agree. Note that horizontal diffusion may even have concerned other human (sub)species with which Sapiens have interbred, like Neandertals and Denissovans (several publications in Nature by a Harvard population genetics team). Successful interbreeding (not simple sexual intercourse) seems to me to imply that these other humans had some articulate speech abilities, otherwise the resulting hybrids would have been severely impaired in the competition for reproduction and their genes wouldn’t have survived until today. As a consequence, one cannot even rule out the possibility that some Neandertal or Denissovan words still continue to live in some languages or language families.

“I have little doubt that spoken language is much older than, and that any hypothetical MRCA was itself the product of a long evolutionary process (in both biological and cultural terms) …, ”

> I entirely agree.

“… not much different, typologically, from languages spoken today. …”

> Well, the principle of uniformitarianism earlier reminded by John may or may not apply perfectly here. The long “biological and cultural evolutionary process” leading to articulate language was perhaps not entirely accomplished in MRCA’s time. For instance, MRCA may have been in the process of acquiring syntactic articulation (the evolution of the phonetic one having obviously begun long before – should I explain why I think so?), so that all extant languages have finished to develop it their own way. May this question possibly be(come) a subject of study? I have an idea about it.

“Certainly not like infant babbling.”

> Well, certainly not like infant babbling, although babies certainly already babbled, as it is a crucial phase of acquisition of articulate speech for them. And there were also certainly babbling-like appellatives, which also are a crucial tool for babies to acquire symbolic representation, and are not known to lack from any language.

All that needs a lot of thinking and discussing. But discovering how some already highly intelligent apes entered the universe of articulate speech may be worth the while, isn’t it?

Sorry for having neglected your posts until now, I had too much to do with Piotr’s.

“So how will you interpret the b -> j -> t -> p innovation chain for father words in Chinese?”

According to the etymogical data in the Tower of Babel databases “Chinese Characters” and “Chinese dialects” (http://starling.rinet.ru/cgi-bin/main.cgi?flags=eygtnnl), there must have been a continuous use of ba/pa forms in most Chinese dialects since Archaic Chinese times. I am absolutely not a specialist of Chinese, but these data (both historical and modern) seem to me compelling.

Arguing for an innovation in such or such dialect may be interesting for Sinologists, of course. Most probably these innovations would appear as interdialectal loanwords. But such details do not change the overall picture of a general transmission of ba/pa words in most Chinese dialects over 3,000 years.

Starling’s “Chinese dialects” database is not a database of Chinese dialect words, but merely of Chinese dialect readings of Classical/Modern Written Chinese. You need other databases in order to argue for the existence of p- forms in Chinese dialects, let alone to claim something like “all Sinitic languages have ‘pa’”. (Isn’t Northern Mandarin a/some Sinitic language(s)?)

For the past 20 years or so I have been doing research in some languages of Western North America, especially the “Penutian” group. From vocabulary lists in this and other groups of the area I have extracted a number of words for MOTHER-MOM and FATHER-DAD which show that there are quite a few other words or stems besides m- and p- initial ones. Thanks to John Cowan the list can be seen here:

Oh yes, you are absolutely right, I never should have claimed that “all Sinitic languages have ‘pa’.” Just like what is found in other language groups, only part of the Sinitic languages may retain Archaic Chinese ‘baʔ’. But it seems to me that, as ideograms are not phoneticized, that dialectal readings of ideograms do correspond to a word in the respective dialects?

Or is the picture of dialectal words for ‘dad’ completely chaotic, with roughly equal proportions of ‘pa’, ‘ba’, ‘ta’, ‘da’, plus some ‘ma’ and ‘na’? This would beat my position for Sinitic, I concede.

@Pierre: The history of all the Sinitic languages cannot be known for sure. However, there is something about which we do know something: Northern Chinese, something eminently documented across millennia. The b -> j -> t -> p cycle I have mentioned concerns this knowable portion of Chinese.

ideograms are not phoneticized, that dialectal readings of ideograms do correspond to a word in the respective dialects?
Chinese characters don’t encode ideas but mostly morphemes or etyma, so different readings of a character in different dialects correspond to each other in a neogrammarian way.

Or is the picture of dialectal words for ‘dad’ completely chaotic, with roughly equal proportions of ‘pa’, ‘ba’, ‘ta’, ‘da’, plus some ‘ma’ and ‘na’? This would beat my position for Sinitic, I concede.
Well, in Mandarin dialects, you mostly have t-. The p- form is two hundred years old in Beijing and 20th century in other places.

In Southern China, you do have some more genuine p-‘s, for example Southern Min pē, which may or may not directly come from Old Chinese *baʔ. But now your position is in somewhat worse form, among the 1.2 billion Sinophones, 900 million Mandarin and 100 million Wu speakers mostly have t- forms, leaving mostly 15% of the population retaining the ancient p-.

I don’t know what the phonetic value of j is here, but I can’t believe that this represents a phonological series. It does not make sense from a phonetic point of view. Perhaps the words attested from different periods also represent different dialects, or even fashions?

j here is yod, & these words are definitely not part of a phonological series (though the b & p words may or may not be related). The j & t words are etymologically unrelated words for “dad”. And minus273 is absolutely right that Chinese characters generally encode specific morphemes, definitely not ideas, such that a character that writes a b word would never be used for one starting with t, for example.

Thanks, Matt A! I was definitely not being clear. What I meant is that, in the general language of the North,

– The original *b- word was replaced by an unrelated j- word before the 7th century.
– The j- word was then replaced by t- forms around 13th century. The t- forms are current in most Mandarin and Wú dialects.
– Recently (18th century), a new p- form, unlikely to be related to the original *b- one, appeared in Beijing. Due to the preeminence of Beijing as the standard language, from 20th century on, it has made much advances in Mandarin dialects.

– Historical Southern Sinitic (a paraphyletic group, i.e. non-Northernish Sinitic) is not attested. In current Southern Sinitic, p- forms are eminently attested and show quite good correspondences to each other. They might be related to the original *baʔ.

However, this becomes almost a detail with regard to your earlier post dated October 19, 2015 at 5:17 pm

“@Pierre: …

[Or is the picture of dialectal words for ‘dad’ completely chaotic, with roughly equal proportions of ‘pa’, ‘ba’, ‘ta’, ‘da’, plus some ‘ma’ and ‘na’? This would beat my position for Sinitic, I concede.]

Well, in Mandarin dialects, you mostly have t-. The p- form is two hundred years old in Beijing and 20th century in other places.

In Southern China, you do have some more genuine p-’s, for example Southern Min pē, which may or may not directly come from Old Chinese *baʔ. But now your position is in somewhat worse form, among the 1.2 billion Sinophones, 900 million Mandarin and 100 million Wu speakers mostly have t- forms, leaving mostly 15% of the population retaining the ancient p-.”

> There must be one or several misunderstanding(s) of my position.

First of all, the number of speakers has nothing to do with historical linguistics. Three billion of Indo-European languages speakers do not make this language family the oldest one. Nor does a little populated subgroup like Albanian (5.4 million speakers, according to Wikipedia) count less, in the linguistic history of IE, than the enormous Indo-Aryan subgroup (1.5 billion). And the extinct Anatolian group (0 speaker, save a half-dozen Hittitologists), it Sturtevant and others are right with the Indo-Hittite hypothesis, counts as much as all the 3 billion speakers of living IE languages.

More generally, I do not deny that papa/mama words may evolve. It all depends on whether they are treated as papa/mama words stricto sensu or as father/mother words (themselves not always ‘formal,’ as Trask says, but often referential, while papa/mama words stricto sensu are appellatives).

Actually, many papa/mama words (phonetically speaking) are ambivalent, i.e. the are or may be used as father/mother words (with a merely referential value). This double value explains that they may shift use as purely referential terms, e.g. the descendants of PIE *pater and *mater, in this case through the adjunction of a suffix to earlier *pa and *ma (and, @Piotr, the irregular shift of Pre-PIE *pa to PIE *pH2, unless PIE indeed had *pater and Indo-Aryan later evolved irregularly, such irregularities in vowel evolution are sold by the dozen in all eymological dictionaries, generally by way of “alternate forms”, but well, we’re loosing sight of our present point, dear). Having become purely referential (or nearly purely, e.g. “_Father_ MacKenzie, will you write the words of a sermon …?”), they undergo all sound changes quite normally. But there are always appellatives for fathers and mothers, in general your own, and in many societies these appellatives are used towards all respected elders.

While this time, there also are always babies, and with their limited phonetic abilities they block most (not all) sound changes because these evolutions would put the words out of their phonetic reach. Babies continue to say baba even if it is in the process of becoming bava, and so on. Sometimes appellatives seem to change in such regular and uneasy-for-babies ways, but it’s not such a frequent event.

However, those same appellatives may and do change in other ways than regular sound change, mostly within the range of the phonetic abilities af babies, e.g. by suffixation, like English dadd-y, German Vat-i (Vater returning to the realm of appellatives, though probably not for babies), or Faeroese papí (@Piotr, are you sure that German, English, Danish, Swedish and Faeroese all borrowed independently papa from French, then Faeroese sold it with the suffix to Icelandic to make pabbi? Those Faeroese.) But very often the root word continues to be apparent, because of babies – or at least I would guess so, I wasn’t there myself.

As a result, when you take all the appellatives in a language group, you observe a range of variation of forms within the group, but there is always (or nearly always) one or two form(s) much more widely represented than others. Were it in only one or two groups out of the numerous language families out there, it could be considered statistic noise, but repeated again and again (like Germanic papa, I reasonably maintain, probably Sinitic ba, as well as many, many others).

… but repeated again and again (like Germanic papa, I reasonably maintain, probably Sinitic ba, as well as many, many others), the only reasonable hypothesis is inheritance in the respective families. Of course, in the absence of the warranty of phonetic reconstruction for these forms (as they do not often comply with regular sound changes), it may be statistical noise in a handful of cases, so for each form in each family there remains a degree of uncertainty regarding its real antiquity, but taken collectively most of these family-dominant forms must be ancient.

You can pronounce a Chinese character in any Chinese language or dialect (and usually in Japanese, Korean, & Vietnamese as well), but that has no bearing on whether the word written with that character is used in a given variety.

That character, (爸, pronounced bà in Standard Mandarin) didn’t exist at the time of Old Chinese and was created much later to write various non-standard dialect forms for “dad”. Eventually it was chosen to write the form that arose in Beijing speech that minus273 mentioned.

Any literate speaker of any Chinese language can read a text written in Mandarin, though they might pronounce it with the sounds of their own language, as given in that chart. This is true even if they speak northern varieties that lack a p- word for father, or if they speak a southern variety which uses a p- word, though one that might have a different final & be written with a different character.

The dialect words 爸 was created to write quite likely might have been related to the Old Chinese b- word, though they may or may not have been. The fact that the same character was later used to write the Beijing p- word in no way suggests any etymological connection with earlier words written with that character (though it doesn’t rule them out either). It was just chosen to write the Beijing word because it had the right meaning and approximately the right pronunciation (the tone of the word used in Beijing doesn’t match that of the earliest dialect words written with that character, but it does match that of at least one other form written with the character, and the phonological match was close enough).

Piotr, are you sure that German, English, Danish, Swedish and Faeroese all borrowed independently papa from French, then Faeroese sold it with the suffix to Icelandic to make pabbi? Those Faeroese.

I wouldn’t at all be surprised to find that papa was borrowed from French to this whole list of languages. In German, Papa with stress on the last syllable certainly used to be aristocratic (perhaps it still is); nowadays it’s stressed on the first syllable.

The suffix is the usual nickname suffix that is all over Germanic and Hungarian today. I’m sure it was applied independently again and again; Mami and Papi aren’t rare in German.

the irregular shift of Pre-PIE *pa to PIE *pH2,

So now you’re saying that, even though only *ph₂- can be reconstructed, **pa- must have been present before it based on no evidence?

unless PIE indeed had *pater and Indo-Aryan later evolved irregularly, such irregularities in vowel evolution are sold by the dozen in all eymological dictionaries, generally by way of “alternate forms”

Papa and mama may well have been borrowed from French after all in whatever Germanic language you may wish. My point is not that papa/mama words are never borrowed. Quite the contrary. Actually, in the papers I referred to in a previous post, I underlined that English dad and mom have probably been borrowed from Celtic tad and mam, Greek baba from Osmanli Turkish, itself probably from Persian, Romanian tata from Slavic (unless it is inherited from Latin tatta, or both – I mean, the apparent survival of Lat. tatta in Romanian, unique in the whole Romance family, may have been secured by the presence of many bilinguals who retained tata because it was Slavic as well).

What I contend is that, in other cases, papa/mama words are (mostly) directly inherited and not innovated by parents seizing any syllable sequence from their babies’ babbling to forge a new word.

There is no trace in French of ata, tata, dad, aba or baba for ‘father’ – papa is a French word inherited from Latin pappa, as appears from the (rather incomplete but already telling) Französisches etymologisches Wörterbuch, with dozens of papa words – actually, its author deals mostly with descendants of Latin pappus ‘grandad’, because the descendants of latin pappa do not comply with sound laws).

There is no trace in English of ata, tata, aba or baba for ‘father’.

There isn’t any trace in Modern Greek of their old word pappa (Homeric), then papa (Hellenistic) since they borrowed baba from the Turks during the several centuries of Ottoman domination over their country. Oh, yes, there was one in Pontic Greek, the (vanishing) dialect of Greeks established around the Black Sea since two-three millennia (already the Trojan War was between Greek speakers). In the Pontic language, it has remained papa until today for the few remaining speakers – not baba, nor dada, nor ata, nor pabi.

This radical lack of variation (variations like dad ~ daddy or papa ~ popa ~ pops ~ pa are just hypocoristic variants, not a change in the root consonant) by itself implies preservation.

Yes, Russian papa most likely was borrowed from either French and/or German. There any such examples in many languge families worldwide.

Of course some of the many likely borrowings may have been an innovation, no linguist was there to see the borrowed word spread around in a given community, but it is really striking that most of the papa/mama words that diverge from those usually found in the language family concerned have a good borrowing explanation, e.g. English dad (not tata, nor ata, nor baba, nor etc.), an isolated form in Germanic (save in a few northern Dutch dialects), which happens to emerge in the Germanic language having settled in an island inhabited by Brythonic Celtic speakers whose languages all have since their earliest known attestations tat or tad forms, sometimes even turned into dad by the Celtic initial consonant alternance, e.g. in a Breton dialect da dad ‘thy dad’.

Sorry for the delay, I thought I had sent this already. I put together a short list of mother/mama and father/papa pairs in a number of languages, especially “Penutian” ones. You will see that m- and p- initial words are present in some languages, but a number of other consonants also occur.

I had trouble with how to post it here without destroying the formatting, and with the help of LH and JC the list is now available here.

The words on this list are excerpted from one of my Penutian vocabulary files, which I try to updata as I go. It comprises mostly Penutian items, with a few from neighbouring languages for comparison and checking for potential borrowings.

And you can support my book habit without even spending money on me by following my Amazon links to do your shopping (if, of course, you like shopping on Amazon); I get a small percentage of every dollar spent while someone is following my referral links, and every month I get a gift certificate that allows me to buy a few books (or, if someone has bought a big-ticket item, even more). You will not only get your purchases, you will get my blessings and a karmic boost!

If your comment goes into moderation (which can happen if it has too many links or if the software just takes it into its head to be suspicious), I will usually set it free reasonably quickly... unless it happens during the night, say between 10 PM and 8 AM Eastern Time (US), in which case you'll have to wait. And occasionally the software will decide a comment is spam and it won't even go into moderation; if a comment disappears on you, send me an e-mail and I'll try to rescue it. You have my apologies in advance. Also, my posts should be taken as conversation-starters; there is no expectation of "staying on topic," and some of the best threads have gone in entirely unexpected directions. I have strong opinions and sometimes express myself more sharply than an ideal interlocutor might, but I try to avoid personal attacks, and I hope you will do the same.

Favorite rave review, by Teju Cole:
"Evidence that the internet is not as idiotic as it often looks. This site is called Language Hat and it deals with many issues of a linguistic flavor. It's a beacon of attentiveness and crisp thinking, and an excellent substitute for the daily news."

From "commonbeauty"

(Cole's blog circa 2003)

All comments are copyright their original posters. Only messages signed "languagehat" are property of and attributable to languagehat.com. All other messages and opinions expressed herein are those of the author and do not necessarily state or reflect those of languagehat.com. Languagehat.com does not endorse any potential defamatory opinions of readers, and readers should post opinions regarding third parties at their own risk. Languagehat.com reserves the right to alter or delete any questionable material posted on this site.