Altaic Languages

Summary and Keywords

“Altaic” is a common term applied by linguists to a number of language families, spread across Central Asia and the Far East and sharing a large, most likely non-coincidental, number of structural and morphemic similarities. At the onset of Altaic studies, these similarities were ascribed to the one-time existence of an ancestral language—“Proto-Altaic,” from which all these families are descended; circumstantial evidence and glottochronological calculations tentatively date this language to some time around the 6th–7th millennium bc, and suggest Southern Siberia or adjacent territories (hence the name “Altaic”) as the original homeland of its speakers. However, since the mid-20th century the dominant view in historical linguistics has shifted to that of an “Altaic Sprachbund” (diffusion area), implying that the families in question have not sprung from a common source, but rather have acquired their similarities over a long period of mutual linguistic contact. The bulk of “Altaic” has traditionally included such uncontroversial families as Turkic, Mongolic, and Manchu-Tungusic; additionally, Japanese (Japonic) and Korean are also frequently seen as potential members of the larger Altaic family (the entire five branches are sometimes referred to as “Macro-Altaic”).

The debate over the nature of the relationship between the various units that constitute “Altaic,” sometimes referred to as “the Altaic controversy,” has been one of the most hotly debated topics in 20th-century historical linguistics and a major focal point of studies dealing with the prehistory of Central and East Eurasia. Supporters of “Proto-Altaic,” commonly known as “(pro-)Altaicists,” claim that only divergence from an original common ancestor can account for the observed regular phonetic correspondences and other structural similarities, whereas “anti-Altaicists,” without denying the existence of such similarities, insist that they do not belong to the “core” layers of the respective languages and are therefore better explained as results of lexical borrowing and other forms of areal linguistic contact.

As a rule, “pro-Altaicists” claim that “Proto-Altaic” is as reconstructible by means of the classic comparative method as any uncontroversial linguistic family; in support of this view, they have produced several attempts to assemble large bodies of etymological evidence for the hypothesis, backed by systems of regular phonetic correspondences between compared languages. All of these, however, have been heavily criticized by “anti-Altaicists” for lack of methodological rigor, implausibility of proposed phonetic and/or semantic changes, and confusion of recent borrowings with items allegedly inherited from a common ancestor. Despite the validity of many of these objections, it remains unclear whether they are sufficient to completely discredit the hypothesis of a genetic connection between the various branches of “Altaic,” which continues to be actively supported by a small, but stable scholarly minority.

Although there is no single narrow definition of “Altaic” in linguistics, in a broad sense all specialists generally accept it as a blanket term, covering a certain number of language families that are spread across Central Asia and the Far East and share a large, hardly coincidental number of structural and morphemic similarities. Depending on their interpretation of these similarities, linguists speak of either an “Altaic language family” (or “macrofamily”), or of an “Altaic Sprachbund” (diffusion area). The first of these hypotheses surmises the existence of an ancestral language (“Proto-Altaic”), spoken in prehistoric times and eventually giving rise to much of the modern linguistic diversity of the Asian continent. The second, conversely, presumes that such an ancestral language never existed, and that the similarities between modern “Altaic” languages are due to extended periods of linguistic convergence, over which individual words and typological features were traded back and forth between formerly unrelated languages.

The linguistic term “Altaic” is generally assumed to have been introduced in the 1840s by Matthias Alexander Castrén (Georg, Michalove, Manaster Ramer, & Sidwell, 1999: 74), one of the first scholars to apply relatively strict linguistic criteria of comparison to large agglomerations of Siberian and Far Eastern languages, previously grouped together on superficial (sometimes just geographical) grounds into such vague taxa as “Tatar” (Johann von Strahlenberg in the 18th century), “Scythian” (Rask, 1834), or “Turanian” (Müller, 1855). In the understanding of Castrén and several other scholars, “Altaic” included such widespread families as Finno-Ugric, Samoyed, Turkic, Mongolian, and Manchu-Tungusic; this larger unit was later renamed “Ural-Altaic,” while “Altaic” was confined to the narrower “core” grouping of Turkic, Mongolian, and Manchu-Tungusic. The term itself implied the existence of an ancestral language, whose original homeland was provisorily identified by early scholars as the region of the Altai Mountains in Central Asia.

Despite the comparatively early appearance of the hypothesis itself, the field of “Altaic linguistics” as a subdiscipline of historical linguistics was not firmly established until the early 20th century, by which time the groundwork for it was laid down both methodologically (with the rise of the Neogrammarian school of historical linguistics) and substantively, with the accumulation of descriptive data on the relevant languages as well as significant research on the comparative grammars and vocabularies of Turkic and Mongolian. The original model of the Altaic theory came into being through the pioneering works of Gustav J. Ramstedt, Nicholas Poppe, Boris Vladimirtsov, Pentti Aalto, and several other scholars; their reconstruction of Proto-Altaic usually focused on establishing regular phonetic correspondences and lexical etymology, but could also include proposals on certain aspects of Altaic morphology, as well as speculations on further connections between Altaic and other language families, such as Uralic, Indo-European, and Dravidian.

The Altaic theory in its “Ramstedt-Poppe” version was seen as the dominant thread in Altaic linguistics until approximately the middle of the 20th century, when it came under heavy criticism from a new generation of scholars, wishing to apply more rigorous criteria to the verification of the hypothesis, particularly in the light of new advances in the fields of areal linguistics and Central Asian philology. Propelled by the works of Gerard Clauson, Gerhard Doerfer, and several other specialists, the opposite viewpoint—namely, that the so-called “Altaic” languages do not satisfy the conditions that are usually demanded from genetically related languages—gradually replaced the idea of “Proto-Altaic” as the mainstream stance on the issue, whereby the general research focus largely shifted from such issues as phonetic correspondences and lexical cognates between the various branches of “Altaic” to the issue of detecting lexical borrowings and typological structural parallels that may have arisen over several millennia of language contact.

Nevertheless, the idea of “Altaic” as a genetic unity whose ancestral state can still be partially reconstructed by properly applying the classic historical-comparative method was far from dead, and the late 20th and early 21st century have seen some major revisions of the old “Ramstedt-Poppe” model. Of these revisions, arguably the most well-known and the most thorough has been that of the so-called “Moscow School” of historical linguistics; taking into consideration the numerous criticisms of Clauson, Doerfer, and others, Russian scholars have offered more complex solutions to some of the old problems, which require carefully filtering between secondary (areal) similarities and true historical cognates. This newly improved methodology was employed to produce the single largest corpus of Altaic etymologies since the early days of Ramstedt and Poppe (Etymological Dictionary of the Altaic Languages [EDAL] by Sergei Starostin, Anna Dybo, and Oleg Mudrak). Despite this, general reception of the dictionary was largely skeptical, since, in the opinion of many specialists, the new model was beset with even more methodological problems and errors of judgment than the old one. A detailed attempt by Martine Robbeets to filter out the unconvincing etymologies of EDAL and reduce the superfluous evidence to a core set of several hundred phonetically and semantically robust etymologies (Robbeets, 2005) has been positively evaluated by some scholars (e.g., Bernard Comrie), but did little to shatter anti-Altaicist incredulity; today, the understanding of “Altaic” as a purely areal term continues to be more widespread than its genetic interpretation (although in 2010 Johanson & Robbeets, in order to avoid confusion, proposed a new term, “Transeurasian,” to refer to the same language families without surmising a necessary genetic relationship between any of them).

The “Altaic debate” between the relative minority of “pro-Altaicisists” and majority of “anti-Altaicists” has become one of the most controversial issues of modern historical linguistics, since its resolution has significant implications for the understanding of the prehistory of some of the world’s most widely spoken and popular languages. Different positions on the problem of Altaic may also reflect different views on the general methodology of comparative linguistics, especially when applied to language families whose genetic and/or areal relations go back so deep in time that conventional methods can no longer be employed without substantial reservations; it may be argued that the entire field of macro-comparative linguistics, to an extent, depends on future progress in the resolution of the Altaic controversy to provide the necessary theoretical and methodological pointers.

2. Internal Structure of Altaic

In the original understanding of M. A. Castrén and several other scholars, “Altaic” was a huge linguistic grouping, spread across most of the Asian continent and including at least five different branches: Fenno-Ugric, Samoyed, Turkic, Mongolian, and Manchu-Tungusic. This grouping was soon renamed “Ural-Altaic,” implying that Fenno-Ugric and Samoyed (Uralic) were much closer related to each other than to any of the three other branches, and by the early 20th century, only the “Altaic” core of Turkic, Mongolian, and Manchu-Tungusic was seen as a proper working hypothesis, with scholars like Ramstedt and Poppe dedicating their efforts to establishing regular correspondences between these three families while remaining largely agnostic about the idea of a larger “Ural-Altaic.”

Already in the 19th century some scholars have argued (largely on typological than truly historical grounds) that such language isolates (in reality—small language groups or “macro-languages”) as Korean and Japanese might also be related to the rest of “Altaic.” The Korean connection, in particular, was explored in depth by Gustav Ramstedt (1949–1953), who first tried to link Korean to Turkic, Mongolian, and Manchu-Tungusic through regular phonetic correspondences and etymological cognate sets. The possible connection with Japanese was first explored in detail in the works of Roy Andrew Miller (1971) and later supported by Sergei Starostin (1991) on the basis of lexicostatistical and etymological argumentation.

On the whole, there are relatively few points of general consensus between past and present supporters of the Altaic hypothesis on the constituency and especially on the internal classification of this family. Any model of Altaic must necessarily involve a “core” of Turkic, Mongolian, and Manchu-Tungusic (“Micro-Altaic,” “Narrow Altaic,” “core Altaic,” or simply “Altaic” as such). Expanded models that add Korean and, less frequently, Japanese are sometimes called “Macro-Altaic” to avoid confusion, although the Moscow school (Sergei Starostin, Anna Dybo, etc.) rarely uses this term, preferring to apply the original nomination of “Altaic” in all cases.

Attempts at including other potential constituents within “Altaic” have been rather few. The Uralic family is today regarded as either completely unrelated to Altaic or related to it on a higher level (“Ural-Altaic” or “Nostratic”). In an old classification model by John Street (1962), Japanese and Korean are seen as a separate “sister” family of Altaic proper, to which Street also assigned the isolated Ainu language; however, the relationship with Ainu was not further supported by any significant proponents of “Macro-Altaic.”

To some extent, Street’s model would later be adopted by Joseph Greenberg (2000), who, in his survey of deep-level genetic relations between the languages of Eurasia, declared “Micro-Altaic” (Turkic/Mongolian/Manchu-Tungusic) and Japanese-Korean-Ainu as two distantly related branches of a large “Eurasiatic” macrofamily. It should be noted that the Altaic hypothesis is frequently discussed and evaluated in larger contexts, such as Greenberg’s “Eurasiatic,” or the similar, but earlier and more substantial “Nostratic” hypothesis of Vladislav Illich-Svitych and Aharon Dolgopolsky (see, e.g., Dolgopolsky, 1998), according to which Altaic itself forms a single branch of a much larger taxon that also includes Indo-European, Uralic, Dravidian, Kartvelian, and Afro-Asiatic. Although “Nostratic”/“Eurasiatic” is an even more controversial hypothesis than “Altaic,” some linguists actually regard it as more promising, e.g. Sergei Yakhontov, who has argued, based on a uniform comparison of lexical and grammatical evidence, that the various branches of “Micro-Altaic” should all be included into “Nostratic” independently of each other.

Arguably the most thoroughly elaborated (though not necessarily the most popular) model of “Altaic” in existence today is that of the Moscow school of comparative linguistics, as put together by Sergei Starostin in the late 1980s and finalized in EDAL by Starostin, Anna Dybo, and Oleg Mudrak. This model includes all five potential branches (Turkic, Mongolian, Manchu-Tungusic, Korean, and Japanese) and, on the basis of lexicostatistical and glottochronological calculations, offers the following scenario:

(1) the original split of Proto-Altaic is dated to approximately the end of the 6th millennium bc, with an initial separation into a “Western” (Turco-Mongolian) and an “Eastern” branch (Koreo-Japanese). The intermediate position of Manchu-Tungusic remains unclear, since the numbers of distinctive isoglosses that it shares with both the Western and the Eastern branches are comparable;

(2) the bifurcation of both Turco-Mongolian and Koreo-Japanese must have taken place somewhere around the 4th millennium bc. This means that all five constituents of Altaic already formed separate linguistic entities as early as six thousand years ago, making the hypothetical protolanguage very difficult to reconstruct as compared to such younger (and commonly accepted) linguistic families of Eurasia as Indo-European, Uralic, or Dravidian.

Some linguists that generally reject the Altaic hypothesis remain sympathetic towards parts of the hypothesis; for instance, J. Marshall Unger (1990) has published on “Macro-Tungusic,” a hypothetical Far Eastern family consisting of Korean, Japanese, and Manchu-Tungusic, while at the same time refusing to admit a further connection with Turkic and Mongolian. However, such opinions are quite rare next to the vast majority of scholars who either support “Altaic” as a whole (or at least the “Micro-Altaic” tripartite model of Turkic, Mongolian, and Manchu-Tungusic) or completely reject the idea that a genetic link between any two of its potential branches has been proven or may be proven with definitive evidence.

3. General Characteristics of “Proto-Altaic”

Most proponents of the Altaic hypothesis generally agree that Proto-Altaic, if it ever existed, must have disintegrated at a relatively early date, i.e. that Proto-Altaic is significantly older than the majority of non-controversial protolanguages, such as Proto-Indo-European or Proto-Uralic; depending on methodology and/or personal intuition, this date of disintegration may be pushed back anywhere from 6000 bc to 8000 bc or even earlier. At the same time, rigorous reconstruction of Proto-Altaic is significantly hindered by lack of early documented sources (the earliest written monuments for “Macro-Altaic” are Turkic inscriptions from the 8th century and Old Japanese texts from around the same time) and by the fact that most of the branches of Altaic themselves began to disintegrate not earlier than 2,000 years ago—implying a temporal gap of no less than 6,000 years between the reconstructed Proto-Turkic, Proto-Mongolian, Proto-Manchu-Tungusic, etc., and their hypothetical common ancestor.

As a direct result of this situation, most Altaicists, past or present, significantly differ in their opinions on both the structural properties and the phonemic, morphemic, and lexical inventory of Proto-Altaic. Outside of a very small core of “fundamental” Altaic etymologies, most of them proposed already in the pioneering studies of Ramstedt and Poppe, and an equally small list of generally trivial phonetic correspondences between directly matching phonetic segments (such as Proto-Altaic *k = Proto-Turkic *k, Proto-Mongolian *k, Proto-Manchu-Tungusic *k, etc.), there are few elements of Proto-Altaic reconstruction over which a complete consensus would exist in “pro-Altaic” circles: evidence adduced by one researcher is easily criticized by another due to such methodological issues as insufficient rigor in establishing regular phonetic correspondences or questionable semantic matching.

Arguably the most well-known, “classic” model of Altaic is still the one represented in the works of G. Ramstedt and N. Poppe: the early-to-mid-20th century idea of Altaic, as it was envisioned by these two scholars, was supported by their unquestionable authority as major specialists in the languages of Central and East Asia. Later, that model was most significantly revised in the works of Soviet/Russian scholars—beginning with V. M. Illich-Svitych in the 1960s and ending with a group of linguists representing the “Moscow school” of historical linguistics under the leadership of Sergei Starostin; the phonology and lexicology of Proto-Altaic as presented in their Etymological Dictionary of the Altaic Languages (2005) represents such a radical departure from the old Ramstedt/Poppe model that the two should be discussed back-to-back for a better comparative perspective.

4. Comparative Altaic Phonology

As a rule, the phonological systems of present day Altaic languages are not very complicated: according to the World Atlas of Linguistic Structures, their consonantal inventories range from “average” to “moderately small,” while vowel quality inventories tend to be either “average” or “large,” the latter being a direct consequence of complex rules of vowel harmony that operate in most continental Altaic languages and produce additional sets of vocalic sounds which frequently obtain phonemic status. The phonological proto-system that Ramstedt and Poppe had suggested for their common ancestor also reflected these typological features.

Regarding consonants, Poppe (1965), largely following Ramstedt, reconstructed a typologically plausible system of 18 original consonants that consisted of four voiceless stops (*p, *t, *č, *k), their voiced (*b, *d, *ǯ, *g) and nasal (*m, *n, *ɲ, *ŋ) correlates, one fricative (*s), one palatal glide (*j), and two pairs of resonants (*l1, *l2, *r1, *r2). Some of the most important sound shifts that were assumed to have taken place between the disintegration of Proto-Altaic and the beginning of disintegration of its three major branches included:

—the merger of Proto-Altaic *l1 and *l2 → *l, *r1, *r2 → *r in Mongolian, Manchu-Tungusic, and Chuvash (the earliest divergent branch of Common Turkic) as opposed to the preservation of their phonemic identity in Narrow Turkic (without Chuvash), where the first element of each pair is preserved as a resonant (*l, *r), while the second becomes a post-alveolar fricative (*š, *ž).

The last of these hypotheses has become a major point of controversy in itself, being directly related to the long-running dispute in comparative Turkology between supporters of the “rhotacism” hypothesis (which supports the reconstruction of *š, *ž on the Proto-Turkic level, with subsequent development to *l, *r in Chuvash) and the “zetacism” hypothesis (which seeks to reconstruct these phonemes as an additional pair of resonants, e.g. *ly, *ry, preserved as resonants in Chuvash, but shifting to fricative articulation in the rest of Turkic).

Analysis of the Turkic data in the light of the Altaic theory seems to favor the “zetacism” hypothesis: cf. the difference between such forms as Mongolian *doluɣa-, Manchu-Tungusic *dala-, Turkic *jālga- (Chuvash śula-) “to lick,” reflecting Proto-Altaic *l1, and Mongolian *dal-da “hidden,” Manchu-Tungusic *dali- “to cover, hide,” Turkic *jaly- (or *jaš-) “to close,” reflecting Proto-Altaic *l2. There are, however, some well-known examples of “zetacists” with a strong anti-Altaic stance, e.g. Gerhard Doerfer (1984), who points out that the resolution of this debate has no direct bearing on the Altaic hypothesis; among other things, the increase of phonetic similarity between the reconstructed Turkic and Mongolian forms can also be interpreted as an argument in favor of mutual borrowing between these two families.

In later works, this reconstruction has been altered in numerous details. The most significant change concerned the system of stops: following upon an original idea by V. M. Illich-Svitych, the “Moscow school” has argued in favor of a triple, rather than binary, opposition in the stop system, reconstructing Proto-Altaic *p, *ph, *b; *t, *th, *d, etc. This was largely due to a revision of the traditional reconstruction of Proto-Turkic: as a rule, researchers would not acknowledge the archaic nature of the phonological opposition between word-initial voiced and voiceless consonants, despite such an opposition actually existing in some of the attested Turkic languages (Oghuz branch and some others). Although Illich-Svitych’s proposal to reconstruct *d- and *g- as separate Proto-Turkic phonemes remains controversial among Turkologists, it has been generally endorsed by Altaicists, who claim that it is well supported by external (Mongolian and Mancu-Tungusic) data.

The vocalic system of Proto-Altaic was reconstructed by Poppe as consisting of nine short vowels and their long correlates, best preserved in Proto-Turkic and heavily simplified in Mongolian and Manchu-Tungusic. Assuming that progressive vowel harmony, by which vowels of the second syllable partially assimilate to those of the first one, operated already on the level of Proto-Altaic, Poppe did not specifically concern himself with reconstructing vocalic oppositions in the second syllable, even though many nominal and verbal stems in Mongolian and Manchu-Tungusic have a bisyllabic structure.

Poppe’s vocalic inventory was to some extent preserved in the “Moscow School” version of Altaic, although some of the vowels were reinterpreted as diphthongs: the resulting system consists of five basic vowels (*a, *e, *i, *o, *u) and three diphthongs (*i̭a, *i̭o, *i̭u). The most important difference is that most Altaic roots in this version of the hypothesis are reconstructed as bisyllabic CVCV-type stems where the second V can be any of the five basic vowels, but not a diphthong, thus *aga “rain,” *biju “to be,” *dilo “year,” etc. The complexity of vocalic correspondences between different branches is explained by the assumption that vocalism of the first syllable is regularly “colored” by the vocalism of the second one during the transition from Proto-Altaic to its daughter languages; however, in a radical departure from previous models of analysis, systematic vowel harmony as such is not projected onto Proto-Altaic itself. Rather, it is assumed that vowel harmony arose in Proto-Turkic, Proto-Mongolian, etc., independently as part of a global areal tendency that operated in the “Altaic region.”

An attempt to reconstruct certain prosodic features of Proto-Altaic was first undertaken by researchers of the “Moscow school,” primarily Sergei Starostin, who, in his investigation of the Macro-Altaic hypothesis, suggested a complex system of regular correspondences between vowel quality in Proto-Turkic and Proto-Manchu-Tungusic, on one hand, and pitch accent (tones) in Korean and Japanese, on the other hand. Based on this research, the authors of EDAL reconstruct Proto-Altaic as a language with pitch accent, where each syllable could be characterized by either a high or a low tone, and phonologically relevant vowel length. However, this is the only aspect of Proto-Altaic reconstruction which crucially depends on the acceptance of “Macro-Altaic” rather than the traditional “Micro-Altaic” hypothesis, and is therefore viewed with suspicion by the majority of Altaicists outside the “Moscow school.”

5. Comparative Altaic Grammar

The main emphasis in Altaic studies has usually been on comparative phonetics and lexicon rather than on grammar. Anti-Altaicists usually criticize this as a major flaw of the theory (see below), but from a pro-Altaicist perspective, the two most likely reasons for this are the comparatively old “age” assumed for Proto-Altaic, hindering the reconstruction of paradigmatic evidence; and the fact that Altaic languages are typically agglutinative, allowing for long strings of productive grammatical suffixes that rarely fuse with the root and are on the whole relatively unstable in historical terms—many grammatical morphemes in modern Altaic languages can be reliably traced back to formerly autonomous auxiliary words that must have undergone grammaticalization at relatively late dates.

Nevertheless, some research has been carried out over the years on possible connections between grammatical morphemes that are reconstructible for Proto-Turkic, Proto-Mongolian, etc. and are not easily traced back to independent words, suggesting the possibility of being inherited directly from a Common Altaic state. For instance, following up on elements of research by such Altaicists as Nicholas Poppe, Karl Menges, and Murayama Sichiro, the authors of EDAL summarize earlier proposals and add their own ones on several such connections in the area of nominal declension, e.g. a possible plural suffix *-th- (→ Turkic *-t, Mongolian -d, Manchu-Tungusic *-ta/n/~ *-te/n/, Japanese *ta-ti, Korean -tɨ-r), a possible genitive marker in *-ɲV (→ Old Turkic -ŋ, Mongolian *-n, Manchu-Tungusic *-ŋi, Old Japanese -no, Middle Korean -ɲ), a dative/locative marker in *-du ~ *-da (→ Old Turkic locative/ablative ‑ta ~ -da ~ -te ~ -de, Mongolian dative/locative -da, Manchu-Tungusic dative *-du, locative *-dā) and several other short morphemes that could have functioned as postpositions or case endings in Proto-Altaic.

For verbal affixes, no such list has been proposed in EDAL, but this situation has been largely remedied in a string of publications by Martine Robbeets, culminating in a summarizing monograph (Robbeets, 2015) in which the author claims to have identified no fewer than 19 common auxiliary verbal morphemes of Altaic (“Transeurasian”) heritage, relatively well preserved in daughter branches and showing signs of an original paradigmatic organization. Among these is the negative auxiliary verb *ana-, the denominal verb suffix *-lA-, the causative marker *-ti-, the reflexive-anticausative marker *-pU-, the fientive marker *-dA-, and several other morphemes denoting aspect, valency, and derivation.

One of the strongest arguments in favor of Altaic has been generally recognized as the extreme similarity of the pronominal systems in its main branches (although there have also been attempts on the part of anti-Altaicists to suggest alternate explanations through borrowing and diffusion): cf. Proto-Turkic *bẹ (genitive stem *me-n) “I” = Proto-Mongolian *bi (genitive *mi-n) “I” = Proto-Manchu-Tungusic *bi “I”; Proto-Turkic *sẹ (genitive stem *se-n) “you (sg.)” = Proto-Mongolian *či (genitive *či-n) “you” = Proto-Manchu-Tungusic *si “you.” In EDAL, the corresponding protoforms are reconstructed as Proto-Altaic *bi “I” and *si “you.” Other pronominal stems suggested for Proto-Altaic in the same source on the basis of evidence from at least three branches are the demonstrative stems *ko “this,” *tha “that,” *i and *o “this, that” and the interrogative stems *kha “who,” *ŋ(i̭)V “what.”

No special attempts at reconstructing any of the aspects of Proto-Altaic syntax have been undertaken so far, although from a purely typological perspective, it may be suggested that Proto-Altaic had Subject—Object—Verb word order, since this is the pattern that is most commonly featured in modern and historically attested Altaic languages.

6. Basic and Cultural Lexicon of Altaic

According to lexicostatistical calculations performed by “Moscow school” Altaicists, all subbranches of “Macro-Altaic,” as represented by their reconstructed proto-languages, share at least 20% common etymological matches on the so-called Swadesh list (100 items pertaining to the most stable sphere of the basic lexicon). Although the Altaic hypothesis has been sometimes criticized for failing to produce enough material that would be reconstructible for the basic lexicon layer of Proto-Altaic (see next section), already the “classic” works of Ramstedt, Poppe, and others contain enough comparanda to populate a representative list, and the etymological corpus has been significantly enlarged in EDAL.

Some examples of basic lexical items that the authors of EDAL reconstruct for Proto-Altaic with evidence from at least three out of five branches include: (a) body parts: *ni̭ā “eye,” *khi̭oŋa “nose,” *khila “hair,” *mōjno “neck,” *palga “foot,” *čhajǯV “breast,” *phejɲe “bone”; (b) natural objects: *phi̭olo “star,” *mi̭ūri “water,” *ti̭ōĺi “stone,” *khāpha “tree bark,” *li̭apha “leaf”; (c) adjectives, including color terms: *khi̭obarV “dry,” *zejna “new,” *si̭ājri “white,” *karu “black,” *puli “red”; (d) verbs: *ǯē “to eat,” *ŋēni “to go,” *ŋūju “to sleep,” *deka “to burn.” There is also a rather large number of binary basic isoglosses between Turkic and Mongolian, on one hand, and between Korean and Japanese, on the other, leading the authors to suggest that these pairs may have shared intermediate common ancestors (Proto-Turko-Mongolian and Proto-Japanese-Korean) that introduced lexical innovations of their own before disintegrating further.

The authors of EDAL go as far as to reconstruct a complete system of numerals from 1 to 10 for Proto-Altaic (*bi̭uri “1,” *ti̭ubu “2,” *ŋi̭u “3,” *tōj- “4,” *thu “5,” *ɲu “6,” *nadi “7,” *ǯa “8,” *khegVnV “9,” *či̭obe “10”), although the last three items are represented exclusively by matches between Manchu-Tungusic and Japanese and may be considered later areal isoglosses.

A huge number of “culture-specific” etymologies have also been amassed in the field of comparative Altaic lexicology. According to an analytical study by Anna Dybo (2013), the Proto-Altaic landscape was represented by numerous terms denoting various types of “hills,” “low mountains,” “slopes,” “rocks,” “ravines,” “valleys,” “rivers,” “sandbanks,” “currents,” “islets,” etc.; it is much more difficult to suggest credible matches for terms having to do with desert or seaside landscapes. Flora and fauna terminology are, along with various not particularly diagnostic terms, represented by such words as “cedar,” “pine tree,” “ash tree,” “juniper,” “deer,” “elk,” “badger,” “sable,” “hare,” “wolverine,” all of which could indicate a possible homeland somewhere in Southeast Siberia (in a taiga-like environment).

The lexical corpus reconstructed in EDAL contains a large number of terms that refer to hunting and fishing activities (“to hunt,” “to lie in ambush,” various terms for dogs, bows, arrows, etc.), which would be consistent with the supposed taiga habitat. More controversial are terms that could be semantically interpreted as referring to elements of agriculture and herding, such as “to cultivate/earth,” “grain,” several terms denoting cereals, “to graze,” “cow,” “small cattle,” “horse,” and even “harness” and “bridle”: they seem poorly compatible with the hypothesis of a Proto-Altaic unity around the 6th millennium bc. Alternately, from a pro-Altaicist perspective some of these matches could be explained as traces of areal “Wanderwörter” that spread across at a later date, confused with inherited terms, or, alternately, as terms that independently acquired new “cultural” meanings in daughter languages after the disintegration of Proto-Altaic (for instance, names used for wild cereals could be carried over to their domesticated equivalents).

7. Criticism of the Altaic Hypothesis

Despite the best efforts of supporters of the Altaic hypothesis, ever since the first significant increase of “anti-Altaicist” publications in the middle of the 20th century it has steadily remained out of favor with “mainstream” comparative-historical linguistics. One important reason for this is that the methodology employed by Altaicists often finds itself at odds with methods favored by “narrow” specialists, working on the history of any particular hypothetical member of “Altaic” (Turkologists, Mongolists, Koreanists, etc.). For instance, a Turkologist working exclusively on Turkic languages will offer a solution that will seem best compatible with the corpus of Turkic data, whereas an Altaicist is likely to offer an alternate solution that may not be optimal in an exclusively Turkic context, but will seem more reasonable when viewing Turkic as a descendant of Proto-Altaic; a transparent example of such a conflict has been described above, in the section on Comparative Altaic Phonology (“zetacism” vs. “rhotacism”).

The principal objections against Altaic have been summarized in numerous publications, the most important of them voiced by such notable scholars as G. Clauson, G. Doerfer, D. Sinor, A. Vovin, S. Georg, and several others. However, since they apply in different degrees to different models of Altaic, it is usually necessary to indicate which particular model of “Altaic” (the original 18th–19th-century studies, the “classic” Ramstedt/Poppe version of the theory, the “Moscow School” model, or something else) suffers the most from these criticisms, which may themselves be grouped into six types:

(1)Criticism on typological grounds. Since languages from different branches of “Altaic” share numerous typological similarities on many levels (word order, agglutinative morphology, root and stem structure, vowel harmony, etc.), such similarities in themselves, especially in the early days of the “Ural-Altaic” hypothesis, seemed to suggest the possibility that these languages might have descended from a single common ancestor, from whom they had inherited all these features. However, modern historical linguistics explicitly rejects hypotheses of genetic relationship based on typological grounds, since it has been established that, given even a relatively small time interval, typological features can easily spread across unrelated languages in a state of intense linguistic contact.

On the other hand, it should be noted that neither the Ramstedt/Poppe model nor any later variation on the Altaic theory relies on typological considerations as evidence in favor of Altaic (the “Moscow School” model even goes as far as to reject vowel harmony in Proto-Altaic altogether, claiming that it must have arisen independently after the original split), relying instead on comparisons of actual lexical and grammatical morphemes. The idea that “Altaic” is still being promoted and defended on typological grounds today is a relatively common misunderstanding, widespread among non-specialists rather than serious “anti-Altaicists.”

(2)Criticism on basic lexicon grounds. Going back to the works of G. Clauson, this argument against Altaic states that, despite the seeming wealth of etymological comparanda, very few convincing parallels are found in the lexical layer that is particularly important for proving genetic relationship, namely, the basic lexicon. As pointed out by Clauson, basic lexicon parallels found in the “Ramstedt/Poppe model” of Altaic are not only few in number on the whole, but also tend to be weakened by irregular phonetic correspondences. Numerous similarities between, e.g., Turkic and Mongolian languages in their “cultural” layers, contrasted with the relative paucity and poor quality of such comparanda as body part terms, basic landscape terms, and other items on the Swadesh list, should rather be interpreted as a result of areal convergence than descent from a common ancestor.

This line of criticism was directly tackled by linguists of the “Moscow school,” where lexicostatistical quantification of evidence is viewed as exceptionally important for convincingly demonstrating genetic relationship. Once the system of phonetic correspondences was revised and a new etymological corpus was created, the authors of EDAL claimed to have shown that, contrary to the objections of Clauson and other “anti-Altaicists,” the percentage of basic lexicon matches between the five branches of “Macro-Altaic” was actually quite high (cf. some examples of reconstructions listed above). However, these new matches were, in their turn, criticized by a new generation of “anti-Altaicists” for much the same reasons as the old ones (primarily the uncertainty of phonetic correspondences used to justify the etymologies).

(3)Criticism on phonetic grounds. As a rule, “pro-Altaicists” claim that they work in the Neogrammarian pattern, searching for rigorous sound correspondences between compared items to prove the non-accidental nature of the links between them; tables of such correspondences are always adduced in major monographs on the Altaic hypothesis. Nevertheless, most models of Proto-Altaic phonology have been regularly criticized for numerous flaws, such as incompleteness of the proposed systems, typologically dubious solutions, unconditioned splits of reflexation in daughter languages, and frequent violations of the suggested phonetic laws, overlooked by Altaicists in order to maximize the numbers of etymological cognates between the various branches of Altaic.

(4)Criticism on grammatical grounds. Defenders of the Altaic theory often hold differing views on the grammatical properties of Proto-Altaic, but nobody has succeeded so far in reconstructing a credible paradigm of nominal declension or verbal conjugation that would be comparable, for instance, with the success achieved in the reconstruction of such paradigms on the level of such uncontroversial families as Proto-Indo-European, nor has a complete working model of the system of morphemes that were used for nominal and verbal derivation been proposed (EDAL has a large list of short derivational suffixes postulated for Proto-Altaic, but their alleged functions remain mostly unclear). This is seen as a particularly serious flaw by those linguists who regard grammatical evidence and grammatical reconstruction as evidence of the highest order for proving a hypothesis of genetic relationship.

(5)Criticism on areal grounds. Arguably the most commonly voiced objection against the Altaic theory is its inability to convincingly distinguish between inherited linguistic elements and similarities that may have had an areal origin. It is a well-known historical fact that many of the alleged daughter branches of Altaic were heavily influenced by one another over the past 2,000 years: cases of particularly intense linguistic contact involve numerous Turkic borrowings into Mongolian and vice versa (beginning in the 13th century), numerous Mongolian borrowings into Manchu (and, through Manchu, into the other Tungusic languages), and, in the context of “Macro-Altaic,” a significant influence of Korean language on Japanese in the middle of the 1st millennium ad. Since the phonetic systems of the concerned languages are typologically similar, many of the “Altaic” etymologies proposed in the original “Ramstedt/Poppe model” have indeed been discredited as erroneous projections of late-period contacts onto a much deeper level of language relationship.

In more recent versions of the Altaic theory, such as the “Moscow school” version presented in EDAL, a special set of filters has been introduced to distinguish recent borrowings from ancient cognates, based primarily on setting up different systems of correspondences for borrowings and inherited lexicon (for instance, Proto-Mongolian *h- corresponds to Proto-Manchu-Tungusic *p- in inherited Proto-Altaic etyma, but to Manchu-Tungusic word-initial zero in etyma that were borrowed from Mongolian already after the loss of *h- in Northern Mongolian dialects after the 14th century, etc.). However, while such filters may indeed be useful to stratify chronologically different lexical layers in the compared languages, they do not disprove the possibility that the allegedly “inherited” terms may simply represent earlier cases of borrowing that took place between the ancestors of Turkic, Mongolian, Manchu-Tungusic, and other “Altaic” languages in early history or even in prehistoric times.

(6)Criticism on grounds of general methodology. Supporters of the Altaic theory are often criticized for being over-permissive and subjective in their approach to data, usually choosing a “pro-Altaic” solution in cases when the evidence is really ambiguous. In return, “pro-Altaicists” frequently accuse their opponents of a “splitter” approach, in which genetic relationship of two or more linguistic units is only considered acceptable once all other options (such as accidental similarities and/or areal contact) have been disproved beyond the smallest degree of doubt—something that is barely possible when dealing with time depths on a scale of more than five or six millennia.

On the whole, Altaic linguistics is rich with open debate, triggered by each new significant publication in the field. Despite the fact that more than half a century of detailed discussions on the issue have not resulted in any form of scientific consensus, such discussions are generally acknowledged to be of great theoretical and methodological importance, as they lead to new insights on the capacities and limitations of the comparative method in recovering elements of human linguistic prehistory that predate relatively “easily” reconstructible proto-languages (such as Proto-Indo-European or Proto-Semitic). In their turn, any new breakthroughs in the methodology of comparative linguistics, including such areas as quantitative methods, diachronic semantics, typology of phonetic change, etc., will be of the utmost importance to the resolution of the ongoing “Altaic controversy.”

8. Critical Analysis of Scholarship

Available literature on the Altaic problem is not only extremely vast (not to mention scholarly works on the various branches of Altaic, which are even more numerous), but also tends to be highly subjective, usually written from a strongly expressed “pro-Altaic” or “anti-Altaic” perspective, so that making specific recommendations for getting an objective picture and keeping up to date with the latest progress in Altaic linguistics is not an easy task. Here we list only a very small selection of available works, focusing on either “landmark” publications or general studies with large bibliographies, from which the reader may proceed to more specialized publications if necessary.

For a general overview of the various “Micro-Altaic” languages and their respective philological traditions, De Rachewiltz and Rybatzki (2010) may serve as a good entry point. A brief history and summarization of the Altaic hypothesis may be found in two short, but highly informative papers that, in a rare occasion, try to maintain a neutral, objective stance: Vovin (1999) and Georg et al. (1999).

The classic “Poppe / Ramstedt model” of Altaic which still remains more popular among certain Altaicists than the newer “Moscow school” model is, of course, best described in the works of Nicholas Poppe and Gustav Ramstedt themselves, most accessibly in Poppe (1965). Less than a decade after the publication of that work, Altaic was integrated with Japanese evidence in Miller (1971), still arguably the most significant English-language study on the possible Altaic origins of that language. Various criticisms of this model are scattered through the works of Gerard Clauson, Gerhard Doerfer, and other scholars; among other works, the essays collected in Sinor (1990) provide some good insights into the nature of these criticisms.

Since the early 1990s, a series of new important studies on the Altaic problem have been written by scholars belonging to the so-called Moscow school of comparative linguistics, most importantly, Sergei Starostin, Anna Dybo, and Oleg Mudrak. Many of these studies are in Russian, but their culminating effort—“Etymological Dictionary of the Altaic Languages” (EDAL), written by all three authors—is easily available in English and may serve as a representative indicator of the current state of affairs in Altaic etymology. However, it should not be taken at face value, and is best consulted along with critical works written from both a “pro-Altaicist” perspective (e.g., Robbeets, 2005) and an “anti-Altaicist” one. Concerning the latter, a good summarizing example of contemporary thought on the Altaic problem is a comparison of Vovin (2005) (detailed, multi-faceted criticism of EDAL from all possible points of view) and Dybo and Starostin (2008) (an equally detailed reply to all of Vovin’s points).

Links to Digital Materials

The Tower of Babel: Includes a full searchable version of “Etymological Dictionary of the Altaic Languages,” as well as a collection of freely accessible publications on Altaic linguistics.