The Caucasian languages

The Caucasus mountain range (fig.1) is a linguistic
wonderland (though definitely not the kind of place I want to live, with
all that political trouble and economic problems; field linguists working
in this area require the courage and caution of a war reporter), with
about 50 different languages in an area the size of France. There are
almost as many languages in the Caucasus as there are in the rest of
Europe. Some of these languages belong to families found elsewhere. The
Indo-European family is represented by Armenian, Ossetian,
Talysh and Tati (all Iranian languages except Armenian),
and of course the modern lingua franca of the area,
Russian. The Azeri, Balkar, Karachay,
Kumyk and Nogay languages belong to the Turkic family,
and Aisor, a descendant of Aramaic, is a Semitic language.

Most languages of the Caucasus, however, belong to three families not
found elsewhere. These (about 40) languages are the ones meant when
linguists speak of Caucasian languages (as is done in this essay).
They are somtimes also called Palaeo-Caucasian, Old
Caucasian or, especially in eastern Europe, Ibero-Caucasian
(the 'Ibero-' in the latter name has nothing to do with the Iberian
peninsula, but refers to the ancient kingdom of Iberia in what is now
Georgia; the term is deprecated because it is so misleading).

Fig. 1. Ethnolinguistic groups in the Caucasus region.
(Source: CIA)

Classification of the Caucasian languages

This classification follows Catford (1977), numbers of speakers according
to the 1970 Soviet Union Census. Some languages will have more speakers
today, some less. Several languages also have speakers outside the
former USSR.

South Caucasian (SC, Kartvelian)

Relationships of Caucasian families?

One ought to keep in mind that these three families are distinct
families and not branches of a single "Caucasian" family. No
relationship between any two of the three families has been established
so far. Nor has any of them been proven to be related to any family
elsewhere in the world.
Many authors have raised such claims, but most of them (especially those
who claim that "Caucasian" was related to Basque or whatever, without
realizing that there are three different families) can be dismissed out
of hand. Any serious relationship hypothesis either involves only
one of the three families, or if it involves more than one, all
involved families have to be considered separately.

One of the first scholars to propose a relationship between a Caucasian
family and a non-Causasian family was Franz Bopp, one of the founding
fathers of Indo-European comparative linguistics, who assumed in 1847
that SC was related to Indo-European. The idea of such a relationship is
also entertained by the proponents of the 'Nostratic' hypothesis, who
group SC with Indo-European, Uralic, Altaic, Afro-Asiatic and Dravidian.
Some linguists also assume that the Hurrian and Urartean languages (two
related ancient languages of eastern Anatolia) are related to NEC. None
of these proposals have won much acceptance.

Common features of Caucasian languages

Even though the three Caucasian families are not known to be related to
each other, there are some similarities between them: they form a
Sprachbund or convergence area. However, they are also
very different from each other in many respects.

Phonology

Consonants

The Caucasian languages are famed for their rich consonant inventories.
All of them have labial, dental, velar and uvular stops as well as
alveolar, postalveolar and uvular fricatives. There are voiceless,
voiced and glottalized (ejective) stops.

The actual number of consonants varies from language to language.
Georgian has the smallest consonant inventory
(fig. 2); the other languages add more consonants to
it. The NWC languages have labialized and palatalized consonants as well
as lateral fricatives; the latter are also found in many NEC languages.
The most copious inventory is that of Ubykh
(fig. 3), with 80 consonants.

Labial

Dental

Alveolar

Postalveolar/Palatal

Velar

Uvular

Glottal

Aspirated stops

p

t

ts

tʃ

k

Glottalic stops

p'

t'

ts'

tʃ'

k'

q'

Voiced stops

b

d

dz

dʒ

g

Voiceless fricatives

s

ʃ

χ

h

Voiced fricatives

v

z

ʒ

ʁ

Nasals

m

n

Liquids

lr

Semivowels

w

j

Fig. 2. Consonant inventory of Georgian.

The SC languages are also famed for their formidable consonant clusters;
forms such as Georgian brts'q'inva 'to shine' are nothing unusual.
The NWC languages are also rich in consonant clusters, though not quite
as excessive ones as SC, and the NEC languages have few clusters.

Labial

Alveolar

Alveolo-palatal

Palato-alvolar

Retroflex

Palatal

Velar

Uvular

Glottal

Voiceless stops

ppʕ

ttw

[k] kjkw

qqʕqjqwqʕw

Voiced stops

bbʕ

ddw

[g] gjgw

Glottalized stops

p'pʕ'

t'tw'

[k'] kj'kw'

q'qʕ'qj'qw'qʕw'

Voiceless affricates

ts

tɕtɕw

tʃ

tʂ

Voiced affricates

dz

dʑ

dʒ

dʐ

Glottalized affricates

ts'

tɕ'tɕw'

tʃ'

tʂ'

Voiceless fricatives

f

s

ɕɕw

ʃʃw

ʂ

x

χχʕχjχwχʕw

h

Voiced fricatives

v

z

ʑʑw

ʒʒw

ʐ

γ

ʁʁʕʁjʁwʁʕw

Nasals

mmʕ

n

Laterals

lɬɬ'

Trill

r

Semivowels

wwʕ

j

Fig. 3. Consonant inventory of Ubykh.

Vowels

The Caucasian languages also have the reputation of being poor in
vowels. This, however, is undeserved. Georgian has an ordinary
five-vowel system /aeiou/; some
languages, such as Chechen, have even more. Only the NWC languages can
be said to have only two vowel phonemes, a high one and a low one.
However, these vowel phonemes have broad ranges of allophones, depending
on the neighbouring consonants, which are spread about the whole vowel
space.

Orthography

The only Caucasian language with a long-standing tradition of writing is
Georgian, which is written in its own beautiful script known as
Mχedruli (fig. 4; there are also other
variants of this alphabet that look quite different and are little used
today; the term Mχedruli refers specifically to the one
chiefly used in modern Georgia). It has been a literary
language since about 400 A.D.; the Georgian alphabet is specifically
designed for the Georgian language and the spelling is mostly phonemic.

The other Caucasian languages had no official status in Czarist Russia,
and were only occasionally written (using Georgian, Arabic or Cyrillic
letters) or, in most cases, not at all. This changed after the
revolution, when, at least nominally, all languages of the USSR were
granted equal rights (though in practice, Russian remained the primus
inter pares). Several Caucasian languages were developed as written
languages taught in schools, first using the Latin alphabet but later the
Cyrillic alphabet. Some languages gained official status in various
autonomous republics and provinces set up in the Caucasus area by the
Soviet government.

Fig. 4. The Georgian alphabet. (Source: Omniglot)

Morphology

All Caucasian languages are rich in inflection, but the inflectional
systems in the three families are very different from each other. The
NWC languages have only small numbers of noun cases (1 to 4),
while many NEC languages have very many (up to 40 and more, most of them
bimorphemic local cases). The SC languages are somewhere in the middle,
with similar case inventories as those found in the older Indo-European
languages. Almost all Caucasian are ergative at least to some
degree. Most NEC languages have a system of noun classes.

The verb is always inflected. The NWC languages have especially
complex verb morphology. The verb agrees with subject and object in
person and number; there are also many tenses, moods and other
inflections. The NEC and SC verbs also show complex morphology; in most
NEC languages, verbs show noun class agreement.

Syntax

Typical for Caucasian languages is, as already mentioned, the ergative
construction: the subject of an intransitive verb is treated like the
object of a transitive verb. The word order is SOV in most
languages; most Caucasian languages use postpositions
rather than prepositions.

Northwest Caucasian

The NWC family consists of five languages, one of which is now extinct.
Abkhaz is spoken in Abkhazia, which nominally forms the
northwestern tip of the Republic of Georgia, but is actually under
Russian control. Abaza is closely related to Abkhaz. It is
spoken in the Karachay-Cherkess Republic of Russia. Ubykh was
formerly spoken near Sochi, Russia. It went extinct in 1992, when its
last native speaker, Tevfik Esenç, died. Adyghe is spoken
in the Republic of Adyghea, Russia; its close relative Kabardian
in the Kabardino-Balkar and Karachay-Cherkess Republics, Russia. When
Russia conquered the Caucasus in 1864, the majority of NWC speakers
(including all Ubykhs) emigrated to Turkey, where some NWC-speaking
communities still exist today.

Phonology

The NWC languages are known among linguists for their large consonant
inventories, ranging from 47 in Kabardian
(fig. 5) to 80 in Ubykh (fig. 3).

Labial

Alveolar

Alveolo-palatal

Palato-alvolar

Alveolarlateral

Palatal

Velar

Uvular

Pharyngeal

Glottal

Voiceless stops

p

t

kw

qqw

Voiced stops

b

d

gw

Glottalized stops

p'

t'

kw'

q'qw'

ʔʔw

Voiceless affricates

ts

tʃ

Voiced affricates

dz

dʒ

Glottalized affricates

ts'

tʃ'

Voiceless fricatives

f

s

ɕ

ʃ

ɬ

xxw

χχw

ħ

Voiced fricatives

v

z

ʑ

ʒ

ɮ

γ

ʁʁw

Glottalized fricatives

f'

ɕ'

ɬ'

Nasals

m

n

Trill

r

Semivowels

w

j

Fig. 5. Consonant inventory of Kabardian.

The vowel inventories are small. The NWC languages are often described
as having only two vowel phonemes, a high one and a low one.
These two phonemes, however, have different allophones depending on the
adjacent consonants (fronted next to palatalized consonants, rounded next
to labialized consonants). Thus, the high vowel can be realized as [i],
[ɨ], [u]; the low vowel as [ɛ], [a], [ɔ]. This
means that on the phonetic level, the langugaes show rather normal
vowel inventories. It is likely that Proto-NWC had a normal vowel
inventory, but transferred the features of fronting and rounding from the
vowels to the neighbouring consonants.

Stress is distinctive in the NWC languages, e.g. Abkhaz
'aχwaʂa 'unfortunate',
a'χwaʂa 'Friday',
aχwa'ʂa 'piece'.
The languages have mostly open syllables and rather complex
morphophonemic alternations.

Morphology

Nouns

Nouns in NWC are inflected for case, number and definiteness. The case
systems are rather simple. Abkhaz has only an unmarked absolutive and an
adverbial case. Adyghe has a four-case system
(fig. 6). Possession is indicated by pronominal
prefixes.

Indefinite

Definite

Singular

Plural

Singular

Plural

Absolutive

-Ø

-xa(-r)

-r

-xa-r

Oblique

-Ø/-ə

-m-a/-xa-m(-a)

-m

-m-a/-xa-m(-a)

Instrumental

-tʃ'a

-xa-tʃ'a

-m-tʃ'a

-xa-m-tʃ'a

Adverbial

-(a)w

-xa-w

-(a)w

-xa-w

Fig. 6. Nominal inflection in Temirgoi Adyghe.

Personal pronouns distinguish gender in the 2nd and 3rd persons
(fig. 7). Only Abkhaz and Abaza have true 3rd person
pronouns; Ubykh and the Circassian languages use demonstratives instead.
Demonstratives in Ubykh distinguish 'this' and 'that'; in the other
languages, 'this (near me)', 'that (near you)' and 'that (yonder)'.

Singular

Plural

1st

sa

ħa

2nd masc.

wa

ʃwa

2nd fem.

ba

ʃwa

3rd masc.

ja

da

3rd fem.

la

da

3rd non-human

ja

da

Fig. 7. Personal pronouns in Abkhaz.

Verbs

Verbs show very complex morphology in the NWC languages. They are
conjugated for subject, direct object and indirect object. There are
three sets of conjugation prefixes: set I for intranstive subjects and
transitive objects (all NWC languages are ergative), set II for indirect
objects, set III for transitive subjects (fig. 8).

Syntax

The NWC languages have SOV word order and are postpositional. Genitives
precede the nouns, but most adjectives follow. Subordinate clauses
precede the main verb; the verb in the subordinate clause appears in an
infinite form called a converb (similar to Turkish or Japanese).

Because NWC languages are ergative, intransitive and transitive
clauses are constructed differently, and different verb
inflections are required whether an object is present or absent,
as in the following example from Abkhaz:

The NWC languages do not use finite subordinate clauses;
instead, non-finite verb forms (participles and converbs) are
used.

An example of an Ubykh relative clause:

(2)

za-'mʕa
Ø-t-χja-'wə-s-twə-nan-apple(-ABS) it-whom-for-you-I-give-DYN-NON.FIN.PRES
'the man for whom I give you an apple'

An example of an Abkhaz complement clause:

(3)

wə-s
Ø-a'χjə-j-ħwa-Ø-zthat-as it-that-he-say-PAST-NON.FIN
'It is a lie that he spoke thus.'

Northeast Caucasian

The NEC family is deeply divided into two branches: Nakh and
Daghestanian. Nakh and Daghestanian languages are different
enough in their vocabularies that some linguists doubt their
relationship, but that is a minority position; anyway, the languages are
similar enough to be treated together (there are some typological
differences, though). The Nakh group consists of three languages:
Chechen, spoken mainly in the Chechen Republic (Russia),
Ingush in the neighbouring Ingush Republic (Russia), and
Bats, spoken in the Pankishi valley in northeastern Georgia.
The Daghestanian group consists of 26 languages, most of which are spoken
in the Republic of Daghestan (Russia). The most important Daghestanian
languages are Avar and Lezgian. Some Daghestanian
languages are only spoken in a single village.

No external relationship of the NEC languages has been established so
far. Some linguists have proposed a relationship of NEC with two
interrelated extinct ancient languages of eastern Anatolia,
Hurrian and Urartean. Another relationship candidate is
Burushaski, an isolate in northeastern Pakistan, which is
typologically similar to the Daghestanian languages. However,
typological similarity is not sufficient to establish relationship.

Phonology

The NEC languages have rich consonant inventories, though less so than
the NWC languages. As in other Caucasian languages, there are ejective
and uvular consonants as well as two or more series of sibilants. Many
NEC languages have lateral fricatives as well as 'intensive' (tense or
geminate) consonants. The inventory of Avar
(fig. 10) is given below.

Labial

Alveolar

Palato-alveolar

Lateral

Palatal

Velar

Uvular

Pharyngeal

Glottal

Voiceless stops

p

t

kk:

q:

Glottalic stops

(p')

t'

k'k':

q':

ʔ

Voiced stops

b

d

g

Voiceless affricates

tsts:

tʃtʃ:

tɬ:

Glottalic affricates

ts'ts':

tʃ'tʃ':

tɬ':

Voiceless fricatives

ss:

ʃʃ:

ɬɬ:

x:

χχ:

ħ

h

Voiced fricatives

z

ʒ

ʁ

ʕ

Nasals

m

n

Approximants

w

r

l

j

Fig. 10. Consonant inventory of Avar.

Some languages also have pharyngealized, labialized and palatalized
consonants. The vowel systems vary much. Avar has aeiou; some languages have front rounded
vowels, back unrounded vowels or both. Khinalug and the Nakh languages
also have diphthongs. Some Daghestanian languages are tonal.

Morphology

Nouns

Nouns in NEC languages are grouped into noun classes. These are usually
not marked on the noun itself (with exceptions such as Avar w-ats:
'brother', j-ats: 'sister'), but adjectives and verbs agree to
them. The number of noun classes varies from language to language. Most
languages have three (male human, female human, non-human) or four
classes. The Nakh languages have more; Tabasaran has only two (human
vs. non-human); Lezgian, Agul and Udi have no noun class disctinction at
all. There is thus a cline running from the northwest (many classes) to
the southeast (few classes) within the family.

bəd

Ø-iʔeru

oʒe

'this small boy'

bodu

j-iʔeru

kid

'this small girl'

bəd

j-iʔeru

tselu

'this small drum'

bodu

b-iʔeru

wə

'this small dog'

bəd

r-iʔeru

tʃ'it'

'this small knife'

Fig. 11. Noun class agreement in Hunzib NPs.

Fig. 11a. Noun class inventory sizes in NEC.

The nouns themselves inflect for case and number. Usually, the base
stem is also the absolutive singular. From this stem, the oblique
singular stem is formed with a suffix, the absolutive plural with a
different suffix, and the oblique plural stem from the absolutive
plural. There are also different patterns.

The Nakh languages have case inventories comparable to those of SC and
the older Indo-European languages. Most Daghestanian languages, in
contrast, have very rich inventories of local cases. These cases are
formed by the combination of suffixes indicating orientation and
direction, and in some languages, also a non-distal/distal
opposition (fig. 12).

Non-distal

Distal

Essive

Allative

Ablative

Essive

Allative

Ablative

'in'

-a:

-a:-r

-a:j

-a:z

-a:z-a-r

-a:z-aj

'among'

-ɬ

-ɬ-er

-ɬ-a:j

-ɬ-a:zz

-ɬ-a:zz-a-r

-ɬ-a:z-aj

'on (horizontal)'

-tɬ'(o)

-tɬ'o-r

-tɬ'-a:j

-tɬ'-a:z

-tɬ'-a:z-a-r

-tɬ'-a:z-aj

'under'

-tɬ

-tɬ-er

-tɬ-a:j

-tɬ-a:z

-tɬ-a:z-a-r

-tɬ-a:z-aj

'at'

-x(o)

-xo-r

-x-a:j

-x-a:z

-x-a:z-a-r

-x-a:z-aj

'near'

-de

-de-r

-d-a:j

-d-a:z

-d-a:z-a-r

-d-a:z-aj

'on (vertical)'

-q(o)

-qo-r

-q-a:j

-q-a:z

-q-a:z-a-r

-q-a:z-aj

Fig. 12. Local cases in Tsez.

Verbs

Verb morphology in NEC languages is simpler than in the NWC languages.
Verbs usually inflect for tense, aspect and mood, and in many languages,
take class prefixes agreeing with the noun class of the absolutive
argument. Some of the more southerly languages have person-number
markers on the verb.

Syntax

The NEC languages allow for much freedom in word order, due to the
extensive case marking; the basic, unmarked order is SOV. Adjectives and
genitives precede the noun, postpositions follow it. Objects
are focused by OVS order, subjects by OSV order, as in the
following Archi examples:

South Caucasian

The South Caucasian (Kartvelian) family consists of four languages.
Georgian is the largest Caucasian language. It is the official
language of Georgia and the only Caucasian language with a long-standing
literary tradition, written in an alphabet of its own
(fig. 4). Mingrelian (also spelled
Megrelian), spoken in northwestern Georgia, is closely related to
Laz, which is spoken mainly along the Black Sea in Turkey near the
Georgian border, and also in a small area in the southwestern corner of
Georgia. Svan in northern Georgia is the most distantly related
member of the family.

Phonology

The Kartvelian languages are the least consonant-rich of the Caucasian
languages. The consonant inventory of Georgian is shown in
fig. 2; the other languages of the family have
similar inventories. These consonants form extensive clusters. Georgian
has the five vowels aeiou.
Mingrelian adds schwa, Laz front rounded vowels, Svan both. Vowel
gradations similar to Indo-European ablaut occur in the SC languages;
Swan also has i-umlaut.

Morphology

The South Caucasian languages, like all Caucasian languages, are rich in
inflectional morphology. They are intermediate between the NWC
languages with their rich verbal and less rich nominal morphology, and
the NEC languages with their simpler verbs and more richly inflected
nouns. Of the three Caucasian families, South Caucasian is the one most
similar to Indo-European.

Nouns

Nouns in SC languages are inflected for case and number. There are no
genders or noun classes. The numbers are singular and plural; the case
systems are more developed than in NWC, but not as luxuriant as in the
Daghestanian languages. They are indeed comparable to the systems of the
older Indo-European languages (fig. 13).

Singular

Plural (old)

Plural (modern)

Nominative

kali

kalni

kalebi

Vocative

kalo

kalno

kalebo

Ergative

kalma

kalta

kalebma

Genitive

kalis

kalta

kalebis

Dative

kalsa

kalta

kalebsa

Instrumental

kalit

kalta

kalebit

Adverbial

kalad

kalta

kalebad

Fig. 13. Nominal inflection in Georgian (kali 'woman').

The case alignment in Georgian is split. In the present tense, the case
marking is accusative, with the dative case used as an accusative. In
the aorist, it is split-S. Direct objects of transtitive verbs and
subjects of stative verbs are in the nominative, while subjects of
transitive and of active intransitive verbs are in the ergative.

Verbs

The Kartvelian verb is a complex affair, and this section will only touch
on the basics. The verb is inflected for tense, aspect and mood as well
as the person and number of subject and object. There are numerous
prefixes, suffixes and circumfixes.

Subject

Object

Singular

Plural

Singular

Plural

1st person

v-

v- -t

m-

gv-

2nd person

Ø/χ-

Ø/χ- -t

g-

g- -t

3rd person

-a/o/s

-n

Ø-

Ø- -t

Fig. 14. Personal affixes in Georgian.

The tense/aspect/mood forms ('screeves', from Georgian mts'k'rivi
'row') of Georgian are grouped in three series:

The three series show different morphosyntactic alignment. In the
present series, it is accusative, using the nominative case to mark
subjects and the dative case to mark direct objects. In the aorist
series, it is active/stative, using the ergative to mark agents and the
nominative case to mark patients. In the perfect series, it is also
active/stative, but agents appear in the dative case. An exception to
this are verbs of perception and emotion, which always have dative
subjects and nominative objects.

Syntax

Of all three Caucasian stocks, the Kartvelian languages are
syntactically most like the Indo-European languages. There are,
for instance, relative pronouns, a feature that is common in
Europe but rare anywhere else:

Modern Georgian is a mostly head-final SOV language with postpositions.
Old Georgian is less consistently head-final: modifiers often follow
their heads. Old Georgian genitives show suffixaufnahme,
i.e. they agree with the head noun in case and number. Example: