Language Evolution

06 November 2014

Here is a fascinating infographic presentation posted at Lovely Little Lexemes (hat tip to Mrs. B!). A curious (and probably unique) relationship can be observed between the United Kingdom and Poland: the most common second language in the UK is Polish, and the most common second language in Poland is English.

07 October 2014

Neuter nouns
with the suffix *-wr̥/*-w(e)n- are relatively rare in most branches of Indo-European.
The only group where they can be found in great numbers is Anatolian. In
Hittite, the suffix productively formed
verbal nouns (names of actions), but there are also examples of nouns that had become independent lexical units, no longer
bound to a particular verb paradigm. They had usually acquired a concrete
meaning (referring to a thing or substance rather than an abstraction). One of
such nouns is Hitt. pahhur/pahhuen- ‘fire’, evidently an ancient word, preserved
in many branches of the family and showing evidence of archaic vowel
alternations and mobile stress: nom/acc.sg. *páh₂wr̥, gen.sg. *ph₂wéns, etc. It
may be etymologically connected with the verb *pah₂- ‘guard, protect’, but it’s doubtful
if even the speakers of Hittite were still aware of any such connection: the semantic
distance between the verb and its derivative was already too great.

Outside
Anatolian, the suffix does not play any major role. The nouns that contain it
are scattered remnants of a Proto-Indo-European pattern of word-formation.
Their attestation is very uneven. They are quite well represented in Sanskrit
and Greek, but only isolated examples are found elsewhere (the ‘fire’ word,
which became part of Indo-European basic vocabulary sufficiently early, is
exceptionally well attested). Here are a few typical *-wr̥/*-w(e)n- nouns
evidently connected with known verb roots:

Their
reflexes in the historically documented languages rarely display the whole range of vowel, consonant and stress variations, most of which were levelled out analogically in prehistoric times. Still, these alternations are reconstructible thanks to the fact that different fragments of the pattern have been preserved
in different languages. They can be reassembled into a complete picture like
the pieces of a jigsaw puzzle or the disarticulated skeleton of a fossil animal.

Got wheels?
A four-wheeled toy from the Cucuteni-Trypillian culture;
the early fourth millennium BC.

Neuters of
this kind formed collectives by inserting a lengthened *ō into the suffix. The
collective of a count noun denotes simply a set of objects (a collective
plural), while the collective of a mass noun like ‘fire’ denotes a particular quantity or sample of the thing in question (‘a fire, a burning mass’). This became one of the derivational mechanisms
by which Indo-European mass nouns could be transformed into count nouns. The accent was commonly shifted to the suffix in the process,
causing the reduction of the root vowel: *páh₂wōr (collective) > *ph₂wṓr > *pwṓr (a countable neuter with its own
case forms such as gen.sg. *p(h₂)un-és). Still later, the distiction between
the original mass noun and its collective could be blurred and abandoned, the younger form ousting
the older and serving in both functions (‘fire’ or ‘a fire’). The archaic Proto-Indo-European form *páh₂wr̥ is
unambiguously preserved only in Anatolian, while the remaining Indo-European
languages show reflexes of *pwṓr or its further modified descendants.

Now we can view the reconstruction *kʷét-wr̥ in this light. Supposing it was derived from our hypothetical verb root *kʷet- ‘group into pairs’, the original meaning of *kʷétwr̥ (as a nomen actionis) would be something like ‘pairing’, and its collective *kʷétwōr would mean ‘a particular result of pairing, a complete set organised into pairs’. In the Proto-Indo-European world, there were many “natural” sets of things conceptualised as consisting of two pairs: human hands and feet; fore and rear legs of animals; the wheels of a wagon; the four directions, whether cardinal (east and west, north and south) or relative (forward and backwards, left and right); paired organs of perception (two eyes and two ears). This could have provided sufficient motivation for treating ‘4’ as the prototypical case of an “even collective”. An interesting parallel can be seen in the “fraternal” numeral systems widespread in Amazonia. In the languages that employ them, the numeral ‘4’ is derived from an expression meaning ‘each has a brother/companion/spouse’. At a more primitive stage, preserved in the Dâw language, there are only three “exact” lexical numerals, ‘1’, ‘2’, and ‘3’. The values from 4 to 10 are described as ‘even’ (‘has a brother’) or ‘odd’ (‘has no brother’). The precise value can’t be expressed linguistically, but the words ‘even’ and ‘odd’ can be supplemented by clarifying hand gestures:

Dâw
speakers indicate ‘four’ by holding the fingers of one hand separated into two
blocks; for ‘five’, they add the thumb; for ‘six’, they place the second thumb
against the first to make a third pair; and so on until for ‘ten’ all fingers
are grouped into five pairs, the thumbs together.

[Epps
2006: 265]

Once
established as a concrete numeral (rather than part of an even-odd tally
system), *kʷétwōr (or *kʷətwṓr) was interpreted as an ordinary neuter plural, and –
like the numerals ‘1’, ‘2’, and ‘3’ – formally an adjective, inflected not only
for case but also for gender. This resulted in the analogical creation of the
animate plural in *-wor-es (and the periphrastic feminine ‘four females’, soon
univerbated and phonetically mutilated in the process). Note that if the
adjective had been formed directly from the verbal noun *kʷétwr̥/*kʷ(ə)twén-,
its animate plural would probably have ended up as *kʷet-won-es. In addition to the Greek
and Vedic words for ‘fat’, already discussed, compare Greek peîrar (gen. -atos)
‘boundary’ < *pér-wr̥/*pr̥-w(e)n- versus the Homeric adjective a-peírōn (animate) ‘boundless,
endless’ < *n̥-per-wōn.

All this
suggests that the word *kʷétwr̥ (coll. *kʷétwōr) was transparently derived from a verb root and
adopted as a cardinal numeral at a rather late date, perhaps in “Core Indo-European”
(the non-Anatolian part of the family) rather than in Proto-Indo-European
proper. It is a well-known fact that Anatolian has a different word for ‘4’,
*meju- (Hittite meu-/meyau-, Luwian māwa-). Since the jury is still out on
whether Hittite kutruwa(n)- ‘witness’ has anything to do with the numeral ‘4’*),
we should seriously consider the possibility that the familiar reconstruction *kʷetwores
is not Proto-Indo-European at all but represents a “dialectal” innovation which
replaced its older synonym in the common ancestor of Tocharian and the extant branches of the family.

If this
were a journal article rather than a blog post, I would now be obliged to account
for every puzzling irregularity in the branch-specific reflexes of *kʷetwores and
its variants. I will spare my visitors such excruciating details, but if anyone
is really interested in discussing them, welcome to the Comments section.

And now
back to other matters – next time.

*) A
witness in court could be denoted as ‘the fourth man’ (beside the
two contracting parties and the judge).

Reference

Epps, Patience. 2006. “Growing a numeral system: The historical development of numerals
in an Amazonian language family”. Diachronica 23(2): 259-288. [a
preprint version is available here]

02 October 2014

The Latin
adjective triquetrus ‘triangular’ (neuter -um, feminine -a) is baffling. It’s
obviously a compound, and it obviously contains the compositional form of the
numeral ‘three’, *tri-. What else it contains is anything but obvious.
Unfortunately, it’s the only specimen of its kind. The mysterious element
-quetrus does not occur in any other Latin compound. It looks as if it could
have something to do with quattuor ‘four’. When ‘four’ occurs as the first part
of a compound, it has the shape quadru/i-. This form must somehow go back to *kʷətwr̥-,
its metathetic variant *kʷətru-, or a hybrid combination of both, but the voicing of the *t
is odd, not to say perverse, because its exact opposite, *dr > tr, was a regular
change in the prehistory of Latin. The word ‘four’ is evidently such a fickle fellow that
it just can’t resist breaking some established rules. For greater
inconsistency, the adverbial numeral quater ‘four times’, which in other IE
languages (and presumably in Latin as well) derives from *kʷ(e)twr̥-s ~
*kʷ(e)tru-s, shows no voicing. We see a voiced stop again, though, in the denominal verb
quadrō ‘to square; put in order, arrange’ and a few related words such as quadra
‘square piece or slice, plinth, dining table, etc.’ and quadrātus ‘square (n. and adj.)’.

The second
part of triquetrus doesn’t simply reflect *kʷetru- (or *kʷatru- < *kʷətru-),
because the word is a second-declension o-stem, which means that its pre-form
ended in *-tro- rather than *-tru-. The form *kʷetro- (or possibly *kʷatro-,
since pre-Latin *a would have merged with *e in this position) does not
otherwise occur as a variant of ‘4’ in Latin, but since we are dealing with a
capricious word-family, it’s hard to rule out a connection. If it does mean
‘four’, however, why’s that? A triangle has three sides, it has three angles,
but has it got three “fours”? It would not be strange if a word for the right
angle had something to do with squares or rectangles, and therefore indirectly
with the numeral ‘4’, but a triangle can have at most one right angle,
certainly not as many as three (the Penrose tribar, shown on the right, would be an exception if it could exist in ordinary Euclidean space).

Can
external cognates help? It’s tempting to compare triquetrus with Old English
þrifeoþor (sometimes glossed as ‘triangular’ in reference books such as Bosworth and Toller’s Anglo-Saxon Dictionary). It has been suggested earlier by
one of the commenters on this blog [Douglas G. Kilday]
that the Old English word is a loan from (unattested) Gaulish *petros ‘corner’ (<
*kʷetros), which became Germanic *feþra- after the operation of Grimm’s Law. This
tantalising suggestion, however, can’t be correct. The word þrifeoþor appears
in Old English glossaries (Corpus, Erfurt, and Épinal) three times (spelt ðrifeoðor, trifoedur, ðrifedor),
and is translated into Latin as triquadrum. One might think that triquadrum is
a distortion of triquetrum caused by “folk etymology” (the mistaken
identification of the second part as the compositional form of ‘4’), but in
fact it’s no such thing. Old English authors took the adjective triquadrus from
Orosius, a Christian priest and scholar from the Roman province of Gallaecia (today’s Galicia, Spain). Orosius, active in the first decades of the 5th century, was the author
of several enormously influential works, including
Historiae Adversus Paganos, with a chapter on the geography of the
world. Here is the relevant passage (Book 1, Chapter 2; emphasis added):

[Our elders
made a threefold division of the world, which is surrounded on its periphery by
the Ocean. Its three parts they named Asia, Europe, and Africa.
Some authorities, however, have considered them to be two, that is, Asia,
and Africa and Europe, grouping the last two as one continent.]

The epithet
triquadrus refers to “the circle of all the earth” (orbis totius terrae = the
world). Orosius certainly doesn’t mean that the Earth is a triangular circle, or
that it has three corners. He means that the landmass of the world (as he knew
it) is tripartite, divided by most ancient geographers into three continents (in
this context, quadra means ‘part, division, area’, not literally a square). Anglo-Saxon translators coined a calque, mechanically replacing Latin
quadr- with feoþor- < *kʷetwr̥-, the compositional form of Old English fēower
‘four’. Þrifeoþor was never intended to mean ‘triangular’. Its second member is
the same feoþor- (= Late West Saxon fiþer-, fyþer-) that we find as the first
element in numerous Old English compounds, e.g. fiþerfēte ‘four-footed’ (=
Latin quadrupēs).External
support for *kʷetro- thus evaporates, but triquetrus still has to be explained
somehow. I would suggest that its second element is a derivative of *kʷet- ‘join pairwise’ with the instrumental suffix *-tro-. When the suffix was added to a
root ending in a dental stop, the last segment of the root was dropped already in Proto-Indo-European (this process is known as “the metron
rule”). Thus we get *métrom (Greek métron ‘measure’) from *méd-trom (*med- ‘allot,
mete out’), and *h₁étrom (Vedic átra- ‘nourishment’) from *h₁éd-trom (*h₁éd- ‘eat’).
The noun *kʷétrom < *kʷet-trom would be ‘something that holds a pair of
things together’, hence ‘joint, connection’ or the like. There were several
Proto-Indo-European roots with similar meanings, and accordingly several nearly
synonymous nouns for things like woodworking joints; joint
itself comes (via French) from Latin iunctus ‘connected’ (the root here is
*jeug-, as in yoke). Tri-quetrus (< *tri-kʷetro-) is built exactly like tri-angulus (a noun is
used as the second member of a compound adjective without altering its stem
class), and its etymological meaning is ‘having three connections (between
pairs of sides)’.

The next
post, in which I shall return to the numeral ‘four’ itself, will be the last in this series.

29 September 2014

What kind
of noun is čët? What is its relationship to our hypothetical verb root? One
cannot avoid asking such questions when proposing an etymology. A word is more
than a root; it has a derivational history. If you add an affix to a word, you
may alter its lexical category and its meaning of the base. We already know a good
deal about morphological processes in the Indo-European languages, which means
that we can tell plausible relationships between possibly related words from
unlikely ones.

Let R be a
root morpheme. In Proto-Indo-European (and in many of the languages descended
from it), a root consists of a consonantal skeleton with a slot where a vowel can
be inserted. For example, the verb root *{w_rǵ} ‘make, work’ is normally quoted
in the form *werǵ-, called its e-grade, symbolised as R(e). Here, the slot is occupied
by the vowel *e. The same root also forms an o-grade, R(o), realised as *worǵ-,
and a zero grade, R(z), in which the vowel slot remains empty. In that case,
the liquid *r, sandwiched between two other consonants, has to play the role of a
syllable nucleus, and the root becomes phonetically *wr̥ǵ- (in the traditional
Indo-Europeanist notation, a tiny subscipt ring marks a syllabic consonant).

One of the largest
and most productive classes of PIE nominals (nouns/adjectives) were the
so-called thematic nouns (also known as o-stems). Their stem ended in the vowel
*-o-, to which inflectional endings were attached. In the simplest case, the
vowel was added directly to the root; in more complex cases it was part of a
suffix (such as *-to-, *-no-, *-tero-, *-tlo-, etc.). Somewhat surprisingly, “simple
thematic” nouns of the shape R(e)-o-
were pretty rare in the protolanguage. The neuter action noun *wérǵ-o-m ‘work,
activity’ is well supported by the agreement between Germanic *werka- (Old
English weorc, German Werk) and Greek érgon; we also have Iranian (Avestan) varəza-, with
the same stem (and meaning) but with masculine inflections. Very few such nouns,
however, are truly old. More typically, the suffix *-o- was added to R(o), as
in *wóiḱ-o- ‘house, dwelling’ (root *weiḱ- ‘enter, occupy’) and sometimes to R(z),
as in *jug-ó- ‘yoke’ (root *jeug-, already mentioned in earlier posts).

Marc
Greenber (2001) doesn’t define the morphological status of his reconstruction
*kʷet- (‘two’ > ‘pair, partner’). In some places in the article he treats it
as if it were a root noun (with no suffixes), but the simplest form we actually
find in Slavic is represented by Russ. čët (cf. dialectal Polish cot), which appears to reflect a thematic masculine noun
*kʷet-o-s ‘even number’. How could it have originated? If *kʷet- was once a
verb root (with the approximate meaning of ‘arrange in pairs, pair up’), *kʷet-o-
makes sense as a kind of action noun that has acquired a resultative interpretation:
by pairing objects together, you end up with an even number of them. (By the
way, the verb root is not entirely conjectural: we can see it in Russian četáť ‘form
pairs’.) The problem with *kʷet-o- is
that it represents a rare type of stem, at least in terms of PIE morphology. Is
it legitimate to posit it just like that?

On the
other hand, *kʷet-o- needn’t go all the way back to PIE. The deverbal formation R(e)-o- has enjoyed increased productivity in Slavic. We even
have doublets like R(o)-o- and R(e)-o-, where the o-grade variant is more
conservative (and has more external cognates), while the e-grade seems to be a younger innovation (with a more restricted distribution). Thus, the root *tekʷ- ‘run, flow’ has produced
Slavic *tekъ (as if from *tekʷ-o-s) ‘waterflow, leak, source’, which coexists
with *tokъ (< *tokʷ-o-s) ‘stream, current, flux; (figuratively) course, sequence
of events’. The former is an innovation directly connected with the Slavic verb
*tekti ‘leak, flow’ (3sg. *tečetь > *tékʷ-e-ti), whereas the latter is a relict form
which has drifted away from its etymological base, also semantically. Therefore,
if *četъ is a relatively recent derivative of a Proto-Slavic verb, it wouldn’t
be surprising if it had an o-grade cousin (possibly with a more “evolved”
meaning).

As a matter
of fact, Greenberg mentions *kotъ ‘offspring (of animals), litter’ and *kotiti
(sę) ‘have young’ as possible members of the same word-family. A connection
with the homophonous noun *kotъ ‘domestic cat’ (a European Wanderwort which spread with the introduction
of cats) is folk-etymological: the verb may be used of cats, but also of mice,
sheep, goats, roe deer, and a variety of other animals. It is used even in those
Slavic languages that have a different word for ‘cat’ (e.g. Serbo-Croatian mačka). The verb *kotiti could
be an “iterative/causative” built to the root *kʷet-. The structure of such secondary
verbs is R(o)-éje/o- (the final vowel of the stem alternates depending on which
conjugational ending is added). For example, the Slavic verb *gъnati (3sg. *ženetь)
‘drive on, drive away, rush’ has a corresponding o-grade iterative, *goniti (3sg. *gonitь) ‘chase,
run after’. These forms ultimately reflect PIE *gʷʰén-/*gʷʰn- ‘slay, kill with blows’ (a root verb, somewhat restructured in Slavic) and its PIE iterative *gʷʰon-éje/o-.
The verb *tekti (< *tékʷ-e/o-), mentioned above, forms a pair with the
causative *točiti ‘cause to flow, (cause to) roll’ (< *tokʷ-éje/o-). Note also such English
pairs as lie vs. lay, or sit vs. set, where the first member is a primary verb
and the second is its causative (e.g. ‘lay’ = ‘cause to lie’).

The stem *kʷot-éje/o-,
originally with middle-voice inflections (whose function was taken over by the
reflexive/reciprocal pronoun *sę in Slavic), would mean ‘form a couple (together)’, hence ‘mate,
have sex’, and eventually ‘reproduce, have young’. If so, *kotъ ‘litter’ is not
a senior synonym of *četъ (with a hard-to-explain change of meaning), but more
likely a separate verbal noun back-formed from *kotiti sę (the consequence of
mating), on the analogy of formally similar denominal verbs: *agniti sę ‘yean’,
*teliti sę ‘calve’, *žerbiti sę ‘foal’.

The
feminine *četa can hardly be a collective (at any rate in the meaning ‘pair’).
Not only because it refers to just two things, but also because collectives in
*-ah₂ to o-stem masculines are an archaic formation in Indo-European (as
opposed to neuter collectives, co-opted as ordinary plurals of neuter nouns and adjectives), and *četъ is
unlikely to be sufficiently ancient. But Indo-European *-(a)h₂ was not only a
collective suffix and a marker of femininity; it was also employed to coin (formally feminine) abstracts, including action nouns. Quite a few deverbal masculines in
Slavic (and more generally in Balto-Slavic) have feminine synonyms like *čarъ
~ *čara ‘sorcery, enchantment’ or *-tokъ ~ *-toka ‘flow, course’, *-sěkъ ~ *-sěka ‘cutting’ (in compounds). Note the familiar
morphological formations represented by Greek tómos ‘slice’ (result of cutting) versus
tomḗ ‘cut’ (an instance of cutting) – a nice parallel to *četъ (resultative)
vs. *četa (an individual instance of pairing).

In the
first post of this series I suggested that the stem *kʷet-w(o)r- was originally
a deverbal neuter of a familiar type. Before I develop this idea, let me briefly
suggest one other possible trace of the root *kʷet-: the second member of the Latin
compound triquetrus ‘triangular’. The next post will be about it.

26 September 2014

Jakobson’s remark about a possible connection between Russian čët and četýre is discussed in Blažek (1999: 212-213) and especially in Greenberg (2001). Both authors mention earlier, more sketchy treatments of the problem, and they both add more Slavic material to the Russian words originally listed by Jakobson (which were čët, čëtka ‘even number’, četá ‘pair, union’, and čeť ‘quarter’). Blažek also notes an interesting potential cognate in Ossetian, an Indo-European language spoken in the north-central Caucasus (Ossetian is the only living descendant of the Northeast Iranian languages once spoken by the Scytho-Sarmatian inhabitants of the Eurasian steppe belt). The word in question is cæd ‘pair of oxen yoked together’, as if from Proto-Iranian *čatā (the Digor dialect of Ossetian has preserved a more conservative disyllabic form of the word, cædæ).

Blažek does not follow up Jakobson’s suggestion (presumably because he favours a different etymology of ‘four’, proposed by Schmid 1989; see pp. 213, 215, 331 in Blažek’s book). Greenberg, however, regards it as convincing and develops it further. Like Blažek, he considers the predominantly South Slavic *četa ‘troop, military unit’ (hence Serbo-Croatian Četnici ‘Chetniks’) to be part of the word-family of čët, and tries to explain the accentual difference between the end-stressed word četá (< *četa̍) in Russian and the root-stressed South Slavic forms – Bulgarian čéta, Serbian/Croatian čȅta, Slovene čẹ́ta (< *čèta) – in order to defend their common origin.

According to Greenberg, the word ‘four’ is derived from the root *kʷet- meaning ‘two’ extended with a multiplicative suffix, so that *kʷet-wor- means ‘(two) groups of two, twice two’. Greenberg also speculates that Proto-Indo-European *kʷotero- ‘which (of two)?’ (Greek póteros, English whether) contains the same root. This is hardly a good idea, since there is no compelling reason to question the straightforward standard analysis of *kʷo-tero- as the interrogative pronoun *kʷo- plus *-tero-, the IE suffix of binary contrast. The semantic gap between ‘two’ and ‘military unit’ is bridged by Greenberg as follows: Slavic *četa originated as the collective (in *-ah₂) of a word meaning ‘two, pair’, and ‘multitude of pairs’ evolved into ‘troop, group, band (of soldiers)’.

There are serious problems with this derivation. First, (East/West) Slavic *četъ means ‘even number’, not ‘two’ or ‘pair’, while, on the contrary, the supposedly collective četá can mean ‘pair’ in Russian (beside some related meanings: ne četá, accompanied by a dative, means ‘not on a par with, superior to…’). What appears to be its exact cognate in Ossetian means ‘pair of oxen’, not, say, ‘herd of cattle’. Furthermore, while it’s true that the semantics of Russian četá covers not only ‘pair’ but also ‘troop’ (the latter attested already in Old Russian), we are probably dealing with a lexical merger between a native East Slavic word and a borrowing from Church Slavic (Czech četa ‘platoon’ is likewise a South Slavic loan, as are, ultimately, a number of similar “wandering words” in various neighbouring languages – Romanian, Hungarian, Albanian, and even Turkish). The non-attestation of intermediate meanings like ‘double column (of soldiers)’ makes it hard to justify the derivation of ‘troop’ from ‘pair’. Since the semantic difference is combined with a formal difference (conflicting accentuation), the etymology simply falls apart. It seems reasonable to conclude that the contrast between *četa̍ and *čèta is old and distinguishes two words of different origin (notwithstanding their merger in Russian). [See this comment, however.]

Jakobson’s final hypothetical relative of ‘four’, čeť ‘fourth part (of land), quarter’ (Old Russian četь ~ četъka), is in all likelihood a popular truncation of četverť (~ četvertka) < Proto-Slavic *četvьrtь ‘quarter’ < *kʷetwr̥-ti-, a noun corresponding to the widespread ordinal *kʷetwr̥-to- ‘fourth’. It is of course related to ‘four’, but in a rather trivial manner.

Etymological dictionaries often attempt to connect četa (in either sense) with the Slavic verb *čьtǫ (inf. *čisti) ‘count, reckon, read’, derived from PIE *kʷeit- ‘notice, recognise’. This verb has produced numerous derivatives in Slavic (e.g. *čislo ‘number’); some of them may be accidentally similar to members of the čët group both in form and in meaning, e.g. Old Czech čet ‘count, quantity’ (Modern Czech počet, with a prefix). Note, however, the gen.sg. čtu ~ čta. The disappearing root vowel reflects Proto-Slavic *ь (a reduced vowel continuing earlier short *i in the weak form of the root, *kʷit-). Despite their deceptive similarity, Russian četá(or čët) and Czech čet have different etymologies.

If we remove all the false or dubious cognates, we are left with just the initial material: *četъ ‘even number’, *četьnъ ‘even (of numbers)’ and *četa ‘pair’ ­– a word-family securely attested in East and West Slavic. We can safely add the Ossetian word (isolated in Iranian, as far as I know, but a perfect match for *četa, semantically and formally). There’s no evidence that the original meaning of the morpheme *čet- was ‘two’; nevertheless, it seems to have had something to do with arranging things in couples. Typologically, the Slavic “odd/even” terminology is parallel to what we have seen in Greek and Sanskrit, even if different lexical roots are involved. If so, one could expect *čet- to be semantically close to the familiar Indo-European roots *h₂ar- ‘fit together’ and *jeug- ‘yoke, connect’. I shall therefore tentatively assume that *čet- continues a verb root like *kʷet-, with the approximate meaning of ‘combine into pairs’. Let’s see if we can work from here ­– next time.

With the
vast and reliable etymological material put into circulation by Vasmer, a
number of new questions naturally arises. I should like to dwell on some
particulars.

Roman Jakobson (1955) *)

The Slavs
played at “even and odd” too. In Polish the game used to be called cetno licho
(or cetno i licho). The noun licho is still used as a mild euphemism for ‘devil’.
Czego chcesz, do licha? means “What the heck do you want?” Polish also has the
adjective lichy ‘poor, inferior, in bad shape’. Historically, licho is a neuter form
of lichy, substantivised centuries ago, when the adjective had a wider range of
meaning, including ‘mean, evil’; licho
was therefore ‘something wicked’. The phrase cetno i licho lingers on on the fringes of
literary Polish (people are at best vaguely aware that it refers to some old game
of chance), but cetno no longer occurs on its own, and has no obvious relatives
in the modern Polish lexicon.

A few
hundred years ago (most examples come from 16th-century texts) cetno and licho could mean, respectively, ‘even number’ and ‘odd
number’. Though often contrasted with each other, they were not yet harnessed
together into a fixed phrase. Cetnem (instr.sg.) or w cetnie (loc.sg.) meant ‘(occurring)
in even numbers’; likewise lichem and w lichu ‘in odd numbers’. This
usage has been completely forgotten.

Licho and
lichy go back to Proto-Slavic *lixъ‘strange, irregular, rogue’. In the modern Slavic languages it
usually has pejorative conotations (‘bad, lacking, defective, lonely’, etc.); it can also mean ‘excessive, superfluous’. The meaning of Russian lixój, however, ranges – somewhat schizophrenically – from ‘bad, sinister, hard’ to ‘daring, valiant’ (the common ancestor was ‘extraordinary’,
whether in a positive or a negative sense)’. Like semantically similar words in
other languages (Greek perittós, English odd), *lixъ developed the arithmetical meaning
of ‘odd’, which survives here and there in the Slavic branch. For example, in Czech
liché číslo means ‘odd number’. As for its origin, *lixъ < *leikʷ-so-, from the
widespread Proto-Indo-European root *leikʷ- ‘leave, abandon’.

So much for licho. Where does cetno come from? The Russian
term for “even and odd” is čët i néčet. Čët means ‘even number’ (=
čëtnoe čisló); néčet is its antonym. The adjective čëtnyj ‘even’ (of a number) is
closely related to Polish cetno. Russian č normally corresponds to Polish cz, but
some regional varieties of Polish have
merged the affricate cz /tʂ/ with c /ts/ for centuries, and the standard language has borrowed
a number of dialectal pronunciations of this kind.

On the combined
evidence of Polish and East Slavic forms we can reconstruct Proto-Slavic
*četъ (n.) and *četьnъ (adj.). Russian also has the noun četá ‘pair,
couple’, which is formally and semantically close to them. There are several
other Slavic words that might or might not be related to *četъ, but it’s wiser at this stage to
exclude more difficult material so as to avoid the risk of contaminating a reliable set of cognates with spurious ones.

Back in the
1950s, as successive volumes of Max Vasmer’s monumental Russisches etymologisches Wörterbuchwere published in Heidelberg, the great linguist Roman Jakobson (then at Harvard University) read the entire
dictionary (I mean, actually read it like a novel, page by page), jotting down comments on entries that attracted his attention. Those marginalia were published as a journal
article (see the reference below) and reprinted in Jakobson’s Selected Writings
(Volume II: Word and Language). With regard to čët and its relatives, Jakobson remarked that they “seem to be archaic relics of the same word family as četýre” (the
Russian reflex of the Indo-European numeral ‘four’). Having devoted one sentence to the
matter, he moved on to the next entry that had caught his eye, čex ‘Czech’. The idea that čët and četýre are somehow related has been picked up by several other authors,
but hitherto published attempts to analyse *kʷetwor- in this light have the usual flaws of “root etymologies”: too little attention to
morphological details, and too much imaginative semantics. Nevertheless, I
think Jakobson’s idea is worth salvaging, so I’ll review those previous
attempts and try to see if I can do any better.