Measures of Similarity for Accents, Dialects,
Related and Unrelated Languages

The Various Degrees of
Similarity/‌Difference Between The Language Varieties Covered in the
Study

More specifically, we intend (among other things) to try to
measure, to quantify,
just how similar or how different various language varieties are relative to
each other.The Quechua and Aymara
families together present a continuum of degrees of similarity/‌difference
between language varieties, from only minimally different regional ‘accents’ of
Quechua, to entirely different languages which may not even be related to each
other.This study will produce
quantified comparisons over various spans of this continuum – that is,
comparisons between pairs of language varieties showing all the various
possible degrees of difference.Of
course this is in any case a gradual scale of degree of difference, though in
the familiar terminology one might therefore talk of the following four
distinct levels, i.e. comparisons between:

•‘Accents’, or regional variation within the same
‘dialect’:for example between the
Quechua spoken in the Cochabamba
region and those of Potosí and Uyuni, all forms of the Bolivian, or indeed the
wider Cuzco-Collao ‘dialect’.

•Varieties
which while markedly different, are generally still considered ‘dialects’ belonging to the ‘same language’:for example Ecuadoran, Ayacucho and Bolivian
Quechua.

•Different
languages, though clearly genealogically
(‘genetically’) related ones, i.e.
from the same family (Quechua or Aymara):for example the Quechua of Cuzco and that of Huancayo.

•Quite
different languages, where it is not yet clear
whether they share an ultimate common origin within a single family or not, or
show similarities only due to prolonged and deep contact:for example Aymara and Quechua.

Indeed on this last and highest level, we also aim to look
into what quantified comparative data might be able to tell us that might help
elucidate the thorny issue of the nature of the relationship between the two
families – common origin or just prolonged contact.

Two Types of Comparison

This study therefore involves two different types of
comparison:

1.Between varieties for which we
are sure that they share a common origin,
i.e.:

(a)on
the one hand, a comparison of all the varieties of Quechua amongst
themselves;

(b)and
on the other hand, a separate comparison between all the varieties of Aymara
amongst themselves (i.e. including Jaqaru y el Kawki).

2.Between varieties for which we
are not sure that they share a common origin,
i.e. a comparison of any variety of Quechua against any variety of Aymara.

For the full list of fieldwork locations for which data have
already been collected, including photos of each area and of some of the
speakers of these languages, click here for my index page of fieldwork
locations.

The map and the ‘family tree’ below
currently present only the details of the Quechua family.Further details on the Aymara family (much
smaller, at least in terms of the surviving varieties for which we have
evidence), including their specification on the map and ‘family tree’, will be
added to this page in due course.

For Quechua some twenty
varieties will be studied, from Ecuador,
Peru, Bolivia and Argentina.The particular selection of varieties has
been made on the basis of three criteria, i.e. in order to offer:

•coverage
of all the various degrees of difference between varieties within the Quechua
family (accents, dialects, closely related languages) – see above;

•coverage
of all the main varieties within all the main branches of the ‘family tree’ of
the language – or rather, family of closely related languages – that is
Quechua.For more details, see the map
and ‘family tree’ structure table below, and a brief note on how different the varieties are from each other.

•most
intensive coverage of the areas considered most significant for a better
understanding of the history, origins and development of the Quechua family
(and its early contact with Aymara), that is in particular the areas whose
varieties of Quechua are in some senses ‘intermediate’ between the two
principal branches of the family:Pacaraos, Yauyos, etc.

Applying the same principles to Aymara,
the study will cover:

•at
least three forms of southern (or ‘Altiplano’) Aymara, one for each of its
principal varieties;

•for
central (or ‘Tupino’) Aymara:Jaqaru,
and – to the extent that it is still possible to obtained reliable data for
this all but extinct variety – Kawki.

For more information on Jaqaru and Kawki, particularly an
in-depth look at the question of their endangerment and, for Jaqaru, the
chances of long-term survival (Kawki is sadly already doomed), click to read
the following article, in Spanish, by Dante Oliva
León:Jacaru
y Cauqui, al Borde del Silencio.

We also aim, if possible, to collect data for the Bolivian
Andean language Uru‑Chipaya, apparently
unrelated to either Quechua or Aymara.

The tree below is based on the one in the book Lingüística
Quechua, alias Cerrón-Palomino
(2003),
which appears in turn to have been based on the first two main works on the
Quechua family tree, namely Torero (1964)andParker (1963).Both of these authors came to very similar conclusions, though
apparently arrived at independently by two different linguists at around the
same time.

However, it
should be noted that this is not the only view of the relationships between Quechua
dialects.The Ethnologue classification
puts Pacaraos Quechua in the QII, not the QI group, for instance.Indeed, in his doctoral thesis, Landerman (1991)
fairly convincingly calls into question even the fundamental distinction
between the two main branches of the family tree, QI and QII.Once we have our own results from this comparative study, we hope
to be able to contribute significantly to the debate ourselves.

Those varieties it is proposed to include in the lexical and phonetic
comparisons are shown underlined.
Where more than one sub-variety is to be covered, this is indicated by the
number in parentheses, e.g. [3].

For where these varieties are spoken, see the dialect map
above.

Those varieties for which reliable descriptive grammars
exist are shown in italics.
These are the ones I would propose to cover in the morphosyntactic comparisons.

Data are
being collect in order to make a detailed comparison of these varieties in
three aspects:

•In
their basic lexicon, based
on a list of some 300 word-meanings:click here to see a preliminary version of our full meaning list.We have deliberately selected our list of
meanings to cover in order for it to be as fully compatible as possible with
similar lists already well known and used in studies on various language
families around the world, including 100 and 200‑word lists first drawn
up by Swadesh (1952), and the modified 200‑meaning
version drawn up by Dyen, Kruskal and Black (1992).We are also in the process of adapting these
as appropriate to the cultural and linguistic context of the Andes (and to a
certain extent also Amazonia).The
meanings also include many which have been identified by Lohr (1999) and Yakhontov (as reported in Starostin (1991:59-60) as those that appear to be
generally resistant to being borrowed from one language to another, as well as
many other meanings known to be more susceptible.Particular attention will be focused on the
issue of possible cases of word borrowing, and how this might be identified by
specific techniques, including statistical ones developed initially in
genetics, for processing and analysing the comparative data.

•In
their phonetics, based on
the pronunciation of a sample list of some 100 ‘pan-Quechua’ cognates (and a
different 100 ‘pan-Aymara’ ones) from among the 200 in the lexical
comparison.For details on the method
being used to produce quantifications of phonetic similarity, and examples of
the results it produces for Romance varieties and a set of Indo-European
languages, see Heggarty (2000).

•In
certain aspects of their basic
inflectional morphosyntax, for both nounsandverbs
– which in these highly agglutinating languages principally means their
morphology.Details on the method being
used to produce quantifications of similarity in basic morphosyntax are
contained in Paul Heggarty’s Ph.D. thesis (click on these links for either a brief abstractor
afuller description), which will be made
available on this website in 2004, and later published with major revisions in
2004.

The comparisons in basic lexicon and phonetics will be made
for all the varieties in the study, using data collected on fieldwork trips to
villages where each of the varieties is spoken, and reference to the main
existing dictionaries and phonological descriptions for the varieties (where
available).

Our definitive data
and final results of our comparisons and quantifications will be posted on this
website around September 2004.

Once the data have been
collected, during early 2004 we will be producting quantifications of
similarity between the language varieties covered in each of the fields
mentioned here, using the techniques set out in Heggarty (2000).We shall then ‘process’ these data using
various ‘family tree‑drawing’ algorithms initially devised for similar
uses in biology, particularly genetics, such as Network by Bandelt et al. (1995)
and Phylip by Felsenstein (2001).For more details on how we make use of these
techniques and what they can bring to analysis of linguistic data in problematic
cases such as those of the Andes, a full list of our research group’s articles,
oral papers and their abstracts can be found by clicking here.

In early March 2004 I returned from my main period of
fieldwork in the Andes to collect all my phonetic and lexical data.During March and April 2004 I will be
processing and analysing all those data, to see what can be learned from them.I hope they will be able to contribute to the
debate on these significant questions in Andean comparative and historical
linguistics:

•whether
the Quechua and Aymara families are or not ultimately related language families

•the
classification of Quechua dialects, and Aymara dialects

•the
most plausible range of dates for their initial separation of the Quechua
language family, and the Aymara language family

•the
most plausible location of the original Quechua ‘homeland’

During April and May 2004 I will then write up my results
and conclusions in a major article due to be published by September 2004.I will also be making available as much of my
data as possible on this website, as and when it is processed and I have time
to present it appropriately on webpages.This too should be completed in time for publication of my article in
September 2004.

Eventually further sections will be added to this webpage
including a discussion of previous estimates and quantifications of the degree
of diversity within the Quechua and Aymara families, and of the proposed
pan-Andean orthography which will be used as the common reference orthography
for the data to be posted on this site.

On these pages the
principle is followed that each language is written in the form most
appropriate to the orthography of the language of the text in which they are
mentioned.That is, on the Spanish
version of this page the spellings used are quechua, aimara, jacaru and cauqui, even
though in their respective languages the spelling proper to that language gives:qhichwa, aymara, jaqaru, kawki.The
somewhat anarchic orthography of English, meanwhile, generally accepts
spellings as in the original language, unless an accepted form already exists,
hence the spellings used are:Quechua,
Aymara, Jaqaru, Kawki.

The family termed Aymaraby Rodolfo Cerrón-Palomino (or in his Spanish spelling aimara) is also known as Jaqi or Aru by other
linguists.This family includes not only
the language most well known by the name of Aymara
or Aymará (i.e.
for Cerrón‑Palomino more specifically southern
or Altiplano Aymara), but also central or TupinoAymara, that is, the language varieties Jaqaru and Kawki,
spoken in a few mountain villages in the district of Tupe, province of Yauyos, in
the Lima department, Peru.The
terminology followed on these pages is that of Cerrón‑Palomino,

Swadesh, Morris (1952)Lexico-statistical dating
of prehistoric ethnic contacts: With special reference to North American
Indians and Eskimos.in:
Proceedings of the American Philosophical Society - 96: 452-463