Semantic Typology and Little Studied Languages

...the true difference between languages is not in what may
or may not be expressed but in what must or must not be conveyed by the
speakers.

Roman Jakobson, 1959

Plan:

Prelude: Quantifiers in the New Yorker?!

What is linguistic typology?

Little-studied languages and typology?

Semantic typology?

Exhibits from Some Languages

Haisla: Number and Deixis
Kiowa Number
Pirahã: typology of linguists
English: gaps in language and culture

Conclusions

Prelude: Quantifiers in the New Yorker!?

It is not often that issues of syntax and semantics make the headlines. A
notable exception in the recent past was provided by the flurry of excitement
about Dan Everett and his claims about the Amazonian language Pirahã.
Probably the most prominent of these journalistic reports was in the
April 16 2007
edition of The New Yorker. A bit of Googling will turn up a whole spate
of reactions and spin-offs from this and related stories.

The sometimes heated interchanges that followed Everett's article,
starting with the commentaries by a number of researchers published with the
paper in Current Anthropology and Everett's response (see References
below for Everett 2005 and Nevins et al 2007) show that questions about
Universal Grammar, the uniqueness of languages,
Humboldt-Cassirer-Sapir-Whorfian ruminations, and the like find a vivid
resonance among specialists and laypeople alike.

In this presentation, I will take up a number of questions about semantics
across languages, semantic typology, and semantic universals. First, however,
a few general remarks.

Everett makes a number of claims about Pirahã. Among them, these, which
I will use to guide my discussion here (in part):

No quantification.

No counting

No number

No perfect tenses

No recursion

Cultural roots for the above and other characteristics

What is linguistic typology?

Languages do not differ randomly across the space of possible distinctions.
There is a long history of attempts to correlate various properties of
languages with each other. In modern times, the work of Joseph Greenberg has
been of fundamental importance. An older typology was based on morphological
characteristics of languages, using such terms as "analytic, synthetic -
polysynthetic," and "isolating, fusional, inflectional, agglutinative." Sapir
devoted major parts of his popular book Language (1921) to questions of
linguistic form and content along the lines of this kind of typology.

Sapir's schema is based on two kinds of criteria: the formal difference
between independent roots and three kinds of operations on them, with varying
degrees of fusion or internal modification, and the semantic differences among
root, pure relational, and mixed relational elements of content or meaning.
We will return to a classification inspired by Sapir's discussion below. But
let it be noted here that Sapir recognized that languages displayed these
characteristics in varying degrees so that the schema was more like a set of
ideal types than a ready-made and absolute set of categories. We will note
some of this below as well.

Sapir's typology revisited.

In Language (p. 138 in Harvest book edition), Sapir proposes this
general scheme:

I. Pure-relational Languages

{A. Simple{B. Complex

II. Mixed-relational Languages

{C. Simple{D. Complex

We can take "relational" here as referring to purely "abstract" grammatical
functions.

Sapir's schema, which is elaborated quite bit beyond this global summary, is
based on formal as well as semantic criteria. The distinction between more or
less free-standing items (word, stem, root) on the one side and processes of
affixation, internal modification.

What can little-studied languages tell us about typology?

Before going into the special topics of my talk, I would like to make a
few remarks about the general enterprise of working on local, often endangered
languages in the context of investigating language typology, or indeed any
other topic of general theoretical interest.

Does the investigation of such languages have a special role or
importance for typology? Many would answer Yes. But why? Here are some
things to think about.

Just for extending the empirical base, any language will do, its
importance will be a function of the given mass of knowledge that we have
about languages, and of the extent to which the little studied language
exhibits new hitherto unknown properties. So being endangered plays no special
role here, exept in the sense that the language may be lost without making its
possibly unique contribution to the fund of knowledge we need for making
hypotheses about Language.

Sapir's advice

There is hardly a classificatory peculiarity which does not receive a
wealth of illumination from American Indian languages. It is safe to
say that no sound general treatment of language is possible without
constant recourse to these materials. Edward Sapir, Collected
Works, V: 145 (from an Encyclopedia Brittanica article [14th edition,
1929, Vol. 5, 138-141]:

Excursus: reasons for studying and documenting endangered languages are not
just scientific.

Universals and universals

Semantic typology?

There seem to be two common attitudes toward questions about semantics for
different languages:

The basic semantic building blocks for different languages are obviously
the same

The basic semantic building blocks for different languages are obviously
different

One of these views seems to be based on an impulse to attribute a universal
conceptual space to humans and the belief that semantics must connect up to
this universal conceptatorium.

The other view is fed by Whorfian and semi-Whorfian beliefs about the special
space of meanings that go with different languages and cultures. It is also
quite in line with the judgments of multilingual speakers that various words
"just don't translate" from one language to another.

In several recent papers I have tried to address the seeming contradiction
between these two stances by an appeal to a difference between
structure and texture, which is analogous to the differences we
attribute to those between grammar and style (Bach [2003], 2004).

Obviously, discussions like this cannot be carried on sensibly if they don't
take into account the widest variety of languages. And in the last decade the
number of encounters between "little-studied" languages and model-theoretic
semantics has been increasing.

Therefore, it would seem to be analytic that studying little studied language
cannot help but contribute to linguistic typology, indeed, to linguisic theory
period.

Semantics and semantics

Basic aspects of model-theoretic semantics: one aspect of meaning is provided
by setting up a model structure to be understood as the space of denotations
that are associated with expressions of the language. Take these English
sentences for a start:

Sam ate an orange

Horses are mammals.

I am hungry.

To interpret sentences like these we need something like these ingredients, at
least: a set of individuals for expressions like Tom, subsets of the
set of individuals such as the sets of oranges, horses, mammals, things that
are hungry. We need something like worlds with respect to which we evaluate
the truth of sentences. Importantly, for examples like (3) and for the
temporal side of these sentences, access to the context of evaluation, so that
we can understand I as the speaker and the truth of (3) as evaluated
with the speaker as whoever says the sentence and the time as whatever time is
the evaluating context. So if I say (3) right now here, then the sentence
denotes the True in this world and this time just in case Emmon is hungry
at time 9:35 (say) in Paris (say). There is much more to be said about these
sentences, but I won't say it yet.

model structures

One part of studying natural language semantics is to set up a model
structure which provides the basic building blocks for an interpretation.
Setting aside the context box for now, we might start with this more or less
standard setup, call it M1 :

M1 =

BOOL= {1,0 }: truth values (the True, the False)

E: set of individuals

S: set of worlds or situations

F: set of all functions built out of i - iii

In addition to the sets just enumerated for M1, the model
structure needs to specify some inherent relations among some of the elements,
for example, relations of accessibility, inclusion and precedence among
situations in S. I want to take these up separately below.

Note: the model structure of Montague's most wellknown paper on natural
language (1973: PTQ) differs from M1 in that our S
here is split into two: a set I of worlds and a set of times J
with an antisymmetric ordering ≤ on the set of times.

This much of a model structure is surely common to the semantics of any
language, since it is little more than a schema for spelling out possible
denotations.

Time

Semantics and Grammar

Semantics and Lexicon

Three languages plus a bit of some others

Haisla: Number

About Haisla: a Northern Wakashan language (along with Heiltsuk, Ooweky'ala
and a number of languages lumped together as Kwakw'ala -- Franz Boas's
Kwakiutl). Haisla is spoken in Kitamaat Village in northern British Columbia
in the far west of Canada.

Some Haisla examples (e = ə):

begʷánem

person, people

bíbegʷanem

people

t̓íxʷa

black bear(s)

t̓ít̓exʷa

black bears

ketá

shoot

kiketá

shoot "plural"

ketátlnugʷa

I am going to shoot.

ketátlnis / kiketátlnis

we (inclusive) are going to shoot

kiketátlnugʷa.

I am going to shoot repeatedly / several times

kiketátlnugʷaʼi.

I am going to shoot them.

kiketá begʷánemax̄i
t̓íxʷix̄i.

The man/men shot/repeatedly the bear/bears (repeatedly).

Note: there are a number of different reduplicative and ablaut forms, a number
of which are used for marking plurality etc.

Like many North American languages (and in fact probably languages around the
world), plurality is an optional category in Haisla and other Wakashan
languages. As one might expect not every word has a definite plural form and
the common hierarchy applies: nounss near the high end of an animacy scale tend
to have plurals; near the low end, they don't.

Here's how I would model a system like this: a plain noun like
t̓íxʷa denote sets from *[[t̓íxʷ]]a
the big domain covering atoms and sets all the way up.

You can tell a straighforward Gricean story about the interpretations of plain
and plural forms in such a language, as opposed to a language in which
plurality is enforced.

Obligatory choice: The interlocutor had to make a choice so I will interpret
the choice as telling me something important!

Optional choice: There may be no particular reason for making or not making
this choice, so I can't really conclude anything. If its important context
will probably tell me.

Number enters into Haisla and a number of neighboring languages in a different
way at the level of lexical choice. Some predicates are specialized as to
shape or other characteristics of the subject or object, among them number.
So for example Haisla hená means `to be located (somewhere): of a long
cylindrical object.' Coast Tsimshian baa `run singular'
k̓oł `run plural.'

Kiowa Number

The Tanoan languages offer a fascinating system of number in the nominal and
pronominal systems. You can find a short introduction in Mithun, 1999: 81-82
(data Jemez). Watkins (1984) gives an extended description of Kiowa (not to
be confused with Kiowa Apache, an Athapaskan language). I will cite Kiowa
here from Laurel Watkins' work (Watkins 1984, see also now Harbour 2003, 2007).

The most notable feature of the system is inverse number. Nouns are
divided into four classes. There are three number categories: singular, dual,
plural. According to its class membership, each noun has basic or inherent
number or numbers.

singular/dual inverse: plural primarily animate

dual/plural inverse: singular

dual inverse: singular/plural

(nouns in this class do not use the inverse suffix

(I omit discussion of the Class IV nouns.)
As you can see, the inverse picks the complement meaning with respect to
plurality. Disambiguation of the choices (e.g. dual/plural) comes about by
combinatorics with number marking in pronominal affixes on verbs and other
elements.

Examples:

tógúl `young man, two young men'

tógúlgɔ̀ `(more than two) young men'

But in combination with an Intransitive Prefix èͅ- on a
predoicate, the first word must be interpreted as dual. This prefix marks
intransitives as predicates over pairs of entities.

gú: ribs (dual/plural)

gú:gò rib

In combination with an Intransitive Prefix -èͅ, the first
word must be interpreted as dual, the second as singular, while to get the
plural the intransitive prefix gyà- is required. You are
invited to consult the sources for more details of this complex system.

Now a question: is there any way to assign a uniform semantic value to the
inverse suffix?

Here's a try. Let's adopt the kind of structured domain proposed by Link and
others. For any common noun we have the set
of all groups formed from the atoms or basic atoms of the domain. For
languages like the Tanoan languages, for any noun N we have D(N) = the union
of the singletons (or atoms) I(N), the pairs II(N), and the pluralities
III+(N). The denotation of an uninflected Class I noun is just the union of
the singletons and the pairs, for Class II just the union of the pairs and the
plurals, for Class III the pairs. Now the inverse can be interpreted as an
operation that takes the denotation of the bare noun and delivers the
complement of that denotation within the whole domain of the common noun.
This treatment requires that we have available a denotation that is not
directly associated with any of the various forms of the noun itself. The
effect of the various inflections on other elements that disambiguate the
expressions then can be achieved by intersecting the denotation of the nominal
expression with cardinality sets, as in the example above.

Pirahã disputes

I focus here not on Pirahã itself, but rather on the controversies that
have attended the publication of Daniel Everett's paper. Thus this section may
be thought of as a contribution to the sociology or anthropology of
linguistics.

English: Gaps in Language and Culture

English of course is not a little studied language, but recent research has
shown that it has a number of astonishing gaps. We focus here on kinship
terminology. Most speakers of English are limited to the following terms:

Gaps: there is no primary way to distinguish sexes in items like
cousin nor to distinguish paternal and maternal versions of such
items as grandfather, nor is there any standard way to
distinguish first, second, ... wives/husbands/spouses/partners.

Some older speakers can use further terminology, such as first cousin
once-removed but there is much confusion about these items among
most speakers.

It has been conjectured that the severe gaps in kinship terminology noted here
result from cultural constraints such as this one:

Don't remember more than one former spouse/partner!

It is possible, of course, to use paraphrases to fill some of these gaps:

My mother's father... my maternal grandfather

But most speakers reject attempts to push such expressions beyond a limit of
2:

??John's mother's father's spouse's daughter's son...

Repeated attempts to teach adults the more elaborate older system have failed.
In fact, attempts to teach children have failed as well, as they quickly tire
of the task and turn to listening to their inevitable MP3 players. We do not
believe that this situation represents a cognitive deficit, but is no doubt
purely cultural in genesis.

Another astonishing gap in English occurs in its impoverished deictic system.
Phrases like his pencil are completely unspecified as to whether
the person referred to by his as well as the possessum in
question is near the speaker or the hearer or in some other place, real and
visible or invisible and perhaps non-existent, or whether either person or
pencil was here recently and is now gone (visible or invisible) (cf. Boas,
1947, and Bach 2006).

Boas, Franz. 1947. Kwakiutl Grammar with a Glossary of the Suffixes.
Edited by Helene Boas Yampolsky with the Collaboration of Zellig S. Harris.
Transactions of the American Philosophical Society. N.S. Vol 37,
Part 3, pp. 202[?]-377. Reprinted by AMS Press, New York.