Project Overview

Project

Project members:

Period of award

January 2003 - December 2005

Funder:

Economic and Social Research Council (ESRC) - RES-000-23-0082

The nature of the relationship between frequency of use and grammar in natural language is poorly understood. In order to understand this relationship better, we looked at textual frequency distributions in a language which encodes a reasonable number of grammatical distinctions in its word forms, namely Russian. We have developed for other purposes a precise, computationally verified hierarchical model of Russian morphology. In this project we took the next logical step, namely to use this model to determine how far distinct categorizations within the model correspond to differences in use in Russian texts. In order to achieve this we looked at a specific kind of construct which we had already investigated cross-linguistically. This is syncretism, or grammatical ambiguity, where one form can have multiple functions. The major new element in this project was to investigate the relationship between frequency of use and syncretism based on corpus analysis. There are different types of syncretism (and we reflected this by locating them at different points in the hierarchy of our formal model); this made syncretism an ideal construct to use to investigate the more general and harder question of the relationship between textual frequency and grammar.