New from Cambridge University Press!

Edited By Keith Allan and Kasia M. Jaszczolt

This book "fills the unquestionable need for a comprehensive and up-to-date handbook on the fast-developing field of pragmatics" and "includes contributions from many of the principal figures in a wide variety of fields of pragmatic research as well as some up-and-coming pragmatists."

We linguists must remember that we use two different definitions of the word''grapheme''. They have been nicely distinguished by Kohrt (1986); I haveparaphrased them in one of my forthcoming works as (1) something that isentirely inside the graphonomy of a language, which Kohrt calls the ''analogicalview'' of a grapheme, and (2) something inside the phonology of a language thatis related in a certain way to its graphonomy, which Kohrt calls the''referential view'' of a grapheme.

Sproat uses the term ''grapheme'' with its referential meaning. (I use it with itsanalogical meaning in my own work.) Doing so allows Sproat to use it for talkingabout whatever bits of speech are referred to by any given mark that he isdiscussing without having to analyze the mark itself--which is, after all, notwhat he is interested in doing. And by a ''writing system'' he means a method ofwriting down marks that refer to sounds, and does not mean the marks themselves,or their differences that distinguish them from one another. In the same spirit,he uses the term ''linguistic'' to refer to speech rather than to writing, as whenhe asks ''what linguistic elements do written symbols encode?'' There is nothingwrong with doing this, as long as the reader knows what the writer means.Sproat's work has been in text-to-speech synthesis, and he is a professor bothof linguistics and of electrical & computer engineering.

SUMMARYIn chapter 1, ''Reading Devices'' Sproat says that his topic is how to computerizetext-to-speech synthesis (TTS). He assumes that the input for his proceduresexists in some electronic form and that its output is digital representation ofspeech. (He gives references for those who may be interested in opticalcharacter recognition and speech synthesis as input and output for his ownmodel.) He notes that to pronounce a word aloud it is often necessary to know alot about the language that is not present in the word's written form. And healso notes that to do so, things may have to be accounted for that people areoften unaware of, such as the fact that many writing systems do not have spacesbetween their words.

As an example, Sproat discusses a pair of Russian words that are spelled thesame but differ in their stress placement and also (consequently for Russian) intheir reduced vowels. He describes them and a Chinese word by both anAttribute-Value Matrix (AVM) and an annotation graph, both of which show thewords' pronunciations and their orthographies. He then presents definitions andaxioms in logical notation, and he comes to the central claims of his theory,which are that (1) the mapping between the Orthographically Relevant Level (ORL)of a language and the written characters in its words is regular, and that, (2)for a given writing system and a given language, the ORL represents a consistentlevel of linguistic representation.

As further issues, Sproat discusses why a theory of writing systems should beconstrained and why a study of writing systems should rely on a segmentalanalysis of spoken language. He then concludes the chapter with an outline ofterminology and conventions that he will use, and he adds an appendix onfinite-state automata and transducers.

In chapter 2, ''Regularity'', Sproat deals with the fact that, although spokenutterances exist in one temporal dimension, written utterances exist in twospatial dimensions. He refers briefly to ''ordinary (string-based) regularlanguages'', which can be written with only a sequence catenator because thesequence of written characters matches the one-dimensionality of speaking. (Hedoes refer to the problem of distinguishing between apostrophes and commas bymentioning their heights, but he merely mentions the tone diacritics oflanguages such as Thai, Vietnamese, and Navajo in a footnote, p. 4.) Sproat thendiscusses ''planar regular languages'', in which the written characters must havetheir relationships described in two dimensions and therefore require a richerset of catenation operators. He introduces the notion of the Small LinguisticUnit (SLU), within which the sequence of characters does not have to conform totheir ''macroscopic'' (line- and document-level) order, and he provides asystematic notation for expressing catenation, both of characters within an SLUand of SLUs macroscopically.

Sproat then gives examples of four writing systems for which the catenation ofcomponents within the SLUs needs to be stated: Korean Hankul, Devanagari, PahawhHmong, and Chinese. The Hankul characters for (historical) syllables arearranged within square spaces, and those characters are arranged there by asimple rule, according to whether they are horizontal or vertical in shape. InDevanagari, initial vowels are written by separate characters. For every othergraphic syllable (defined as everything after one phonological vowel andincluding the next such vowel, and not necessarily forming a phonologicalsyllable) the consonants of the syllable, if there are more than one, arecombined into a ligature, and the vowel, if not schwa, is written by a characterin a certain place in the orbit around the consonant character or ligature. Novowel character is written when the spoken vowel is schwa. For the Pahawh Hmongsystem (Smalley et al., 1990), the characters are divided into two groups:onset-consonant and vowel-cum-lexical-tone. For each written syllable, thevowel-cum-tone character is written before the onset character, although theircorresponding sounds are pronounced in the opposite order. For Chinese, Sproatagrees that its complex characters can be analyzed into many levels, and heintroduces additional notation for their internal structure that is based on thecatenation operators he has already used. He notes that almost every Chinesecharacter can be divided into a phonetic component, giving some informationabout its pronunciation, and a semantic component, giving clues to its meaning,and he also notes that each character has a ''determining component'' (itssemantic one, unless its phonetic one belongs to a certain set of eight) thatcontrols the placement of its other components. He also notes that the moreregularly Chinese pronounced characters are more regular in their structure.Sproat mentions possible counterexamples from Ancient Egyptian, Spanish, and Mayan.

Finally, Sproat discusses ''macroscopic catenation: text direction''. He assumesthat an ordinary written text can be modeled by a ''virtual tape'' which isarranged on a page in rows or in columns, and which, when it reaches the edge ofa page, is cut and continued next to the previous row or column. In English,this ''tape'' runs from left to right, in Hebrew it runs from right to left, andin Chinese it runs from top to bottom. Sproat also mentions boustrophedonwriting and points out that shop signs may show variations on the basic way thata text is arranged in a language.

In chapter 3, ''ORL [Orthographically Relevant Level] Depth and Consistency'',Sproat considers the levels within the phonology that are represented by variouswriting systems and the consistency with which those levels are represented, andhe mentions that the spelling of some words must be ''lexically marked'', i.e.specified without reliance on the phonology.

He begins with a case study of Russian and Belarusian, which he says form a nearminimal pair for this comparison, each showing great internal consistency, butwith ORLs that have different depth.

For English, Sproat acknowledges the existence of Chomsky & Halle 1968(hereafter SPE), but he says that much of it merely shows ''personal taste abouthow writing systems should be designed'', and he mentions that Sampson 1985 haspointed out serious defects in SPE. Sproat remarks that ''the system of Englishspelling is a great deal more chaotic than that of [...] almost any otherlanguage that uses a script whose original design was purely phonological.''However, he rejects the often-expressed idea that English has a logographicwriting system because the evidence for it is so inconsistent, unlike theconsistent logographic elements in the Chinese writing system. Sproat gives a32-page appendix of words of the sort that are central for the argumentspresented in SPE, giving for each of them a ''Deep ORL'' and a ''Shallow ORL'', andhe mentions that the addition of a great number of other English words wouldprobably make the argument for a deep ORL less convincing, except for thequestion of how to write reduced vowels.

Sproat also discusses the devoicing of dental obstruents in certain environmentsin Serbo-Croatian, and he presents experimental evidence that casts doubt on thestandard treatment of this phenomenon, although saying that it needs furtherinvestigation. He discusses a possible example of cyclicity in Dutch, andconcludes that his theory has no problem with it if the cyclicity is internal tothe orthography. Finally, he discusses surface orthographic constraints, andnotes that they can be handled by environmental rules or lexical marking withinthe orthography.

In chapter 4, ''Linguistic Elements'', Sproat asks about the range of linguistic(phonological) elements that can be represented by writing systems. He looks atthe influential taxonomies of writing systems presented by Gelb (1963), Sampson(1985), and DeFrancis (1989). He dismisses Gelb as teleological and outmoded, hepresents Sampson's and DeFrancis's tree-shaped taxonomies, and he listsDeFrancis's disagreements with Sampson. He then presents his own taxonomy ofwriting systems, for which he uses two dimensions, with the parameters ''amountof logography'' and ''type of phonography''.

In considering Chinese, Sproat concentrates on the semantic-phonetic compoundsthat are the vast majority of Chinese characters, and notes that for suchcharacters ''the phonological information provided by the phonetic component issometimes perfect (only a few), frequently only partial (by far the greatestnumber) , and in some cases completely useless (only a few)''. He therefore saysthat ''it is much more useful to view [Chinese writing] as an imperfectphonographic system with additional logographic attributes, than it is to viewit as a wholly logographic system''. And he shows how logographic elements areused when writing disyllabic Chinese morphemes.

For the Japanese writing system, Sproat describes the complications that arisefrom its many layers of borrowing from the Chinese writing system, resulting incharacters most of which are logographic, because a Japanese reader must simplymemorize the association between the sounds and the marks. But he also notesthat the use of kanji (Chinese characters) has declined steadily during thetwentieth century, as more and more people have become literate. Finally, Sproatmentions written characters in some languages that show plurality of meaning,reduplication of sounds, and the zero pronunciation of other characters.

In chapter 5, ''Psycholinguistic Evidence'', Sproat asks what support there is inthe psycholinguistic literature for the ''psychological reality'' of the model heproposes. He notes that there is little consistency in that literature, and hedoes not suggest that the computational devices he proposes actually exist inpeople's heads. He asks rather how the macroscopic properties of his modelcompare with what that literature has found, especially with respect to twoquestions: (1) whether the relationship between orthography and ''linguisticform'' is the same for all writing systems, and (2) whether the ''OrthographicalDepth Hypothesis'' (ODH)--which claims that languages with ''deep'' orthographiessuch as English require readers to read by going through the lexicon whilelanguages with ''shallow'' orthographies such as Serbo-Croatian allow readers togo directly from the graphonomy to the phonology--is valid. (Sproat also notesthat, although Serbo-Croatian is often adduced as such a language, it does notwrite lexical stress, and that Spanish would be a better example.)

Sproat claims that multiple routes from the written marks to the phonology existfor all written languages. He cites arguments for and against the ODH and findsthat for both Chinese and Japanese there is evidence that readers use suchroutes. (He notes as a possibly more familiar example that literates in Englishknow how to pronounce the letter string without having to think of theword they are pronouncing.)

Sproat then considers the ''connectionist'' models that assume large numbers ofsimple, but massively interconnected, units. He mentions Seidenberg andMcClelland (1989) as a classic statement of that approach, he summarizes theirmodel, and he also mentions more recent work. He points out a defect in theirmodel, he says that there is little reason for it to supersede other models ofreading, and he concludes that the overall architecture of his own model is atleast not at odds with what we know about human orthographic processing.

In chapter 6, ''Further Issues'', Sproat discusses some complications that canarise when trying to turn a written text into an internal linguisticrepresentation, and he mentions Manx Gaelic as an example of an orthographyconstructed so as to be similar to that of another language (in this case,English). He discusses how the 1995 reforms in Dutch have introducedmorphological and semantic complications into its spelling. He discusses thecomplicated relationships that exist in many languages between numericalnotations and spoken number names. He discusses the problems involved inpronouncing ''abbreviatory devices'' (a term that he uses because ''abbreviation'',''acronym'', and such terms have been used with so many different meanings). Andhe discusses how various languages pronounce logograms such as , and and the fact that some terms such as ''NATO'' have pronunciations based ontheir abbreviations.

Sproat also mentions that many linguists such as Vachek of the Prague school andsome British linguists (although he does not mention Halliday 1989) have wantedto treat written communication separately from and in parallel to spokencommunication. Sproat again emphasizes the fact that written texts are arrangedin two dimensions, while spoken texts are arranged in only one dimension, and heasks whether mathematical notation should be regarded as language. He concludesthat it is a matter of definition whether written texts are to be considered aslinguistic, and he points out that he is dealing in the present book only withmapping from written to spoken forms.

Finally, Sproat points out that all previous work has only touched on what hediscusses, and that he hopes other researchers will carry on the study of thistopic. He mentions that SPE was not underpinned by any theory of orthography;similarly, he mentions that a large number of workers in speech technology donot realize the grammatical and semantic elements that they must consider. Andhe repeats that, for most languages, ''letter strings [...] do encodepronunciation, but only in combination with other information that cannot becomputed from the letter string alone''.

EVALUATIONSproat's monograph is the first to formally and systematically explore onedirection of the relationship between spoken and written language. It surelymust be taken as the basis for any such work in the future. The notation that heprovides is powerful, he goes thoroughly into certain aspects of therelationship that he models, and he mentions the places where he sees that morework must be done.

As befits the first monograph in a new field, Sproat gives copious references tothose who have worked in fields related to his and to the sources which he hasdrawn on for data. He borrows occasionally from various versions of theChomskyan tradition, he creates new notations for some of the relationships hediscusses, and he frequently makes logical statements in algebraic notation;however, all of his terminology is readily intelligible to those who willacquaint themselves with it. As Gleason (1976) has pointed out, we have aprofessional metalanguage which is composed of bits and pieces from varioustheories but which we all recognize and use. Sproat has provided us some morevery useful items for our metalanguage.

In the best of all possible worlds, for every language that uses more than onecommunication channel, we would describe on its own terms each communicationchannel that it uses, whether spoken, written, or signed, and we would alsodescribe all of the relationships that exist in both directions between all ofthe communication channels that it uses. Unfortunately, we are not there yet.Spoken sounds have been well studied on their own terms, but the study ofwritten marks and gestured signs on their own terms, and the relationships amongall of these communication channels, is just beginning.

ABOUT THE REVIEWEREarl M. Herrick has a Ph.D. in linguistics and is emeritus professor at TexasA&M University-Kingsville. His 1966 M.A. thesis, ''A linguistic description ofRoman alphabets'' (Hartford Studies in Linguistics 19), which he wrote under H.A. Gleason, Jr., remains the only analysis ever published of the features thatdistinguish the characters of those alphabets from one another, although itsstratificational notation is now antiquated. He has since published aconsiderable number of papers on graphonomy and on stratificational theory inLACUS Forum and in Visible Language, and he has promised a manuscript on thegraphonomy of English to Springer.