By David Crystal

New from Cambridge University Press!

By Peter Mark Roget

This book "supplies a vocabulary of English words and idiomatic phrases 'arranged … according to the ideas which they express'. The thesaurus, continually expanded and updated, has always remained in print, but this reissued first edition shows the impressive breadth of Roget's own knowledge and interests."

Fry's blurb written for Routledge reads: "Fry demonstrates that Japanese conversation obeys certain principles of argument ellipsis that appear to be language universal: namely, the tendency to omit transitive and human subjects and the tendency to express at most one argument per clause. He identifies a set of syntactic and semantic factors that correlate significantly with the ellipsis of grammatical particles following a noun phrase. These factors include the grammatical construction type (question, idiom), length of the noun phrase (NP), utterance length, proximity of the NP to the predicate, and the animacy and definiteness of the NP. The animacy and definiteness constrains are of particular interest because these too seem to reflect language-universal principles."

EVALUATION

On John Fry's website (johnfry.org), he says "[a] book must exhibit not only outstanding scholarship, but also be a pleasure to read... the second criterion seems to be harder to meet." In my opinion, Ellipsis and wa-marking in Japanese Conversation meets both criteria with flying colors, and has certainly earned its place in Routledge's Outstanding Dissertations in Linguistics series.

I found only one typographical error in the English text, although there were a few more in the Japanese text (nanika "something" on pp 32, 52, 58, and 59 should be nanka, burukkrin "Brooklyn" on pg 53 should be burukkurin). As you can see from the six page introduction, F has done an excellent job paring away the literature review and other dissertatorial fat. He kept the meat accessible to as wide an audience as possible, providing introductory material on natural language in Ch. 2, information on corpus linguistics research for specialists in Ch. 4, and information on the Japanese language in the appendices, allowing readers to skip sections they are not interested in, but insuring that no reader is left behind.

As indicated in the title, F uses natural language data to investigate ellipsis and wa-marking. F distinguishes systematic data, such as corpus or elicited data, from nonsystematic data, such as introspective or anecdotal data. F says that elicited, introspective, and anecdotal data may be useful for identifying linguistic phenomena and formulating hypotheses, but corpus data is unique in its ability to objectively and quantitatively measure linguistic phenomenon (p. 12).

The merits of corpus data are indisputable in the abstract, but any data is only as good as the people who collected it. In Ch. 3, we see some specific problems with the CHJ, such as the paucity of the demographic information that was collected from participants. Far too often, basic information such as age, education, and the dialect spoken by participants is missing. In some cases, native speakers employed by the University of Pennsylvania to transcribe the CHJ made "judgments", as F politely puts it, as to the dialect spoken by a participant. I believe "guess" would be a more accurate description here. Obviously, no single Japanese native speaker can be a native speaker of all Japanese dialects, and it seems that none of the CHJ transcribers were native speakers of any of the Kyushu dialects, nor were any of F's assistants. It seems F's Figure 3.3 (p. 30) may have accidentally inflated the Kansai numbers at the expense of Kyushu numbers, and Table A.3 on (p. 181) may include some dialect particles, such as male wa and kashira, which F counts as standard Japanese particles, inflating those numbers. However, for the most part F handled dialect as best as he could -- carefully citing sources and in some cases excluding dialect data. It is probably safe to say that these items had little effect on F's discussion of ellipsis and wa-marking.

In Ch. 4, F discusses how he annotated the CHJ corpus to allow automatic processing. F advises "[r]eaders whose eyes glaze over at the fine[r] details of corpus annotation" to skim through Ch. 4. If you do choose to skim, be sure not to miss 4.5 Acoustic annotations (pp 67-76). I have long been intimidated by audio data, but F has cured me of that. F gives an excellent description of how to use the computer to measure audio files and how to use those measurements in quantitative research. If you have not worked with audio files before, this is a very painless introduction to the subject that will benefit you greatly.

F's claim that ellipsis in Japanese appears to follow language universals has potentially exciting ramifications for those working with gender language, an area where many feel that Japanese differs markedly from universals. F notes in 5.4.2 Sex and dialect (pp 101-104) that the CHJ data does not support any "categorical generalization about the effect of speaker sex on ellipsis in Japanese". F is very polite, but, reading between the lines, I believe that he is refuting Shibamoto's (S) claim that "male speakers are found to retain particles with much greater frequency than female speakers" (Shibamoto, 1985, as quoted by F). F suggests that S's use of elicited data and her small sample size may have been problematic.

S was the first person to write on ellipsis in Japanese women's speech in English, and is still widely cited on that subject today (Tanaka, 2004: 96). Naturally, she has been quite influential, and her work is seen as "statistical" (Tsujimura, 1996: 377-9). Nonetheless, I strongly agree with F's comments above, and I feel that as more researchers use corpus data for quantitative research in to Japanese women's language, we will find that many of the features of women's Japanese described by S are perhaps not as marked as she suggests. However, gender was not F's main theme, sometimes his description of gender items, such as counts of female particles in Table A.3, must be taken with a grain of salt. Fortunately, Table A.3 is provided as background information in the appendix, and does not detract from F's primary conclusions.

I heartily recommend this book. F writes well and does a great job making what could be a dry read engaging. His is an excellent model to follow for anyone interested in working with corpus data. I am sure you will find this book especially useful if you want to work with audio corpus data. F is soon to publish an annotated CHJ corpus, which will make the CHJ data even more accessible. In addition to his findings on wa-marking and ellipsis, F uncovered what I felt were some very exciting finds in Japanese gender language, and has shown the potential and some difficulties of using CHJ data for dialect research. As with all good research, he carefully limited his scope, to ellipsis and wa-marking, and has left the books on gender and dialect for someone else to write.

Robert C. Albon graduated from the University of Wisconsin-Madison in 1995. He has been a freelance translator since 1992 (Japanese to English, Chinese to English, French to English, Creole to English) and was an official Japanese interpreter at the 2002 Salt Lake Olympics. He currently lives in Zama City, Japan. His research interests include informal language and dialects of Japanese, Chinese and French. Homepage: www.albon.us.