By David Crystal

New from Cambridge University Press!

By Peter Mark Roget

This book "supplies a vocabulary of English words and idiomatic phrases 'arranged … according to the ideas which they express'. The thesaurus, continually expanded and updated, has always remained in print, but this reissued first edition shows the impressive breadth of Roget's own knowledge and interests."

SUMMARYThis book consists of a collection of essays reporting the results of awide-ranging corpus linguistic project at the Berlin Brandenburg Academy ofSciences, Germany, which concentrates on various aspects of German multi-wordunits, primarily verb phrase idioms and support verb constructions. The projectis adjacent to another project, which deals with compiling a large corpus of20th century German language that has provided the primary basis for theempirical studies in the volume.

The volume begins with Christiane Fellbaum's introduction, in which she explainsthe main aims of the project determining its position in the research tradition.The merits of large-scale corpora in studies on idiomatic language arewell-acknowledged (see e.g. Moon 1998, Stubbs 2002), and this book continues thetrend by showing how multi-word units that have traditionally been regarded asfixed, idiosyncratic expressions do not in the end dramatically differ fromnon-idiomatic, syntactically free expressions. At the same time as Fellbaum'stext outlines the background for the book, it introduces the reader to the fieldof corpus-linguistic idiom studies in general explaining the basic framework andterminology and reciting some of the seminal studies in the field.

The actual chapters in the book are divided into two parts. The first fourchapters are called ''Corpus, extraction and workbench'' and discuss thetechnical, theoretical, and, to some extent, practical background of theproject, while the final six chapters form the ''Linguistic analysis'' and presentindividual empirical studies conducted during the project.

The technical part of the book begins with Alexander Geyken's description of thecorpus compilation project (''The DWDS corpus: a reference corpus for the Germanlanguage of the twentieth century'', 23-40). It provides an account of thevarious questions related to creating a balanced reference corpus and thusoffers a valuable lesson for anyone who has any ideas of collecting a corpus ofone's own. The emphasis is, naturally, on the 100-million-word core corpuscalled the DWDS corpus, which contains German texts from each decade of the 20thcentury balanced chronologically and by text genre, but the project alsoinvolved a collection of a 900-million-word supplementary corpus of newspapertexts from the 1990s. Together these two corpora are large enough to be reliablyused for studies on idioms, but the development work still continues.

The following three chapters describe some of the basic aspects related to thetools and methods used for searching and analyzing the corpora. First, AlexanderGeyken and Alexey Sokirko (''Classifying NVGs/FVGs in an interactive parsingprocess'', 41-53) give a brief account of the interactive parser they have beendeveloping to help lexicographers find suitable data from the corpus; thecorpus, after all, is far too large for anyone to inspect it manually. Theirwork is based on the idea of semi-automatic linguistic analysis, in which ashallow parser first on syntactic basis extracts a suitably-sized set ofrelevant examples which can then be inspected manually. The parser developmentcontinues, but the results of the experiments that have so far been conducted onverb-nominalization constructions and function verb constructions look promising.

In his essay, Axel Herold (''Corpus queries'', 54-63) concentrates on the problemsrelated to extracting enough relevant data from the corpus, since it isimportant that when examples are extracted from the corpus the queries allow usto detect not just those variations that we could think of but also thosevariations that we would not expect to turn up. After all, intuition cannotoften account for everything that actually happens in the real world, and thisis especially true of idiomatic expressions.

Gerald Neumann, Fabian Körner and Christiane Fellbaum (''A lexicographicworkbench for German collocations'', 64-77) describe the lexicographic workbenchused for analyzing and representing the data. The idiom examples extracted froma corpus form example corpora, and the idioms in these corpora are analyzedmanually and represented in annotated templates, which contain a lot ofinformation about the syntactic and semantic nature and even history of eachexample idiom linking it to other relevant expressions. These templatesconstitute the core of the workbench and will be freely available to theresearch community, which makes the whole project a very valuable contributionto the field.

Katerina Stathi's chapter ''A corpus-based analysis of adjectival modification inGerman idioms'' (81-108) begins the empirical part of the volume and provides acomprehensive and careful investigation into the ways in which German idiomsallow adjectival modification. Stathi takes a critical view towards earliersuggestions and develops a fine-grained classification of adjectivalmodification, which contains five different levels and functions hierarchicallyin a somewhat similar fashion with Fraser's (1970) classic hierarchy of idiomtransformations, i.e. each modification that is permitted at a certain higherlevel of the hierarchy automatically permits modifications at lower levels.Moreover, Stathi also ponders the consequences her analysis has for idiom theoryin general.

In their article ''Types of changes in idioms – some surprising results of corpusresearch'' (109-137), Elke Gehweiler, Iris Höser and Unidine Kramer take adiachronic view on idioms and discuss the semantic and structural changes thathave occurred in idiomatic expressions in German during the 20th century. Thedevelopment of idiomatic expressions follows the same routes that have beenacknowledged with single lexemes. This diachronic development is something thatcannot be adequately presented in conventional dictionaries, but the authorssuggest that their idiom database could function as a suitable means to this end.

Christiane Hümmer discusses the possibly motivated nature of the contextualbehavior of idiomatic expressions (''Meaning and use: a corpus-based case studyof idiomatic MWUs'', 138-151). She comes to the conclusion that idiom behavior isat the same time both motivated and arbitrary. Motivation is important forexplaining the links between the different semantic levels of idiomaticexpressions, but arbitrariness can be seen, for example, in the way that onlyone of the possible motivated links between the lexicon and language use isrealized.

In the chapter called '''You fool her' doesn't meant (that) 'you conduct herbehind the light': (Dis)agglutination of the determiner in German idioms''(152-163), Anna Firenze concentrates on the determiner variation that can befound in German idioms. Although various grammar books and earlier studies havecategorically claimed that determiner changes in certain idioms automaticallyturn them into non-idiomatic expressions, Firenze shows that this is not true.On the contrary, the various determiner changes that are possible fornon-idiomatic language are also possible for idioms without loss of idiomaticmeaning. For some reason, the examples in this chapter are not glossed, whichmakes the chapter slightly different from the other chapters of the book.

Angelika Storrer's chapter ''Corpus-based investigations on German support verbconstructions'' (164-187) analyzes German support verb constructions, also knownas light verb or nominalization verb constructions. Storrer divides theconstructions into two types, those in which the predicative noun following theverb forms part of a prepositional phrase (the construction type is abbreviatedas PP-SVC) and those in which the noun is the head of a direct object (DO-SVC),and shows how these types behave slightly differently in terms ofmorphosyntactic variation. She also points out how the previous assumption,according to which support verb constructions can in most cases be freelysubstituted by the corresponding base verb constructions is unjustified; in manycases, contextual or semantic restrictions prohibit such substitutions. Thechapter contains a lot of interesting information. However, a few of thequantitative claims made in it would have benefited if they had been tested withstatistical methods, although most of the points made in the text do not requireparticular statistical verification.

The volume ends with Christiane Fellbaum's discussion of the roles ofconstructional meaning and lexical meaning in the semantics of idiomaticexpressions (''Argument selection and alternations in VP idioms'', 188-202).Following the ideas of Goldberg (1995), she argues for the importance oflexeme-independent constructional meaning in explaining the semantics of idioms.As a consequence, the semantic analysis of idioms requires that idiom-specificsyntactic frames that essentially contribute to the meaning be recognized.

EVALUATIONThis book is a neat and compact package of studies illuminating the phenomenonof German multi-word units from various angles. Its object is interesting andcurrent in linguistics, and the fact that the studies are corpus-based makes iteven more topical. Although the articles approach the phenomenon from variousperspectives and posit fairly different research questions, they closely relateto one another and support the claims made in the whole book.

An additional merit of the book is that it bridges the gap between the Germanand Anglo-American research traditions of idiomatic language. After all,phraseology plays an important role in German linguistic tradition, part ofwhich is unfortunately little known by researchers who are not literate inGerman. In addition to containing original studies, which discuss idiomaticexpressions in German and thus offer information that could be compared, forexample, with the corresponding phenomena in English, the book also bringsattention to a prominent body of idiom literature that has been published eitherin German or in French and therefore has so far remained mainly unrecognized inthe English-speaking world.

Since this is a work in progress, one could always question whether it wouldhave been a good idea to delay the publication, for example, by a year, becausethis would have allowed time for some of the work to be developed a bit further.I, however, prefer the publication at this point. The stage at which the work isat the moment (or was at the moment when the articles were written) is nowreported in the articles and offers valuable information for researchers who areplanning or have already began to work on similar projects; had the articlesbeen written later, some of the questions that are included in them and can beof help for future projects might have been left out, since they would have beensolved already. Moreover, since the authors at various points emphasize thatthis is a work in progress and will be developed continuously, there would havebeen no guarantee that a slight delay would have found the project at a stagewhere it is essentially different from its present condition.

Unfortunately, the print quality of some of some of the figures in chapters 2and 4 is fairly poor. And it escaped the eye of the editor that the text on afew occasions refers to color codes that are used in the computer programs whilethe figures in the book are black and white. Nevertheless, the book reads welland the type editing is almost faultless. All in all, Fellbaum's _Idioms andCollocations_ is a very welcome contribution to the field of idiom research andoffers valuable information about corpus-based study of multi-word units.

ABOUT THE REVIEWEREsa Penttilä, PhD, is currently working as senior assistant at the University ofJoensuu, Finland. His main research interests are in cognitive linguistics, inparticular idiomatic language and idiomatic constructions. He is also interestedin the philosophy of language.