Dominic Widdows

Quick: What do the following words have in common: ring, ideal, and field?

To a mathematician, the answer is that all three describe algebraic structures. To a linguist, however, all three are examples of "ambiguous" words, that is, words with more than one meaning. Human beings are quick to determine the meanings of such ambiguous words from their context (if the list had been "ring, clip, and pin", or "ring, vault, mat, and bars", our mental image of the word "ring" would change). Computers are often less adept at this, and the mistakes that computer translation programs make are a source of considerable humor to those of us who don't need to rely on them. Two supposed examples are the (apocryphal) translation of these two common expressions:

"The spirit is willing, but the flesh is weak," and "Out of sight, out of mind,"

into

"The vodka is good, but the meat is rotten," and "Invisible Idiot"

respectively. Both of these translations were the invention of human sense of humor, although urban legend attributes these to computer error; the persistence of these legends attests to the important role that ambiguity plays in language.

Widdows' book describes how geometry can help computer software mimic the process of determining meaning through context. My own one-sentence oversimplification of the strategy is this: words linked by "and" are likely to have related meanings. [A fuller description appears toward the end of this review]. To drive home this point, I challenge the reader to create a natural-sounding sentence that includes the phrase "parsnips and honor".

At this point, before I expand on Widdows' illuminating book, I would like to take a digression on this word "and", particularly as it appears in the title. Faced with two dissimilar words that are linked by "and" — such as "parsnips and honor" or "Geometry and Meaning" — we have to work hard to seek a context that includes both. If I hadn't given away part of the plot, what might you have guessed about this book from its title? (If you prefer, you can skip to the end of this digression.)

I confess that I was drawn to this book by a mistaken guess that it might fit into one of two genres, which I'll call here the "Edward Tufte" and "Betty Edwards" genres. Tufte's books — The Visual Display of Quantitative Information, Envisioning Information, and Visual Explanations — have sold more than a half-million copies and started a minor Tufte cult; its followers are dedicated to making quantitative displays as informative and self-explanatory as possible. (I would place myself in an outer ring of this cult.) In this case, "Geometry" might more appropriately be termed "graphics" and "meaning" might better be termed "information"; the overarching purpose of Tufte's books and those of his followers is to design graphical displays that do the best job possible of helping the viewer to best understand patterns (and exceptions to patterns) that lie in the data.

Betty Edwards is the author of both Drawing on the Right Side of the Brain and Drawing on the Artist Within. Her work — particularly that found in the first book — has changed the way art instructors everywhere teach their classes, albeit in the same way that Calculus Reform has changed the approach of all mathematics instructors. That is, either artists incorporate the Betty Edwards approach, or they make a conscious, defendable choice not to use it.

The Betty Edwards story is not as familiar to mathematicians as that of Tufte, so I'll use her words to summarize it quickly here.

I was teaching in a high school in 1965, my first year of public school teaching, and I was dismayed that I wasn't able to teach all of my student how to draw... It had always seemed to me that drawing, compared with other kinds of learning, was easy: after all, everything you need to know in order to draw is right there in front of the eyes. Just look at it, see it, and draw it. "Why can't they see what is right in front of their eyes?" I wondered. What is the problem?"...

Then one day, out of sheer frustration, I announced to my class, "Right. Today we are going to draw upside down." I placed some copies of master drawings upside down on the students' desks: I told them not to turn the drawings right side up, and to make a copy of the original, doing their own drawings also upside down. The students, I believe, thought I had gone "round the bend." But the room became suddenly quiet, and the students settled in to the task with obvious enjoyment and concentration. When they had finished and we turned the drawings right-side up, to my surprise and to the students' surprise, every person in the class, not just a few, had made a good copy, a good drawing. [Drawing on the Artist Within, pages 21-22].

Her conclusion from this experiment, and the thesis of her first book, is that teaching students to draw really means teaching students to see the world around them. Her second book, appropriately enough, turns this conclusion upside down, and proposes the notion that in order to "see" the answers to perplexing riddles in our own world, it helps to understand the "vocabulary" of drawing. [The word "vocabulary" here does not mean "verbal terminology" but "artistic characteristics". Drawings whose main feature is a simple horizontal line — many landscapes, for example — tend to be seen as "tranquil", and the lines in an "angry" painting have very different characteristics than those in a "joyful" one.] Although her work leans heavily upon psychological research into the left-brain/right-brain dichotomy, she would be as dismayed as any mathematician to hear our students lament, "I can't do that; I'm just not left-brained!". Indeed, the book abounds with quotations by Poincaré, Einstein, and Hadamard, as well as philosophers, artists, and other scholars from a broad variety of disciplines.

Edwards and Tufte share the common goal of using visual displays to extract useful information: Edwards extracts the information from the artist's personal experience and environment, and Tufte extracts it from a collection of data. In both cases, the visual display allows the viewer to ask questions of the display, and (with any luck) to answer those questions. In both cases, this visual display is personal in the sense that it expects that a person will be its audience.

Widdows differs from both Edwards and Tufte in that his work is not about visual displays (although there are some of these in the book). The "Geometry" is not the paper-and-pencil geometry of similar triangles or ruler-and-compass constructions. Instead, it is the geometry of weighted graphs, linear algebra on high-dimensional vector spaces, lattices, and logic. (As for the last, Widdows states on page xviii that "the eventual union of logic and geometry could be as great a catalyst as the union of algebra and geometry wrought by Descartes' over 350 years ago.")

Nor is his approach "personal": the geometric structures that he develops require significant computational power and so the "viewer" he most often names is a search engine. When we enter "rings ideals fields" into Google or MathSciNet, we hope that we don't have to wade through pages about jewelry and gymnastics, and so we want the search engine to understand context well enough to weed those pages out. Enter Widdows.

The three main audiences for this book are linguists, mathematicians, and computer scientists (particularly those interested in artificial intelligence or information engineering). Widdows has used the material from this book in a graduate course in language and informatics, and has exercises that are available "upon request". In spite of this, Widdows hopes that this book is a gentle introduction to the field for those with a "general scientific interest", and true to this goal, he carefully distinguishes sections that go into great detail for the benefit of a select group, reminding the rest of his readers of the main points and urging them not to get bogged down in details. His introduction states "my dearest (and for a mathematical book, most ambitious) desire has been that Geometry and Meaning should be enjoyable."

Any book that uses Grassmanian manifolds and projections from a 1000-dimensional vector space to a 100-dimensional subspace is not likely to be the same kind of "enjoyable" as, say, a new John Grisham novel or even a "visualizing graphs" book. You're not going to find yourself reading snippets of this book to strangers on a bus, or to recommend it to your grocer. That said, anyone who reads reviews on MAA Online is likely to find this book "enjoyable"; I might read snippets of this book to my linear algebra students, and I would recommended it to my mathematical colleagues.

So what is in this book?

Chapter 1 ("Geometry, Numbers, and Sets") is an introduction to just what its title describes. I liked Widdows' use of three examples — packing a car with boxes, adjusting the rearview mirrors, and turning the steering wheel — to describe how geometry can be 3-, 2-, or 1-dimensional. In fact, I cruised through the first chapter wondering if it seemed so clear because I am a mathematician, or because of his writing.

I got to test this question in Chapter 2 ("The Graph Model: Networks of Concepts"), which introduced linguistic terminology like thesauri, corpora (a body of texts), kinds, and tokens. It was in this chapter that I began to appreciate Widdows' carefully thought-out choices of different fonts and his many clear examples. Also in Chapter 2 we get the first examples of linguistic graphs, where nouns (for example) are the vertices, linked by an edge if they are linked by an "and" in the corpus. It is here also that we get a hint of how to use the graph structure to identify ambiguous words (like "ring").

Chapter 3 ("The Vertical Direction: Concept Hierarchies") leads us into directed graphs and metrics. These metrics give us a computational way of determining that

"Rings, ideals, and fields are all algebraic structures,"

is a more informative statement than

"Rings, ideals, and fields are all mathematical structures,"

Chapter 4 ("Measuring Similarity and Distance") begins by using distance notions to help identify "prototypes" of a class of objects (a prototypical algebraic structure might be a group, ring, or field, rather than a monoid). The topic then turns to symmetry, in particular noticing that "and" is not a transitive way of identifying the meanings of words, and using several notions (such as curvature of a graph) to identify ambiguous words and keep their various synonyms separate.

Chapters 5 to 7 ("Word Vectors and Search Engines ", "Exploring Vector Spaces in One and More Languages", "Logic with Vectors: Ambiguous Words and Quantum States") get to the heart of the book; this is where Boolean Algebra, Grassmannian Manifolds, and the projections onto 100-dimensional subspaces appear. It is hard to imagine that the reader who does not know the material of Chapter 1 can get his mind all the way around all of these concepts; to his credit, Widdows continues to give both technical and intuitive descriptions of the mathematical techniques he uses. The linguistic accomplishments that these techniques produce, however, are clear and aimed to impress (especially to a novice like me). Some of these accomplishments include translation from one language to another:

(Our) techniques even found German translations that hadn't yet made it into the UMLS medical thesaurus [a multi-lingual thesaurus], in particular acronyms that were listed only as English terms which are also used in German with the same meaning. [p 191].

Other examples concern search engines, with queries of the type "A NOT B". Widdows supposes that we want to find a place to clean our business suit, and so we do a search of "suit NOT lawsuit". For such searches, the method of Vector Negation (which relies on orthogonal projections) allows search engines to remove documents with words related to B (such as "plaintiff"), and are therefore more efficient than the traditional Boolean methods at calling up only the most relevant documents.

The book ends (as every good book should) with a chapter or two that sums everything up and points toward future research.

Perhaps I have made it clear that I am a stranger to mathematical linguistics. I can't tell you how this book fits into the field, judge the newness of the results, or even comment on the correctness of the author's assertions. On the other hand, I have the advantage of being able to judge how accessible this material might be to a new audience, and I can say quite honestly that this book opened up a whole new world to me. I can now speak confidently of latent semantic analysis and corpus data. When I next teach Linear Algebra, I have a whole new set of examples close at hand — examples that are probably more immediate to my student than electrical circuit diagrams.

And I've gained a new respect for the importance of the word "and".

Annalisa Crannell's primary research is in topological dynamical systems, but she is also active in developing curricular materials for courses on "Mathematics and Art" as well as materials for writing across the curriculum.