Caveat Lector

Why (other than the catchy title) might one choose this book? Because of its intriguing subject, of course. Author Read Montague shows how computational neuroscience, animal neurophysiology and functional magnetic-resonance imaging (fMRI) have advanced our understanding of how reward signals in the brain guide our actions. Other incentives include the book's engaging style, fresh anecdotes and uninterrupted narrative flow (technical information and literature citations are relegated to endnotes).

But should you choose this book? I'm not so sure. Unfortunately, the text as a whole is not as rewarding as one might wish. Despite its grounding in computational neuroscience, Why Choose This Book? provides little description of how neural circuitry works or how computational models can extend our understanding of it. Too often, the discussion of technical concepts is a little off-key, and there are no illustrations to give the reader a better feel for empirical data or model structure. What's more, the chapters are uneven in quality. Although most of the notes are useful, they are not indexed. And important ideas hinted at in the text may lack the expected elaboration in the notes.

Let's take a brief tour of the book's contents. Montague observes that the ability to assign value and make choices is a fundamental feature of how the brain computes. He then states that "computations with goals mean computations that can care about something," carelessly conflating computational and psychological descriptors. He ignores the classic studies of goal seeking that grew out of Norbert Wiener's 1948 book Cybernetics: Or, Control and Communication in the Animal and the Machine . However, Montague does make the powerful observation that

humanity's special capacity to value arbitrary objects and behavioral acts confers on us a kind of behavioral superpower [italics added] not rivaled in the nervous systems of any other species—we can choose to veto our instincts for survival based on an idea.

And he later examines the impact of this power for good and for ill.

Montague summarizes the 1936 contribution of Alan Turing, who showed that any rule-based computation could be conducted by what is now known as a Turing machine equipped with a single program; that the program could be separated from the hardware; and that there is a universal Turing machine, which can simulate any other Turing machine. Turing's work was seminal, and an award that is the equivalent of a Nobel Prize for computer science is named in his honor. But it is hard to see the relevance of his work to this book. Turing separates software from hardware; the brain unites them in wetware. Turing machines can carry out any computation, true, but in a ploddingly serial manner that is quite unlike the distributed and parallel computations of the brain. And Turing machines do not care. The key idea that remains is that of a "virtual machine": Like a computer running different programs, the human prefrontal cortex links to the "value system" of its owner, enabling him or her to learn how to operate in many very different ways, as task and circumstance dictate.

Montague maintains in chapter 2 that the brain's slowness, noisiness and imprecision should be considered a sign of its near perfection, but his main argument in support of that position is that a laptop computer can become intolerably hot to the touch unless cooled, whereas a human brain maintains a bearable temperature. But to assert this is to ignore the diverse criteria that define optimality. If one wishes to store vast arrays of words or numbers and to recall them and operate on them with great reliability, an electronic computer is far more efficient than a brain. A more useful analysis would probe how brains evolved and how they combine cellular processing with the adaptive demands of embodied creatures; it would also provide illuminating examples of how the brain encodes information.

Montague is also wrong to say that valuation is essential to an efficient computational device (adding numbers efficiently is not based on evaluating a goal), but he is right to go beyond classic cybernetics in stressing the following idea:

To be adaptive to new environments or simply exist in a highly variable environment, an efficient machine must have a way to update its prestored values [these are valuations in the sense of goal-setting, not specific numerical values] and to learn new values altogether.

Reinforcement learning (Montague ignores other forms of learning and the brain regions that employ them) is the method of doing this that lies at the heart of the book. Strangely, the details of "slow, noisy, and imprecise" neural computation are largely missing from the presentation of this learning technique.

Chapters 4 and 5 center on the remarkable coming together of the mathematical theory of reinforcement learning and actual neurophysiology. The basic challenge is this: Reinforcement may be intermittent, so it is hard to know which of one's preceding actions contributed to the eventual outcome. The theory describes how someone can learn from success and failure by "backing up" from reinforced states to those that are likely to lead to success (these come to have the relatively high values of expected positivereinforcement) or likely to lead to failure (these gain the relatively high values of expected negative reinforcement). For example, for someone learning checkers, winning the game serves as positive reinforcement and losing it as negative reinforcement. The theory of reinforcement learning explains how, with experience, a player comes to form better judgments of which moves in the middle of the game are more likely to take one in the direction of a win and which in the direction of a loss.

In learning to estimate with increasing accuracy the expected level of reinforcement for more and more states, a crucial signal is the temporal-difference error, which occurs when an action brings about an unexpected change in reinforcement. In key experiments, Wolfram Schultz of the University of Cambridge studied the firing of the neurons that release the neuromodulator dopamine. In one study, he rewarded monkeys with a small slice of apple and found that at first these dopaminergic neurons fired at the exact moment when the monkey received the reward. However, the dispenser made a distinctive click when it cut off a piece of apple, and in due course the neurons fired when the monkey heard the click rather than when it got the piece of apple. The neurons, it seemed, were firing to signal the temporal difference in expected reinforcement. This insight is a triumph of the integration of computational neuroscience and neurophysiology, and Montague provides a good sampling of recent, related findings. He suggests that the priority for uncovering this linkage goes to himself, Peter Dayan and Terry Sejnowski for their 1996 article "A Framework for Mesencephalic Dopamine Systems Based on Predictive Hebbian Learning" ( Journal of Neuroscience 16:1936-1947). However, the true pioneer was Andy Barto ("Adaptive Critics and the Basal Ganglia," in J. C. Houk, J. L. Davis and D. G. Beiser [editors], Models of Information Processing in the Basal Ganglia , MIT Press, 1995), who had earlier developed the theory of temporal-difference learning with his student Rich Sutton.

Montague also reminds us that the transition from a primary to a secondary reinforcer—from apple to click—can be repeated, and in humans the process can go very far indeed, so that abstract ideas can come to provide more powerful reinforcement than do the things that are needed for basic survival. A human can become, quite literally, a martyr for an idea. Thus chapters 4 and 5 are titled, respectively, "Sharks Don't Go on Hunger Strikes: And Why We Can," and "The Value Machine: And the Idea Overdose."

Chapter 7, "From Pepsi to Terrorism: How Neurons Generate Preference," is not really about neurons so much as how the overall activity of brain regions can be monitored by techniques such as fMRI, which compares blood flow in the various regions of the brain while a human performs various tasks. Pepsi comes into play in experiments that show that being told a drink is Coke rather than Pepsi can change the neural correlates of tasting the beverage, even if one cannot actually distinguish one soda from the other in a blind taste test. This chapter thus helps us appreciate the links between computational neuroscience, animal neurophysiology and human psychology as understood through the use of brain imaging.

Chapter 6, "The Feelings We Really Treasure: Regret and Trust," and chapter 8, "Our Choice: It's Not Your Mother's Soul, but It's Still Alive," contain nuggets of interesting information and more of Montague's lively anecdotes. However, both are nonetheless disappointing. Montague's commentary on the important questions under discussion is informed neither by deep philosophical analysis nor by his own expertise in computational modeling and brain imaging.

So, which parts of this book should you choose to read? Chapters 4, 5 and 7 and their many endnotes give a valuable tour of the topic of reinforcement learning, but other chapters are far less rewarding. I particularly regret that Montague failed to make the most of this valuable opportunity to provide a fuller sense of the importance of computational neuroscience.