O.K., Computer, Tell Me What This Smells Like

Can machine learning help unravel the bewildering relationship between scent and chemistry?

Illustration by Vanessa Mckeown

Our sense of smell is gloriously specific. The mellow aroma of butter
and flour rising from warm pie crust, the synthetic bite of fresh paint,
the familiar odor of a new car—when we get a whiff of something, we know
immediately what it is. But this natural delicacy of perception far
exceeds our ability to tell how a given molecule, drawn on a blackboard
and considered as an abstraction, will strike our noses. Two substances
with completely different chemical shapes might smell almost identical,
while two others with similar shapes might smell nothing alike. That’s
in direct contrast to, say, color vision; by examining the wavelengths
of light bouncing off a rose or a child’s hat, a scientist can say that
a human will see them as red or blue (unless the human happens to be
color-blind—though, even then, the shade is predictable). But what about
a substance gives it the scent of grapefruit or sewage? As Leslie
Vosshall, a neurobiologist at Rockefeller University, in New York, told
me recently, “We have no way to tell.”

Of course, there are workarounds. In the mid-nineteen-eighties, a group
of researchers who were members of the American Society for Testing and Materials
recruited more than a hundred people to help compile a
list of molecules and their associated scents. From this trove of information
and others put together since, we know that benzaldehyde smells like
cherries and isoamyl acetate like bananas. But such atlases of odor,
Vosshall pointed out, are labor-intensive to make, and traditionally
they have served the rather limited needs of the flavor and fragrance
industries, failing to explore the full range of human
olfaction.
Over the years, biologists who specialize in the psychophysics of smell
have continued to work away at the problem. Earlier this year, Vosshall
and her collaborators published a new take on
it,
this time using computer algorithms.

The researchers first asked around fifty people to rate the intensity
and pleasantness of four hundred and seventy-four odor molecules, and to
describe them using terms such as “leather,” “fruit,” “bakery,” and
“chemical.” Then they provided groups of computer scientists—all
entrants in a competition called the DREAM Olfaction Prediction
Challenge—with
more than four thousand pieces of information about the molecules,
ranging from their component atoms to their 3-D shapes. The groups used
machine-learning techniques to suss out connections between the
chemistry of the molecules and how they were perceived; some employed a
so-called random-forest strategy, which can uncover nonlinear
relationships, and others relied on regularized linear models, which are
simpler. A subset of the original data was kept apart so that it could
be used to test how accurate the models were.

The groups’ predictions varied in accuracy, but, for some scents,
especially those that a human might label “garlicky” or “fishy,” Arizona
State University’s IKW Allstars and the University of Michigan’s Team
Guan Lab did fairly well. These categories of smell are both very
recognizable and very strongly linked to certain chemistry—sulfur
compounds for garlic, ammonia compounds for fish—which made them
particularly tractable. Not surprisingly, aggregating the various models
increased their power; twenty per cent of the time, Vosshall said, this
combined approach returned the correct answer, and sixty-five per cent
of the time the correct answer appeared in a list of the top ten
options. Still, despite having a great deal of information about how the
molecules looked, and despite improvements over earlier work, the groups
mostly weren’t able to furnish reliable predictions. Vosshall sees the
study as a preliminary skirmish in a longer, more demanding engagement.
“We’re not there,” she told me.

One way to interpret the results of the experiment is to conclude that
scientists need more data—more opinions from more people on how more
molecules smell. But another possibility, Vosshall said, is that she and
her colleagues are still thinking about olfaction too simplistically. In
the nose, there are hundreds of smell receptors. Unlike their brethren
in the
mouth,
which sense just one of the key tastes—sweetness, saltiness, umami, sourness, and
bitterness—they don’t appear to be specialized. Instead, they seem to
interact with one another and with the environment in all kinds of ways,
sending myriad messages to the brain that it interprets as the scent of
chocolate cake, sawdust, lilacs. Clearly, the mechanisms of smell are
not as simple as a lock fitting into a keyhole, or even an alkaloid from
your morning coffee striking a bitter receptor on your tongue. The
difficulty of the problem makes it all the more marvellous that we can
recognize scents so effortlessly.