Site Mobile Navigation

Crunching the Data for the Tree of Life

Biologys tree of life has grown out of a simple sketch by Darwin (center) into many and varied new attempts to visualize
the diversity of life. The Paloverde program (left) allows a user to cruise through thousands of species with the movements
of a mouse. Above right, a particular gene is traced to visualize how different species are related.Credit
Illustration by Thomas Porostocky; Photographs, from left, by Michael Sanderson; Mciary Altaffer/Associated Press; Tal Dagan and William Martin

Michael Sanderson is worried. Dr. Sanderson, a biologist at the University of Arizona, is part of an effort to figure out how all the estimated 500,000 species of plants are related to one another. For years now the researchers have sequenced DNA from thousands of species from jungles, tundras and museum drawers. They have used supercomputers to crunch the genetic data and have gleaned clues to how today’s diversity of baobabs, dandelions, mosses and other plants evolved over the past 450 million years. The pace of their progress gives Dr. Sanderson hope that they will draw the entire evolutionary tree of plants within the next few years. “It’s within striking distance,” Dr. Sanderson said.

There’s just one problem. “We have no way to visualize such a tree at the moment,” he said. If they tried, they would end up with a blurry, inscrutable thicket. “It would be ironic,” Dr. Sanderson said. “We’d be saying, ‘We’ve built it, but we can’t show it to you.’ ”

Ever since Charles Darwin first sketched a spindly sapling in 1837, biologists have relied on evolutionary trees to understand the history of life. Today biologists draw evolutionary trees to help them track the emergence of new diseases, identify species at risk of extinction, and trace the history of disease-related genes in the human genome. Within the next few decades, biologists may figure out how the millions of species on Earth are related to one another. But for people to actually see that tree of life, the tree itself will have to evolve.

Biologists have responded to the problem by enlisting the help of computer scientists and software designers from companies like Google and Adobe to find a new way of looking at evolution. Their goal is to create a program that allows scientists and nonscientists alike to fly through evolutionary trees.

“Just like Google Earth changed the way people look at geography, a sophisticated tree of life browser could really change the way we look at the life around us,” said Mark W. Westneat, the director of the Biodiversity Synthesis Center at the Field Museum in Chicago.

Darwin drew the first evolutionary tree when he was 28. He had recently returned to England from his five-year voyage around the world aboard the Beagle, and his theory of evolution was still in an embryonic state. It occurred to him that evolution could explain the similarities and differences between species. The descendants of an ancestral species might have evolved into different forms, splitting into separate lineages “like the branching of a great tree from a single stem,” as he would later write in “On the Origin of Species.”

Darwin’s first tree is now a familiar sight in books, museum exhibits and, of course, on Wikipedia. But David Kohn, the director of the Darwin Digital Library at the American Museum of Natural History, has recently discovered 10 other trees that Darwin drew in later years. “It’s a long-term preoccupation,” Dr. Kohn said. “It feels like he’s using it to think.” While pondering how humans evolved, Darwin drew a cluster of branches to represent our common ancestry with apes and monkeys.

Darwin left it to other biologists to figure out what real evolutionary trees looked like. In 1879, for example, the German biologist Ernst Haeckel published a tree, complete with bark and leaves, showing humans and animals evolving from single-celled creatures.

The science of tree-building took a significant step forward in the late 1900s. Biologists set up standard rules for comparing species and figuring out who was most closely related to whom. Once they were all speaking the same scientific language, they could test each other’s hypotheses with new evidence. They also began to get new kinds evidence for their trees. It became possible to compare not just the skeleton or color patterns of species, but also their proteins and genes.

At first biologists could draw only small trees, typically with a dozen branches at most. They were held back by the fact that a group of species may possibly be related in many different ways. If a biologist adds more species to a group, the possibilities explode. “For 25 species, there are more possible trees than there are stars in the known universe,” Dr. Westneat said. “For 80 species, there are more trees than there are atoms in the known universe.”

Simply comparing every single tree would be impossible. Fortunately, mathematicians developed statistical methods for searching quickly through potential trees to find the ones that do the best job of explaining all the evidence. Computers could do millions of calculations for biologists and store a growing database of information on Web sites. Trees grew hundreds of new branches, then thousands. “We’re overwhelmed with information,” Dr. Hillis said.

Today trees with thousands of branches, sometimes called “supertrees” or “megatrees,” are starting to appear in scientific literature. Their branches reveal patterns in evolution that were missed in smaller studies.

In 2007, for example, Olaf Bininda-Edmonds, a biologist at Carl von Ossietzky University in Germany, and his colleagues published a tree of 4,500 mammals — in other words, just about every known mammal species. The tree allowed researchers to estimate the rate at which mammals have evolved into new lineages. For decades, many researchers have argued that most major groups of living mammals evolved after the dinosaurs became extinct 65 million years ago. Based on their mammal supertree, Dr. Bininda-Emonds and his colleagues argued that mammals were diversifying millions of years earlier.

Less than two years later, the mammal supertree is looking puny. In a paper to be published in the journal BMC Evolutionary Biology, Stephen Smith of the National Evolutionary Synthesis Center in North Carolina and his colleagues have created a tree containing 13,533 species of plants. Their study shows that ferns — sometimes considered as living fossils that have changed little for hundreds of millions of years — have actually been evolving faster than younger groups of plants, like conifers and flowering plants.

Plants are not just related to one another. They’re also related to us animals, fungi, bacteria and all other living things on Earth. Over the past seven years, the National Science Foundation has been financing a project known as Assembling the Tree of Life, the goal of which is “to reconstruct the evolutionary origins of all living things,” according to its Web site. Research teams are analyzing slices of the tree, while mathematicians and computer scientists work on methods to combine them into a single analysis. “You can just imagine how Darwin would have enjoyed it,” Dr. Kohn said.

Darwin would probably not enjoy trying to draw such a tree, though. “Even when the mammal supertree is printed out at two meters by two meters, the species names remain virtually unreadable,” Dr. Bininda-Emonds said. “It’s a Google Earth kind of problem. You can’t simultaneously see where Central Park is in New York, and where New York is in the United States.”

An error has occurred. Please try again later.

You are already subscribed to this email.

It has become clear to biologists that they are going to have to find new ways to draw evolutionary trees. “Our advances in understanding evolution are moving really fast now, but the tools for looking at these big trees are lagging behind,” Dr. Westneat said.

The future of evolutionary trees may be on display on a wall in Dr. Sanderson’s laboratory in Tucson. He and his colleagues have mounted a bank of flat-screen monitors that can show off a program they have designed called Paloverde. Dr. Sanderson can transform an evolutionary tree into a three-dimensional structure, and then use his mouse to navigate through it, zooming in on particular branches he wants to inspect.

It’s a mesmerizing sight, but Dr. Sanderson is quick to point out its limits. “My program can handle 1,000 species fairly effectively. When you get to 5,000 species, it gets very slow and not very beautiful,” he said.

To bring evolutionary trees up to date, biologists are working with computer scientists and other visualization experts. Dr. Westneat has been organizing meetings over the past year to bring the two cultures together. “It has the potential to move us beyond what biologists with a little bit of programming can do,” Dr. Westneat said.

Even with the help of visualization experts, biologists won’t be able to fly through the tree of life any time soon. “It’s definitely not small potatoes — it’s cutting-edge research,” said Tamara Munzner, a computer scientist at the University of British Columbia.

Dr. Munzner is working on methods to allow biologists to see details of the tree of life without losing sight of its overall shape. One of her programs acts like a fisheye lens, blowing up clusters of branches. She has also figured out how to make trees rubbery, so that a biologist can stretch some parts of it open and squeeze others down. Although a few thousand branches may slow down Paloverde, Dr. Munzner’s programs can handle millions of branches.

For Dr. Hillis, drawing the tree of life is not something to do simply because it’s there. He thinks it will become a practical tool, in the same way online databases of DNA have become practical tools for geneticists.

“What I’d really like is the entire tree of life on a small hand-held device,” Dr. Hillis said. Biologists would be able to put a tissue sample from a plant, animal or other organism in the machine, which would then scan its DNA and find its place in the tree of life, even if it’s a new species. The data could then be uploaded to a database, so that every biologist’s machine would get an updated tree. “It would be a ‘tricorder’-like device, able to identify any species on Earth in the field,” he said.

If biologists do ever succeed in drawing the tree of life, it will look profoundly different from Darwin’s sketch. Lineages do branch as they evolve, but sometimes the branches join back together. It has long been known that separate plant species sometimes produce hybrids that can no longer interbreed with their parent species. In other words, they become new species. When biologists draw the relationships of some groups of plant species, their pictures look more like webs than trees.

In other cases, genes don’t have to wait for two species to come together — they simply leap from one branch of life to another.

Viruses sometimes infect a new host species, and in the process they transfer genes from its previous host. Many species of bacteria can slurp up naked DNA or pass it to one another on tiny genetic ringlets.

“Each gene has its own evolution. It’s not inherited from mother to daughter; it’s inherited from a neighbor,” said Peer Bork of the European Molecular Biology Laboratory.

Biologists are just starting to understand how this different kind of heredity alters the tree of life. Although genes may move from one species to another fairly often, it may be rare that they become a permanent part of a new genome. Tal Dagan, a biologist at the University of Düsseldorf, has estimated their impact by analyzing hundreds of thousands of genes from microbes. She estimates that 80 percent of the genes in any microbe have been passed from one species to another at some point.

Dr. Dagan and her colleagues have not simply published their results as a table of numbers, though. “We had to have a new picture of evolution,” she said.

For Dr. Dagan, evolution is still shaped like a tree. “Most of the evolution is still going on in the branches,” she said. But over billions of years, thousands of genes have shuttled among the branches. To drive this point home, Dr. Dagan and her colleagues have drawn a dense filigree of lines between the branches of the tree of life. “You see the tree and you see the thousands of edges, and you know this is how it is,” she said.

Could these sorts of evolutionary vines be added to a complete tree of life without letting visitors get lost in the complexity? “It does make things more complicated,” Dr. Munzner said. “But it doesn’t mean it’s hopeless. My answer is, ‘Bring it on.’ ”

Correction: February 19, 2009

An article on Feb. 10 about efforts to create complete evolutionary trees misspelled the name of a computer program that can navigate through a three-dimensional evolutionary tree. It is Paloverde, not Paleoverde. The article also misspelled a type of tree that researchers have studied. It is the baobab, not baobob.

A version of this article appears in print on , on Page D1 of the New York edition with the headline: Darwin: Crunching the Data For the Tree of Life. Order Reprints|Today's Paper|Subscribe