Origin of Species: How a T.Rex Femur Sparked a Scientific Smackdown

Everyone suspected dinosaurs were giant birds; then one researcher produced 68 million-year-old protein to prove it. Critics rejected those findings as statistical junk. How a femur sparked a new field of biology—and a scientific smackdown.
Photo: Christopher Griffith, T. rex photographed at the Natural History Museum of Los Angeles County

Sixty-eight million years ago, on a soggy marsh in what is now a desolate stretch of eastern Montana, a Tyrannosaurus rex died. In 2000 a team of paleontologist led by famed dinosaur hunter Jack Horner found it. These are scientific facts, as solid as the chunk of fossilized femur from that same T. rex that Horner gave to North Carolina State University paleontologist Mary Schweitzer in 2003. It was labeled sample MOR 1125.

This article has been reproduced in a new format and may be missing content or contain faulty links. Contact wiredlabs@wired.com to report an issue.

Several facts concerning MOR 1125 are also beyond dispute: First, that a technician in Schweitzer's lab put a piece of the bone in a demineralizing bath to study its components but left it in longer than necessary; when she returned, all that remained was a pliable, fibrous substance. That Schweitzer, intrigued by this result, ground up and prepared another piece of the bone and sent it to John Asara, a mass spectrometry expert at Beth Israel Deaconess Medical Center and Harvard Medical School. That Asara treated the brown powder with an enzyme and injected it into a mass spectrometer the size of a washing machine, hoping to detect and sequence any T. rex proteins that had miraculously survived inside the bone. And finally, that the device purred and buzzed for an hour before spitting out data describing the molecular contents of the sample.

It was at this moment—when a fragment of 68 million-year-old dinosaur was rendered as strings of letters decipherable only by the most labyrinthine mathematical algorithms—that empirical certainty crumbled. What followed was a complex, contentious, and peculiarly modern scientific argument, one more about software and statistics than bones and pickaxes.

That argument began in earnest in April 2007, when Asara, Schweitzer, and several colleagues announced in the journal Science that the mass spectrometer had indeed uncovered seven preserved fragments of protein in MOR 1125. Five of those fragments closely matched sequences of collagen—the most common protein found in bones—from birds, specifically chickens.

The discovery generated international headlines—"Study: Tyrannosaurus Rex Basically a Big Chicken" — as the first molecular confirmation of the long-theorized relationship between dinosaurs and birds. It was also the first-ever evidence that protein could survive even a million years, much less 68 million. The New York Times reported that the finding "opens the door for the first time to the exploration of molecular-level relationships of ancient, extinct animals." Some news outlets couldn't resist drawing parallels to a certain popular fictional tale. The research, suggested the UK Guardian, "also hints at the tantalizing prospect that scientists may one day be able to emulate Jurassic Park by cloning a dinosaur."

Before long, however, a distinctly human subplot emerged. Within 16 months, three separate rebuttals appeared, two in Science itself. Many researchers were skeptical of the quality of Asara's data and doubted that collagen could survive so long, even partially intact. "You're talking about something a hundred times older than anything ever sequenced," says Steven Salzberg, director of the Center for Bioinformatics and Computational Biology at the University of Maryland. "If you have extraordinary results, they require extraordinary evidence."

Asara and Schweitzer were forced to parry and retreat, admitting that statistical evidence for one of the protein fragments was too weak for them to claim they'd even found it. The pair's fiercest critic, a UC San Diego computational biologist named Pavel Pevzner, also questioned the other six fragments, demanding that Asara release all of his underlying data. In a caustic 2008 Science critique, he compared Asara to a boy who watches a monkey bang away randomly on a typewriter, sees it produce seven words, and "writes a paper called 'My Monkey Can Spell!'" Asara's findings, Pevzner told The Washington Post, were "a joke" that would make "serious evolutionary biologists laugh." Then things got contentious.

In many ways, the ongoing case of MOR 1125 exemplifies what can happen when the scientific process—a meticulous consensus built on a foundation of small findings, published in rigorously peer-reviewed journals—is interrupted by a headline-grabbing discovery. As one study catapults into the public sphere, careers and even entire scientific disciplines can come to hinge on its validity. This, then, is a story about what happens when the headlines fade and researchers are left to confirm or debunk the discovery of the week.

The battle over those T. rex proteins has spilled out into blogs and conferences, generating a cloud of public accusations—some more founded in science than others. It has also highlighted a real and growing tug-of-war between computational and traditional biological research, with debates that increasingly play out in databases and mathematical formulas. When findings are anchored to digital evidence instead of microscope slides, replicating another biologist's work starts to resemble recalculating a physicist's model. And without the public release of all experimental data, the peer reviews of even the leading scientific journals are rendered meaningless.

As the modern discipline of bioinformatics comes crashing into analog fields like paleontology, researchers are just beginning to grapple with questions that the dinosaur controversy inadvertently unearthed. And in the case of the disputed T. rex proteins, the answers may not be as they first appeared.

How a lab found chicken in a T. rex

Mass spectrometry has been used for decades to determine the molecular makeup of unidentified compounds. In recent years, mass spec machines have proliferated across scientific fields. Here's how a sample from a 68 million-year-old T. rex femur found in 2000 was analyzed to reveal the discovery of a lifetime.
— Venkat Srinivasan, Illustration: Peter Grundy

1) Extract the peptides. Mary Schweitzer's paleontology lab ground up a piece of the bone, prepared the sample chemically, and sent it to John Asara. Asara treated it with an enzyme to break any proteins into smaller molecules called peptides, which were then separated from one another.

2) Process the molecules.The separated peptides were sprayed into an instrument called a mass spectrometer, which weighed, sorted, and fragmented them. Each fragment was given a mathematical description called a spectrum. Asara's sample produced more than 48,000 such spectra.

3) Crunch the data.Because amino acids have unique weights, algorithms can be used to detect the sequence of amino acids—represented by letters—that make up each peptide. Asara then compared the sequences found in the T. rex sample to those of known, present-day animals.

4) Match the peptides.According to Asara's algorithm, seven peptides matched those found in other species, including chicken. Later, when the data was released, researchers using different algorithms found an eighth peptide with amino acids in a sequence typical of ostrich.

Pavel Pevzner couldn't care less about dinosaurs. What's important in this T. rex business, he tells me one day in his office at UCSD's Center for Algorithmic and Systems Biology, is the thorny mathematical puzzle that arises in the search for proteins. "Biology itself," he says matter-of-factly, "is now a computational science." Pevzner, a 50-something Russian native whose Strangelovian accent morphs his ths into zs, is tall and rugged, with a perpetual 5 o'clock shadow. He is known as one of the top thinkers in the world of bioinformatics, a man with unquestioned computational chops who views himself as a guardian of statistical rigor. "Pavel is a smart guy, but he kind of has … a style," one colleague told me. "He likes to stir the pot." A photo on the university's Web site shows Pevzner in full-on Western gear, complete with 10-gallon hat, a beer in one hand and a rifle in the other.

On this afternoon he is sporting a more typical academic costume of jeans and a blazer. But he seems to be feeling no less the sheriff. "In some areas absolutely fundamental to biology—for example, the sequencing of DNA—there are practically no biologists working in this," he says, only computational scientists. Pevzner specializes in developing algorithms to decode the proteins found in mass spectrometry research. The T. rex issue came to him when Science asked him to peer-review Asara's paper for publication. Even at first glance, he says, "it was clear that this paper was computationally illiterate."

Following his reasoning requires some understanding of how Asara's protein-detection experiments work. Proteins are chains of amino acids, common molecules known by single-letter names—P for proline, G for glycine, and so on. Schweitzer's biochemical tests on MOR 1125 had hinted that the sample contained amino acids. Asara, then, needed to do three things: detect chains of those amino acids, demonstrate that they were fragments of real proteins, and show that those fragments were organic remnants of the dinosaur itself.

An organism's proteome is the complete set of the proteins it contains. Think of it as a dictionary, a collection of words (proteins) made up of letters (amino acids). Now imagine finding a 68 million-year-old bag that appears to contain thousands of letters strung together in chains of varying lengths. That's MOR 1125. The purpose of mass spectrometry is to spell out those letter strings in order to compare fragments of words against the organism's protein dictionary.

To do that, the letter chains are first split into shorter segments called peptides, which are analyzed to determine their mass. The peptides are then sorted by weight and fragmented to reveal their constituent amino acid sequences, each of which is given a mathematical description called a spectrum. Software-based algorithms then determine the letter sequences of the peptides. There are several respected algorithms available to do this—including Pevzner's—and they can produce somewhat different results.

Once all the letters are identified and placed in sequence, the strings are compared against the dictionaries of different species. Because no T. rex proteins had ever been sequenced, Asara had to look for the closest matches in databases of modern animals.

Asara's original paper asserted that the algorithm had identified seven peptides in MOR 1125. The spectra of five of those peptides aligned most closely with chicken collagen, followed by the collagen of frogs and newts. The implication—that T. rex was a closer relative of birds than of modern reptiles or amphibians—was just what paleontologists would have predicted.

When the paper landed in Pevzner's inbox, however, it contained the supporting spectra for only those seven peptides. Missing were the tens of thousands of "junk" spectra—strings of letters that Asara's machine had sequenced but couldn't match to anything in the database. Without them, it was impossible to know whether the peptides found in the T. rex sample matched chicken peptides out of mere chance. Asara's findings, Pevzner thus asserted, could be nothing more than statistical artifacts—random jumbles of letters that just happened to match words in the dictionary.

Pevzner strongly advised Science to reject the T. rex findings. But other reviewers—who remain anonymous—disagreed, and the paper was published. As the headlines rolled out, Pevzner expanded on his criticisms in an article of his own. Science rejected it.

Over the next year, however, other papers critical of Asara's and Schweitzer's work did appear. Sensing an opening, Pevzner resubmitted his own paper to Science, which published it in August 2008. The article lambasted Asara for failing to compute statistical significance values and again demanded that he release the junk spectra. "It is now the turn of the mass spectrometry community," Pevzner concluded, "to question whether the monkey can actually spell."

Meanwhile, the critics carried their attacks into blog postings and comment sections, and then into the press. In some articles, Asara's findings were mentioned alongside an infamous 1994 paper that claimed to have recovered dinosaur DNA, a result later debunked as lab contamination by, among others, Schweitzer. Asara's work—and the entire discovery—appeared increasingly beleaguered. "I knew the reception that this stuff was going to get," Schweitzer says. "I think it's been kind of hard on him."

When Asara refused to release the spectra, he planted himself firmly on one side of a battle over transparency. Scientific journals, as a rule, require that published experimental findings include enough information to allow other researchers to reproduce the results. Traditionally, though, other details can be kept tucked away in lab notebooks, to be mined for further publishable nuggets.

When an experiment relies entirely on statistical data, however, reproducing it in full requires the equivalent of everything in the lab notebook. The oldest branch of bioinformatics, genomics, settled the issue of data disclosure years ago, and today DNA sequencing data is generally released in full when—and sometimes even before—a paper is published. The newer field of proteomics is still a kind of scientific Wild West, but open data advocates argue that publishing the underlying data is just as crucial.

In practice, that ideal runs into the realities of the scientific job market. Researchers depend largely on publication to maintain their funding and academic standing. Releasing mass spec data before scouring it for every potential discovery, Asara complained, would have let others scoop up publishable findings.

To which open-data advocates had a simple answer: tough luck. Much of the research is publicly funded, and the only reason to sit on data is a selfish one.

In the fall of 2008, Asara relented. "I have learned from this process that transparency is always the best policy," he conceded in an online back-and-forth with Pevzner. With that, he posted all 48,216 spectra without restrictions in an online database. "We have nothing to hide," he told me at the time.

Within two weeks, a pair of scientists on the opposite coast turned Asara's own data against him. Martin McIntosh, a proteomics expert at Seattle's Fred Hutchinson Cancer Research Center, and computational biologist Matthew Fitzgibbon downloaded the spectra. When they ran their own set of algorithms, they turned up an unexpected twist: an eighth peptide, one that hadn't appeared in any of Asara's papers. And it yielded a match—not to collagen, but to a hemoglobin peptide found in ostriches.

That finding rang a bell. The pair remembered that Asara's lab had once done a project involving ostrich proteins, giving them an alternate story that could explain Asara's findings: After completing his previous work, they suggested, Asara hadn't managed to scrub all the ostrich molecules out of his equipment. When he then sequenced the T. rex sample, he had used some test tube or dropper or machine contaminated with an infinitesimal amount of ostrich protein. Of course the peptides Asara found matched up well with chicken—because they were from another bird.

McIntosh was careful to remain circumspect about the discovery, which, he told me in November, he had submitted for publication: "It just means that there is another parsimonious explanation"—a scientific term for the simplest explanation for a given set of facts. "The positive note is, we couldn't have done this without the data provided by Asara." But he suggested that Asara, in private conversations, was hurting his case by questioning his critics' motives. "We are not trying to get famous on this," McIntosh said. "You know that expression 'dramatic claims require dramatic evidence'?"

I did.

"With a lot of things in science, there is not necessarily anything objective that tells you this is the right answer." It was, he said, more like convincing a jury beyond a reasonable doubt—and here was a piece of evidence casting serious doubt.

On a balmy February afternoon, Pavel Pevzner steps onto a ballroom stage at the San Diego Westin before an audience of fellow scientists at the annual conference of the US Human Proteome Organization. For two years now, he and other critics have been chipping away at the T. rex protein discovery. Two researchers even published a paper asserting that the proteins were actually from a bacterial biofilm. Schweitzer countered that charge convincingly, but it still added to the thick cloud of doubt surrounding the research. In that context, Pevzner's topic—"Mass Spectrometry of T. rex: Treasure Trove of Ancient Proteins or Contamination/Statistical Artifacts?"—has the feel of a final demolition. Pevzner made it clear to me, two weeks prior to taking the stage, that he still thought of Asara's work as "speculative science."

The two researchers, in fact, have been circling each other all day, like kids on a junior high playground. "We're not exactly on a friendly basis," Asara tells me that morning. "But if I see him, I'll say hi, of course." He professes no intention of attending Pevzner's talk or any need to defend himself against whatever computational grenades the Russian is preparing to lob. "The last thing I need is to listen to someone who clearly has a biased view of the data," he had emailed a few weeks earlier.

"I'd like to see him, but I haven't," Pevzner tells me shortly before his talk. It's a strange comment, considering that I just saw him an hour earlier outside a room where Asara was manning a poster presentation of his research. Then, as Pevzner steps up to the stage, Asara slips in and takes a seat.

The T. rex controversy, Pevzner begins, offers "an excuse to discuss the arguably more important topic of statistical significance." He recaps the arguments of his Science article, taking apart the statistical significance of several of Asara's peptides on a giant screen. Asara's original paper, he emphasizes, had contained "no statistical analysis."

A few seats over from me, Asara listens quizzically, one leg propped casually over the other. But at the end of his outstretched arm, a finger nervously taps out a beat on a chair between us.

Two of the peptide identifications, Pevzner says, do look "reasonable," perhaps implying that "there are indeed T. rex collagen peptides in this sample." But then he pulls his trump card: McIntosh and Fitzgibbon's hemoglobin finding, the results of which have not been published but which McIntosh sent to Pevzner. The work yields an alternate hypothesis, Pevzner announces: ostrich contamination—perhaps suggesting that Asara's paper "ought to be withdrawn."

Biology is squishy, Pevzner knows, but numbers are firm, and he believes he's got the computational goods. The hemoglobin can only be from T. rex if you combine the astronomically unlikely possibility that T. rex collagen survived for 68 million years with the equally unlikely survival of hemoglobin. Which raises the question, Pevzner says, of whether "T. rex did indeed taste like chicken. Or maybe like beef?" The crowd chuckles. Asara smiles tightly.

Pevzner concludes that there is a simple choice: "We should either side with Asara et al., and join their claim that they found ostrich hemoglobin peptide from T. rex that was well preserved over 68 million years," he says, "or we should side with Martin's group, who claim it is contamination. Let's take a poll: Who thinks that the hemoglobin is actually T. rex hemoglobin? "

Not a single hand goes up.

Extraordinary claims require extraordinary evidence. Carl Sagan popularized that mantra, and it has served scientific skeptics, and science itself, well. The discovery of 68 million-year-old collagen and hemoglobin fragments in a dinosaur bone is clearly an extraordinary claim. Which leaves us with this question: Who gets to decide what constitutes extraordinary evidence?

Over lunch one day at the conference, I finally sit down with Asara after months of trying to arrange an interview. At 36, he is stocky and pale, with black hair combed straight back into a pile atop his head. Over email he had often sounded besieged and irritated—"if you read our responses, the answer should be quite clear," he curtly replied to my first inquiry about the controversy. In person, however, he is different: open rather than defensive, cheerfully optimistic instead of brusque.

Most of his research—on how cancer cells signal each other—is far removed from dinosaurs, he says. But he concedes that the T. rex finding "makes me a name that people recognize." And the evidence he proceeds to lay out for the discovery casts an entirely new light on MOR 1125.

First, he points out that he had used several standard mathematical techniques to reinforce the identification of the collagen peptides in his original paper. Still, Pevzner's original complaint, he says, "made us realize we should be more careful" with computational results. So he asked the author of a different algorithm—one favored for its conservative approach to matching peptides—to rerun the data independently. The results matched the original collagen spectra exactly.

Indeed, in granting the statistical significance of even two peptides, Pevzner was abandoning his original contention—that the proteins were mere statistical artifacts. After all, both criticisms can't be true: If you say the peptides result from contamination, you can't also argue that they're mere ghosts in the numbers. "I think we can reject the army of monkeys scenario," agrees Marshall Bern, a computer scientist at PARC, who like Pevzner writes mass spectrometry algorithms and who ran Asara's fully released data through his own algorithm.

Pevzner, when I call him later, concedes as much. "After the spectra were released, it became clear that at least two are reasonable quality spectra," he says. "The new argument came in, and this is contamination."

So sample MOR 1125 unequivocally contains some proteins. But are they from a T. rex or from an ostrich? For starters, Asara says, the hemoglobin peptide matches more than 30 birds, which suggests that McIntosh picked ostrich because he knew of Asara's previous work with that species.

What's more, Asara conducted his ostrich and T. rex experiments a year and a half apart, separated by roughly 1,500 mass spectrometry runs. According to Asara, none of those spectra, nor samples of the soil surrounding the fossils, nor his daily control runs—in which he sequences known solutions to check for contaminants—turned up any ostrich hemoglobin. Also, the ostrich that Asara had sequenced hadn't even produced the particular hemoglobin sequence McIntosh matched. And Science had actually rejected McIntosh's ostrich paper after receiving Asara's response.

Schweitzer, meanwhile, had published several articles reporting evidence of collagen in MOR 1125 obtained using traditional biological techniques. That work had been done in her own lab, on samples never sent to Asara. The pair also collaborated on an identical study of several-hundred-thousand-year-old mastodon proteins—without contamination or criticism.

When I ask McIntosh after the conference how he explains away this evidence, he says, "It's routine that you run a bunch of samples and only one of them is contaminated." The burden of proof lies with Asara, he contends. McIntosh maintains, too, that a certain chemical modification on the hemoglobin makes it more likely to be contamination (to which, of course, Asara offers a rebuttal).

Over lunch, Asara asserts that between his work and Schweitzer's, they have answered the critics. "This is biology we're doing here; it's not just computational analysis," he concludes, between bites of a BLT. "This is a story about protein preservation. When you look at all the validations we did, how can we make the story more convincing?"

Well, there is one way. In early May, a new paper by Asara and Schweitzer—together with more than a dozen coauthors—appeared in Science. In it, the team has replicated their protein experiments on MOR 2598, a bone fragment from an 80 million-year-old hadrosaur, an entirely different species, dug up in a different part of Montana in 2007.

This time, they have used even more rigorous controls, handling the fossils with sterile instruments from the beginning of the excavation. They have replicated both Schweitzer's biochemical results (which show evidence of degraded cells and blood vessels) and Asara's mass spec data (which reveal eight collagen peptides) in independent labs. Asara himself used a mass spec machine with much higher resolution and adhered to Pevzner's demands for rigorous statistical analysis. Once again, the ancient protein fragments have lined up with bird collagen. But they lined up most closely with something else: the T. rex peptides reported two years ago by Asara.

McIntosh declares himself swayed, though still circumspect. "It's a nice bit of work," he tells me. "I think they've been doing a good job of shutting the door. Whether the door is truly locked or not, I don't know." Some other explanation could potentially win out over time. But the hemoglobin-based ostrich contamination hypothesis, he says, "doesn't really bear on what they're trying to prove here."

Pevzner, characteristically, is still playing the sheriff. "I'm glad that Asara called the previous criticism appropriate," he says. "I had a commentary that their analysis was unprofessional; they agreed with this. I had a commentary that this work couldn't be evaluated unless they release the data; they agreed with that."

He maintains that Asara and his colleagues have erected a "wall of silence" around the issue of McIntosh's hemoglobin peptide discovery, which goes unmentioned in the new paper. "This is much bigger news than the collagen," he says. And the researchers are keeping it quiet, he adds, precisely because it is so extraordinary as to cast doubt on their conclusions.

It's a bold claim, but one that McIntosh himself swats down. Since the hemoglobin finding was not published, he points out, it essentially remains a scientific rumor—not a solid theory that demands addressing. Now, to be convincing, Asara's critics are the ones who need evidence to back their alternate hypotheses. "It's up to them to demonstrate it," McIntosh says.

Asara and Schweitzer, in other words, have done just what the critics asked. They've built a rigorous scientific case for the survival of 68 million-year-old proteins from a beast that animates children's imaginations. If it continues to hold up, it is research worthy of its international fanfare. The slow, grinding process of science, freed from the headlines, is working just as it's supposed to.

The one lesson that all sides of the debate now agree on is that the new age of computational biology must be one of data transparency. Such disputes can only be resolved—and the scientific method can only survive the digital age—if scientists dump their digital notebooks online for anyone to try to replicate. And in that sense, Pevzner has been right from the beginning.

Indeed, when Science published the new paper in early May—the one that Asara knew would silence many of his critics—he made a special arrangement to release the entire data set online the same day. Extraordinary claims, as they say, require extraordinary evidence.