December 15, 2002 | Can you build an industry around 70 percent accuracy? Though never framed starkly at the ninth Chips to Hits meeting in Philadelphia in late October, the question was definitely in the air.

Several speakers conceded that microarrays -- whether supplied by established vendors or homegrown -- cannot produce data as reliably as the integrated circuit chips that powered the economy in the 1980s and '90s. Perhaps 30 percent of the data coming off microarrays is bad, several speakers acknowledged

This year's Chips to Hits conference attracted 1,500 attendees and two especially eminent bioinformatics experts: John Quackenbush and Mark Boguski. Both seemed intent on casting realistic expectations about the promise of microarrays.

Quackenbush, an investigator at The Institute for Genome Research (TIGR), presided over a bioinformatics track at the meeting. Although he trained as a physicist, his philosophy can be summed up in the phrase: "It's the biology, stupid."

"Biology can't be forgotten," Quackenbush said. "At the end of the day, when we interpret the data, what we need to do is interpret fundamental questions in biology. The challenge, as I see it, is to take the data that comes from these proteomic approaches and put it into a biological context."

Quackenbush surveyed an impressive series of software tools developed at TIGR, including MEV, a Java-based viewer of microarray data with numerous built-in algorithms; Midas, a program to filter and normalize microarray data across different experiments; and Madam, another Java tool to load and retrieve microarray data from databases.

"When we build these databases, we understand their limitations," Quackenbush modestly told the attendees. "When you look at data, you have to realize all data you work with is limited in some way. Understanding those limitations is vital to understanding the final results."

Mark Boguski, a visiting investigator at the Fred Hutchison Cancer Center in Seattle, hit the same cautionary note. Boguski warned against overexcitement about any new technology, whether traditional gene expression or even less-tested platforms: "Before we rush into new technologies like proteomics, we need to be thinking about experimental design and what we are building into these projects. We have a lot of data, but many times it is not collected under the rights kinds of conditions."

Boguski's central point was that the problems are not so much inherent in the microarrays themselves, but in their rushed or haphazard development. Boguski issued an oft-repeated cautionary note: "There have been many red herrings in gene expression profiling. We have no idea what are the underlying expression patterns in the population we're studying."

As a pathologist, Boguski is also worried about the quality of the tissue samples that are the source of microarray experiments. According to Boguski, what's crucial to understand is that the already tricky-to-reproduce procedures for gene expression can be further compromised by a variety of human and biological factors in hospitals.

"One of the first words you learn is autolysis," Boguski said, referring to the enzymatic destruction of cells in the time it can take to transfer tissues to the lab. "By the time you get a sample of tissue, those specimens are pretty useless for anything having to do with medical study." And yet vast research operations are gearing up to do just that.

Physicians who collect and handle tissue samples have legitimate clinical priorities besides getting it down to a research lab as quickly as possible. There are also concerns about the quantity of biological material if microarray experiments can be repeated and their findings validated. "Usually, sample sizes are too small to offer powerful statistical conclusions," Boguski said. "The surgeons are very busy, so they write one sentence of the clinical history. You don't have the information on confounding factors that inform your analysis."

Perhaps aware of the microarray and genechip vendors in the audience, Boguski proposed a relatively mild remedy: "If we're really going to solve biological and technical problems, we are going to have to leverage the technology in a more effective way. We have plenty of data. We must pay more attention to study design and biological variability in the input material. We need to be aware of the right kind of tools to apply."

Sidebar: There’s Sand in Them Thar Test Tubes

Analyzing the genetic machinery of people, mice, rats, viruses is so ... last year. The next scientific frontier in microarray research is ... beaches. Yes, sand. In technical terms, doing genetic expression analysis on what you lie on during your summer vacation.

Jacques Corbeil of the University of California, San Diego, addressed the Chips to Hits meeting and noted he has no difficulties whatsoever attracting volunteers to collect samples from coastal areas. "We are doing molecular fingerprinting of the beach and looking at the complex microenvironment," said Corbeil, an AIDS researcher who received one of the earliest Affymetrix scanners to be used in academia. It turns out that the microscopic life seething in any particular stretch of seashore has a unique signature.