Gene
microarrays are a kind of modern art. Thousands of colored squares comprise
the grids of microarrays, each square representing the activity of one
gene under certain circumstances. A single microarray is a mere snapshot
of activity, and its patterns of red, green, and black are essentially
meaningless. But when collections of microarrays are assembled and analyzed,
as happened in a recent breast cancer study, the result can be new classifications
of disease and an appreciation for the nuances of biology.

A 23,000-gene microarray used in the David Botstein-Patrick
Brown laboratory at Stanford University School of Medicine (detail). View fullCourtesy Charles M. Perou

Researchers at Stanford University School of Medicine in California have
recently found distinguishable differences in gene expression in a sample
of 40 breast cancer tumors. Based on the activity of 8,102 genes, the
researchers characterized what appear to be at least four subgroups of
the disease in the sample population. The researchers are now trying to
determine whether each subgroup is associated with a particular disease
outcome.

"Having profiled all the tumors, we can now go back and identify certain
types that are predictive of biological or clinical behavior," says Charles
M. Perou, of the Stanford laboratory jointly headed by David Botstein
and Patrick Brown. The group investigates a number of tumor types, including
breast, prostrate, liver, and lung. "Our work is premised on the belief
that there are clear differences in gene expression within tumors of a
specific type," says Perou.

Two previous gene microarray studies have reported new classifications
of cancersin one case lymphoma and the other melanoma. Subjects
were grouped into two categories based on a mathematical analysis of tumor
gene activity. Like the breast cancer findings, the research appeared
this year in Nature.

Hierarchical-clustering analysis
and data display of gene-expression patterns for a set of 80 human
tumor samples.
Each row represents a gene, and each column represents a tumor sample.
The behavior of a gene in the experiment is represented by the color
and intensity of the squares: red indicates above average expression;
black indicates average expression; and green indicates below average
expression. View largerCourtesy Charles M. Perou

Analyzing microarray data involves sorting relatively large amounts of
information, which is done using algorithms. From the initial results,
the researchers identify genes of particular interestthose that
are expressed at both higher and lower than normal levels. After more
algorithmic analysis, the researchers arrange the data into hierarchical
'clusters' that reveal patterns among groups of genes, allowing them to
begin the work of classifying tumors according to the new subsets of disease.

The researchers sampled twenty of the breast cancer tumors twiceonce
before and once after a 16-week course of chemotherapy. The patterns of
gene activity in microarrays from the same individual were almost always
more similar to each other than either was to any other. Still, among
the entire sample were four distinct tumor types that no one had previously
reported.

The breast tumors in the sample are basically indistinguishable during
a clinical exam or under the microscope, according to Perou. Previous
genetic studies also failed to reveal that certain groups of genes play
an important role in tumor development. Genetic studies typically focus
on a single gene or several genes, not hundreds or even thousands. "When
you look at one gene at a time, you can't see relationships between genes
and groups of genes," says Perou, adding: "the more samples, the finer
the distinctions."

Toward personalized medicine

Preliminary data suggest that one type of tumorthose derived primarily
from breast basal epithelial cellsmay be associated with a very
poor prognosis, according to Perou. If confirmed, this information would
be critical for the treatment of this type of tumor. Indeed, gene array
technologies have generated so much interest in part because they seem
to promise more precise diagnoses, which might allow doctors and patients
to 'personalize' the treatment. "As we begin to individualize the therapy
based on the type of tumor, I suspect many treatments will prove more
effective than we now realize," says Perou.

Experimental microarrays are also being used to spot gene activity associated
with metastasisthe spread of tumor cells into previously unaffected
tissue. For example, researchers at the Whitehead Institute for Biomedical
Research and the Massachusetts Institute of Technology recently used arrays
to identify genes that are more highly expressed in metastatic mouse and
human melanoma cells compared with their non-metastatic counterparts.

For all the information gene microarrays provide, they reveal relatively
little about proteins, the molecules that carry out most of the functions
of a cell. Gene arrays detect the presence of messenger RNA, the chemical
involved in translating DNA into proteins. Tracking this middle step in
the production process reveals nothing about three areas of interest to
researchers: protein function, the abundance of protein in a cell, and
modifications to proteins after they are producedchanges that may
be critical in the development of disease.

Solving technical problems

"If you really want to know what's going on in a cell, you have to look
at the molecules responsible for cellular functions, rather than intermediates
in the process," says Gavin MacBeath of the Center for Genomics Research
at Harvard University in Cambridge, MA. His laboratory is developing microarray
technologies for studying protein function and screening large numbers
of protein-protein interactions. Many academic and industry researchers
have tried in recent years to solve the technical problems that are delaying
the development of functional protein arrays.

The technology lags behind gene arrays in part because proteins are naturally
uncooperative. "DNA is very well behaved, and there are powerful ways
to amplify the chemical," says MacBeath. "Any DNA will work for a common
set of conditions, whereas proteins are very different from each other,
and some proteins are more stable than others." Another obstacle to protein
arrays is the difficulty of immobilizing proteins on slides while preserving
function (often the whole point of developing the array is to study function).

Robots spot 10,000 proteins on a slide

MacBeath and Stuart L. Schreiber, also of Harvard University, have worked
out some of the problems. In a recent issue of Science, the researchers
describe the construction of glass slides densely arrayed with proteins
for functional studies. Borrowing a technology from gene microarrays,
the researchers used a high-precision robot to print more than 10,000
protein samples on a surface about half the size of a microscope slide
(1,600 spots per square centimeter).

Previous studies have reported versions of a protein array, including
a kind of test tube array created at the University of Washington. The
Harvard team points out two characteristic features of their project:
First, the method solves the problem of attaching proteins to a structure
without losing function. And second, the technology is relatively simple
to use and available to anyone. A purpose of the project, says MacBeath,
was to make the technology easily accessible and compatible with standard
instrumentation.

MacBeath's laboratory is using the array to study families of between 50
and 200 proteins in humans and yeast. The yeast proteome, he notes, includes
some 6,200 proteins. "As the technology improves, we'll go after the whole
thing," he says.