Life, Sugar-Coded

"She had a
sweet child-fancy that her playmates understood her language as she
did theirs, and that birds, flowers, animals, and insects felt for her
the same affection which she felt for them."

Louisa
May Alcott, from A Modern Cinderella

Glycomicsthe systematic identification and characterization of
all the carbohydrates (sugar) chains used by organismsis a young
and relatively undeveloped field when compared to genomics and proteomics.
And this comparison has caused glycomics to be likened to a scientific
Cinderella: she's hard working, but put upon, and has to stay home while
her big sisters genomics and proteomics go to the ball.

Of course, the fairy-tale analogy is not so much to say that the genome
and the proteome are evil stepsisters as to point out that, like Cinderella,
the glycome has much beauty and grace ripe for discovery.

"Carbohydrates carry information just like DNA and proteins do," says
James Paulson, who is a professor in The Scripps Research Institute (TSRI)
Department of Molecular Biology. "But glycomics is a field that has lagged
behind progress in these other[s] by about 20 years."

The nature of the complex branched structures of carbohydrates has contributed
to this lag by slowing the development of efficient and routine methods
for structure analysis and synthesis that are key to unraveling the information
content and biological functions that they mediate. As a result, progress
in elucidating the functions of information-carrying carbohydrates has
been slow.

Now, to rectify this, a large grant to study the functions of carbohydratesa
field that Paulson calls "functional glycomics"is bringing together
a consortium of some 50 independently funded researchers at nearly 40
different institutions around the world, including several here in San
Diego.

The National Institute of General Medical Science, which supports basic
biomedical research, has awarded a multi-year grant to TSRI. The grant
carries a five-year package of $34 million, including $7.4 million for
the first year.

"It was [the National Institute of General Medical Science's] idea to
glue together the work of independently funded investigators," says Paulson,
"to provide the funds to accelerate the rate of research for [this] important
field." The beneficiaries are the scientific and biomedical community
as a whole, and, ultimately, the public as discoveries are translated
into treatments for disease and improved health. Information generated
by the consortium will be rapidly disseminated to participants and the
public alike through web-based databases.

"The goal of the Consortium for Functional Glycomics to understand the
paradigms by which carbohydrate-binding proteins mediate cell function
through recognition of their carbohydrate ligands" he adds. "We know enough
to say that carbohydrates can carry zip code-like addresses to aid the
proper trafficking of cells in the body, and that carbohydrates can modulate
signaling from the outside of a cell to the inside, but what we know so
far is just the tip of the iceberg."

The Third Alphabet

Carbohydrate structures are very much a part of the language of life.
They are like the accents on spoken wordsthey change the meaning
without changing the spelling.

Some even call carbohydrates the third alphabet, behind DNA and proteins.
Though they are not charged with storing genetic information like DNA
or acting as enzymatic workhorses like proteins, carbohydrates nevertheless
do carry information and are responsible for important biological functions,
playing a central role in many types of intercellular communication events,
particularly in the immune system.

"This program is looking at the third alphabet," says Paulson. "It's
a smaller alphabet than the genome or proteome but is nevertheless critically
important to understanding many aspects of biology."

Carbohydrates are particularly important in the immune system because
all cells, foreign or human, are covered with them. Some viruses, like
HIV and influenza, use sugars on the outside of human cells to gain entry,
and immune system cells use carbohydrate-binding proteins to detect subtle
differences in sugar structures on the surface of cells to recognize foreign
pathogens. Sometimes the carbohydrate-binding proteins and their sugar
ligands are expressed on the same cell, and the sugar is part of the regulation
machinery of the cell. Indeed, the major histocompatability complex, which
is responsible for the recognition, is composed almost entirely of glycosylated
proteins.

Moreover, sugar structures differ among cells and are regulated in development
and differentiation. "And, the differences are important," adds Paulson.
"If the right sugars are not there, the biology is altered."

For instance, one of the participating investigators, Jamey Marth at
the University of California, San Diego (UCSD), has shown that the absence
of a single carbohydrate expressed in T cells and displayed on their surfaces
will not affect CD4+ helper T cells, but will cause CD8+ killer T cells
to die prematurely.

Paulson was working as a postdoctoral fellow when he was struck by the
extraordinary specificity of the carbohydrate synthetases. As an assistant
professor, he began using enzymes to study the biochemistry of sugars
and their synthesis and, eventually, using sugars as a probe of biological
function. For years he worked for a San Diego biotech company as a senior
executive and researcher. While there, Paulson and others showed that
neutrophils and certain other white blood cells require a specific sugar
structure called sialyl-Lewis X to traffic normally to sites of inflammation
and lymph nodes. These sugars cause the cells to adhere to and roll on
cells lining the walls of a blood vessel that have produced complementary
binding proteins to 'recruit' the cells to that tissue. Once slowed down,
they stop and squeeze through the cells of the vessel wall into the surrounding
tissue.

In recent years, he has been studying the role of carbohydrates recognized
by another family of carbohydrate-binding proteins, examining how the
binding to carbohydrates modify a cell's function.

In particular, he is studying CD22, one of the membrane-spanning "accessory"
proteins of the B cell receptor that recognizes a carbohydrate also expressed
on B cells. Previously, he had cloned a gene for the enzyme that makes
the carbohydrate recognized by CD22. Deletion of this gene in mice with
Marth, a collaborator on this project, resulted in immuno-suppressed B
cells, indicating that the sugar made by the enzyme is a negative regulator
of B cell receptor signaling.

"What we think is happening is that when [CD22] binds the ligand, it
is binding to glycoproteins that sequester it from the B cell receptor
complex," says Paulson. "When the ligand is not there, it is free to associate
and [CD22] exerts its maximum negative regulatory effect to dampen the
immune response."

The mechanism has direct implications for treatment of autoimmune disease
and inflammation, but understanding it fully is hampered by the difficulty
of the research, which spans disciplines from organic synthesis to pure
biochemistry to cell imaging to whole animal models. Such limitations
in the research in this field are common. This recognition was the impetus
for Paulson and other members of the consortium to consider submitting
the grant application.

The Sweet Smell of Success

The stated purpose of the grant is to define paradigms by which proteincarbohydrate
interactions mediate cell communication. More specifically, the focus
is on the regulation of glycans displayed on the surfaces of cells and
the structure, function, and regulation of the four major families of
carbohydrate-binding proteins that mediate biological events. "[At the
moment], very little is known about the structure of carbohydrates on
a given cell and about how sugar constellations differ from cell to cell,"
says Paulson.

The program includes hypothesis-driven research of the participating
investigators and scientific cores funded by the grant that provide essential
resources for conducting research, a platform of information about carbohydrate
structures and their interaction with carbohydrate-binding proteins, and
an infrastructure of bioinformatics and databases to facilitate sharing
of data both within the consortium and with the public.

Information generated in the scientific cores will be accessible within
six weeks, and information deposited by participating investigators will
be released as soon as it is made public.

Any investigator with an existing grant in the area of carbohydrates
in cell communication can apply for membership to the consortium. Membership
entitles one to resources produced by the scientific cores funded by the
grant, and to view unpublished data deposited by other participating investigators
under the confines of a blanket non-disclosure agreement. Investigators,
in turn, commit to providing data back to the consortium.

The Cores

The carbohydrate synthesis and protein expression core will provide
a library of synthetic carbohydrates and other reagents to be used by
the other cores and by investigators. The investigators should benefit
greatly from this resource because one of the main bottlenecks that has
kept the field back up to now has been lack of commercial availability
and the difficultyparticularly for biologists without extensive
organic synthesis trainingin synthesizing these structures.

An analytical core located at UCSD will perform analyses of carbohydrate
structures. This serves a crucial purpose because, due to the complexity
of a branched glycan structure, systematic sequence analysis relies on
highly specialized methods like high field mass spectrometry and nuclear
magnetic resonance involving stripping sugars off cells and subjecting
them to reliable but tedious separations and individual identifications.
The application of these techniques is still not ready for high throughput,
but "that's something that I'm hoping this program will have an effect
on," says Paulson.

A gene microarray core located at TSRI will produce oligonucleotide
microarrays of human and murine carbohydrate-binding protein and glycosyltranferase
genes.

And a murine genetics core at TSRI and phenotype core at UCSD will generate
up to ten transgenic strains per year with altered carbohydrate-binding
protein and glycosyltranferase genes that will be phenotyped using all
the standard behavioral and biochemical assays. Once these models are
developed, they will be made publicly available.

A protein-carbohydrate interaction core at the University of Oklahoma
will determine the specificity and affinity of protein carbohydrate interactions.

Finally, a bioinformatics core located at the Massachusetts Institute
of Technology (M.I.T.) will contain a central database.

The Table is on the Sugar

The database will allow participants to search and link information to
speed up some research activities of the cores and participating investigators.
For instance, for the analytical core, when the sequencing is linked to
the structure database, mapping the sugars on cell surfaces may become
easier, since about 90 percent of carbohydrates that are routinely found
on cells are likely to be common structures that are already identified.
Running a sugar "fingerprint" of cell through a database could quickly
pinpoint new and interesting structures that need further characterization.

The linking of all types of data is another advantage. Synthesis data,
links to commercially available materials, links to publications, analysis
spectra of carbohydrates, and other information will eventually be linked
to genetic data on the four major families of carbohydrate-binding proteins
from all major organisms that have been sequenced, database on the families
of glycosyl transferases, and a carbohydrate structure database developed
by Glycominds Ltd., as a contribution to the project.

Because the database servers are integrated and accessible through the
internet, data entry should be as user-friendly as data retrieval. Investigators
can effortlessly attach tables and figures to the database using forms
and templates that automatically tag the data appropriately as the database
builds and as the body of knowledge on carbohydrates and their related
binding proteins grows.

While having a complete picture of sugars in the body is still a long
way off, the availability of the database on the internet should benefit
the scientific community as a whole. The grant will also address the research
bottlenecks in functional glycomics, clearing the road to a more complete
understanding of the paradigms that have evolved to use information in
the glycome to mediate cell communication.