The retina is responsible for capturing images from the visual field. Retinitis pigmentosa, which refers to a group of inherited diseases that cause retinal degeneration, causes a gradual decline in vision because retinal photoreceptor cells (rods and cones) die. Images on the left are courtesy of the National Eye Institute, NIH; image on the right is courtesy of Robert Fariss, Ph.D., and Ann Milam, Ph.D., National Eye Institute, NIH.

Metabolomics, the comprehensive evaluation of the products of cellular processes, can provide new findings and insight in a vast array of diseases and dysfunctions. Though promising, metabolomics lacks the standing of genomics or proteomics. It is, in a manner of speaking, the new kid on the “omics” block.

Even though metabolomics is still an emerging discipline, at least some quarters are giving it a warm welcome. For example, metabolomics is being advanced by the Common Fund, an initiate of the National Institutes of Health (NIH). The Common Fund has established six national metabolomics cores. In addition, individual agencies within NIH, such as the National Institute of Environmental Health Sciences (NIEHS), are releasing solicitations focused on growing more detailed metabolomics programs.

Whether metabolomic studies are undertaken with or without public support, they share certain characteristics and challenges. Untargeted or broad-spectrum studies are used for hypotheses generation, whereas targeted studies probe specific compounds or pathways. Reproducibility is a major challenge in the field; many studies cannot be reproduced in larger cohorts. Carefully defined guidance and standard operating procedures for sample collection and processing are needed.

While these challenges are being addressed, researchers are patiently amassing metabolomic insights in several areas, such as retinal diseases, neurodegenerative diseases, and autoimmune diseases. In addition, metabolomic sleuths are availing themselves of a growing selection of investigative tools.

A Metabolomic Eye on Retinal Degeneration

The retina has one of the highest metabolic activities of any tissue in the body and is composed of multiple cell types. This fact suggests that metabolomics might be helpful in understanding retinal degeneration. At least, that’s what occurred to Ellen Weiss, Ph.D., a professor of cell biology and physiology at the University of North Carolina School of Medicine at Chapel Hill. To explore this possibility, Dr. Weiss began collaborating with Susan Sumner, Ph.D., director of systems and translational sciences at RTI International.

Retinal degeneration is often studied through the use of genetic-mouse models that mimic the disease in humans. In the model used by Dr. Weiss, cells with a disease-causing mutation are the major light-sensing cells that degenerate during the disease. Individuals with the same or a similar genetic mutation will initially lose dim-light vision then, ultimately, bright-light vision and color vision.

Wild-type and mutant phenotypes, as well as dark- and light-raised animals, were compared, since retinal degeneration is exacerbated by light in this genetic model. Retinas were collected as early as day 18, prior to symptomatic disease, and analyzed. Although data analysis is ongoing, distinct differences have emerged between the phenotypes as well as between dark- and light-raised animals.

“There is a clear increase in oxidative stress in both light-raised groups but to a larger extent in the mutant phenotype,” reports Dr. Weiss. “There are global changes in metabolites that suggest mitochondrial dysfunction, and dramatic changes in lipid profiles. Now we need to understand how these metabolites are involved in this eye disease and the relevance of these perturbations.”

For example, the glial cells in the retina that upregulate a number of proteins in response to stress to attempt to save the retina are as likely as the light-receptive neurons to undergo metabolic changes.

“One of the challenges in metabolomics studies is assigning the signals that represent the metabolites or compounds in the samples,” notes Dr. Sumner. “Signals may be ‘unknown unknowns,’ compounds that have never been identified before, or ‘known unknowns,’ compounds that are known but that have not yet been assigned in the biological matrix.”

Internal and external libraries, such as the Human Metabolome Dictionary, are used to match signals. Whether or not a match exists, fragmentation patterns are used to characterize the metabolite, and when possible a standard is obtained to confirm identity. To assist with this process, the NIH Common Fund supports Metabolite Standard Synthesis Cores (MSSCs). RTI International holds an MSSC contract in addition to being a NIH-designated metabolomics core.

Mitochondrial Dysfunction in Alzheimer’s Disease

Alzheimer’s disease (AD) is difficult to diagnose early due to its asymptomatic phase; accurate diagnosis occurs only in postmortem brain tissue. To evaluate familial AD, a rare inherited form of the disease, the laboratory of Eugenia Trushina, Ph.D., associate professor of neurology and associate professor of pharmacology at the Mayo Clinic, uses mouse models to study the disease’s early molecular mechanisms.

Synaptic loss underlies cognitive dysfunction. The length of neurons dictates that mitochondria move within the cell to provide energy at the site of the synapses. An initial finding was that very early on mitochondrial trafficking was affected reducing energy supply to synapses and distant parts of the cell.

During energy production, the major mitochondrial metabolite is ATP, but the organelle also produces many other metabolites, molecules that are implicated in many pathways. One can assume that changes in energy utilization, production, and delivery are associated with some disturbance.

“Our goal,” explains Dr. Trushina, “was to get a proof of concept that we could detect in the blood of AD patients early changes of mitochondria dysfunction or other changes that could be informative of the disease over time.”

A Mayo Clinic aging study involves a cohort of patients, from healthy to those with mild cognitive impairment (MCI) through AD. Patients undergo an annual battery of tests including cognitive function along with blood and cerebrospinal fluid sampling. Metabolic signatures in plasma and cerebrospinal fluid of normal versus various disease stages were compared, and affected mitochondrial and lipid pathways identified in MCI patients that progressed to AD.

“Last year we published on a new compound that goes through the blood/brain barrier, gets into mitochondria, and very specifically, partially inhibits mitochondrial complex I activity, making the cell resistant to oxidative damage,” details Dr. Trushina. “The compound was able to either prevent or slow the disease in the animal familial models.

“Treatment not only reduced levels of amyloid plaques and phosphorylated tau, it also restored mitochondrial transport in neurons. Now we have additional compounds undergoing investigation for safety in humans, and target selectivity and engagement.”

“Mitochondria play a huge role in every aspect of our lives,” Dr. Trushina continues. “The discovery seems counterintuitive, but if mitochondria function is at the heart of AD, it may provide insight into the major sporadic form of the disease.”

Distinguishing Types of Asthma

In children, asthma generally manifests as allergy-induced asthma, or allergic asthma. And allergic asthma has commonalities with allergic dermatitis/eczema, food allergies, and allergic rhinitis. In adults, asthma is more heterogeneous, and distinct and varied subpopulations emerge. Some have nonallergic asthma; some have adult-onset asthma; and some have obesity-, occupational-, or exercise-induced asthma.

Adult asthmatics may have markers of TH2 high verus TH2 low asthma (T helper 2 cell cytokines) and they may respond to various triggers—environmental antigens, occupational antigens, irritants such as perfumes and chlorine, and seasonal allergens. Exercise, too, can trigger asthma.

One measure that can phenotype asthmatics is nitric oxide, an exhaled breath biomarker. Nitric oxide is a smooth muscle relaxant, vasodilator, and bronchodilator that can have anti-inflammatory properties. There is a wide range of values in asthmatics, and a number of values are needed to understand the trend in a particular patient. L-arginine is the amino acid that produces nitric oxide when converted to L-citrulline, a nonessential amino acid.

According to Nicholas Kenyon, M.D., a pulmonary and critical care specialist who is co-director of the University of California, Davis Asthma Network (UCAN), some metabolomic studies suggest that there is a state of L-arginine depletion during asthma attacks or in severe asthma suggesting a lack of substrate to produce nitric oxide. Dr. Kenyon is conducting clinical work on L-arginine supplementation in a double-blind cross-over intervention trial of L-arginine versus placebo. The 50-subject study in severe asthmatics should be concluded in early 2017.

Many new biologic therapies are coming to market to treat asthma; it will be challenging to determine which advanced therapy to provide to which patient. Therapeutics mostly target severe asthma populations and are for patients with evidence of higher numbers of eosinophils in the blood and lung, which include anti-IL-5 and (soon) anti-IL-13, among others.

Tools Development

Waters is developing metabolomics applications that use multivariate statistical methods to highlight compounds of interest. Typically these applications combine separation procedures, accomplished by means of liquid chromatography or gas chromatography (LC or GC), with detection methods that rely on mass spectrometry (MS). To support the identification, quantification, and analysis of LC-MS data, the company provides bioinformatics software. For example, Progenesis QI software can interrogate publicly available databases and process information about isotopic patterns, retention times, and collision cross-sections.

Mass spectrometry (MS) is the gold standard in metabolomics and lipidomics. But there is a limit to what accurate mass and resolution can achieve. For example, neither isobaric nor isomeric species are resolvable solely by MS. New orthogonal analytical tools will allow more confident identifications.

To improve metabolomics separations before MS detection, a post-ionization separation tool, like ion mobility, which is currently used to support traditional UPLC-MS and MS imaging metabolomics protocols, becomes useful. The collision-cross section (CCS), which measures the shape of molecules, can be derived, and it can be used as an additional identification coordinate.

Other new chromatographic tools are under development, such as microflow devices and UltraPerformance Convergence Chromatography (UPC2), which uses liquid CO2 as its mobile phase, to enable new ways of separating chiral metabolites. Both UPC2 and microflow technologies have decreased solvent consumption and waste disposal while maintaining UPLC-quality performance in terms of chromatographic resolution, robustness, and reproducibility.

Informatics tools are also improving. In the latest versions of Waters’ Progenesis software, typical metabolomics identification problems are resolved by allowing interrogation of publicly available databases and scoring according to accurate mass, isotopic pattern, retention time, CCS, and either theoretical or experimental fragments.

MS imaging techniques, such as MALDI and DESI, provide spatial information about the metabolite composition in tissues. These approaches can be used to support and confirm traditional analyses without sample extraction, and they allow image generation without the use of antibodies, similar to immunohistochemistry.

“Ion-mobility tools will soon be implemented for routine use, and the use of extended CCS databases will help with metabolite identification,” comments Giuseppe Astarita Ph.D., principal scientist, Waters. “More applications of ambient ionization MS will emerge, and they will allow direct-sampling analyses at atmospheric pressure with little or no sample preparation, generating real-time molecular fingerprints that can be used to discriminate among phenotypes.”

Microflow Technology

Microflow technology offers sensitivity and robustness. For example, at the Proteomics and Metabolomics Facility, Colorado State University, peptide analysis was typically performed using nanoflow chromatography; however, nanoflow chromatography is slow and technically challenging. Moving to microflow offered significant improvements in robustness and ease-of-use and resulted in improved chromatography without sacrificing sensitivity.

Conversely, small molecule applications were typically performed with analytical-scale chromatography. While this flow regime is extremely robust and fast, it can sometimes be limited in sensitivity. Moving to microflow offered significant improvements in sensitivity, 5- to 10-fold depending on the compound, without sacrificing robustness.

But broad-scale microflow adoption is hampered by a lack of available column chemistries and legacy HPLC or UPLC infrastructure that is not conducive to low-flow operation.

Compound annotation and comparability and transparency in data processing and reporting is a challenge in metabolomics research. Multiple groups are actively working on developing new tools and strategies; common best practices need to be adopted.

The continued growth of open-source spectral databases and new tools for spectral prediction from compound databases will dramatically impact the ability for metabolomics to result in novel discoveries. The move to a systems-level understanding through the combination of various omics data also will have a huge influence and be enabled by the continued development of open-source and user-friendly pathway-analysis tools.

Fusion detection can be carried out with traditional opposing primer-based library preparation methods, which require target- and fusion-specific primers that define the region to be sequenced. With these methods, primers are needed that flank the target region and the fusion partner, so only known fusions can be detected. An alternative method, ArcherDX’ Anchored Multiplex PCR (AMP), can be used to detect the target of interest, plus any known and unknown fusion partners. This is because AMP uses target-specific unidirectional primers, along with reverse primers, that hybridize to the sequencing adapter that is ligated to each fragment prior to amplification.

In time, the narrow, tortuous paths followed by pioneers become wider and straighter, whether the pioneers are looking to settle new land or bring new biomarkers to the clinic.

In the case of biomarkers, we’re still at the stage where pioneers need to consult guides and outfitters or, in modern parlance, consultants and technology providers. These hardy souls tend to congregate at events like the Biomarker Conference, which was held recently in San Diego.

At this event, biomarker experts discussed ways to avoid unfortunate detours on the trail from discovery and development to clinical application and regulatory approval. Of particular interest were topics such as the identification of accurate biomarkers, the explication of disease mechanisms, the stratification of patient groups, and the development of standard protocols and assay platforms. In each of these areas, presenters reported progress.

Another crucial subject is the integration of techniques such as next-generation sequencing (NGS). This particular technique has been instrumental in advancing clinical cancer genomics and continues to be the most feasible way of simultaneously interrogating multiple genes for driver mutations.

Enriching nucleic acid libraries for target genes of interest prior to NGS greatly enhances the sensitivity of detecting mutations, as the enriched regions are sequenced multiple times. This is particularly useful when analyzing clinical samples, which generate low amounts of poor-quality nucleic acids.

Most target-enrichment strategies require prior knowledge of both ends of the target region to be sequenced. Therefore, only gene fusions with known partners can be amplified for downstream NGS assays.

Archer’s Anchored Multiplex PCR (AMP™) technology overcomes this limitation, as it can enrich for novel fusions, while only requiring knowledge of one end of the fusion pair. At the heart of the AMP chemistry are unique Molecular Barcode (MBC) adapters, ligated to the 5′ ends of DNA fragments prior to amplification. The MBCs contain universal primer binding sites for PCR and a molecular barcode for identifying unique molecules. When combined with 3′ gene-specific primers, MBCs enable amplification of target regions with unknown 5′ ends.

“Tagging each molecule of input nucleic acid with a unique molecular barcode allows for de-duplication, error correction, and quantitative analysis, resulting in high sequencing consensus. With its low error rate and low limits of detection, AMP is revolutionizing the field of cancer genomics.”

In a proof-of-concept study, a single-tube 23-plex panel was designed to amplify the kinase domains of ALK, RET, ROS1, and MUSK genes by AMP. This enrichment strategy enabled identification of gene fusions with multiple partners and alternative splicing events in lung cancer, thyroid cancer, and glioblastoma specimens by NGS.

Over the last decade, the Biomarker/Translational Research Laboratory has focused on developing clinical genotyping and fluorescent in situ hybridization (FISH) assays for rapid personalized genomic testing.

“Initially, we analyzed the most prevalent hotspot mutations, about 160 in 25 cancer genes,” continued Dr. Borger. “However, this approach revealed mutations in only half of our patients. With the advent of NGS, we are able to sequence 190 exons in 39 cancer genes and obtain significantly richer genetic fingerprints, finding genetic aberrations in 92% of our cancer patients.”

Using multiplexed approaches, Dr. Borger’s team within the larger Center for Integrated Diagnostics (CID) program at MGH has established high-throughput genotyping service as an important component of routine care. While only a few susceptible molecular alterations may currently have a corresponding drug, the NGS-driven analysis may supply new information for inclusion of patients into ongoing clinical trials, or bank the result for future research and development.

“A significant impediment to discovery of clinically relevant genomic signatures is our current inability to interconnect the data,” explained Dr. Borger. “On the local level, we are striving to compile the data from clinical observations, including responses to therapy and genotyping. Globally, it is imperative that comprehensive public databases become available to the research community.”

This image, from the Massachusetts General Hospital Cancer Center, shows multicolor fluorescence in situ hybridization (FISH) analysis of cells from a patient with esophagogastric cancer. Remarkably, the FISH analysis revealed that co-amplification of the MET gene (red signal) and the EGFR gene (green signal) existed simultaneously in the same tumor cells. A chromosome 7 control probe is shown in blue.

Tumor profiling at MGH have already yielded significant discoveries. Dr. Borger’s lab, in collaboration with oncologists at the MGH Cancer Center, found significant correlations between mutations in the genes encoding the metabolic enzymes isocitrate dehydrogenase (IDH1 and IDH2) and certain types of cancers, such as cholangiocarcinoma and acute myelogenous leukemia (AML).

Historically, cancer signatures largely focus on signaling proteins. Discovery of a correlative metabolic enzyme offered a promise of diagnostics based on metabolic byproducts that may be easily identified in blood. Indeed, the metabolite 2-hydroxyglutarate accumulates to high levels in the tissues of patients carrying IDH1 and IDH2 mutations. They have reported that circulating 2-hydroxyglutarate as measured in the blood correlates with tumor burden, and could serve as an important surrogate marker of treatment response. …..

“Traditionally, it has been hard to use standard methods to quantify the amount of tRNA in the cell,” says Tavazoie. The lead authors of the article, Hani Goodarzi, formerly a postdoc in the lab and now a new assistant professor at UCSF, and research assistant Hoang Nguyen, devised and applied a new method that utilizes state-of-the-art genomic sequencing technology to measure the amount of tRNAs in different cell types.

The team chose to compare breast tissue from healthy individuals with tumor samples taken from breast cancer patients–including both primary tumors that had not spread from the breast to other body sites, and highly aggressive, metastatic tumors.

They found that the levels of two specific tRNAs were significantly higher in metastatic cells and metastatic tumors than in primary tumors that did not metastasize or healthy samples. “There are four different ways to encode for the protein building block arginine,” explains Tavazoie. “Yet only one of those–the tRNA that recognizes the codon CGG–was associated with increased metastasis.”

The tRNA that recognizes the codon GAA and encodes for a building block known as glutamic acid was also elevated in metastatic samples.

The team hypothesized that the elevated levels of these tRNAs may in fact drive metastasis. Working in mouse models of primary, non-metastatic tumors, the researchers increased the production of the tRNAs, and found that these cells became much more invasive and metastatic.

They also did the inverse experiment, with the anticipated results: reducing the levels of these tRNAs in metastatic cells decreased the incidence of metastases in the animals.

How do two tRNAs drive metastasis? The researchers teamed up with members of the Rockefeller University proteomics facility to see how protein expression changes in cells with elevated levels of these two tRNAs.

“We found global increases in many dozens of genes,” says Tavazoie, “so we analyzed their sequences and found that the majority of them had significantly increased numbers of these two specific codons.”

According to the researchers, two genes stood out among the list. Known as EXOSC2 and GRIPAP1, these genes were strongly and directly induced by elevated levels of the specific glutamic acid tRNA.

“When we mutated the GAA codons to GAG– a “silent” mutation because they both spell out the protein building block glutamic acid–we found that increasing the amount of tRNA no longer increased protein levels,” explains Tavazoie. These proteins were found to drive breast cancer metastasis.

The work challenges previous assumptions about how tRNAs function and suggests that tRNAs can modulate gene expression, according to the researchers. Tavazoie points out that “it is remarkable that within a single cell type, synonymous changes in genetic sequence can dramatically affect the levels of specific proteins, their transcripts, and the way a cell behaves.”

Scientists have found that measuring how cancer treatment affects the levels of metabolites – the building blocks of fats and proteins – can be used to assess whether the drug is hitting its intended target.

This new way of monitoring cancer therapy could speed up the development of new targeted drugs – which exploit specific genetic weaknesses in cancer cells – and help in tailoring treatment for patients.

Scientists at The Institute of Cancer Research, London, measured the levels of 180 blood markers in 41 patients with advanced cancers in a phase I clinical trial conducted with The Royal Marsden NHS Foundation Trust.

They found that investigating the mix of metabolic markers could accurately assess how cancers were responding to the targeted drug pictilisib.

Their study was funded by the Wellcome Trust, Cancer Research UK and the pharmaceutical company Roche, and is published in the journal Molecular Cancer Therapeutics.

Pictilisib is designed to specifically target a molecular pathway in cancer cells, called PI3 kinase, which has key a role in cell metabolism and is defective in a range of cancer types.

As cancers with PI3K defects grow, they can cause a decrease in the levels of metabolites in the bloodstream.

The new study is the first to show that blood metabolites are testable indicators of whether or not a new cancer treatment is hitting the correct target, both in preclinical mouse models and also in a trial of patients.

Using a sensitive technique called mass spectrometry, scientists at The Institute of Cancer Research (ICR) initially analysed the metabolite levels in the blood of mice with cancers that had defects in the PI3K pathway.

They found that the blood levels of 26 different metabolites, which were low prior to therapy, had risen considerably following treatment with pictilisib. Their findings indicated that the drug was hitting its target, and reversing the effects of the cancer on mouse metabolites.

Similarly, in humans the ICR researchers found that almost all of the metabolites – 22 out of the initial 26 – once again rose in response to pictilisib treatment, as seen in the mice.

Blood levels of the metabolites began to increase after a single dose of pictilisib, and were seen to drop again when treatment was stopped, suggesting that the effect was directly related to the drug treatment.

Metabolites vary naturally depending on the time of day or how much food a patient has eaten. But the researchers were able to provide the first strong evidence that despite this variation metabolites can be used to test if a drug is working, and could help guide decisions about treatment.

Researchers at the Gladstone Institutes say they have found a new pathway by which salicylic acid, a key compound in the nonsteroidal anti-inflammatory drugs aspirin and diflunisal, stops inflammation and cancer.

In a study (“Salicylate, Diflunisal and Their Metabolites Inhibit CBP/p300 and Exhibit Anticancer Activity”) published in eLife, the investigators discovered that both salicylic acid and diflunisal suppress two key proteins that help control gene expression throughout the body. These sister proteins, p300 and CREB-binding protein (CBP), are epigenetic regulators that control the levels of proteins that cause inflammation or are involved in cell growth.

By inhibiting p300 and CBP, salicylic acid and diflunisal block the activation of these proteins and prevent cellular damage caused by inflammation. This study provides the first concrete demonstration that both p300 and CBP can be targeted by drugs and may have important clinical implications, according to Eric Verdin, M.D., associate director of the Gladstone Institute of Virology and Immunology .

“Salicylic acid is one of the oldest drugs on the planet, dating back to the Egyptians and the Greeks, but we’re still discovering new things about it,” he said. “Uncovering this pathway of inflammation that salicylic acid acts upon opens up a host of new clinical possibilities for these drugs.”

Earlier research conducted in the laboratory of co-author Stephen D. Nimer, M.D., director of Sylvester Comprehensive Cancer Center at the University of Miami Miller School of Medicine, and a collaborator of Verdin’s, established a link between p300 and the leukemia-promoting protein AML1-ETO. In the current study, scientists at Gladstone and Sylvester worked together to test whether suppressing p300 with diflunisal would suppress leukemia growth in mice. As predicted, diflunisal stopped cancer progression and shrunk the tumors in the mouse model of leukemia. ……

Researchers at Georgia State University have designed a new protein compound that can effectively target the cell surface receptor integrin v3, mutations in which have been linked to a number of diseases. Initial results using this new molecule show its potential as a therapeutic treatment for an array of illnesses, including cancer.

The novel protein molecule targets integrin v3 at a novel site that has not been targeted by other scientists. The researchers found that the molecule induces apoptosis, or programmed cell death, of cells that express integrin v3. This integrin has been a focus for drug development because abnormal expression of v3 is linked to the development and progression of various diseases.

“This integrin pair, v3, is not expressed in high levels in normal tissue,” explained senior study author Zhi-Ren Liu, Ph.D., professor in the department of biology at Georgia State. “In most cases, it’s associated with a number of different pathological conditions. Therefore, it constitutes a very good target for multiple disease treatment.”

“Here we use a rational design approach to develop a therapeutic protein, which we call ProAgio, which binds to integrin αvβ3 outside the classical ligand-binding site,” the authors wrote. “We show ProAgio induces apoptosis of integrin αvβ3-expressing cells by recruiting and activating caspase 8 to the cytoplasmic domain of integrin αvβ3.”

The findings from this study were published recently in Nature Communications in an article entitled “Rational Design of a Protein That Binds Integrin αvβ3 Outside the Ligand Binding Site.” …..

“We took a unique angle,” Dr. Lui noted. “We designed a protein that binds to a different site. Once the protein binds to the site, it directly triggers cell death. When we’re able to kill pathological cells, then we’re able to kill the disease.”

The investigators performed extensive cell and molecular testing that confirmed ProAgio interacts and binds well with integrin v3. Interestingly, they found that ProAgio induces apoptosis by recruiting caspase 8—an enzyme that plays an essential role in programmed cell death—to the cytoplasmic area of integrin v3. ProAgio was much more effective in inducing cell death than other agents tested.

Noncoding RNAs Not So Noncoding

Bits of the transcriptome once believed to function as RNA molecules are in fact translated into small proteins.

In 2002, a group of plant researchers studying legumes at the Max Planck Institute for Plant Breeding Research in Cologne, Germany, discovered that a 679-nucleotide RNA believed to function in a noncoding capacity was in fact a protein-coding messenger RNA (mRNA).1 It had been classified as a long (or large) noncoding RNA (lncRNA) by virtue of being more than 200 nucleotides in length. The RNA, transcribed from a gene called early nodulin 40 (ENOD40), contained short open reading frames (ORFs)—putative protein-coding sequences bookended by start and stop codons—but the ORFs were so short that they had previously been overlooked. When the Cologne collaborators examined the RNA more closely, however, they found that two of the ORFs did indeed encode tiny peptides: one of 12 and one of 24 amino acids. Sampling the legumes confirmed that these micropeptides were made in the plant, where they interacted with a sucrose-synthesizing enzyme.

Five years later, another ORF-containing mRNA that had been posing as a lncRNA was discovered inDrosophila.2,3 After performing a screen of fly embryos to find lncRNAs, Yuji Kageyama, then of the National Institute for Basic Biology in Okazaki, Japan, suppressed each transcript’s expression. “Only one showed a clear phenotype,” says Kageyama, now at Kobe University. Because embryos missing this particular RNA lacked certain cuticle features, giving them the appearance of smooth rice grains, the researchers named the RNA “polished rice” (pri).

Turning his attention to how the RNA functioned, Kageyama thought he should first rule out the possibility that it encoded proteins. But he couldn’t. “We actually found it was a protein-coding gene,” he says. “It was an accident—we are RNA people!” The pri gene turned out to encode four tiny peptides—three of 11 amino acids and one of 32—that Kageyama and colleagues showed are important for activating a key developmental transcription factor.4

Since then, a handful of other lncRNAs have switched to the mRNA ranks after being found to harbor micropeptide-encoding short ORFs (sORFs)—those less than 300 nucleotides in length. And given the vast number of documented lncRNAs—most of which have no known function—the chance of finding others that contain micropeptide codes seems high.

Overlooked ORFs

From the late 1990s into the 21st century, as species after species had their genomes sequenced and deposited in databases, the search for novel genes and their associated mRNAs duly followed. With millions or even billions of nucleotides to sift through, researchers devised computational shortcuts to hunt for canonical gene and mRNA features, such as promoter regions, exon/intron splice sites, and, of course, ORFs.

ORFs can exist in practically any stretch of RNA sequence by chance, but many do not encode actual proteins. Because the chance that an ORF encodes a protein increases with its length, most ORF-finding algorithms had a size cut-off of 300 nucleotides—translating to 100 amino acids. This allowed researchers to “filter out garbage—that is, meaningless ORFs that exist randomly in RNAs,” says Eric Olsonof the University of Texas Southwestern Medical Center in Dallas.

Of course, by excluding all ORFs less than 300 nucleotides in length, such algorithms inevitably missed those encoding genuine small peptides. “I’m sure that the people who came up with [the cut-off] understood that this rule would have to miss anything that was shorter than 100 amino acids,” saysNicholas Ingolia of the University of California, Berkeley. “As people applied this rule more and more, they sort of lost track of that caveat.” Essentially, sORFs were thrown out with the computational trash and forgotten.

Aside from statistical practicality and human oversight, there were also technical reasons that contributed to sORFs and their encoded micropeptides being missed. Because of their small size, sORFs in model organisms such as mice, flies, and fish are less likely to be hit in random mutagenesis screens than larger ORFs, meaning their functions are less likely to be revealed. Also, many important proteins are identified based on their conservation across species, says Andrea Pauli of the Research Institute of Molecular Pathology in Vienna, but “the shorter [the ORF], the harder it gets to find and align this region to other genomes and to know that this is actually conserved.”

As for the proteins themselves, the standard practice of using electrophoresis to separate peptides by size often meant micropeptides would be lost, notes Doug Anderson, a postdoc in Olson’s lab. “A lot of times we run the smaller things off the bottom of our gels,” he says. Standard protein mass spectrometry was also problematic for identifying small peptides, says Gerben Menschaert of Ghent University in Belgium, because “there is a washout step in the protocol so that only larger proteins are retained.”

But as researchers take a deeper dive into the function of the thousands of lncRNAs believed to exist in genomes, they continue to uncover surprise micropeptides. In February 2014, for example, Pauli, then a postdoc in Alex Schier’s lab at Harvard University, discovered a hidden code in a zebrafish lncRNA. She had been hunting for lncRNAs involved in zebrafish development because “we hadn’t really anticipated that there would be any coding regions out there that had not been discovered—at least not something that is essential,” she says. But one lncRNA she identified actually encoded a 58-amino-acid micropeptide, which she called Toddler, that functioned as a signaling protein necessary for cell movements that shape the early embryo.5

Then, last year, Anderson and his colleagues reported another. Since joining Olson’s lab in 2010, Anderson had been searching for lncRNAs expressed in the heart and skeletal muscles of mouse embryos. He discovered a number of candidates, but one stood out for its high level of sequence conservation—suggesting to Anderson that it might have an important function. He was right, the RNA was important, but for a reason that neither Anderson nor Olson had considered: it was in fact an mRNA encoding a 46-amino-acid-long micropeptide.6

“When we zeroed in on the conserved region [of the gene], Doug found that it began with an ATG [start] codon and it terminated with a stop codon,” Olson says. “That’s when he looked at whether it might encode a peptide and found that indeed it did.” The researchers dubbed the peptide myoregulin, and found that it functioned as a critical calcium pump regulator for muscle relaxation.

With more and more overlooked peptides now being revealed, the big question is how many are left to be discovered. “Were there going to be dozens of [micropeptides]? Were there going to be hundreds, like there are hundreds of microRNAs?” says Ingolia. “We just didn’t know.”

Little things mean a lot. To any biologist, this time-worn maxim is old news. But it’s worth revisiting. As several articles in this issue of The Scientist illustrate, how researchers define and examine the “little things” does mean a lot.

Consider this month’s cover story, “Noncoding RNAs Not So Noncoding,” by TS correspondent Ruth Williams. Combing the human genome for open reading frames (ORFs), sequences bracketed by start and stop codons, yielded a protein-coding count somewhere in the neighborhood of 24,000. That left a lot of the genome relegated to the category of junk—or, later, to the tens of thousands of mostly mysterious long noncoding RNAs (lncRNAs). But because they had only been looking for ORFs that were 300 nucleotides or longer (i.e., coding for proteins at least 100 amino acids long), genome probers missed so-called short ORFs (sORFs), which encode small peptides. “Their diminutive size may have caused these peptides to be overlooked, their sORFs to be buried in statistical noise, and their RNAs to be miscategorized, but it does not prevent them from serving important, often essential functions, as the micropeptides characterized to date demonstrate,” writes Williams.

How little things work definitely informs another field of life science research: synthetic biology. As the functions of genes and gene networks are sussed out, bioengineers are using the information to design small, synthetic gene circuits that enable them to better understand natural networks. In “Synthetic Biology Comes into Its Own,” Richard Muscat summarizes the strides made by synthetic biologists over the last 15 years and offers an optimistic view of how such networks may be put to use in the future. And to prove him right, just as we go to press, a collaborative group led by one of syn bio’s founding fathers, MIT’s James Collins, has devised a paper-based test for Zika virus exposure that relies on a freeze-dried synthetic gene circuit that changes color upon detection of RNAs in the viral genome. The results are ready in a matter of hours, not the days or weeks current testing takes, and the test can distinguish Zika from dengue virus. “What’s really exciting here is you can leverage all this expertise that synthetic biologists are gaining in constructing genetic networks and use it in a real-world application that is important and can potentially transform how we do diagnostics,” commented one researcher about the test.

Moving around little things is the name of the game when it comes to delivering a package of drugs to a specific target or to operating on minuscule individual cells. Mini-scale delivery of biocompatible drug payloads often needs some kind of boost to overcome fluid forces or size restrictions that interfere with fine-scale manipulation. To that end, ingenious solutions that motorize delivery by harnessing osmotic changes, magnets, ultrasound, and even bacterial flagella are reviewed in “Making Micromotors Biocompatible.”

Cilengitide, a cyclic RGD pentapeptide, is currently in clinical phase III for treatment of glioblastomas and in phase II for several other tumors. This drug is the first anti-angiogenic small molecule targeting the integrins αvβ3, αvβ5 and α5β1. It was developed by us in the early 90s by a novel procedure, the spatial screening. This strategy resulted in c(RGDfV), the first superactive αvβ3 inhibitor (100 to 1000 times increased activity over the linear reference peptides), which in addition exhibited high selectivity against the platelet receptor αIIbβ3. This cyclic peptide was later modified by N-methylation of one peptide bond to yield an even greater antagonistic activity in c(RGDf(NMe)V). This peptide was then dubbed Cilengitide and is currently developed as drug by the company Merck-Serono (Germany).

This article describes the chemical development of Cilengitide, the biochemical background of its activity and a short review about the present clinical trials. The positive anti-angiogenic effects in cancer treatment can be further increased by combination with “classical” anti-cancer therapies. Several clinical trials in this direction are under investigation.

Integrins are heterodimeric receptors that are important for cell-cell and cell-extracellular matrix (ECM) interactions and are composed of one α and one β-subunit [1, 2]. These cell adhesion molecules act as transmembrane linkers between their extracellular ligands and the cytoskeleton, and modulate various signaling pathways essential in the biological functions of most cells. Integrins play a crucial role in processes such as cell migration, differentiation, and survival during embryogenesis, angiogenesis, wound healing, immune and non-immune defense mechanisms, hemostasis and oncogenic transformation [1]. The fact that many integrins are also linked with pathological conditions has converted them into very promising therapeutic targets [3]. In particular, integrins αvβ3, αvβ5 and α5β1 are involved in angiogenesis and metastasis of solid tumors, being excellent candidates for cancer therapy [4–7].

There are a number of different integrin subtypes which recognize and bind to the tripeptide sequence RGD (arginine, glycine, aspartic acid), which represents the most prominent recognition motif involved in cell adhesion. For example, the pro-angiogenic αvβ3 integrin binds various RGD-containing proteins, including fibronectin (Fn), fibrinogen (Fg), vitronectin (Vn) and osteopontin [8]. It is therefore not surprising that this integrin has been targeted for cancer therapy and that RGD-containing peptides and peptidomimetics have been designed and synthesized aiming to selectively inhibit this receptor [9, 10].

One classical strategy used in drug design is based on the knowledge about the structure of the receptor-binding pocket, preferably in complex with the natural ligand. However, this strategy, the so-called “rational structure-based design”, could not be applied in the field of integrin ligands since the first structures of integrin’s extracellular head groups were not described until 2001 for αvβ3 [11] (one year later, in 2002 the structure of this integrin in complex with Cilengitide was also reported [12]) and 2004 for αIIbβ3 [13]. Therefore, initial efforts in this field focused on a “ligand-oriented design”, which concentrated on optimizing RGD peptides by means of different chemical approaches in order to establish structure-activity relationships and identify suitable ligands.

We focused our interest in finding ligands for αvβ3 and based our approach on three chemical strategies pioneered in our group: 1) Reduction of the conformational space by cyclization; 2) Spatial screening of cyclic peptides; and 3)N-Methyl scan.

The combination of these strategies lead to the discovery of the cyclic peptidec(RGDf(NMe)V) in 1995. This peptide showed subnanomolar antagonistic activity for the αvβ3 receptor, nanomolar affinities for the closely related integrins αvβ5 and α5β1, and high selectivity towards the platelet receptor αIIbβ3. The peptide was patented together with Merck in 1997 (patent application submitted in 15.9.1995, opened in 20.3.1997) [14] and first presented with Merck’s agreement at the European Peptide Symposium in Edinburgh (September 1996) [15]. The synthesis and activity of this molecule was finally published in 1999 [16]. This peptide is now developed by Merck-Serono, (Darmstadt, Germany) under the name “Cilengitide” and has recently entered Phase III clinical trials for treating glioblastoma [17]. …..

The discovery 30 years ago of the RGD motif in Fn was a major breakthrough in science. This tripeptide sequence was also identified in other ECM proteins and was soon described as the most prominent recognition motif involved in cell adhesion. Extensive research in this direction allowed the description of a number of bidirectional proteins, the integrins, which were able to recognize and bind to the RGD sequence. Integrins are key players in the biological function of most cells and therefore the inhibition of RGD-mediated integrin-ECM interactions became an attractive target for the scientific community.

However, the lack of selectivity of linear RGD peptides represented a major pitfall which precluded any clinical application of RGD-based inhibitors. The control of the molecule’s conformation by cyclization and further spatial screening overcame these limitations, showing that it is possible to obtain privileged bioactive structures, which enhance the biological activity of linear peptides and significantly improve their receptor selectivity. Steric control imposed in RGD peptides together with their biological evaluation and extensive structural studies yielded the cyclic peptide c(RGDfV), the first small selective anti-angiogenic molecule described. N-Methylation of this cyclic peptide yielded the much potentc(RGDf(NMe)V), nowadays known as Cilengitide.

The fact that brain tumors, which are highly angiogenic, are more susceptible to the treatment with integrin antagonists, and the positive synergy observed for Cilengitide in combination with radio-chemotherapy in preclinical studies, encouraged subsequent clinical trials. Cilengitide is currently in phase III for GBM patients and in phase II for other types of cancers, with to date a promising therapeutic outcome. In addition, the absence of significant toxicity and excellent tolerance of this drug allows its combination with classical therapies such as RT or cytotoxic agents. The controlled phase III study CENTRIC was launched in 2008, with primary outcome measures due on September 2012. The results of this and other clinical studies are expected with great hope and interest.

Integrins are heterodimeric, transmembrane receptors that function as mechanosensors, adhesion molecules and signal transduction platforms in a multitude of biological processes. As such, integrins are central to the etiology and pathology of many disease states. Therefore, pharmacological inhibition of integrins is of great interest for the treatment and prevention of disease. In the last two decades several integrin-targeted drugs have made their way into clinical use, many others are in clinical trials and still more are showing promise as they advance through preclinical development. Herein, this review examines and evaluates the various drugs and compounds targeting integrins and the disease states in which they are implicated.

Integrins are heterodimeric cell surface receptors found in nearly all metazoan cell types, composed of non-covalently linked α and β subunits. In mammals, eighteen α-subunits and eight β-subunits have been identified to date 1. From this pool, 24 distinct heterodimer combinations have been observed in vivo that confer cell-to-cell and cell-to-ligand specificity relevant to the host cell and the environment in which it functions 2. Integrin-mediated interactions with the extracellular matrix (ECM) are required for the attachment, cytoskeletal organization, mechanosensing, migration, proliferation, differentiation and survival of cells in the context of a multitude of biological processes including fertilization, implantation and embryonic development, immune response, bone resorption and platelet aggregation. Integrins also function in pathological processes such as inflammation, wound healing, angiogenesis, and tumor metastasis. In addition, integrin binding has been identified as a means of viral entry into cells 3. ….

Combination of cilengitide and radiation therapy and temozolomide. The addition of cilengitide to radiotherapy and temozolomide based treatment regimens has shown promising preliminary results in ongoing Phase II trials in both newly diagnosed and progressive glioblastoma multiforme 139–140. In addition to the Phase II objectives sought, these trials are significant in that they represent progress that has made in determining tumor drug uptake and in identifying a subset of patients that may benefit from treatment. In a Phase II trial enrolling 52 patients with newly diagnosed glioblastoma multiforme receiving 500 mg cilengitide twice weekly during radiotherapy and in combination with temozolomide for 6 monthly cycles following radiotherapy, 69% achieved 6 months progression free survival compared to 54 % of patients receiving radiotherapy followed by temozolomide alone. The one-year overall survival was 67 and 62 % of patients for the cilengitide combination group and the radiotherapy and temozolomide group, respectively. Non-hematological grade 3-4 toxcities were limited, and included symptoms of fatigue, asthenia, anorexia, elevated liver function tests, deep vein thrombosis and pulmonary embolism in across a total of 5.7% of the patients. Grade 3-4 hematological malignancies were more common and included lymphopenia (53.8%), thrombocytopenia (13.4%) and neutropenia (9.6%). This trial is significant in the fact that is has provided the first evidence correlating a molecular biomarker with response to treatment. Decreased methylguanine methyltransferase (MGMT) expression was associated with favorable outcome. Patients harboring increased MGMT promoter methylation appeared to benefit more from combined treatment with cilengitide than did patients lacking promoter methylation. The significance of the MGMT promoter methylation in predicting response is likely due to inclusion of temozolomide in the treatment combination.

In the last two decades great progress has been made in the discovery and development of integrin targeted therapeutics. Years of intense research into integrin function has provided an understanding of the potential applications for the treatment of disease. Advances in structural characterization of integrin-ligand interactions has proved beneficial in the design and development of potent, selective inhibitors for a number of integrins involved in platelet aggregation, inflammatory responses, angiongenesis, neovascularization and tumor growth.

The αIIbβ3 integrin antagonists were the first inhibitors to make their way into clinical use and have proven to be effective and safe drugs, contributing to the reduction of mortality and morbidity associated with acute coronary syndromes. Interestingly, the prolonged administration of small molecules targeting this integrin for long-term prevention of thrombosis related complications have not been successful, for reasons that are not yet fully understood. This suggests that modulating the intensity, duration and temporal aspects of integrin function may be more effective than simply shutting off integrin signaling in some instances. Further research into the dynamics of platelet activation and thrombosis formation may elucidate the mechanisms by which integrin activation is modulated.

The introduction of α4 targeted therapies held great promise for the treatment of inflammatory diseases. The development of Natalizumab greatly improved the quality of life for multiple sclerosis patients and those suffering with Crohn’s Disease compared to previous treatments, but the role in asthma related inflammation could not be validated. Unfortunately for MS and Crohn’s patients, immune surveillance in the central nervous system was also compromised as a direct effect α4β7 antagonism, with potentially lethal effects. Thus Natalizumab and related α4β7 targeting drugs are now limited to patients refractory to standard therapies. The design and development of α4β1 antagonists for the treatment of Crohn’s Disease may offer benefit with decreased risks. The involvement of these integrins in fetal development also raises concerns for widespread clinical use.

Integrin antagonists that target angiogenesis are progressing through clinical trials. Cilengitide has shown promising results for the treatment of glioblastomas and recurrent gliomas, cancers with notoriously low survival and cure rates. The greatest challenge facing the development of anti-angiogenic integrin targeted therapies is the overall lack of biomarkers by which to measure treatment efficacy.

Mapping the ligand-binding pocket of integrin α5β1 using a gain-of-function approach

Integrin α5β1 is a key receptor for the extracellular matrix protein fibronectin. Antagonists of human α5β1 have therapeutic potential as anti-angiogenic agents in cancer and diseases of the eye. However, the structure of the integrin is unsolved and the atomic basis of fibronectin and antagonist binding by α5β1 is poorly understood. Here we demonstrate that zebrafish α5β1 integrins do not interact with human fibronectin or the human α5β1 antagonists JSM6427 and cyclic peptide CRRETAWAC. Zebrafish α5β1 integrins do bind zebrafish fibronectin-1, and mutagenesis of residues on the upper surface and side of the zebrafish α5 subunit β-propeller domain shows that these residues are important for the recognition of RGD and synergy sites in fibronectin. Using a gain-of-function analysis involving swapping regions of the zebrafish α5 subunit with the corresponding regions of human α5 we show that blades 1-4 of the β-propeller are required for human fibronectin recognition, suggesting that fibronectin binding involves a broad interface on the side and upper face of the β-propeller domain. We find that the loop connecting blades 2 and 3 of the β-propeller (D3-A3 loop) contains residues critical for antagonist recognition, with a minor role played by residues in neighbouring loops. A new homology model of human α5β1 supports an important function for D3-A3 loop residues Trp-157 and Ala-158 in the binding of antagonists. These results will aid the development of reagents that block α5β1 functions in vivo.

Integrins are cell adhesion molecules that mediate cell-cell, cell-extracellular matrix, and cellpathogen interactions. They play critical roles for the immune system in leukocyte trafficking and migration, immunological synapse formation, costimulation, and phagocytosis. Integrin adhesiveness can be dynamically regulated through a process termed inside-out signaling. In addition, ligand binding transduces signals from the extracellular domain to the cytoplasm in the classical outside-in direction. Recent structural, biochemical, and biophysical studies have greatly advanced our understanding of the mechanisms of integrin bidirectional signaling across the plasma membrane. Large-scale reorientations of the ectodomain of up to 200 Å couple to conformational change in ligand-binding sites and are linked to changes in α and β subunit transmembrane domain association. In this review, we focus on integrin structure as it relates to affinity modulation, ligand binding, outside-in signaling, and cell surface distribution dynamics.

The immune system relies heavily on integrins for (a) adhesion during leukocyte trafficking from the bloodstream, migration within tissues, immune synapse formation, and phagocytosis; and (b) signaling during costimulation and cell polarization. Integrins are so named because they integrate the extracellular and intracellular environments by binding to ligands outside the cell and cytoskeletal components and signaling molecules inside the cell. Integrins are noncovalently associated heterodimeric cell surface adhesion molecules. In vertebrates, 18 α subunits and 8 β subunits form 24 known αβ pairs (Figure 1). This diversity in subunit composition contributes to diversity in ligand recognition, binding to cytoskeletal components and coupling to downstream signaling pathways. Immune cells express at least 10 members of the integrin family belonging to the β2, β7, and β1 subfamilies (Table 1). The β2 and β7 integrins are exclusively expressed on leukocytes, whereas the β1 integrins are expressed on a wide variety of cells throughout the body. Distribution and ligand-binding properties of the integrins on leukocytes are summarized in Table 1. For reviews, see References 1 and 2. Mutations that block expression of the β2 integrin subfamily lead to leukocyte adhesion deficiency, a disease associated with severe immunodeficiency (3).

As adhesion molecules, integrins are unique in that their adhesiveness can be dynamically regulated through a process termed inside-out signaling or priming. Thus, stimuli received by cell surface receptors for chemokines, cytokines, and foreign antigens initiate intracellular signals that impinge on integrin cytoplasmic domains and alter adhesiveness for extracellular ligands. In addition, ligand binding transduces signals from the extracellular domain to the cytoplasm in the classical outside-in direction (outside-in signaling). These dynamic properties of integrins are central to their proper function in the immune system. Indeed, mutations or small molecules that stabilize either the inactive state or the active adhesive state—and thereby block the adhesive dynamics of leukocyte integrins—inhibit leukocyte migration and normal immune responses.