A SURVIV analysis of breast cancer isoforms developed at UCLA. Blue lines are associated with longer survival times, and magenta lines with shorter survival times.

People with cancer are often told by their doctors approximately how long they have to live, and how well they will respond to treatments, but what if there were a way to improve the accuracy of doctors’ predictions?

A new method developed by UCLA scientists could eventually lead to a way to do just that, using data about patients’ genetic sequences to produce more reliable projections for survival time and how they might respond to possible treatments. The technique is an innovative way of using biomedical big data — which gleans patterns and trends from massive amounts of patient information — to achieve precision medicine — giving doctors the ability to better tailor their care for each individual patient.

The approach is likely to enable doctors to give more accurate predictions for people with many types of cancers. In this research, the UCLA scientists studied cancers of the breast, brain (glioblastoma multiforme, a highly malignant and aggressive form; and lower grade glioma, a less aggressive version), lung, ovary and kidney.

In addition, it may allow scientists to analyze people’s genetic sequences and determine which are lethal and which are harmless.

The new method analyzes various gene isoforms — combinations of genetic sequences that can produce an enormous variety of RNAs and proteins from a single gene — using data from RNA molecules in cancer specimens. That process, called RNA sequencing, or RNA-seq, reveals the presence and quantity of RNA molecules in a biological sample. In the method developed at UCLA, scientists analyzed the ratios of slightly different genetic sequences within the isoforms, enabling them to detect important but subtle differences in the genetic sequences. In contrast, the conventional analysis aggregates all of the isoforms together, meaning that the technique misses important differences within the isoforms.

SURVIV (for “survival analysis of mRNA isoform variation”) is the first statistical method for conducting survival analysis on isoforms using RNA-seq data, said senior author Yi Xing, a UCLA associate professor of microbiology, immunology and molecular genetics. The research is published today in the journal Nature Communications.

The researchers report having identified some 200 isoforms that are associated with survival time for people with breast cancer; some predict longer survival times, others are linked to shorter times. Armed with that knowledge, scientists might eventually be able to target the isoforms associated with shorter survival times in order to suppress them and fight disease, Xing said.

The researchers evaluated the performance of survival predictors using a metric called C-index and found that across the six different types of cancer they analyzed, their isoform-based predictions performed consistently better than the conventional gene-based predictions.

The researchers studied tissues from 2,684 people with cancer whose samples were part of the National Institutes of Health’s Cancer Genome Atlas, and they spent more than two years developing the algorithm for SURVIV.

According to Xing, a human gene typically produces seven to 10 isoforms.

“In cancer, sometimes a single gene produces two isoforms, one of which promotes metastasis and one of which represses metastasis,” he said, adding that understanding the differences between the two is extremely important in combatting cancer.

“We have just scratched the surface,” Xing said. “We will apply the method to much larger data sets, and we expect to learn a lot more.”

Categories

Categories

Archives

Archives

What is RNA-Seq?

long RNAs are first converted into a library of cDNA fragments through either RNA fragmentation or DNA fragmentation. Sequencing adaptors (blue) are subsequently added to each cDNA fragment and a short sequence is obtained from each cDNA using high-throughput sequencing technology. The resulting sequence reads are aligned with the reference genome or transcriptome, and classified as three types: exonic reads, junction reads and poly(A) end-reads. These three types are used to generate a base-resolution expression profile for each gene. Nat Rev Genet 10(1):57-63 (2009)