Gene Sequencing Project Mines Data Once Considered 'Junk' for Clues About Cancer

Genome sequencing data once regarded as junk is now being used to gain important clues to help understand disease. The latest example comes from the St. Jude Children’s Research Hospital – Washington University Pediatric Cancer Genome Project, where scientists have developed an approach to mine the repetitive segments of DNA at the ends of chromosomes for insights into cancer.

These segments, known as telomeres, had previously been ignored in next-generation sequencing efforts. That is because their repetitive nature meant that the resulting information had defied analysis and the data were labeled as junk. But researchers have now traced changes in the volume of telomeric DNA to particular types of cancer and their underlying genetic mistakes. Investigators found that 32 percent of pediatric solid tumors carried extra DNA for telomeres, compared to just 4 percent of brain tumors and none of the leukemia samples studied. The findings were published recently in the journal Genome Biology.

Using this new approach, the investigators have linked changes in telomeric DNA to mutations in the ATRX gene and to longer telomeres in patients with a subtype of neuroblastoma, a cancer of the sympathetic nervous system. Telomere length limits how many times cells can divide. Mechanisms that maintain or lengthen telomeres contribute to the unchecked cell division that is a hallmark of cancer.

“This paper shows how measuring the DNA content of telomeres can enhance the value of whole- genome sequencing,” said Matthew Parker, Ph.D., the paper’s first author and a St. Jude postdoctoral fellow. “In the case of the ATRX mutation, the telomere findings gave us information about the mutation’s impact that would have been hard to get through other means.”

The results stem from the largest study yet of whole-genome sequencing to measure the content of telomeric DNA. The effort involved whole-genome sequencing of normal and tumor DNA from 235 pediatric patients battling 13 different cancers. For comparison, normal DNA from 13 adult cancer patients was included in the research.

“There’s been a lot of interest among cancer researchers into telomere length,” said Richard Wilson, Ph.D., director of The Genome Institute at Washington University School of Medicine in St. Louis. “While more research remains, we think it’s important to begin to characterize the genetic sequences that make up the telomeres. That’s a crucial first step to understanding more precisely any role they may play in cancer.”

The Pediatric Cancer Genome Project sequenced the complete normal and cancer genomes of more than 600 children and adolescents with some of the most aggressive and least understood cancers. Investigators believe the project’s findings will lay the foundation for a new generation of clinical tools. Despite advances, cancer remains the leading cause of death by disease of U.S. children age 1 and older.

The human genome is stored in the four-letter chemical alphabet of DNA, a molecule that stretches more than 3 billion characters in length and provides the instructions for building and sustaining life. Those instructions are the genes that are organized into the 46 chromosomes found in almost every cell.

Each chromosome ends with the same six-letter DNA sequence that is associated exclusively with telomeres. The DNA sequence does not vary, but the number of times it is repeated does, affecting the length of the telomeres. Telomeres shorten each time cells divide, which explains why their length declines naturally with age.

Researchers have known cancer cells use several mechanisms to circumvent the process and keep dividing. But until now the repetitive nature of the telomeric DNA sequence meant they had little to offer researchers using whole-genome sequencing to map the human genome. Other genes can be assigned to a particular spot on a particular chromosome; telomeres cannot.

“For scientists analyzing whole-genome sequencing data the telomeres were just a headache,” said the study’s corresponding author Jinghui Zhang, Ph.D., an associate member of the St. Jude Department of Computational Biology. “We could not properly map them to a position on the human genome, so we didn’t really use them.”

Then listening to a colleague’s presentation, Parker had an idea: “Why not just count the telomeric DNA and look for changes between the normal and cancer cells of patients?”

Zhang said the question was a conceptual leap in thinking about how to use whole-genome sequencing data to study telomeres and cancer. “This is the classic story of how one person’s problem is another person’s gold,” she said.

Parker and his colleagues developed an approach that correctly distinguished between older and younger individuals based on the amount of telomeric DNA in their blood or bone marrow cells. Researchers used three other methods to confirm that whole-genome sequencing could be used to reliably capture telomeric DNA differences between normal and cancer cells. Additional supportive evidence came when investigators found that the method yielded similar estimates of the telomeric DNA content of twins with leukemia who shared similar genetic alterations.

When investigators used the method to study pediatric cancer patients, they found tumors that gained telomeric DNA were also more likely to contain chromosomal abnormalities, including rearrangements within and between chromosomes. Researchers also found that different cancers had distinct patterns of telomeric DNA change. In some cases, the change offered clues about the mechanism responsible for lengthening the telomeres, pointing to a process called alternative lengthening of telomeres.