Future of big data depends on collaboration

Someday a patient’s treatment will be exactly tailored to his personal medical situation, based on the type of tumor genetic history and more.

Researchers and physicians such as Dr. Eric Holland of Fred Hutchinson Cancer Research Center can imagine what a difference that could make to patients like those he treats for brain tumors.

But the path that leads to precision medicine starts with big data, Holland and other experts know. Current technology allows researchers to generate reams and reams of data, whether delving into the genetics of tumors or the metabolites the tumors produce. They hope that these wellsprings of information contain the keys to tailoring medicine to every individual’s disease.

Vice President Joe Biden, who stopped by Fred Hutch last month during his moonshot “listening tour,” spoke of the potential for big data to lead to big answers. He likened the search for targeted cancer treatments to “looking for a needle in a haystack.”

“We have to allow the data to actually yield the answers,” said Biden.

The new wealth of information holds great potential, said Matthew Trunnell, chief information officer at Fred Hutch. “There is great belief that bringing clinical data into a research environment, and combining it with molecular and genomic data, will provide new insights into the mechanisms of disease and inform the therapeutic decisions for particular patients,” said Trunnell.

These efforts will require both databases to store the medical, molecular and genomic information, and data science, to resolve lifesaving patterns and links between treatment response and patient and tumor information. Construction of the necessary infrastructure is already well underway at Fred Hutch. The Hutch Integrated Data Repository and Archive (HIDRA for short) is a database that will provide infrastructure for data science at Fred Hutch and its Cancer Consortium partners, the University of Washington, Seattle Children’s Hospital and Seattle Cancer Care Alliance.

HIDRA draws from years of data from consenting patients — from medical history to tumor genetics — and provides researchers with de-identified information they can survey while maintaining strict patient privacy. However, emerging resources like HIDRA present new analytical challenges, said Holland, the brain cancer researcher who directs Solid Tumor Translational Research (STTR) at Fred Hutch.

“Recently, we’ve had the ability to sequence huge amounts of DNA. And so all of a sudden there’s a lot more digital data that completely describes the tumor in a much better way than the way a microscope would. This is both a good thing and a bad thing,” he explained. “It poses a problem in that the amount of data for any given person is enormous. And all that information has to be compiled together in some giant database and then manipulated and visualized.”

In order to reveal the patterns buried in reams of data, Holland and his colleagues in STTR spearheaded the development of Oncoscape, an online data visualization tool. Oncoscape draws on publicly available molecular, genetic and medical information from The Cancer Genome Atlas, a collaboration between the National Cancer Institute and the National Human Genome Research Institute of the National Institutes of Health.

Oncoscape is designed to be modular, so new applications with specific functions can be easily integrated and contribute new functionality to the collection of existing apps.

“The Oncoscape platform acts as a hub where other developers can integrate their own tools or methods,” said Dr. Lisa McFerrin, a bioinformatics specialist in STTR.

The Oncoscape team turned to GitHub and Amazon Web Services to host Oncoscape, open-source and in the cloud, for any researcher to use — and any computer programmer or computational biologist to help improve. This is part of a larger vision, said Desert Horse-Grant, STTR’s director of strategy and operations.

“We want to break down silos across [scientific] disciplines and help researchers from different disciplines understand each other. We also want to break down data silos,” she explained.

GitHub users can access a list of potential improvements and contribute their own possible modifications. This way, they can contribute to cancer research — whether they know anything about cancer biology or not.

“We don’t want to recreate the wheel,” said McFerrin. “By integrating tools and methods using transparent resources, we can enhance reproducibility of results.”

On Tuesday, GitHub launched a video introduction to Oncoscape as part of its OctoTales video series. However, tools like Oncoscape are only the beginning of Hutch scientists’ foray into data science, said Trunnell.

“Oncoscape is just one of many tools that must be developed in order to visualize and analyze large datasets. Institutionally we expect to fund lots of research in the coming years,” he said.

The future of cancer research and precision medicine rests on collaboration among scientists from disparate disciplines, said Biden during the panel discussion at Fred Hutch. “We are at an inflection point. Four to five years ago, geneticists weren’t working with immunologists. Most immunologists were out there in the wilderness. But all that has changed,” he said.

Current efforts at Fred Hutch are focused on improving research, with the goal of someday parlaying data science to improve patient care, said Trunnell.

“The tools developed today support research and discovery,” said Trunnell. “The additional challenge of being able to really make them effective for clinical decisions [will be tackled in the future].”

Sabrina Richards is a staff writer at Fred Hutchinson Cancer Research Center. She has written about scientific research and the environment for The Scientist and OnEarth Magazine. She has a Ph.D. in immunology from the University of Washington, an M.A. in journalism and an advanced certificate from the Science, Health and Environmental Reporting Program at New York University. Reach her at srichar2@fredhutch.org.

Fred Hutch is proud to be an Equal Opportunity and VEVRAA Employer. We are committed to cultivating a workplace in which diverse perspectives and experiences are welcomed and respected. We do not discriminate on the basis of race, color, religion, creed, ancestry, national origin, sex, age, disability, marital or veteran status, sexual orientation, gender identity, political ideology, or membership in any other legally protected class. We are an Affirmative Action employer. We encourage individuals with diverse backgrounds to apply and desire priority referrals of protected veterans. Read the EEO is the Law poster here.