Genomics and Health Impact BlogA blog devoted to discussing best practices and questions about the role of genomics in disease prevention, health promotion and healthcare.2015-07-28T15:10:28Zhttp://blogs.cdc.gov/genomics/feed/atom/WordPressMuin J Khoury, Director, Office of Public Health Genomics, Centers for Disease Control and Preventionhttp://blogs.cdc.gov/genomics/?p=34772015-06-02T16:07:12Z2015-06-02T16:07:12ZIn a previous post, I commented on the importance of a public health perspective to ensure the success of the proposed precision medicine large national research cohort. Here I offer additional thoughts on the need to balance short term public health gains with long term knowledge generation from this effort.

There is no doubt that “the time is right” [PDF 3.15 MB] for a large precision medicine cohort of 1 million people or more due to advances in scientific knowledge, technology and computing, empowerment of patients and the general public, as well as existing resources of millions of people who already participate in ongoing cohort studies. While insights from this cohort would take years, if not decades to materialize, what are the opportunities for short term public health impact? At a recent NIH workshop, Dr. Francis Collins used examples of early success [PDF 3.15 MB] such as pharmacogenomics tests, new therapeutic targets for common diseases, insights into resilience and healthy aging, and new ways to evaluate m-Health technologies for chronic disease management. He also described a hypothetical 50-year-old woman with type 2 Diabetes [PDF 3.15 MB] with suboptimal glucose control in spite of using a common prescription drug. Suppose within the next two years, she becomes part of the national cohort. A sample of her DNA is analyzed and she agrees to track her glucose levels via a tiny implantable chip that sends wireless signals to her watch and the researchers’ computers. Using these data, she changes diet and her medicine dose schedule. Five years later, her doctor switches her to new drug based on improved molecular understanding of diabetes. Of course, in order to fulfill the vision of better diabetes control based on precision approaches, researchers would not only have to make new discoveries from this large cohort but would also need to conduct follow up studies including randomized trials to compare the relative clinical utility of interventions in diabetes control. At the current pace of research translation, it may be more realistic to expect a switch to improved diabetes therapy to occur much more slowly, perhaps over decades, a reminder of the translation bottlenecks that occur beyond “bench to bedside”.

I believe the proposed large national cohort would provide a unique opportunity for near term impact, by assessing and enhancing implementation of already proven interventions, especially if the cohort can adequately represent various population groups, including minorities and underserved populations. Moreover, since most members of the cohort will be healthy at the time of enrollment, we can combine discovery of new precision tools with implementation of what we already know now to prevent disease and save lives. Using the example of type 2 diabetes, we know that finding and enrolling people with pre-diabetes in a diabetes prevention program can help prevent the onset of disease. Millions of people with pre-diabetes in the United States can benefit from such interventions but do not know they have pre-diabetes. The proposed cohort can assess how to identify thousands of people with pre-diabetes and connect them with healthcare and lifestyle interventions. There are dozens of evidence-based prevention recommendations that we know work in clinical or community-based settings but are not optimally implemented in the general population. Follow up of a cohort of a million plus people would allow us to assess how to enhance the delivery and impact of proven prevention guidelines on population health.

Of course, a unique feature of this proposed large cohort study would be the whole genome sequencing of participants. While this would lead to numerous discoveries of new genome-based associations and possible interventions, it would take time to yield dividends. In the meantime, we have a real opportunity for near term genomic health impact by focusing on conditions for which evidence-based applications are available. We have created a three-tier classification schema of genomic applications based on the methods of evidence-based medicine. A growing list is available on the CDC website. Similarly, Berg et al proposed binning the human genome into buckets based on clinical validity and utility of genomic variants. Tier 1 (bin 1) genes and their variants are those with sufficient evidence for clinical validity and clinical utility to provide meaningful and actionable information to consumers and providers. Tier 2 (bin 2) genes/variants are those with established evidence of validity but insufficient evidence of utility to support a recommendation for use. Tier 3 (bin 3) genes/variants are those with either sufficient evidence for a lack of utility or presence of clear risk for harms, or those with insufficient evidence for both validity and utility.

Using this approach, we can evaluate how to obtain immediate benefits for participants of the cohort and their families. Examples of tier 1 conditions for which genetic testing is recommended in healthy people at high risk (for example to due to family health history) include hereditary breast and ovarian cancer syndrome (BRCA mutations), ;, Lynch syndrome, associated with increased risk of colorectal cancer; and familial hypercholesterolemia, associated with increased risk of premature heart disease. An estimated 2 million people in the US have one of these conditions and most are not aware of their risk. Once identified, there are evidence-based interventions that can significantly reduce their risk. The challenge lies in how to identify these individuals in the population. This cohort of million or more people would be expected to include thousands of undiagnosed, unrecognized patients with these 3 disorders alone, who can be identified through genomic sequencing and clinical data. These individuals and their relatives can take advantages of interventions to reduce their risk—an immediate health impact from this study.

In addition to these 3 conditions, other potential targets are the genes recommended by the American College of Medical Genetics and Genomics to be included in reporting of results of clinical sequencing performed for other reasons (incidental findings) because of clinical actionability. Moreover, a growing number of pharmacogenomic traits with varying levels of evidence is available for dozens of currently used drugs. And not to mention the possibility of expansion of carrier testing for a wide variety of genetic disorders [PDF 1.31 MB] that could be used in making reproductive decisions. Thus, the proposed cohort provides an immediate opportunity to evaluate the use of genome information in disease prevention and health care services, and to conduct applied research involving communication and behavioral sciences, and outcome research, on patients, families, health care systems and communities.

Now within reach, our personal genomic sequence offers an incredible reflection of who we are, and great promise to improve human health, but there are serious concerns about embracing it too quickly.

Empowered Consumers in the Era of Me

If social media is any indication, we, like Narcissus of ancient myth, are surely self-obsessed creatures. Indeed, 21st century culture is epitomized by the innumerable sharing of one’s own image along with constant records of daily personal and family life. In a decade, Facebook alone had assembled more than 1.3 billion users by 2014, creating a community of people larger than China. With the widespread use of smart phones and tablets, participants can, wherever they happen to be, create and post images of themselves in a matter of seconds. In 2013, the term “selfie” became the Oxford Dictionary’s word of the year since its use was said to have increased 17,000 percent from the previous year. Meanwhile one of many YouTube tutorials on how to take the best self-portraits has garnered over 3.2 million views to date.

So it is not surprising that in a time of the “Me Me Me Generation” that there might be interest in possessing the ultimate description of “me”- the literal sequence of one’s own genome that can be carried in hand. The appeal of the genome may be uniquely powerful for those who feel they must know more about themselves, and indeed there has never been a “mirror” with so much potential power. The 3 billion base pairs in the individual cells in our bodies function, in conjunction with our environment, to shape who we are, how we look, how we age, and how healthy we are and will be. Francis Collins, Director of the National Institutes of Health described the human genome sequence as “… our own instruction book previously only known to God.” Moreover, many people view the genome as a means to better understand one’s ancestral past and as a window to their biological future. The greatest promise, however, is to use genomic information in preventing and treating disease, in a new era of precision medicine. For most aspects, the science isn’t there yet – but in the era of Instagram – many do not wish to wait.

“The Narcissome?”

…as costs have steadily decreased closer to the “$1,000 genome” benchmark, the number of human genomes sequenced has increased and may soon grow exponentially. …it has been estimated that by 2020 that 5 million people will have had their genomes sequenced.

Once prohibitively expensive, whole genome sequencing (WGS) was, until recently, only available to the very wealthy and accomplished and/or to researchers with unique access. In fact, the very first individual genome sequenced, belonging to acclaimed genomics pioneer Craig Venter, was completed in 2007 at a cost of $2.7 billion. Soon after, the genome of James Watson, the co-discoverer of the DNA molecule, was sequenced as well. Since that time, other researchers have followed suit in volunteering their own bio samples for WGS leading some critics to wryly describe such projects as the sequencing of the “narciss-ome.” But as costs have steadily decreased closer to the “$1,000 genome” benchmark, the number of human genomes sequenced has increased and may soon grow exponentially. In a recent study, 59% of people in a population based sample expressed interest in WGS for themselves, and among participants who were parents, 58% were interested in WGS for their youngest children. The intense desire that some people have to know more about their personal genomes has been widely publicized, for example in documentaries such as “My Beautiful Genome.” Regardless, fascination with one’s genome is not mere vanity if it leads to improvements in health.

Depth of Data with Potential Health Benefits, But Also Potential Harms

It has long been predicted that affordable WGS will revolutionize the practice of medicine and for people with rare genetic diseases, sequencing is already a useful diagnostic tool. However, the use of genomics to avoid future health problems became a mainstream issue when in 2013 actress and popular icon Angelina Jolie made public her decision to undergo surgery to reduce breast cancer risk, given that she carried a BRCA mutation and had a family history of ovarian cancer. The resulting increase in public awareness of issues involving genes, health, and risk was thought to be so strong that it has widely since been described as the “Angelina Effect.” Indeed, life-saving evidence-based recommendations and interventions exist for BRCA and hereditary breast and ovarian cancer and for a few other examples like Lynch Syndrome and colorectal cancer. These guidelines for use in certain specific scenarios are included in some of the few genomic applications considered Tier 1 or ready for use in clinical practice today, but they apply to only a small percentage of the population. Nevertheless, research may soon lead us to many more such applications, which can identify millions of people who do not know they are at risk and could benefit from life-saving interventions.

In a recent study among 127 research participants, all but 4 expressed a desire to obtain their incidental findings if they could lead to treatment and surveillance though 61% (78/127) wanted to know them even if they did not.

Because WGS efficiently sequences the entire genome at once, soon there may no longer be a need for a variety of individual genetic tests that assess only one or a few variants at a time. However what makes this “wide net” aspect of WGS a clinical game changer is that while it provides the information to diagnose people at risk for suspected genetic diseases, it simultaneously assays for thousands of other genetic variants that are secondary to the reason for the test. Most of this resulting dataset provides information of yet unknown importance but may also include “incidental findings [PDF 342.83 KB]” or variants with known clinical significance. There is currently great controversy in the clinical genetics field about how these findings should be reported to patients who undergo genome sequencing for any reason.

Many experts argue that trained genetic counselors and health practitioners should be involved in the interpretation [PDF 184.67 KB] of genetic test results, especially for WGS. Cautions include that genetic information without known clinical significance imparts no useful information but can bring harms if healthy people act on them, including: wasted economic resources, anxiety and psychological stress; misguided reproductive decisions; and unnecessary follow up medical tests and procedures including surgery. Very little research of WGS harms has been conducted to date, though a small study found that some people experience WGS results related distress and regret. The existence of harms in more established areas of genetic testing is more than a theoretical matter, for example false positives from DNA-based noninvasive prenatal testing and anecdotal reports of terminated pregnancies as a result. There also remain other important ethical issues including privacy, genetic discrimination, and inequity in access.

Collectively these concerns present a dilemma for the health care system because they must soon be weighed and tested in balance to the demands, and many would argue, the rights of individuals to access their own medical data as WGS becomes more and more mainstream. In a recent study among 127 research participants, all but 4 expressed a desire to obtain their incidental findings if they could lead to treatment and surveillance though 61% (78/127) wanted to know them even if they did not.

While many patients are eager to find out what’s in their double helix, physicians, however, are faced with a new predicament. Beyond the few available evidence based recommendations, there are thousands of genetic studies suggesting that thousands of variants may or may not be associated with health outcomes. How does the doctor keep up with the exploding field? How does he/she answer questions about variants for which no clinical action can be taken? Are there liability concerns for revealing or failing to discuss this information?

A Public Health Approach is Needed

So, if the good news is that we are beginning to have affordable access to the “language of life,” the bad news is that we understand only a few words now. There will be no Rosetta Stone to guide us. What we do know is that, except for single gene disorders, disease causation is complicated and involves interaction with many genetic and environmental factors. Thus careful population-based research of both aspects is critical for true interpretation.

However, as the price of WGS decreases and the knowledge base to interpret it rapidly increases, many believe that we will know enough about the benefits and harms for the technology to be ready for population-wide screening in the near future. If so, we as members of the general public and our medical providers, may soon have a tremendous amount of uniquely personal information at our finger tips. Meanwhile, in the next few years, most WGS is expected to be conducted in association with clinical research efforts including the Precision Medicine Initiative: a direct effort to increase the validity and utility of WGS data.

While WGS technology provides an unequivocal reflection of personal biology, for most people it could be argued that a common, plate glass mirror provides more useful information for health today. However, it has been estimated that by the year 2020, 5 million people will have had their genomes sequenced. In order for this promising, new technology to provide the best balance of health benefits and harms for all people, evidence-and population-based science must lead the way to responsible implementation. Only then can the public health goals of precision medicine be realized. For now, we should look into the captivating pool of information that reflects our magnificent genome, strive to carefully learn and understand the murky depths of its complexity, and take care not to plunge in too quickly.

In the past month, two very large studies have made remarkable progress in quantifying levels of breast cancer genetic risk, both for hereditary cancer (associated with BRCA1/2 mutations) and the more common breast cancer cases (associated with polygenes). In the first study of more than 31,000 women with BRCA1/2 mutations from 55 centers in 33 countries on 6 continents, researchers estimated the magnitude of risk for breast and ovarian cancer based on mutation type, function, and position. They found that different BRCA1/2 mutations are associated with significantly different risks of breast and ovarian cancer depending on where the mutations occur within the genes. For example, mutations located near the ends of the BRCA1 coding sequence were associated with a greater risk for breast cancer, while mutations located near the middle confered a higher risk of ovarian cancer. Of course, we have known about BRCA1/2 genes for more than 2 decades. These findings add to the knowledge that has already been gained from looking at the many types of mutations found in affected patients and their families over more than 2 decades. The new data, if appropriately validated, will have implications for risk assessment and cancer prevention decision making for carriers of BRCA1 and BRCA2 mutations. Ultimately as Dr Francis Collins writes, “our hope is not only to spare women with BRCA1/2 who are at low risk of cancer from needless surgery, but to use this newfound knowledge to develop drugs and other less-invasive strategies for cancer prevention in high-risk women.”

In the second study of more than 33,000 breast cancer cases and 33,000 control women, researchers assessed the value of using 77 breast cancer-associated common variants for breast cancer risk stratification (rare variants such as BRCA1/2 mutations were not included in this study). They constructed a genetic risk score based on the combinations of variants. Women in the highest 1% of the genetic risk score had a three-fold increased risk of developing breast cancer compared with women in the middle range. They estimated that the lifetime risk of breast cancer for women in the lowest and highest quintiles of the risk score were 5.2% and 16.6% for a woman without family history, and 8.6% and 24.4% for a woman with a first-degree family history of breast cancer. The authors concluded that the observed level of risk discrimination could inform targeted screening and prevention strategies. In addition, further stratification of risk may be achieved by combining genetic risks with lifestyle/environmental factors (that were not measured in this study). Until recently, we had known that multiple common susceptibility variants may be combined to identify women at different levels of breast cancer risk but population data were not available. The findings of this study, if further validated can lead to a general genetic risk assessment strategy for all women and not only those that harbor BRCA1/2 mutations.

Commenting on 25 years of breast cancer risk estimation, Mitchell Gail, the architect of the “Gail model” discussed the importance of quantifying absolute risk, namely the probability that a woman with specific risk factors will develop breast cancer over a defined age interval, as was done in the second study. Breast cancer is a common cancer for which screening preventive intervention has been developed (mammography) and recommended for women between 50-74 years by the US Preventive Services Task Force. However, for women aged 40-49 years, the US Preventive Services Task Force (USPSTF) determined that “The decision to start regular, biennial screening mammography before the age of 50 years should be an individual one and take into account patient context, including the patient’s values regarding specific benefits and harms.” Although age is the most important risk factor, many women in their forties can have higher or lower level of risk based on other factors including genetic risk scores.

Nevertheless, evaluating the inclusion of genetic risk assessment scores in recommendations on breast cancer screening is not straightforward. Steve Narod writes in his recent commentary: “If genomewide association studies were the first step towards precision medicine and the development of the model is the second step, then the third step is to show that the personal risk score is useful. Who should be tested and who will pay?” For example, in the event that a woman’s genetic risk score places her in the top percentile, should we offer her more frequent and earlier screening, or more intensive screening (e.g., with MRI). The option for risk-reducing medications is also available. Nevertheless, there is still no direct evidence on whether or not stratified screening and prevention using genetic risk score will lead to overall net benefit vs harms for individual women and the population at large. In the United Kingdom, several workshops have examined scientific, ethical and logistical aspects of stratified population screening for breast cancer based on polygenic susceptibility. The promise of genetic stratification was recognized with the combination of age and genetic risk profile theoretically providing a more efficient screening program compared to age alone. However, they also recognized that key scientific, ethical and practice issues need to be addressed before genetic stratification for breast cancer can be implemented in practice.

In conclusion, data from two recent large population studies provide more precise estimates of breast cancer risk in women with high risk mutations in BRCA1/2 genes, and in all women based on their polygenic risk profile. These types of studies can influence clinical preventive services with additional evaluation of the utility of this information in reducing the burden of breast cancer morbidity and mortality. In the meantime, family history will continue to serve as a valuable low-tech tool in the stratification of breast cancer risk for screening and prevention. Beyond breast cancer, the large scale epidemiologic investigations of genetic risk will usher in a new era of precision prevention for many diseases in the years to come.

Denmark has gathered more data on its citizens than any other country. The Danish Civil Registration System (CRS) contains individual-level information on all residents of Denmark (and Greenland as of 1972). By January 2014, the CRS had registered 9.5 million individuals and more than 400 million person-years of follow-up. A unique ten-digit Civil Personal Register number assigned to all persons in the CRS allows individual-level record linkage of all Danish registers. Daily updated information on migration and vital status allows for nationwide cohort studies with virtually complete long-term follow-up on emigration and death. The CRS facilitates sampling of general population comparison cohorts, controls in case–control studies, family studies and targeted population surveys. The data in the CRS are virtually complete, have high accuracy, and can be retrieved for research purposes while protecting the anonymity of Danish residents. Although other Scandinavian countries have their own databases, Denmark has the reputation for possessing the most complete collection of statistics and databases touching on almost every aspect of life. The Danish government has compiled nearly 200 databases, some begun in the 1930s, on everything from medical records to socioeconomic data on jobs and salaries. These databases “allow for instant, large cohort studies that are impossible in most countries.” Examples of genetic studies using this unique resource include studies of genes and lifestyle in aging using the Danish Twin Register which includes 110,000 pairs of twins. Another example is a recent series of genomewide association studies that have identified genetic factors associated with febrile seizures in children after receiving the measles, mumps, and rubella (MMR) vaccine.

Time required to obtain meaningful results – Longitudinal studies of chronic disease outcomes span decades to allow for a robust number of endpoints to occur.

Contact – Existing cohorts are heterogeneous with respect to permission for data sharing and the need for researchers to re-contact/consent participants.

Demographics – Existing US cohorts do not completely represent the American population or projected demographic changes.

Privacy – There are concerns about privacy, security and access to individual data and health records.

Dynamic Technologies – Administrative-claims, digital and smart-phone technologies to track participants over time and space are rapidly evolving

Scope – Sufficient sample size required to capture small proportion of people with a specific disease or genotype.

Coordination, transparency, and governance – Necessary information is not readily available including fragmentation of electronic health records and claims data, data platforms, and health care systems.

The proposed solution to create a national cohort [PF185.66 KB] is to build upon a platform of existing cohorts. By assembling existing cohorts into a large consortium of cohorts, with a central infrastructure, NIH could harmonize data types; enhance data collection; achieve economies of scale; and provide a resource for addressing new scientific questions.

One model of a US cohort consortium has been successfully implemented for years at the National Cancer Institute (NCI). The NCI Cohort Consortium seeks to address the need for large-scale collaborations to pool the large quantity of data and biospecimens necessary to conduct a wide range of studies. The Cohort Consortium includes investigators responsible for more than 40 high-quality cohorts involving more than 4 million people. The cohorts cover large, rich, and diverse populations. Extensive risk factor data are available, and biospecimens including germline DNA collected at baseline, are available on more than 2 million individuals. Investigators team up to use common protocols and methods, and to conduct coordinated and pooled analyses.

Obviously, the conversation has just started among funding agencies, scientists, patients and other stakeholders on the most optimal design of a US national consortium of cohorts. While the US cannot be a “cohort”, the long term benefit of assembling a nationally representative cohort of the population is definitely worth a try.

]]>0Muin J Khoury, Director, Office of Public Health Genomics, Centers for Disease Control and Preventionhttp://blogs.cdc.gov/genomics/?p=32372015-03-10T17:01:13Z2015-03-02T18:16:58Z

These same technologies are ushering in a parallel era of “precision public health” that goes beyond individualized treatment of sick individuals. The word “precision” in the context of public health can be simply described as improving the ability to prevent disease, promote health and reduce health disparities in populations by: 1) applying emerging methods and technologies for measuring disease, pathogens, exposures, behaviors, and susceptibility in populations; and 2) developing policies and targeted public health programs to improve health. We are currently seeing the initial drive towards precision public health but much more work lies ahead, especially in collaboration with health care. The following are emergent areas and some examples.

Improving Early Detection of Pathogens and Infectious Disease Outbreaks

CDC’s new surveillance strategy for the 21st century [PDF 110.88 KB] will jumpstart the accelerated use of emerging tools and approaches to improve the availability, quality, and timeliness of surveillance data for policy and decision makers. The surveillance strategy will also enhance linking public health data with clinical systems and healthcare professionals. The Health Information Technology for Economic and Clinical Health Act (HITECH) and the associated Meaningful Use requirements are an unprecedented opportunity for clinicians, healthcare providers, and public health officials to benefit from greater electronic connectivity, public health reporting, surveillance and tracking of health outcomes and effectiveness of laboratory tests and interventions in the “real” world.

A third area of precision public health will result from specific advances in biomedical and public health sciences to target disease prevention to subsets of the population at high risk. For the use of genomics in healthy populations, this could take years to mature; however, there is an emerging list of genomic applications that merit a population level approach, such as finding undiagnosed patients and their relatives with selected genetic disorders for which interventions can save lives. For example, the use of genome sequencing in healthy populations has already been proposed both to complement newborn screening programs, and to develop novel adult genetic screening for selected conditions. Genomics, however, is only one of many avenues for identifying high risk populations for screening and interventions. CDC’s public health programs already use targeted approaches, for example, by recommending screening for hepatitis C in people born from 1945 through 1965 (baby boomers) and identifying people with prediabetes through the National Diabetes Prevention Program.

There are many challenges for precision public health including developing a strong evidentiary foundation for using new methods and technologies, building a sustainable informatics capacity to enhance connectivity and interoperability of various systems, dealing with various ethical and social issues such as privacy, educating the public health workforce about the use of new technologies, and empowering the public with unbiased and accurate information that can improve health. Finally, only through the collaboration of health care and public health will we achieve optimal population health outcomes. These are the early days of precision public health and it is not just about “genes, drugs and disease.”

]]>0Muin J Khoury, Director, Office of Public Health Genomics, Centers for Disease Control and Preventionhttp://blogs.cdc.gov/genomics/?p=32242015-03-10T17:05:33Z2015-02-17T17:25:27Z

In January 2015, a paper in Science created a “buzz” in the scientific community and the media. Based on statistical modelling, the authors suggested that “only a third of the variation in cancer risk among tissues is attributable to environmental factors or inherited predispositions. The majority is due to ‘bad luck,’ that is, random mutations arising during DNA replication in normal, noncancerous stem cells.” They went on to suggest that the importance of this finding is not only for understanding disease but also for designing strategies to limit the mortality it causes. The release of the paper was accompanied immediately by sensational headlines such as

Immediately, scientific and public health communities not only questioned the scientific methods and interpretation of the results of the paper but counteracted the notion of bad luck with the message that we can act to prevent many cancers by acting on what we already know from cancer risk factors.

Regardless of the scientific merit and the media misinterpretation of the paper, the individual and population perspectives can be reconciled as they seem to have been lost in the heated back-and-forth debate.

From an individual’s perspective, the perception of “bad luck” may be unavoidable. “Why me?” is a question that people who get cancer or any other disease may well ask themselves. Even for diseases with strong risk factors such as cigarette smoking and lung cancer, most smokers will not develop lung cancer. When smokers develop lung cancer, is it because of bad luck or because they have other known risk factors (such as family history, radon exposure) or because of unknown factors?

From a public health perspective, we know we can reduce a large number of new cancer cases and many cancer deaths can be avoided through lifestyle changes and use of cancer screening. Cancer risk may be reduced by avoiding tobacco, limiting alcohol use, avoiding excessive exposure to ultraviolet rays from the sun and tanning beds, eating a diet rich in fruits and vegetables, maintaining a healthy weight, and being physically active. Screening for cervical and colorectal cancers as recommended helps prevent these diseases by finding precancerous lesions that can be treated before they become cancerous. Screening for cervical, colorectal, and breast cancers also helps find these diseases at an early, often highly treatable stage.

Vaccines also help reduce cancer risk. The human papillomavirus (HPV) vaccine helps prevent most cervical cancers and some vaginal and vulvar cancers, and the hepatitis B vaccine can help reduce liver cancer risk. Making cancer screening, information, and referral services available and accessible to all Americans can reduce cancer incidence and deaths.

Genetics also plays a role in the occurrence of many cancers. For example, people with family history of breast or ovarian cancer are at increased risk and may benefit from genetic counseling and testing, with available interventions to reduce their risks. Similarly, people with family history of colorectal cancer may benefit from earlier and more frequent screening for early detection and interventions. There is also increasing evidence that the interaction between genetic and environmental risk factors can increase risk for a wide variety of cancers.

The bottom line for cancer and for most human diseases, the notion of bad luck for each one of us should be reconciled with the evolving scientific knowledge on genetic and environmental factors associated with disease risk and progression. Such knowledge provides an important foundation for intervention and for reducing the burden of disease for individuals and populations.

The HuGE published literature database now contains more than 100,000 citations, a milestone reached at the end of 2014. The Office of Public Health Genomics has compiled this database since 2001 via weekly systematic sweeps of PubMed performed by a single curator. For the first five years, a complex PubMed query was used to identify studies of genotype prevalence, gene-disease association, gene-environment interaction, and the performance characteristics of genetic tests. In 2006, a data mining approach using support vector machines replaced the PubMed query, reducing the time needed for hand curation and improving both sensitivity and specificity. The database and a suite of online tools to explore it were re-launched as the HuGE Navigator.

Since the first draft of the human genome sequence was announced in 2001, PubMed has added more than one million articles on human genetics and genomics. Human genome epidemiology has grown, too, but studies of genetic variation and disease in populations—i.e., groups of people not defined by family relationships—still accounts for only a small fraction of the total (Figure 1).

Articles in HuGE published literature database,by year of publication – 2001-2014*

A boom in gene discovery followed the introduction of genome-wide association studies (GWAS) (hotlink) in 2005; following up on these discoveries to unravel genetic contributions to disease, however, remains extremely challenging. There are no “high-throughput” shortcuts to understanding. Now that it seems clear that common genetic variants have only small effects on disease risk, the field has shifted toward studies of rare variants with large effects. This may look like a return to the pre-Human Genome Project roots of genetic epidemiology; discoveries in this phase, however, are just the next steps toward building the knowledge base for population-level interpretation.

Meta-analysis has become popular as a first step in knowledge synthesis. Concern over the proliferation of poorly conducted meta-analyses, however, led the editors of PLOS ONE to establish explicit quality criteria for submitted manuscripts and the American Journal of Epidemiology has endorsed this approach. Although rigorous meta-analysis can be useful for assessing and refining gene discoveries, it does not suggest next steps. Other methods are needed to integrate genetic data into ways of thinking that can help us understand, prevent and treat disease. Human genome epidemiology must evolve to help meet this challenge.

On Jan 5, 2015, the HuGE Navigator completed transition to a completely automated curation process based on machine learning and data extraction. This method has achieved 90% sensitivity and specificity when tested against the previous, semi-automated process. The HuGE published literature database will continue to be updated weekly with automatic indexing of gene symbols, study type (meta-analysis, GWAS), and category (pharmacogenomics, genetic testing).

Human genome epidemiology is a global enterprise. The first 100,000 articles in the database included authors from 151 countries (Fig 2). The HuGE Navigator will remain online as a freely accessible resource for all who are interested in human genetic variation and population health.

]]>0Muin J Khoury, Director, Office of Public Health Genomics, Centers for Disease Control and Preventionhttp://blogs.cdc.gov/genomics/?p=31912015-03-10T17:47:05Z2015-01-29T17:07:42ZThe announcement of a new major US Precision Medicine initiative comes more than a decade after the completion of the Human Genome Project, the ambitious project that culminated in sequencing all 3 billion base pairs of our genome. Continuous improvement in the quality of sequencing, dramatic reduction in price, and ongoing advances in multiple sectors of biotechnology all promise a new era of medicine known variably as personalized medicine, genomic medicine and more recently precision medicine. With conventional medicine, patients are treated individually but typically with the same treatment that everyone else with that condition receives. Thus an opportunity may be missed: certain medical interventions can be more effective or cause fewer side effects for some patients than for others, making it important to identify in advance which patients are more or less likely to benefit from the intervention. This is where precision medicine comes in. Precision medicine takes into account individual differences in the genes, environments, and lifestyles of people allowing the design of targeted disease interventions from the start. While genomics is often suggested as the leading driver of personalization, other factors may be equally as important. For example, health information technology can be used to integrate medical history into patient-centered approaches to improving health and treating disease.

As paradoxical as it may seem, while precision medicine focuses on individualized care for each patient, its success truly requires a population-based perspective. First, it is important to learn what works and what does not for one person, but it is impossible to infer causality by working with one person at a time. To be informative, data on an individual need to be compared with data from large numbers of people to recognize important individual characteristics and to identify relevant population subgroups that are likely to respond differently to drugs and other interventions.

Third, while precision medicine is currently focused on treatment, a compelling case can be made for giving even more attention to early detection and disease prevention. Although personalized treatments can help save the lives of people who are already sick, disease prevention applies to all of us. “Precision prevention” then may be useful in using both science and limited resources for targeting prevention strategies to subsets of the population. For example, recent data suggest that knowing the speed with which some people metabolize nicotine, based on genetic and other factors, could lead to personalized smoking cessation interventions to complement the highly successful public health efforts that have resulted in reduction in smoking over the past few decades. Another approach to precision prevention is increased screening of people at greater risk of cancer. Family health history collection is an inexpensive tool for identifying individuals and families that require earlier and more intensive screening for breast and ovarian cancer or colorectal cancer.

Finally, implementation of precision medicine requires the full participation and education patients (all of us), communities, physicians, payers, and the healthcare community. This should be guided by strong “translational” implementation sciences which go beyond the traditional bench to bedside model (see recent paper on this topic). Society has a stake in assuring that the national investment in precision medicine research leads to tangible health benefits for all and does not worsen existing health disparities. This is where strong public health-healthcare partnerships are key in assessing the needs of individuals and communities, developing appropriate policies and guidelines, ensuring that all people have access to the intended benefits of technology, and tracking effectiveness and cost effectiveness outcomes in the real world.

These are the early days of precision medicine. The road ahead is long. Let us make sure that a public health perspective is included at the outset to ensure the success of research and ultimately the effective and responsible implementation of new scientific discoveries for the benefit of all.

]]>1asw6http://blogs.cdc.gov/genomics/?p=30742015-07-28T15:10:28Z2014-12-30T20:56:25ZRecent advances in next generation sequencing (NGS) could potentially revolutionize newborn screening, the largest public health genetics program in the United States and around the world. Over the last five decades, newborn screening has grown from screening for one condition (phenylketonuria (PKU)) in one state, to nationwide screening for at least 31 severe but treatable conditions, most of which are genetic. Each year, thousands of babies in the United States are saved from lifelong disability and even death by timely diagnosis and initiation of treatment. An important aspect of newborn screening is speed; many of the diseases that are screened for are inborn errors of metabolism in which the baby’s body cannot properly break down certain substances in food which can build up to toxic amounts.

Most newborn screening tests are biochemical tests that use drops of blood from a heel prick, dried onto a piece of filter paper. The introduction of tandem mass spectrometry made it possible to screen for many conditions at the same time using a single test, and led to a huge expansion in the number of conditions that could be included, with some U.S. states now screening for over 50 conditions.

The initial, first-tier screening tests detect those babies at increased risk of being affected by any of the diseases targeted by the program. Sometimes further screening tests (second-tier tests) are needed to narrow down the group of babies to determine which ones require diagnostic testing. For some conditions, the second-tier test is based on DNA. For example, the first-tier screen for cystic fibrosis looks at concentrations of the immunoreactive trypsinogen (IRT) protein. In some states, genetic screening is then done on the newborn screening blood spots from those babies with high IRT concentrations to look for mutations (changes) in the gene that causes cystic fibrosis. However, this screen currently uses DNA panels that look at a limited number of mutations. If the initial panel identifies only one mutation, in many cases the screen will be considered positive and the infant will be referred for diagnostic testing (a sweat test). However, in some cases, newborn screening labs may first perform further sequencing to determine whether a second mutation is present. Sequencing has also been proposed for second-tier screening for severe combined immunodeficiency (SCID).

A recent paper describes a method that takes this idea further, by using NGS to sequence 126 genes that together comprise the majority of the genes currently implicated in most newborn screening conditions (except hearing loss, critical congenital heart defects, and severe combined immunodeficiency). The method then uses a computer program to “filter” the sequence information so that only genetic variants potentially related to one of the specified diseases are reported. The researchers were able to use dried blood spots for some but not all of the targeted sequencing tests.

Although this method is suggested by the researchers as a second-tier screening test, it might someday be considered as a first-tier test for many of the newborn screening conditions. If all genes associated with the genetic conditions evaluated through newborn screening were sequenced at once, in a time- and cost-efficient manner, the need for follow-up testing might be greatly reduced. However, challenges remain before this strategy could be used in newborn screening, and further studies are needed. The recent study was not population-based, but instead used samples from children known to be affected by specific newborn screening disorders. Furthermore, while the technique worked well to identify potentially harmful sequence changes in the targeted 126 genes, this information alone only provided the correct diagnosis for 75% of the cases; for the rest, information about the child’s clinical condition was needed to complete the diagnosis.

Currently, NGS is expensive, although prices continue to drop as the technology improves. The need for rapid screening test results and the type of sample available for analysis—dried blood spots might not currently work in some cases—are also limitations that will potentially change as the technology evolves. More information will be needed on the sensitivity (the percentage of affected infants that are picked up by the screen) and specificity (the percentage of unaffected children that are not picked up by the screen) of NGS for the different newborn screening conditions. Tandem mass spectrometry measures the concentrations of metabolites in the blood, so that concentrations that are out of range (too high or too low) are detected directly without needing to know the mechanism or mutations involved. Additionally, metabolite concentrations that are out of range might reflect rare conditions not detectable by mutation analysis, but that require prompt treatment, such as vitamin B12 deficiency of the newborn. Thus, tandem mass spectrometry, at least for now, provides a more accurate way to detect inborn errors of metabolism. Most state newborn screening laboratories do not currently use NGS, so introducing this technology would require purchasing new equipment, hiring or training staff, and other considerations. Another challenge would be the large amount of data generated by NGS. Currently, newborn screening data systems deal with a much smaller set of data points for each screening test performed. For example, for cystic fibrosis screening, state laboratories would store the IRT concentration, the cutoff, and the results of the DNA mutation panel if second-tier screening was done. The informatics capabilities needed to store and analyze sequencing data would be much greater.

For now, newborn screening using whole-genome or exome sequencing raises more questions than it answers. When state laboratories started using tandem mass spectrometry for newborn screening, some raised concern that the technology was driving the screening and cautioned that the ability to screen for a condition was not reason alone to report the results. Newborn screening using whole-genome or exome sequencing will face many of these same issues. The time to think about how to resolve them is now.

During the past decade, the number of laboratories that offer genetic testing remained relatively flat, however, the number of diseases for which testing is available increased consistently and dramatically. As of November, 2014, there are more than 42,000 tests available for just over 4,000 disorders. In terms of raw numbers, the recommendations from a decade of EGAPP fall drastically short of covering the field. Nevertheless, EGAPP, dubbed by CDC as a pilot initiative, has been enormously influential in prioritizing tests for evaluation, determining what questions need to be asked and answered, and identifying where key crosscutting weaknesses in research must be addressed in genomics.

Muin Khoury presenting a certificate of recognition to Ned Calonge

We have heard both high praise and harsh criticism of the EGAPP initiative over the past decade. Commendations often center upon the assertion that nobody else is doing what EGAPP does, and that the general approach EGAPP uses, of assessing the analytic and clinical validity, and especially the clinical utility of genomic tests, is critically needed. Criticisms, on the other hand, generally fall into three categories: 1) that the process is too slow, 2) that the evidentiary “bar” espoused by EGAPP (which relies on demonstration of clinical utility) is too high, and as a result, 3) guidelines with insufficient evidence to recommend for or against testing will proliferate, thus, in essence offering no clinically actionable guidance whatsoever. While we gratefully accept the kudos, we have seen the increased use of EGAPP-inspired concepts and frameworks in genomic evaluations. More and more, we see groups such as ACMG, ASCO, and NCCN producing guidelines that involve assessment of clinical utility. It can no longer be said, if it ever could, that EGAPP reviews and recommendations are unique in addressing this critical area in genomics. We acknowledge that high quality systematic reviews require significant investments in time and resources. Whether requiring demonstration of clinical utility is too stringent a criterion for implementation of tests in clinical practice may largely be a decision for individual practitioners, patients, and health payers. For implementation of genomic testing applications in public health programs, however, we argue that establishment of clinical utility is a necessary prerequisite. Finally, we acknowledge that 2/3 of the published EGAPP recommendations have concluded that insufficient evidence was available to recommend for or against testing. In these instances, we are still quite pleased that the process has led to the identification of key gaps in knowledge that should help to set productive future research agendas. Needless to say, we are very pleased that EGAPP has been able to produce two recommendations for (cascade screening for Lynch syndrome, pharmacogenomic KRAS testing with anti-EGFR therapy), and one recommendation against (factor V Leiden testing for idiopathic venous thromboembolism) use of genomic applications in specific clinical scenarios. In addition we appreciate the nuances available in EGAPP methods, which allow encouragement or discouragement (e.g., CYP450 testing for patients with major depression before treatment with specific SSRI drugs) of use in those instances where evidence is insufficient for a formal up or down recommendation.

As we celebrate a decade of the EGAPP initiative, the immediate priority for OPHG is to move more downstream along the translation continuum with a new initiative that benefits from the lessons learned by the EWG. As implied above, we believe that EGAPP has had a tremendously positive impact, primarily through influence on the methods of other groups evaluating genomic testing in the T2 phase of translation. In addition, the emerging ClinGen Resource appears poised to address expansive tracts in the genomics T2 space, with assessment of clinical utility in mind. Our new initiative will facilitate the development and application of stakeholder-centered methods, approaches, and best practices for reviewing the evidence on implementation and impact of public health genomic technologies in the real world. While guidelines may be developed as needed through the initiative, they will not be the central focus. Instead, emphasis will be placed on accelerating the implementation of genomic applications that hold particular promise in public health, with stakeholder input, through a variety of means and methods. An expansive group of stakeholders will be recruited to inform the initiative, with CDC programs and state health departments comprising the core stakeholder group. Beyond this, representatives from a broad array of federal agencies will be included, as will professional and patient organizations. One or more methods development groups will be assembled from the stakeholders, to review and address challenges in implementation identified by the stakeholders themselves. Methods groups will not be standing committees, but will be formed to address specific projects before dissolving back into the larger pool of stakeholders. Current and former EWG members will be invited to join the stakeholder group, as will members of the EGAPP steering committee.

Full details on the new initiative are still being hammered out, and you can expect to read more about it in the coming months here on the OPHG blog. In the meantime, the current EWG expressed strong interest in maintaining its independent status and will continue to develop and apply methods for genomic tests evaluation. As we close out this chapter of public health genomics, we want to thank each and every member of the EWG for their amazing work and service. The field of genomics owes EGAPP a tremendous debt of gratitude for developing a strong evidentiary foundation for genomic medicine. We look forward to working with members of the EWG under the aegis of the new public health genomics initiative.