Using Statistical Models to Address Complex Environmental Health Data

DERT Success Story

Roger Peng, Ph.D., is an associate professor in the Department of Biostatistics at the Johns Hopkins Bloomberg School of Public Health. He says that studying the public health impacts of exposure to particulate matter and its constituents is needed to better inform national air quality regulations and standards.

Roger Peng, Ph.D.

NIEHS grantee Roger Peng, Ph.D., serves as co-director of the data management and statistics core for the NIEHS Center for Childhood Asthma in the Urban Environment at the Johns Hopkins Bloomberg School of Public Health. As an environmental biostatistician, he recognizes the importance of developing statistical methods and using data science skills to address environmental health issues.

“Environmental health involves the integration of data from various disciplines, including epidemiology, toxicology, biostatistics, and clinical medicine,” Peng explained. “Data science skills are critical for bringing the people and data from these disciplines together to address complex environmental health problems.” Unfortunately, some data scientists do not naturally anticipate how their expertise and skills can be applied in environmental studies. “Scientists in both the data science and environmental health communities should offer more cross-disciplinary training programs to encourage the development of students and professionals within these fields,” he added.

Peng explores how statistical models can be used to demonstrate the human health impacts of spatial-temporal changes in air pollution. Particulate matter (PM) air pollution is a complex mixture of extremely small, inhalable particles. PM size, coarse (PM10) or fine (PM2.5), is a major determinant of its potential to cause toxicity and adverse health effects. Studies in people have shown that PM exposure is associated with a wide range of adverse health outcomes. However, there is some evidence that specific constituents, or components, of PM differ with regard to their toxicity and effects on human health.

Peng and collaborators performed the first ever national, season- and region-specific analysis to explore associations between mortality and short-term exposure to fine PM constituents. Using statistical techniques, they analyzed the effects of seven major constituents across 72 U.S. urban communities. With single pollutant models, they determined that four constituents — organic carbon matter, elemental carbon, silicon, and sodium ion — were most strongly associated with mortality between 2000 and 2005. However, they found no evidence of seasonal or regional effects for associations between mortality and fine PM or its constituents within this time period.

These findings suggest that some constituents may be more toxic than others, and regulating fine PM alone may not protect human health sufficiently. “Based on these findings, it is also clear that we should consider looking more closely at how the chemical composition of these harmful particles vary based on source of PM, such as automobiles, coal-fired power plants, and ports” Peng explained. “This work could help policy makers and stakeholders develop more targeted strategies and standards for reducing fine PM exposure.”

PM air pollution is not the only public health concern; in the wake of global climate change predictions, rising temperatures compounded by more frequent and severe heat waves also present a major health threat. To quantify health effects related to climate change, Peng and colleagues used statistical models to predict the impacts of heat waves on future mortality risk in Chicago, Illinois, for the years 2081 - 2100. They estimated, in the absence of climate adaptation, that Chicago could experience 166 - 2,217 excess deaths per year from heat waves.

Sharing Data Science Skills and Education

Peng and his colleagues have recently launched a series of online data science courses, including Data Science Specialization and Genomic Data Science. “These courses are reasonably priced and open to all,” Peng mentioned. “Data science skills will become fundamental essentially for every field of study. Therefore, we want to make data science education available to as many people as possible.”

Recognizing the significant health risks that heat waves pose to vulnerable populations, such as the elderly, Peng and collaborators performed another study to examine heat-related emergency hospitalizations for respiratory diseases. In the largest study of the elderly to date — 12.5 million Medicare beneficiaries across 213 U.S. counties — the investigators used statistical inference models to estimate a national average for the relative risk of respiratory hospitalizations that occurred between 1999 and 2008. On average across the counties, each 10°F increase in daily outdoor temperature was associated with a 4.3 percent increase in same-day respiratory disease hospitalizations.

Moving forward, Peng and collaborators plan to explore the associations between air pollution exposure and asthma-related outcomes in children. They also are planning to expand their climate change work beyond heat waves to look at more extreme weather phenomena in general. This would include looking at the health impacts of natural disasters, such as hurricanes.