Environmental Statistics

Environmental Statistics

The environmental statistics theme by definition is the development and application of statistical methodology to environmental issues- these can be based in the natural environment (both undisturbed and perturbed) or the urban environment. Environmental statistics is a broad discipline stretching from how and what to sample, through to modelling impacts on human and ecosystem health and ultimately to providing predictions of what changes might occur in the future. Statistical methodology being used include time series analysis, spatial modelling, Bayesian methods, wavelet analysis, extreme value modelling and non-parametric (particularly regression and additive) modelling.

The school also heads the EPSRC funded SECURE nework which brings together the environmental and statistical communities to provide fresh intelligence and new insights into environmental change and society's management of that change.

Postgraduate opportunities

Spatial clustering of disease risk (PhD)

Disease mapping has applications in public health by allowing for identification of areas which are at high risk of particular health problems. Such approaches are generally based on areal data, which involves partitioning the study region into a set of non-overlapping areal units and recording counts of disease cases within each areal unit. The majority of approaches assume a spatially smooth risk surface, but this may not be realistic, and there has been recent interest in developing methodology which allows for discontinuities in this structure.

One way to account for such discontinuities is via spatial clustering, which groups together neighbouring areas which are similar in terms of disease risk. This clustering structure can then be taken into account when modelling the risk surface. Much of the existing spatial clustering methodology is based on agglomerative clustering approaches, but this can often be simplistic. This project will develop new methodology for spatial clustering by exploring new hierarchical and centroid-based clustering techniques.

Data fusion for environmental and ecological systems (PhD)

The current data explosion across many environmental application areas (eg environmental satellite remote sensing, automatic insitu sensors and long-term monitoring records) provides a rich data resource to answer questions which could not have been investigated previously. However, the data complexity across different temporal and spatial scales, specifically in terms of data volume, sparsity and linkage, and the unstructured design of such studies, present numerous statistical modelling challenges to reveal important processes and relationships within such data. This PhD will develop statistical methodology to address such challenges with a focus on application areas such as investigating water quality and ecosystem health.

Estimating the effects of air pollution on human health (PhD)

The health impact of exposure to air pollution is thought to reduce average life expectancy by six months, with an estimated equivalent health cost of 19 billion each year (from DEFRA). These effects have been estimated using statistical models, which quantify the impact on human health of exposure in both the short and the long term. However, the estimation of such effects is challenging, because individual level measures of health and pollution exposure are not available. Therefore, the majority of studies are conducted at the population level, and the resulting inference can only be made about the effects of pollution on overall population health. However, the data used in such studies are spatially misaligned, as the health data relate to extended areas such as cities or electoral wards, while the pollution concentrations are measured at individual locations. Furthermore, pollution monitors are typically located where concentrations are thought to be highest, known as preferential sampling, which is likely to result in overly high measurements being recorded. This project aims to develop statistical methodology to address these problems, and thus provide a less biased estimate of the effects of pollution on health than are currently produced.

Analysis of Spatially correlated functional data objects. (PhD)

Historically, functional data analysis techniques have widely been used to analyze traditional time series data, albeit from a different perspective. Of late, FDA techniques are increasingly being used in domains such as environmental science, where the data are spatio-temporal in nature and hence is it typical to consider such data as functional data where the functions are correlated in time or space. An example where modeling the dependencies is crucial is in analyzing remotely sensed data observed over a number of years across the surface of the earth, where each year forms a single functional data object. One might be interested in decomposing the overall variation across space and time and attribute it to covariates of interest. Another interesting class of data with dependence structure consists of weather data on several variables collected from balloons where the domain of the functions is a vertical strip in the atmosphere, and the data are spatially correlated. One of the challenges in such type of data is the problem of missingness, to address which one needs develop appropriate spatial smoothing techniques for spatially dependent functional data. There are also interesting design of experiment issues, as well as questions of data calibration to account for the variability in sensing instruments. Inspite of the research initiative in analyzing dependent functional data there are several unresolved problems, which the student will work on:

Mapping disease risk in space and time (PhD)

Disease risk varies over space and time, due to similar variation in environmental exposures such as air pollution and risk inducing behaviours such as smoking. Modelling the spatio-temporal pattern in disease risk is known as disease mapping, and the aims are to: quantify the spatial pattern in disease risk to determine the extent of health inequalities, determine whether there has been any increase or reduction in the risk over time, identify the locations of clusters of areas at elevated risk, and quantify the impact of exposures, such as air pollution, on disease risk. I am working on all these related problems at present, and I have PhD projects in all these areas.