Information content of household-stratified epidemics

Abstract

Household structure is a key driver of many infectious diseases, as well as a natural target for interventions such as vaccination programs. Many theoretical and conceptual advances on household-stratified epidemic models are relatively recent, but have successfully managed to increase the applicability of such models to practical problems. To be of maximum realism and hence benefit, they require parameterisation from epidemiological data, and while household-stratified final size data has been the traditional source, increasingly time-series infection data from households are becoming available. This paper is concerned with the design of studies aimed at collecting time-series epidemic data in order to maximize the amount of information available to calibrate household models. A design decision involves a trade-off between the number of households to enrol and the sampling frequency. Two commonly used epidemiological study designs are considered: cross-sectional, where different households are sampled at every time point, and cohort, where the same households are followed over the course of the study period. The search for an optimal design uses Bayesian computationally intensive methods to explore the joint parameter-design space combined with the Shannon entropy of the posteriors to estimate the amount of information in each design. For the cross-sectional design, the amount of information increases with the sampling intensity, i.e., the designs with the highest number of time points have the most information. On the other hand, the cohort design often exhibits a trade-off between the number of households sampled and the intensity of follow-up. Our results broadly support the choices made in existing epidemiological data collection studies. Prospective problem-specific use of our computational methods can bring significant benefits in guiding future study designs.