To send content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about sending content to .

To send content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about sending to your Kindle.

Note you can select to send to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be sent to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

By using this service, you agree that you will only keep articles for personal use, and will not openly distribute them via Dropbox, Google Drive or other file sharing services
Please confirm that you accept the terms of use.

To examine similarities and differences in the demographic and clinical profiles of young people (15–25 years of age) referred between the mental health services (MHS) and Jigsaw Galway.

Methods

A retrospective chart review was conducted of clinical files of individuals attending secondary MHS who had been referred to or from Jigsaw Galway over a 5-year period. Differences in demographic and clinical data between individuals referred to or from Jigsaw Galway were compared.

Results

A recent act of self-harm was more prevalent in individuals referred from Jigsaw to the adult MHS (p=0.02). No other demographic or clinical differences were detected between individuals attending Jigsaw Galway and the MHS.

Conclusions

Education sessions for clinical staff working in primary care, Jigsaw Galway and the MHS are suggested to support clinicians in choosing the best referral pathway, which may more optimally address young people’s mental health difficulties.

Childhood obesity rates are higher among Indigenous compared with non-Indigenous Australian children. It has been hypothesized that early-life influences beginning with the intrauterine environment predict the development of obesity in the offspring. The aim of this paper was to assess, in 227 mother–child dyads from the Gomeroi gaaynggal cohort, associations between prematurity, Gestation Related-Optimal Weight (GROW) centiles, maternal adiposity (percentage body fat, visceral fat area), maternal non-fasting plasma glucose levels (measured at mean gestational age of 23.1 weeks) and offspring BMI and adiposity (abdominal circumference, subscapular skinfold thickness) in early childhood (mean age 23.4 months). Maternal non-fasting plasma glucose concentrations were positively associated with infant birth weight (P=0.005) and GROW customized birth weight centiles (P=0.008). There was a significant association between maternal percentage body fat (P=0.02) and visceral fat area (P=0.00) with infant body weight in early childhood. Body mass index (BMI) in early childhood was significantly higher in offspring born preterm compared with those born at term (P=0.03). GROW customized birth weight centiles was significantly associated with body weight (P=0.01), BMI (P=0.007) and abdominal circumference (P=0.039) at early childhood. Our findings suggest that being born preterm, large for gestational age or exposed to an obesogenic intrauterine environment and higher maternal non-fasting plasma glucose concentrations are associated with increased obesity risk in early childhood. Future strategies should aim to reduce the prevalence of overweight/obesity in women of child-bearing age and emphasize the importance of optimal glycemia during pregnancy, particularly in Indigenous women.

Indigenous Australians continue to experience disparities in chronic diseases, many of which have nutrition-related trajectories. Optimal nutrition throughout the lifespan is protective for a number of adverse health outcomes, however little is known about current dietary intakes and related anthropometric outcomes of Indigenous women and their infants. Research is required to identify nutrition issues to target for health promotion activities. The Gomeroi gaaynggal programme is an ongoing, prospective cohort of pregnant Indigenous Australian women and their children. A cross-sectional examination of postnatal dietary intakes and anthropometric outcomes of mothers and children are reported. To date, 73 mother–child dyads have participated postpartum. Breastfeeding initiation was 85.9% and median (interquartile range) duration of any breastfeeding was 1.4 (0.5–4.0) months. Infants were introduced to solid foods at 5.0 months (4.0–6.0) and cow’s milk at 12.0 (10.0–13.0) months. At 12 months postpartum, 66.7% of women were overweight or obese, 63.7% at 2 years. Compared with recommendations, reported median maternal nutrient intakes from 24-h recall were low in fibre, folate, iodine, calcium, potassium and vitamin D and high in proportions of energy from total and saturated fat. Limitations of this study include a small sample size and incomplete data for the cohort at each time point. Preliminary data from this ongoing cohort of Indigenous Australian women and children suggest that women may need support to optimize nutrient intakes and to attain a healthy body weight for themselves and their children.

FFQ are commonly used to examine the association between diet and disease. They are the most practical method for usual dietary data collection as they are relatively inexpensive and easy to administer. In Australia, the Cancer Council of Victoria FFQ (CCVFFQ) version 2 and the online Commonwealth Scientific and Industrial Research Organisation FFQ (CSIROFFQ) are used. The aim of our study was to establish the level of agreement between nutrient intakes captured using the online CSIROFFQ and the paper-based CCVFFQ. The CCVFFQ and the online CSIROFFQ were completed by 136 healthy participants. FFQ responses were analysed to give g per d intake of a range of nutrients. Agreement between twenty-six nutrient intakes common to both FFQ was measured by a variety of methods. Nutrient intake levels that were significantly correlated between the two FFQ were carbohydrates, total fat, Na and MUFA. When assessing ranking of nutrients into quintiles, on average, 56 % of the participants (for all nutrients) were classified into the same or adjacent quintiles in both FFQ, with the highest percentage agreement for sugar. On average, 21 % of participants were grossly misclassified by three or four quintiles, with the highest percentage misclassification for fibre and Fe. Quintile agreement was similar to that reported by other studies, and we concluded that both FFQ are suitable tools for dividing participants’ nutrient intake levels into high- and low-consumption groups. Use of either FFQ was not appropriate for obtaining accurate estimates of absolute nutrient intakes.

• The individual matching of controls to cases in a case-control study may be used to control for confounding or background variables at the design stage of the study.

• Matching in a case-control study has some parallels with pair matching in experimental designs. It uses one of the most basic methods of error control, comparing like with like.

• The simplest type of matched case-control study takes a matched-pair form, in which each matched set comprises one case and one control.

• Matched case-control studies require a special form of analysis. The most common approach is to allow arbitrary variations between matched sets and to employ a conditional logistic regression analysis.

• An alternative analysis suitable in some situations uses a regression formulation based on the matching variables.

Preliminaries

An important and quite often fruitful principle in investigating the design and analysis of an observational study is to consider what would be appropriate for a comparable randomized experiment. What steps would be taken in such an experiment to achieve secure and precise conclusions? To what extent can these steps be followed in the observational context and what can be done to limit the loss of security of interpretation inherent in most observational situations?

In Chapter 2 we studied the dependence of a binary outcome, Y, on a single binary explanatory variable, the exposure, X.

The case-subcohort design, often called simply the case-cohort design, is an alternative to the nested case-control design for case-control sampling within a cohort.

The primary feature of a case-subcohort study is the ‘subcohort’, which is a random sample from the cohort and which serves as the set of potential controls for all cases. The study comprises the subcohort plus all additional cases, that is, those not in the subcohort.

In an analysis using event times the cases are compared with members of the subcohort who are at risk at their event time, using a pseudo-partial likelihood. This results in estimates of hazard ratios.

An advantage of this design is that the same subcohort can be used to study cases of different types.

A simpler form of case-subcohort study disregards event times and is sometimes referred to as a case-base study or hybrid epidemiologic design. In this the subcohort enables estimation of risk ratios and odds ratios.

Preliminaries

In this chapter we continue the discussion of studies described broadly as involving case-control sampling within a cohort. In the nested case-control design, discussed in Chapter 7, cases are compared with controls sampled from the risk set at each event time. A feature of the nested case-control design is that the sampled controls are specific to a chosen outcome and therefore cannot easily be re-used in studies of other outcomes of interest if these occur at different time points; in principle, at least, a new set of controls must be sampled for each outcome studied though some methods have been developed that do enable the re-use of controls.

The cornerstone of the analysis of case-control studies is that the ratio of the odds of a binary outcome Y given exposure X = 1 to that given X = 0 is the same as the ratio of the odds where the roles of Y and X are reversed. This result means that prospective odds ratios can be estimated from retrospective case-control data.

For binary exposure X and outcome Y there are both exact and large-sample methods for estimating odds ratios from case-control studies.

Methods for the estimation of odds ratios for binary exposures extend to categorical exposures and allow the combination of estimates across strata. The latter enables control for confounding and background variables.

For binary exposure X and outcome Y, the probabilities of X given Y and of Y given X can be formulated using two different logistic regression models. However, the two models give rise to the same estimates of odds ratios under maximum likelihood estimation.

Rate ratios can be estimated from a case-control study if ‘time’ is incorporated correctly into the sampling of individuals; a simple possibility is to perform case-control sampling within short time bands and then to combine the results.

Preliminaries

Many central issues involved in the analysis of case-control data are illustrated by the simplest special case, namely that of a binary explanatory variable or risk factor and a binary outcome or response.

Logistic regression can be used to estimate odds ratios using data from a case-control sample as though the data had arisen prospectively. This allows regression adjustment for background and confounding variables and makes possible the estimation of odds ratios for continuous exposures using case-control data.

The logistic regression of case-control data gives the correct estimates of log odds ratios, and their standard errors are as given by the inverse of the information matrix.

The logistic regression model is in a special class of regression models for estimating exposure-outcome associations that may be used to analyse case-control study data as though they had arisen prospectively. Another regression model of this type is the proportional odds model. For other models, including the additive risk model, case-control data alone cannot provide estimates of the appropriate parameters.

Absolute risks cannot be estimated from case-control data without additional information on the proportions of cases and controls in the underlying population.

Preliminaries

The previous chapters have introduced the key features of case-control studies but their content has been restricted largely to the study of single binary exposure variables. We now give a more general development. The broad features used for interpretation are as before:

• a study population of interest, from which the case-control sample is taken;

• a sampling model constituting the model under which the case-control data arise and which includes a representation of the data collection process;

• an inverse model representing the population dependence of the response on the explanatory variables; this model is the target for interpretation.

In two-stage case-control designs, limited information is obtained on individuals in a first-stage sample and used in the sampling of individuals at the second stage, where full information on exposures and other variables is obtained. The first stage may be a random sample or a case-control sample; the second stage is a case-control sample, possibly within strata. The major aim of these designs is to gain efficiency.

Two-stage studies can be analysed using likelihood-based arguments that extend the general formulation based on logistic regression.

Special sampling designs for matched case-control studies include countermatching, which uses some information on individuals in the potential pool of controls to select controls in such a way as to maximize the informativeness of the case-control sets.

Family groupings can be used in case-control-type studies, and there is a growing literature in the epidemiological, statistical and genetics fields. In one approach, cases are matched to a sibling or other relative.

Preliminaries

So far we have discussed case-control studies in which cases and controls are sampled, in principle at random, from the underlying population on the basis of their outcome status. We have also considered extensions, including matched studies and stratified sampling, in both of which it is assumed that some features of individuals in the underlying population are easily ascertained. Sometimes it is useful to consider alternative ways of sampling in a case-control study. In this chapter we discuss some special case-control sampling designs.

The case-control approach is a powerful method for investigating factors that may explain a particular event. It is extensively used in epidemiology to study disease incidence, one of the best-known examples being Bradford Hill and Doll's investigation of the possible connection between cigarette smoking and lung cancer. More recently, case-control studies have been increasingly used in other fields, including sociology and econometrics. With a particular focus on statistical analysis, this book is ideal for applied and theoretical statisticians wanting an up-to-date introduction to the field. It covers the fundamentals of case-control study design and analysis as well as more recent developments, including two-stage studies, case-only studies and methods for case-control sampling in time. The latter have important applications in large prospective cohorts which require case-control sampling designs to make efficient use of resources. More theoretical background is provided in an appendix for those new to the field.

The accumulation and possible combination of evidence from different studies, as contrasted with an emphasis on obtaining individually secure studies, is crucial to producing convincing evidence of generality of application and interpretation.

Methods for combining estimates across studies include those that can be applied when the full data from individual studies are available and those that can be applied when only summary statistics are available. They make allowance for heterogeneity of the effect of interest. The methods are general and not specific to case-control studies.

Some special considerations may be required for the combining of results from case-control studies and full cohort studies or from matched and unmatched case-control studies.

Preliminaries

The emphasis throughout this book has been on analysis and design aimed to produce individually secure studies. Yet in many fields it is the accumulation of evidence of various kinds and from various studies that is crucial, sometimes to achieve appropriate precision and, often of even more importance, to produce convincing evidence of generality of application and interpretation. We now consider briefly some issues involved, although most of the discussion is not particularly specific to case-control studies. The term meta-analysis is often used in this context. The distinctive issues are, however, not those of analysis but more those of forming appropriate criteria for the inclusion of data and of assessing how meaningfully comparable different studies really are. Important issues of the interpretation of apparent conflicts of information may be best addressed by the traditional method of descriptive review.

This book is about the planning and analysis of a special kind of investigation: a case-control study. We use this term to cover a number of different designs. In the simplest form individuals with an outcome of interest, possibly rare, are observed and information about past experience is obtained. In addition corresponding data are obtained on suitable controls in the hope of explaining what influences the outcome. In this book we are largely concerned with binary outcomes, for example indicating disease diagnosis or death. Such studies are reasonably called retrospective as contrasted with prospective studies, in which one records explanatory features and then waits to see what outcome arises. In retrospective studies we are studying the causes of effects and in prospective studies we are studying the effects of causes. We also discuss some extensions of case-control studies to incorporate temporality, which may be more appropriately viewed as a form of prospective study. The key aspect of all these designs is that they involve a sample of the underlying population that motivates the study, in which individuals with certain outcomes are strongly over-represented.

While we shall concentrate on the many special issues raised by such studies, we begin with a brief survey of the general themes of statistical design and analysis. We use a terminology deriving in part from epidemiological applications although the ideas are of much broader relevance.

We start the general discussion by considering a population of study individuals, patients, say, assumed to be statistically independent.

Case-control studies can involve more than two outcome groups, enabling us to estimate and compare exposure-outcome associations across groups.

Studies may involve multiple case subtypes and a single control group, or one case group and two or more control groups, for example.

Case-control studies with more than two outcome groups can be analysed using pairwise comparisons or special polychotomous analyses. The general formulation based on logistic regression extends to this situation, meaning that the data from such studies can be analysed as though arising from a prospective sample.

By contrast, in some situations case-only studies are appropriate; in these no controls are required. In one such situation the nature of the exposure contrasts studied may make the absence of controls reasonable. In another, each individual is in a sense his or her own control.

Preliminaries

In most of this book we are supposing that there are just two possible outcomes for each individual defining them as either cases or as controls. However, there are two contrasting situations where other than two outcomes are involved. First, it may be of interest to estimate and compare risk factors for three or more outcomes; an extended case-control design can be used to make comparisons between more than two outcome groups. The other, very contrasting, situation occurs when controls may be dispensed with, the case-only design. In the first part of this chapter we consider the former situation, and in the second part the latter.

The nested case-control design accommodates case event times into the sampling of controls.

In this design one or more controls is or are selected for each case from the risk set at the time at which the case event occurs. Controls may also be matched to cases on selected variables.

Nested case-control studies are particularly suited for use within large prospective cohorts, when it is desirable to process exposure information only for cases and a subset of non-cases.

The analysis of nested case-control studies uses a proportional hazards model and a modification to the partial likelihood used in full-cohort studies, giving estimates of hazard ratios. Extensions to other survival models are possible.

In the standard design, controls are selected randomly from the risk set for each case; however, more elaborate sampling procedures for controls, such as counter-matching, may gain efficiency. A weighted partial-likelihood analysis is needed to accommodate non-random sampling.

Preliminaries

We have focused primarily so far on case-control studies in which the cases and controls are sampled from groups of individuals who respectively do and do not have the outcome of interest occurring within a relevant time window. This time window is typically relatively short. If cases are to be defined as those experiencing an event or outcome of interest occurring over a longer time period, or if the rate of occurrence of the event is high, then the choice of a suitable control group requires special care.

The retrospective case-control approach provides a powerful method for studying rare events and their dependence on explanatory features. The method is extensively used in epidemiology to study disease incidence, one of the best known and early examples being the investigation by Bradford Hill and Doll of the possible impact of smoking and pollution on lung cancer. More recently the approach has been ever more widely used, by no means only in an epidemiological setting. There have also been various extensions of the method.

A definitive account in an epidemiological context was given by Breslow and Day in 1980 and their book remains a key source with many important insights. Our book is addressed to a somewhat more statistical readership and aims to cover recent developments. There is an emphasis on the analysis of data arising in case-control studies, but we also focus in a number of places on design issues. We have tried to make the book reasonably selfcontained; some familiarity with simple statistical methods and theory is assumed, however. Many methods described in the book rely on the use of maximum likelihood estimation, and the extension of this to pseudolikelihoods is required in the later chapters. We have therefore included an appendix outlining some theoretical details.

There is an enormous statistical literature on case-control studies. Some of the most important fundamental work appeared in the late 1970s, while the later 1980s and the 1990s saw the establishment of methods for case-control sampling in time.

• A case-control study is a retrospective observational study and is an alternative to a prospective observational study. Cases are identified in an underlying population and a comparable control group is sampled.

• In the standard design exposure information is obtained retrospectively, though this is not necessarily the case if the case-control sample is nested within a prospective cohort.

• Prospective studies are not cost effective for rare outcomes. By contrast, in a case-control study the ratio of cases and controls is higher than in the underlying population in order to make more efficient use of resources.

• There are two main types of case-control design; matched and unmatched.

• The odds ratio is the most commonly used measure of association between exposure and outcome in a case-control study.

• Important extensions to the standard case-control design include the explicit incorporation of time into the choice of controls and into the analysis.

Defining a case-control study

Consider a population of interest, for example the general population of the UK, perhaps restricted by gender or age group. We may call a representation of the process by which exposures X and outcomes Y occur in the presence of intrinsic features W the population model. As noted in the Preamble, such a system may be investigated prospectively or retrospectively; see Figure 1.1. In a prospective or cohort study a suitable sample of individuals is chosen to represent the population of interest, values of (W, X) are determined and the individuals are followed through time until the outcome Y can be observed.