To report the longer term outcomes following either a strategy of endovascular repair first or open repair of ruptured abdominal aortic aneurysm, which are necessary for both patient and clinical decision-making.

Methods and results

This pragmatic multicentre (29 UK and 1 Canada) trial randomized 613 patients with a clinical diagnosis of ruptured aneurysm; 316 to an endovascular first strategy (if aortic morphology is suitable, open repair if not) and 297 to open repair. The principal 1-year outcome was mortality; secondary outcomes were re-interventions, hospital discharge, health-related quality-of-life (QoL) (EQ-5D), costs, Quality-Adjusted-Life-Years (QALYs), and cost-effectiveness [incremental net benefit (INB)]. At 1 year, all-cause mortality was 41.1% for the endovascular strategy group and 45.1% for the open repair group, odds ratio 0.85 [95% confidence interval (CI) 0.62, 1.17], P = 0.325, with similar re-intervention rates in each group. The endovascular strategy group and open repair groups had average total hospital stays of 17 and 26 days, respectively, P < 0.001. Patients surviving rupture had higher average EQ-5D utility scores in the endovascular strategy vs. open repair groups, mean differences 0.087 (95% CI 0.017, 0.158), 0.068 (95% CI −0.004, 0.140) at 3 and 12 months, respectively. There were indications that QALYs were higher and costs lower for the endovascular first strategy, combining to give an INB of £3877 (95% CI £253, £7408) or €4356 (95% CI €284, €8323).

Conclusion

An endovascular first strategy for management of ruptured aneurysms does not offer a survival benefit over 1 year but offers patients faster discharge with better QoL and is cost-effective.

Individual participant time-to-event data from multiple prospective epidemiologic studies enable detailed investigation into the predictive ability of risk models. Here we address the challenges in appropriately combining such information across studies. Methods are exemplified by analyses of log C-reactive protein and conventional risk factors for coronary heart disease in the Emerging Risk Factors Collaboration, a collation of individual data from multiple prospective studies with an average follow-up duration of 9.8 years (dates varied). We derive risk prediction models using Cox proportional hazards regression analysis stratified by study and obtain estimates of risk discrimination, Harrell's concordance index, and Royston's discrimination measure within each study; we then combine the estimates across studies using a weighted meta-analysis. Various weighting approaches are compared and lead us to recommend using the number of events in each study. We also discuss the calculation of measures of reclassification for multiple studies. We further show that comparison of differences in predictive ability across subgroups should be based only on within-study information and that combining measures of risk discrimination from case-control studies and prospective studies is problematic. The concordance index and discrimination measure gave qualitatively similar results throughout. While the concordance index was very heterogeneous between studies, principally because of differing age ranges, the increments in the concordance index from adding log C-reactive protein to conventional risk factors were more homogeneous.

Large-scale epidemiological evidence on the role of inflammation in early atherosclerosis, assessed by carotid ultrasound, is lacking. We aimed to quantify cross-sectional and longitudinal associations of inflammatory markers with common-carotid-artery intima-media thickness (CCA-IMT) in the general population.

Methods

Information on high-sensitivity C-reactive protein, fibrinogen, leucocyte count and CCA-IMT was available in 20 prospective cohort studies of the PROG-IMT collaboration involving 49,097 participants free of pre-existing cardiovascular disease. Estimates of associations were calculated within each study and then combined using random-effects meta-analyses.

Inflammation was independently associated with CCA-IMT cross-sectionally. The lack of clear associations with CCA-IMT progression may be explained by imprecision in its assessment within a limited time period. Our findings for ‘inflammatory load’ suggest important combined effects of the three inflammatory markers on early atherosclerosis.

A case–cohort study is an efficient epidemiological study design for estimating exposure–outcome associations. When sampling of the subcohort is stratified, several methods of analysis are possible, but it is unclear how they compare. Our objective was to compare five analysis methods using Cox regression for this type of data, ranging from a crude model that ignores the stratification to a flexible one that allows nonproportional hazards and varying covariate effects across the strata.

Study Design and Setting

We applied the five methods to estimate the association between physical activity and incident type 2 diabetes using data from a stratified case–cohort study and also used artificial data sets to exemplify circumstances in which they can give different results.

Results

In the diabetes study, all methods except the method that ignores the stratification gave similar results for the hazard ratio associated with physical activity. In the artificial data sets, the more flexible methods were shown to be necessary when certain assumptions of the simpler models failed. The most flexible method gave reliable results for all the artificial data sets.

Conclusion

The most flexible method is computationally straightforward, and appropriate whether or not key assumptions made by the simpler models are valid.

Carotid intima media thickness (IMT) progression is increasingly used as a surrogate for vascular risk. This use is supported by data from a few clinical trials investigating statins, but established criteria of surrogacy are only partially fulfilled. To provide a valid basis for the use of IMT progression as a study end point, we are performing a 3-step meta-analysis project based on individual participant data.

Objectives of the 3 successive stages are to investigate (1) whether IMT progression prospectively predicts myocardial infarction, stroke, or death in population-based samples; (2) whether it does so in prevalent disease cohorts; and (3) whether interventions affecting IMT progression predict a therapeutic effect on clinical end points.

Recruitment strategies, inclusion criteria, and estimates of the expected numbers of eligible studies are presented along with a detailed analysis plan.

During a median follow-up of 9.9 (interquartile range, 7.6-13.2) years, 20 840 incident fatal and nonfatal CVD outcomes (13 237 coronary heart disease and 7603 stroke outcomes) were recorded. In analyses adjusted for several conventional cardiovascular risk factors, there was an approximately J-shaped association between HbA1c values and CVD risk. The association between HbA1c values and CVD risk changed only slightly after adjustment for total cholesterol and triglyceride concentrations or estimated glomerular filtration rate, but this association attenuated somewhat after adjustment for concentrations of high-density lipoprotein cholesterol and C-reactive protein. The C-index for a CVD risk prediction model containing conventional cardiovascular risk factors alone was 0.7434 (95% CI, 0.7350 to 0.7517). The addition of information on HbA1c was associated with a C-index change of 0.0018 (0.0003 to 0.0033) and a net reclassification improvement of 0.42 (−0.63 to 1.48) for the categories of predicted 10-year CVD risk. The improvement provided by HbA1c assessment in prediction of CVD risk was equal to or better than estimated improvements for measurement of fasting, random, or postload plasma glucose levels.

CONCLUSIONS AND RELEVANCE

In a study of individuals without known CVD or diabetes, additional assessment of HbA1c values in the context of CVD risk assessment provided little incremental benefit for prediction of CVD risk.

Numerous meta-analyses in healthcare research combine results from only a small number of studies, for which the variance representing between-study heterogeneity is estimated imprecisely. A Bayesian approach to estimation allows external evidence on the expected magnitude of heterogeneity to be incorporated.

The aim of this paper is to provide tools that improve the accessibility of Bayesian meta-analysis. We present two methods for implementing Bayesian meta-analysis, using numerical integration and importance sampling techniques. Based on 14 886 binary outcome meta-analyses in the Cochrane Database of Systematic Reviews, we derive a novel set of predictive distributions for the degree of heterogeneity expected in 80 settings depending on the outcomes assessed and comparisons made. These can be used as prior distributions for heterogeneity in future meta-analyses.

The two methods are implemented in R, for which code is provided. Both methods produce equivalent results to standard but more complex Markov chain Monte Carlo approaches. The priors are derived as log-normal distributions for the between-study variance, applicable to meta-analyses of binary outcomes on the log odds-ratio scale. The methods are applied to two example meta-analyses, incorporating the relevant predictive distributions as prior distributions for between-study heterogeneity.

Genome-wide association studies, which typically report regression coefficients summarizing the associations of many genetic variants with various traits, are potentially a powerful source of data for Mendelian randomization investigations. We demonstrate how such coefficients from multiple variants can be combined in a Mendelian randomization analysis to estimate the causal effect of a risk factor on an outcome. The bias and efficiency of estimates based on summarized data are compared to those based on individual-level data in simulation studies. We investigate the impact of gene–gene interactions, linkage disequilibrium, and ‘weak instruments’ on these estimates. Both an inverse-variance weighted average of variant-specific associations and a likelihood-based approach for summarized data give similar estimates and precision to the two-stage least squares method for individual-level data, even when there are gene–gene interactions. However, these summarized data methods overstate precision when variants are in linkage disequilibrium. If the P-value in a linear regression of the risk factor for each variant is less than , then weak instrument bias will be small. We use these methods to estimate the causal association of low-density lipoprotein cholesterol (LDL-C) on coronary artery disease using published data on five genetic variants. A 30% reduction in LDL-C is estimated to reduce coronary artery disease risk by 67% (95% CI: 54% to 76%). We conclude that Mendelian randomization investigations using summarized data from uncorrelated variants are similarly efficient to those using individual-level data, although the necessary assumptions cannot be so fully assessed.

Finding individual-level data for adequately-powered Mendelian randomization analyses may be problematic. As publicly-available summarized data on genetic associations with disease outcomes from large consortia are becoming more abundant, use of published data is an attractive analysis strategy for obtaining precise estimates of the causal effects of risk factors on outcomes. We detail the necessary steps for conducting Mendelian randomization investigations using published data, and present novel statistical methods for combining data on the associations of multiple (correlated or uncorrelated) genetic variants with the risk factor and outcome into a single causal effect estimate. A two-sample analysis strategy may be employed, in which evidence on the gene-risk factor and gene-outcome associations are taken from different data sources. These approaches allow the efficient identification of risk factors that are suitable targets for clinical intervention from published data, although the ability to assess the assumptions necessary for causal inference is diminished. Methods and guidance are illustrated using the example of the causal effect of serum calcium levels on fasting glucose concentrations. The estimated causal effect of a 1 standard deviation (0.13 mmol/L) increase in calcium levels on fasting glucose (mM) using a single lead variant from the CASR gene region is 0.044 (95 % credible interval −0.002, 0.100). In contrast, using our method to account for the correlation between variants, the corresponding estimate using 17 genetic variants is 0.022 (95 % credible interval 0.009, 0.035), a more clearly positive causal effect.

Electronic supplementary material

The online version of this article (doi:10.1007/s10654-015-0011-z) contains supplementary material, which is available to authorized users.

A conventional Mendelian randomization analysis assesses the causal effect of a risk factor on an outcome by using genetic variants that are solely associated with the risk factor of interest as instrumental variables. However, in some cases, such as the case of triglyceride level as a risk factor for cardiovascular disease, it may be difficult to find a relevant genetic variant that is not also associated with related risk factors, such as other lipid fractions. Such a variant is known as pleiotropic. In this paper, we propose an extension of Mendelian randomization that uses multiple genetic variants associated with several measured risk factors to simultaneously estimate the causal effect of each of the risk factors on the outcome. This “multivariable Mendelian randomization” approach is similar to the simultaneous assessment of several treatments in a factorial randomized trial. In this paper, methods for estimating the causal effects are presented and compared using real and simulated data, and the assumptions necessary for a valid multivariable Mendelian randomization analysis are discussed. Subject to these assumptions, we demonstrate that triglyceride-related pathways have a causal effect on the risk of coronary heart disease independent of the effects of low-density lipoprotein cholesterol and high-density lipoprotein cholesterol.

The causal effects of these three lipid fractions can be better identified using the extended methods of ‘multivariable Mendelian randomization’. We employ this approach using published data on 185 lipid-related genetic variants and their associations with lipid fractions in 188,578 participants, and with CAD risk in 22,233 cases and 64,762 controls. Our results suggest that HDL-c may be causally protective of CAD risk, independently of the effects of LDL-c and triglycerides. Estimated causal odds ratios per standard deviation increase, based on 162 variants not having pleiotropic associations with either blood pressure or body mass index, are 1.57 (95% credible interval 1.45 to 1.70) for LDL-c, 0.91 (0.83 to 0.99, p-value = 0.028) for HDL-c, and 1.29 (1.16 to 1.43) for triglycerides.

Significance

Some interventions on HDL-c concentrations may influence risk of CAD, but to a lesser extent than interventions on LDL-c. A causal interpretation of these estimates relies on the assumption that the genetic variants do not have pleiotropic associations with risk factors on other pathways to CAD. If they do, a weaker conclusion is that genetic predictors of LDL-c, HDL-c and triglycerides each have independent associations with CAD risk.

There has been limited study of factors influencing response rates and attrition in online research. Online experiments were nested within the pilot (study 1, n = 3780) and main trial (study 2, n = 2667) phases of an evaluation of a Web-based intervention for hazardous drinkers: the Down Your Drink randomized controlled trial (DYD-RCT).

Objectives

The objective was to determine whether differences in the length and relevance of questionnaires can impact upon loss to follow-up in online trials.

Methods

A randomized controlled trial design was used. All participants who consented to enter DYD-RCT and completed the primary outcome questionnaires were randomized to complete one of four secondary outcome questionnaires at baseline and at follow-up. These questionnaires varied in length (additional 23 or 34 versus 10 items) and relevance (alcohol problems versus mental health). The outcome measure was the proportion of participants who completed follow-up at each of two follow-up intervals: study 1 after 1 and 3 months and study 2 after 3 and 12 months.

Results

At all four follow-up intervals there were no significant effects of additional questionnaire length on follow-up. Randomization to the less relevant questionnaire resulted in significantly lower rates of follow-up in two of the four assessments made (absolute difference of 4%, 95% confidence interval [CI] 0%-8%, in both study 1 after 1 month and in study 2 after 12 months). A post hoc pooled analysis across all four follow-up intervals found this effect of marginal statistical significance (unadjusted difference, 3%, range 1%-5%, P = .01; difference adjusted for prespecified covariates, 3%, range 0%-5%, P = .05).

Conclusions

Apparently minor differences in study design decisions may have a measurable impact on attrition in trials. Further investigation is warranted of the impact of the relevance of outcome measures on follow-up rates and, more broadly, of the consequences of what we ask participants to do when we invite them to take part in research studies.

Ageing populations may demand more blood transfusions, but the blood supply could be limited by difficulties in attracting and retaining a decreasing pool of younger donors. One approach to increase blood supply is to collect blood more frequently from existing donors. If more donations could be safely collected in this manner at marginal cost, then it would be of considerable benefit to blood services. National Health Service (NHS) Blood and Transplant in England currently allows men to donate up to every 12 weeks and women to donate up to every 16 weeks. In contrast, some other European countries allow donations as frequently as every 8 weeks for men and every 10 weeks for women. The primary aim of the INTERVAL trial is to determine whether donation intervals can be safely and acceptably decreased to optimise blood supply whilst maintaining the health of donors.

Methods/Design

INTERVAL is a randomised trial of whole blood donors enrolled from all 25 static centres of NHS Blood and Transplant. Recruitment of about 50,000 male and female donors started in June 2012 and was completed in June 2014. Men have been randomly assigned to standard 12-week versus 10-week versus 8-week inter-donation intervals, while women have been assigned to standard 16-week versus 14-week versus 12-week inter-donation intervals. Sex-specific comparisons will be made by intention-to-treat analysis of outcomes assessed after two years of intervention. The primary outcome is the number of blood donations made. A key secondary outcome is donor quality of life, assessed using the Short Form Health Survey. Additional secondary endpoints include the number of ‘deferrals’ due to low haemoglobin (and other factors), iron status, cognitive function, physical activity, and donor attitudes. A comprehensive health economic analysis will be undertaken.

Discussion

The INTERVAL trial should yield novel information about the effect of inter-donation intervals on blood supply, acceptability, and donors’ physical and mental well-being. The study will generate scientific evidence to help formulate blood collection policies in England and elsewhere.

Background: Mendelian randomization uses genetic variants, assumed to be instrumental variables for a particular exposure, to estimate the causal effect of that exposure on an outcome. If the instrumental variable criteria are satisfied, the resulting estimator is consistent even in the presence of unmeasured confounding and reverse causation.

Methods: We extend the Mendelian randomization paradigm to investigate more complex networks of relationships between variables, in particular where some of the effect of an exposure on the outcome may operate through an intermediate variable (a mediator). If instrumental variables for the exposure and mediator are available, direct and indirect effects of the exposure on the outcome can be estimated, for example using either a regression-based method or structural equation models. The direction of effect between the exposure and a possible mediator can also be assessed. Methods are illustrated in an applied example considering causal relationships between body mass index, C-reactive protein and uric acid.

Results: These estimators are consistent in the presence of unmeasured confounding if, in addition to the instrumental variable assumptions, the effects of both the exposure on the mediator and the mediator on the outcome are homogeneous across individuals and linear without interactions. Nevertheless, a simulation study demonstrates that even considerable heterogeneity in these effects does not lead to bias in the estimates.

Conclusions: These methods can be used to estimate direct and indirect causal effects in a mediation setting, and have potential for the investigation of more complex networks between multiple interrelated exposures and disease outcomes.

The extent to which diabetes mellitus or hyperglycemia is related to risk of death from cancer or other nonvascular conditions is uncertain.

METHODS

We calculated hazard ratios for cause-specific death, according to baseline diabetes status or fasting glucose level, from individual-participant data on 123,205 deaths among 820,900 people in 97 prospective studies.

RESULTS

After adjustment for age, sex, smoking status, and body-mass index, hazard ratios among persons with diabetes as compared with persons without diabetes were as follows: 1.80 (95% confidence interval [CI], 1.71 to 1.90) for death from any cause, 1.25 (95% CI, 1.19 to 1.31) for death from cancer, 2.32 (95% CI, 2.11 to 2.56) for death from vascular causes, and 1.73 (95% CI, 1.62 to 1.85) for death from other causes. Diabetes (vs. no diabetes) was moderately associated with death from cancers of the liver, pancreas, ovary, colorectum, lung, bladder, and breast. Aside from cancer and vascular disease, diabetes (vs. no diabetes) was also associated with death from renal disease, liver disease, pneumonia and other infectious diseases, mental disorders, nonhepatic digestive diseases, external causes, intentional self-harm, nervous-system disorders, and chronic obstructive pulmonary disease. Hazard ratios were appreciably reduced after further adjustment for glycemia measures, but not after adjustment for systolic blood pressure, lipid levels, inflammation or renal markers. Fasting glucose levels exceeding 100 mg per deciliter (5.6 mmol per liter), but not levels of 70 to 100 mg per deciliter (3.9 to 5.6 mmol per liter), were associated with death. A 50-year-old with diabetes died, on average, 6 years earlier than a counterpart without diabetes, with about 40% of the difference in survival attributable to excess nonvascular deaths.

CONCLUSIONS

In addition to vascular disease, diabetes is associated with substantial premature death from several cancers, infectious diseases, external causes, intentional self-harm, and degenerative disorders, independent of several major risk factors. (Funded by the British Heart Foundation and others.)

Attrition from follow-up is a major methodological challenge in randomized trials. Incentives are known to improve response rates in cross-sectional postal and online surveys, yet few studies have investigated whether they can reduce attrition from follow-up in online trials, which are particularly vulnerable to low follow-up rates.

Objectives

Our objective was to determine the impact of incentives on follow-up rates in an online trial.

Methods

Two randomized controlled trials were embedded in a large online trial of a Web-based intervention to reduce alcohol consumption (the Down Your Drink randomized controlled trial, DYD-RCT). Participants were those in the DYD pilot trial eligible for 3-month follow-up (study 1) and those eligible for 12-month follow-up in the DYD main trial (study 2). Participants in both studies were randomly allocated to receive an offer of an incentive or to receive no offer of an incentive. In study 1, participants in the incentive arm were randomly offered a £5 Amazon.co.uk gift voucher, a £5 charity donation to Cancer Research UK, or entry in a prize draw for £250. In study 2, participants in the incentive arm were offered a £10 Amazon.co.uk gift voucher. The primary outcome was the proportion of participants who completed follow-up questionnaires in the incentive arm(s) compared with the no incentive arm.

Results

In study 1 (n = 1226), there was no significant difference in response rates between those participants offered an incentive (175/615, 29%) and those with no offer (162/611, 27%) (difference = 2%, 95% confidence interval [CI] –3% to 7%). There was no significant difference in response rates among the three different incentives offered. In study 2 (n = 2591), response rates were 9% higher in the group offered an incentive (476/1296, 37%) than in the group not offered an incentive (364/1295, 28%) (difference = 9%, 95% CI 5% to 12%, P < .001). The incremental cost per extra successful follow-up in the incentive arm was £110 in study 1 and £52 in study 2.

Conclusion

Whereas an offer of a £10 Amazon.co.uk gift voucher can increase follow-up rates in online trials, an offer of a lower incentive may not. The marginal costs involved require careful consideration.

The case-cohort study design combines the advantages of a cohort study with the efficiency of a nested case-control study. However, unlike more standard observational study designs, there are currently no guidelines for reporting results from case-cohort studies. Our aim was to review recent practice in reporting these studies, and develop recommendations for the future. By searching papers published in 24 major medical and epidemiological journals between January 2010 and March 2013 using PubMed, Scopus and Web of Knowledge, we identified 32 papers reporting case-cohort studies. The median subcohort sampling fraction was 4.1% (interquartile range 3.7% to 9.1%). The papers varied in their approaches to describing the numbers of individuals in the original cohort and the subcohort, presenting descriptive data, and in the level of detail provided about the statistical methods used, so it was not always possible to be sure that appropriate analyses had been conducted. Based on the findings of our review, we make recommendations about reporting of the study design, subcohort definition, numbers of participants, descriptive information and statistical methods, which could be used alongside existing STROBE guidelines for reporting observational studies.

Health care and health care services are increasingly being delivered over the Internet. There is a strong argument that interventions delivered online should also be evaluated online to maximize the trial’s external validity. Conducting a trial online can help reduce research costs and improve some aspects of internal validity. To date, there are relatively few trials of health interventions that have been conducted entirely online. In this paper we describe the major methodological issues that arise in trials (recruitment, randomization, fidelity of the intervention, retention, and data quality), consider how the online context affects these issues, and use our experience of one online trial evaluating an intervention to help hazardous drinkers drink less (DownYourDrink) to illustrate potential solutions. Further work is needed to develop online trial methodology.

Carotid intima-media thickness (cIMT) is related to the risk of
cardiovascular events in the general population. An association between
changes in cIMT and cardiovascular risk is frequently assumed but has rarely
been reported. Our aim was to test this association.

Methods

We identified general population studies that assessed cIMT at least
twice and followed up participants for myocardial infarction, stroke, or
death. The study teams collaborated in an individual participant data
meta-analysis. Excluding individuals with previous myocardial infarction or
stroke, we assessed the association between cIMT progression and the risk of
cardiovascular events (myocardial infarction, stroke, vascular death, or a
combination of these) for each study with Cox regression. The log hazard
ratios (HRs) per SD difference were pooled by random effects
meta-analysis.

Findings

Of 21 eligible studies, 16 with 36 984 participants were included.
During a mean follow-up of 7·0 years, 1519 myocardial infarctions,
1339 strokes, and 2028 combined endpoints (myocardial infarction, stroke,
vascular death) occurred. Yearly cIMT progression was derived from two
ultrasound visits 2–7 years (median 4 years) apart. For mean common
carotid artery intima-media thickness progression, the overall HR of the
combined endpoint was 0·97 (95% CI
0·94–1·00) when adjusted for age, sex, and mean
common carotid artery intima-media thickness, and 0·98
(0·95–1·01) when also adjusted for vascular risk
factors. Although we detected no associations with cIMT progression in
sensitivity analyses, the mean cIMT of the two ultrasound scans was
positively and robustly associated with cardiovascular risk (HR for the
combined endpoint 1·16, 95% CI
1·10–1·22, adjusted for age, sex, mean common
carotid artery intima-media thickness progression, and vascular risk
factors). In three studies including 3439 participants who had four
ultrasound scans, cIMT progression did not correlate between occassions
(reproducibility correlations between
r=−0·06 and
r=−0·02).

Interpretation

The association between cIMT progression assessed from two ultrasound
scans and cardiovascular risk in the general population remains unproven. No
conclusion can be derived for the use of cIMT progression as a surrogate in
clinical trials.

Background The extent to which adult height, a biomarker of the interplay of genetic endowment and early-life experiences, is related to risk of chronic diseases in adulthood is uncertain.

Methods We calculated hazard ratios (HRs) for height, assessed in increments of 6.5 cm, using individual–participant data on 174 374 deaths or major non-fatal vascular outcomes recorded among 1 085 949 people in 121 prospective studies.

Results For people born between 1900 and 1960, mean adult height increased 0.5–1 cm with each successive decade of birth. After adjustment for age, sex, smoking and year of birth, HRs per 6.5 cm greater height were 0.97 (95% confidence interval: 0.96–0.99) for death from any cause, 0.94 (0.93–0.96) for death from vascular causes, 1.04 (1.03–1.06) for death from cancer and 0.92 (0.90–0.94) for death from other causes. Height was negatively associated with death from coronary disease, stroke subtypes, heart failure, stomach and oral cancers, chronic obstructive pulmonary disease, mental disorders, liver disease and external causes. In contrast, height was positively associated with death from ruptured aortic aneurysm, pulmonary embolism, melanoma and cancers of the pancreas, endocrine and nervous systems, ovary, breast, prostate, colorectum, blood and lung. HRs per 6.5 cm greater height ranged from 1.26 (1.12–1.42) for risk of melanoma death to 0.84 (0.80–0.89) for risk of death from chronic obstructive pulmonary disease. HRs were not appreciably altered after further adjustment for adiposity, blood pressure, lipids, inflammation biomarkers, diabetes mellitus, alcohol consumption or socio-economic indicators.

Conclusion Adult height has directionally opposing relationships with risk of death from several different major causes of chronic diseases.

Case-cohort studies are increasingly used to quantify the association of novel factors with disease risk. Conventional measures of predictive ability need modification for this design. We show how Harrell’s C-index, Royston’s D, and the category-based and continuous versions of the net reclassification index (NRI) can be adapted.

Methods

We simulated full cohort and case-cohort data, with sampling fractions ranging from 1% to 90%, using covariates from a cohort study of coronary heart disease, and two incidence rates. We then compared the accuracy and precision of the proposed risk prediction metrics.

Results

The C-index and D must be weighted in order to obtain unbiased results. The NRI does not need modification, provided that the relevant non-subcohort cases are excluded from the calculation. The empirical standard errors across simulations were consistent with analytical standard errors for the C-index and D but not for the NRI. Good relative efficiency of the prediction metrics was observed in our examples, provided the sampling fraction was above 40% for the C-index, 60% for D, or 30% for the NRI. Stata code is made available.

Conclusions

Case-cohort designs can be used to provide unbiased estimates of the C-index, D measure and NRI.

Background An allele score is a single variable summarizing multiple genetic variants associated with a risk factor. It is calculated as the total number of risk factor-increasing alleles for an individual (unweighted score), or the sum of weights for each allele corresponding to estimated genetic effect sizes (weighted score). An allele score can be used in a Mendelian randomization analysis to estimate the causal effect of the risk factor on an outcome.

Methods Data were simulated to investigate the use of allele scores in Mendelian randomization where conventional instrumental variable techniques using multiple genetic variants demonstrate ‘weak instrument’ bias. The robustness of estimates using the allele score to misspecification (for example non-linearity, effect modification) and to violations of the instrumental variable assumptions was assessed.

Results Causal estimates using a correctly specified allele score were unbiased with appropriate coverage levels. The estimates were generally robust to misspecification of the allele score, but not to instrumental variable violations, even if the majority of variants in the allele score were valid instruments. Using a weighted rather than an unweighted allele score increased power, but the increase was small when genetic variants had similar effect sizes. Naive use of the data under analysis to choose which variants to include in an allele score, or for deriving weights, resulted in substantial biases.

Conclusions Allele scores enable valid causal estimates with large numbers of genetic variants. The stringency of criteria for genetic variants in Mendelian randomization should be maintained for all variants in an allele score.

Background Within-person variability in measured values of a risk factor can bias its association with disease. We investigated the extent of regression dilution bias in calculated variables and its implications for comparing the aetiological associations of risk factors.

Results RDRs in calculated risk factors depend strongly on the RDRs, correlation, and comparative distributions of the components of these risk factors. For measures of adiposity, the RDR was lower for WHR [RDR: 0.72 (95% confidence interval 0.65–0.80)] than for either of its components [waist circumference: 0.87 (0.85–0.90); hip circumference: 0.90 (0.86–0.93) or for BMI: 0.96 (0.93–0.98) and WHtR: 0.87 (0.85–0.90)], predominantly because of the stronger correlation and more similar distributions observed between waist circumference and hip circumference than between height and weight or between waist circumference and height. Error-corrected HRs for BMI, waist circumference, WHR, and WHtR, were respectively 1.24, 1.30, 1.44, and 1.32 per SD change in baseline levels of these variables, and 1.24, 1.27, 1.35, and 1.30 per SD change in error-corrected levels.

Conclusions The extent of within-person variability relative to between-person variability in calculated risk factors can be considerably larger (or smaller) than in its components. Aetiological associations of risk factors should be compared through the use of error-corrected HRs per SD change in error-corrected levels of these risk factors.

Genetic markers can be used as instrumental variables, in an analogous way to randomization in a clinical trial, to estimate the causal relationship between a phenotype and an outcome variable. Our purpose is to extend the existing methods for such Mendelian randomization studies to the context of multiple genetic markers measured in multiple studies, based on the analysis of individual participant data. First, for a single genetic marker in one study, we show that the usual ratio of coefficients approach can be reformulated as a regression with heterogeneous error in the explanatory variable. This can be implemented using a Bayesian approach, which is next extended to include multiple genetic markers. We then propose a hierarchical model for undertaking a meta-analysis of multiple studies, in which it is not necessary that the same genetic markers are measured in each study. This provides an overall estimate of the causal relationship between the phenotype and the outcome, and an assessment of its heterogeneity across studies. As an example, we estimate the causal relationship of blood concentrations of C-reactive protein on fibrinogen levels using data from 11 studies. These methods provide a flexible framework for efficient estimation of causal relationships derived from multiple studies. Issues discussed include weak instrument bias, analysis of binary outcome data such as disease risk, missing genetic data, and the use of haplotypes.