Wolters Kluwer Health
may email you for journal alerts and information, but is committed
to maintaining your privacy and will not share your personal information without
your express consent. For more information, please refer to our Privacy Policy.

The announcement last summer that the risks outweigh the benefits of postmenopausal hormone replacement therapy (HRT) took many by surprise. A large randomized trial by the National Institutes of Health–funded Women's Health Initiative (WHI) 1 found that Prempro, a combination of estrogen and progestin often prescribed to postmenopausal women, increases the risk of breast cancer, coronary heart disease (CHD), stroke and pulmonary embolism. The drug reduces risk for bone fractures and colon cancer, but not enough to outweigh the accompanying risks.

Why had this drug been prescribed so ubiquitously? Had women and their physicians been misled by the observational studies? Here we take a brief look at the findings of case-control and cohort studies on HRT in relation to breast cancer, CHD, stroke, bone fractures and colon cancer. We shall see that, with the notable exception of CHD, the observational studies fare well; the observational studies predicted the WHI findings for all of the other endpoints.

Observational Studies vs Trials: How Well Do They Agree?

Shortly after publication of the WHI findings, Nelson et al. 2 reviewed the evidence on benefits and risks of HRT for the primary prevention of breast cancer, CHD, thromboembolism, osteoporosis and colon cancer. For breast cancer, the data reviewed by these authors suggest that current estrogen users have 20%–40% increased risk 3–5 and that the risk increases with duration of use. 3–7 However, Nelson and colleagues report that observational studies and several meta-analyses show no risk differences between women with any prior estrogen use compared with never-users. 3–9

The review by Nelson et al. 2 also found substantial increases in stroke incidence (relative risk [RR] = 1.12; 95% confidence interval [CI] = 1.01–1.23), but not mortality, in ever-users of HRT compared with never-users. A combined analysis of 12 studies including three randomized clinical trials revealed a two-fold increase in risk for deep vein thrombosis and pulmonary embolism among current HRT users (RR = 2.1; 1.6–2.8). 2 They also noted a nearly four-fold increase in risk within the first year of use (RR = 3.5; 2.3–5.6).

Several observational studies reviewed by Nelson et al. 2 found statistically substantial reductions in osteoporotic bone fracture risk associated with HRT use. Two cohort studies reported a 60% reduction in risk for wrist fractures 10,11 and a 40% reduction in risk for vertebral fractures 11 among current or ever-users of HRT compared with never-users. A combined analysis of six cohort studies showed a nonsubstantial reduction in hip-fracture risk for current users of HRT (RR = 0.64; CI = 0.32–1.04) or ever-users of HRT (RR = 0.76; 0.56–1.01) compared with never-users. 10–15

Current or ever-users of HRT are also at reduced risk for colon cancer compared with never-users, based on a meta-analysis of 18 observational studies. 16

The big difference between the observational studies and the WHI findings concerns the effects of HRT on heart disease. Based on their meta-analysis of 21 observational studies, Nelson et al. 2 noted reductions in CHD incidence (RR = 0.8; CI = 0.68–0.95) and mortality (RR = 0.62; 0.40–0.90) among current users of HRT compared with never-users. However, estimated risk reductions did not achieve statistical significance in past users or ever-users of HRT. Moreover, statistically substantial reductions were not seen when the authors analyzed only those studies that controlled for socioeconomic status (RR = 0.91; CI = 0.67–1.3). 17–20 The possibility of confounding was further suggested when Nelson et al. 2 restricted analysis to studies that controlled for alcohol consumption, physical activity and other CHD risk factors and again found no substantial association between HRT and CHD. 18–21

Why the Discrepancies for CHD?

The most plausible explanation for the disagreement between the observational studies and the WHI findings for CHD is residual confounding in the observational studies. Compared with never-users, HRT users in virtually every observational study were better educated, leaner, more physically active, less likely to smoke, more health conscious and more likely to seek medical care. It is noteworthy that the protective effect of HRT did not achieve statistical significance when the meta-analysis of Nelson et al. 2 was restricted to those observational studies that controlled for socioeconomic status.

Lessons to Be Learned

The first lesson is that it is not yet time for epidemiologists to abandon their fieldwork and become trialists. The good agreement between the observational studies and the trial on endpoints other than CHD confirms the utility and validity of observational studies as monitors of new preventive agents. Moreover, randomized trials cannot evaluate the long-term effects of preventive agents; we will continue to need observational studies to provide this important information. Indeed, long-term observational follow-up of the WHI participants for delayed effects will provide essential data on the lifetime risks and benefits of HRT use among former users.

What can we learn from the CHD discrepancy? Two lessons come to mind. The first is that large sample sizes may be necessary, but they are not sufficient for accurate answers. Bigger need not be better. In the presence of unrecognized residual confounding, the precision in risk estimates gained by large sample sizes can be misleading: a narrow confidence interval can be far removed from the true measure of association. Indeed, the danger of impressively tiny P-values and of impressively narrow confidence intervals that lie far from the mark is a well-known pitfall of meta-analysis. Epidemiology is grounded on the principle of confirming findings by repeating them in different populations and different settings. Nevertheless, the same residual confounding may plague all of the data. Indeed, the consistency of the observational studies may even make it impossible to launch a costly randomized trial. During the planning of the WHI, for instance, the view that HRT prevents heart disease was so entrenched that some argued that it would be unethical to deny some women the drug and instead give them a placebo.

The second lesson to be learned from the CHD discrepancy is that intermediate endpoints do not tell the whole tale. The observational data showing a CHD risk reduction in HRT users were backed by many “mechanistic” studies showing a favorable effect on markers such as cholesterol levels and atherosclerosis. But these endpoints are not synonymous with CHD, and HRT may act adversely on other endpoints that also contribute to risk. Thus, reliance on an intermediate endpoint may be misleading. For example, fluoride supplements were thought to reduce the risk of osteoporotic fractures because fluoride increases bone mineral density. 22 On the contrary, however, randomized trials have indicated no effect on spinal fractures despite a marked increase in spinal bone mass, and they have found an increase in fractures in areas outside the spine. 23–25 Denser bones are not necessarily stronger ones; the bone formed during the administration of fluoride may be structurally abnormal because of defective mineralization of the newly synthesized bone. 26

The utility of an intermediate endpoint rests on its predictive power for the ultimate outcome of interest. 27 As intermediaries between HRT and CHD, lipid levels score fairly well on this front. However, using them as surrogates for CHD oversimplifies a multifaceted problem. Instead, we need to consider a “systems approach” to the effects of perturbing one or more of the many metabolic pathways leading to a complex disease such as CHD. 28

Thus, using the WHI findings as a gold standard, we give the observational studies a grade of B on their performance in charting the short- to moderate-term effects of combined estrogen and progestin in postmenopausal women. Future challenges include continued rigorous attention to the pitfalls of confounding in observational studies and a healthy skepticism about the predictive power of intermediate endpoints.

About the Authors

ALICE S. WHITTEMORE conducts research on the genetic epidemiology of cancers of the prostate, ovary and breast. She is Professor of Epidemiology and Biostatistics at Stanford University School of Medicine. She is a member of the Institute of Medicine, a Fellow of the American Association for the Advancement of Science, a Fellow of the American Statistical Association and a member of the American Epidemiological Society.

VALERIE MCGUIRE has published on the epidemiology of site-specific cancers, neurologic disorders and cardiovascular disease.