It’s increasingly clear that surrogate endpoints don’t tell the
entire story when it comes to a treatment’s effectiveness. Just a couple
of weeks ago, we learned that a drug which raises “good” cholesterol (a
surrogate for cardiovascular disease risk) had no effect on the incidence of heart attacks and strokes. Previous research found that aggressively lowering blood sugar (a surrogate for diabetes complications) actually increased the risk of death among individuals with type 2 diabetes.

Such findings remind us to focus more closely on the real outcomes
that matter to patients – things like death, disease severity, time
spent in the hospital, and patient quality of life. But even these
supposedly “real” outcomes can give us an inflated sense of how well a
treatment works if we don’t evaluate the results carefully. This is
especially true for studies that employ so-called composite endpoints,
or outcomes that consist of two or more components that are combined
into a single result.

Such outcomes are commonly used in studies testing new treatments for
cardiovascular disease. An example might be a composite outcome
consisting of non-fatal heart attacks, cardiovascular deaths, or
emergency surgery to treat a blocked coronary artery.

Why use composites? The main advantage of this approach is increased
statistical efficiency. By measuring more than one result and combining
the data in a single outcome, researchers have an easier time showing a
statistically significant difference between the treatment group and
controls. This allows for studies that require fewer patients, take less
time, and ultimately are more cost-effective. However, this approach
can also open the door to misdirection and statistical sleight of hand.

For example, suppose a drug leads to a large reduction in a composite
outcome of “death or chest pain.” This finding could mean that the drug
resulted in fewer deaths and less chest pain. But it is also possible
that the composite was driven entirely by a reduction in chest pain with
no change, or even an increase, in death.

To help readers understand which components of the composite are most
responsible for any treatment effect, many experts emphasize the
importance of presenting data for all composite components in published
research studies. But in their systematic review of 40 randomized trials
that were published in 2008, Cordoba et al found that only 60% of the
studies they looked at provided reliable estimates for all composite
components. In many studies, there was a misleading implication that the
results applied to the most important clinical component of the
composite, when the results were primarily attributable to less serious
components.

Here’s another concern with composites: Many studies use components,
such as hospital admissions, that are based on a judgment call made by
the clinicians conducting the study. And these are often the components
of the composite that show the largest effect and contribute most to an
overall positive result. This is problematic, Cordoba and colleagues
note, because clinicians often aren’t blinded to the treatment that
study patients are receiving (i.e. they know whether the patient is in
the experimental treatment group or the placebo/control group). And so
their judgment in these cases could easily be biased by their knowledge
of the patient’s study group allocation. Not surprisingly, studies that
include such “clinician driven” components are more likely to report a statistically significant result for the primary outcome.

A final warning involves studies that “cherry pick” data to include
in the composite. When the components of the composite aren’t clearly
identified prior to the study, researchers may be tempted to mix and
match outcome components until they arrive at a statistically
significant result (something that’s bound to happen eventually due to
chance). In one study singled out by Cordoba et al,
the primary outcome was a composite of 8 different components that
wasn’t statistically significant. However, the authors also reported on a
number of secondary composites that consisted of “combinations of
primary end points as well as death from any cause.” These combinations
weren’t specified in the study, but Cordoba calculated 502 ways that
these components could be combined. It’s no shock that the researchers
ultimately turned up a statistically significant result for one of these
combinations — a finding that was singled out for emphasis in the study
abstract, but which is of uncertain clinical importance.

The bottom line is that we need to be careful when reporting on
studies that use composite outcomes. When these studies report a
benefit, reporters should evaluate whether there was a similar effect on
all components of the composite; if not, they should identify which
component of the composite was primarily responsible for the result, and
explain whether that component is more or less important than the
others. Be especially careful when the component calls for a judgment
call on the part of the clinician (e.g. hospital admissions, referral
for surgery, initiation of new antibiotics), as these measure are more
likely to show a positive result that may reflect bias on the part of
the researchers.

Lastly, it’s also important to check whether the components of the
composite were determined before the study was initiated (a priori) or
after it was completed (post hoc). This can often be gleaned either from
a careful reading of the study itself or by checking its registry
listing (if one exists) at clinicaltrials.gov. Trial registries provide a
record of what outcomes were specified before the study started, so
that researchers can’t later decide to cherry pick other results that
showed a benefit. Post hoc changes to the composite components should
generally be viewed with skepticism.

Gary Schwitzer has specialized in health care journalism in his more than 30-year career in radio, television, interactive multimedia and the Internet. He is publisher of the website HealthNewsReview.org, leading a team of more than two dozen people who grade daily health news reporting by major U.S. news organizations. In its first year, the project was honored with several journalism industry awards - the Mirror Award, honoring those who "hold a mirror to their own industry for the public’s benefit," and the Knight-Batten Award for Innovations in Journalism. His blog - which is embedded within HealthNewsReview.org - was voted 2009 Best Medical Blog in competition hosted by Medgadget.com.

This survey is a poll of those who choose to participate and are, therefore, not valid statistical samples, but rather a snapshot of what your colleagues are thinking.

ADVERTISEMENT

MedPageToday is a trusted and reliable source for clinical and policy coverage that directly affects the lives and practices of health care professionals.

Physicians and other healthcare professionals may also receive Continuing Medical Education (CME) and Continuing Education (CE) credits at no cost for participating in MedPage Today-hosted educational activities.