Abstract

Background: Health related quality of life (HRQOL) is increasingly recognised as an important outcome in epilepsy. However, interpretation
of HRQOL data is difficult because there is no agreement on what constitutes a clinically important change in the scores of
the various instruments.

Objectives: To determine the minimum clinically important change, and small, medium, and large changes, in broadly used epilepsy specific
and generic HRQOL instruments.

Methods: Patients with difficult to control focal epilepsy (n = 136) completed the QOLIE-89, QOLIE-31, SF-36, and HUI-III questionnaires
twice, six months apart. Patient centred estimates of minimum important change, and of small, medium, and large change, were
assessed on self administered 15 point global rating scales. Using regression analysis, the change in each HRQOL instrument
that corresponded to the various categories of change determined by patients was obtained. The results were validated in a
subgroup of patients tested at baseline and at nine months.

Results: The minimum important change was 10.1 for QOLIE-89, 11.8 for QOLIE-31, 4.6 for SF-36 MCS, 3.0 for SF-36 physical composite
score, and 0.15 for HUI-III. All instruments differentiated between no change and minimum important change with precision,
and QOLIE-89 and QOLIE-31 also distinguished accurately between minimum important change and medium or large change. Baseline
HRQOL scores and the type of treatment (surgical or medical) had no impact on any of the estimates, and the results were replicated
in the validation sample.

Conclusions: These estimates of minimum important change, and small, medium, and large changes, in four HRQOL instruments in patients
with epilepsy are robust and can distinguish accurately among different levels of change. The estimates allow for categorisation
of patients into various levels of change in HRQOL, and will be of use in assessing the effect of interventions in individual
patients.

Health related quality of life (HRQOL) is recognised as an important outcome in epilepsy treatment, and various instruments
have been developed to assess it in epilepsy. Typically, studies exploring the impact of epilepsy treatments on HRQOL compare
the mean score of instruments among various treatment groups and assess whether the differences are statistically significant.
However, it is difficult to interpret the importance of mean changes in HRQOL, regardless of their statistical significance.
This is because aggregate data (group means) convey no information about the number of individuals in a group who experience
clinically important change. For example, when the mean change for the group is not statistically significant or when it is
lower than a prespecified minimum threshold, clinicians may erroneously conclude that the treatment has no important effects.
As shown by Guyatt et al, small mean changes can conceal clinically important treatment effects in a substantial number of patients.1 Conversely, large mean changes can be accounted for by a small number of individuals experiencing large changes, while the
majority of the group remains unchanged.1 Clinical interpretation of HRQOL requires a notion of what constitute clinically important, small, medium, and large changes
in instrument scores in individual patients.

We quantified the amount of change in commonly used epilepsy specific and generic HRQOL instruments that patients consider
as important—that is, the minimum clinically important change (MIC)—and we also obtained estimates of small, medium, and large
changes in these instruments.

METHODS

Patients

We prospectively assessed 136 consecutive adults with medically refractory focal epilepsy with or without secondary generalisation
who were investigated for epilepsy surgery. Patients aged 16 years or older were eligible if they could complete self administered
HRQOL questionnaires. They were excluded if they had non-epileptic seizures, learning disability, progressive central nervous
system disorders, or medical conditions precluding epilepsy surgery. We aimed to enrol a broad clinical spectrum of patients
representative of adults with medically refractory focal epilepsy. Patients gave informed consent and the institutional ethics
review board approved the study.

Health related quality of life instruments

We conceptualised HRQOL as the patients’ own experience of health, and assessed by the patients’ perception of change in their
own health status.2 We quantified change in four HRQOL instruments. Two are epilepsy specific—the quality of life in epilepsy inventory-89 (QOLIE-89)3 and the QOLIE-31,4 a shorter instrument derived from QOLIE-89; and two are generic—the health utilities index mark III (HUI-III),5 and the medical outcomes study short form (SF-36).6 The latter is contained entirely within the QOLIE-89 as its generic core. The HUI-III generates utility scores that can be
used to obtain quality adjusted life years, the common metric for cost-effectiveness analysis. All instruments have satisfactory
internal consistency and content, construct and convergent validity, and responsiveness in epilepsy.7–10

Instruments were self administered twice, six months apart, following developers’ guidelines. All participants received the
QOLIE-89, from which QOLIE-31 and SF-36 are derived, while a subgroup of 80 patients received the HUI-III. Patients answered
all questionnaires at the same time and in the same order. All questionnaires were immediately reviewed for completeness and
patients were contacted for missing or ambiguous responses.

Patient centred assessment of change in HRQOL

To ascertain meaningful change in HRQOL11 we used patient centred global ratings of change, as described by Jaeschke et al.12 Because HRQOL in epilepsy is multidimensional, to obtain an overall impression of change we asked patients to specify the
direction and amount of change in five areas (questions): overall HRQOL, general health, social activities and work, seizures,
and drug side effects. Patients first rated the five areas as worse, about the same, or better compared with six months earlier,
and then scored the amount of change using a 15 point scale ranging from −7 (a very great deal worse) through 0 (no change)
to +7 (a very great deal better) (see appendix). Participants completed all five global rating questions at the same time
and in the same order, concurrently with the six month HRQOL questionnaires. All responses were reviewed for completeness
and validity, and patients were contacted if clarification was necessary.

The mean score of the five questions served as the summary global rating of change for each patient. In general, global rating
scores between 0 and ±1 are considered as benchmarks for no change, 2 to 3 or −2 to −3 as small change, 4 to 5 or −4 to −5
as medium change, and 6 to 7 or −6 to −7 as large change. These categories, initially specified by Jaeschke et al on the basis of experience and clinical intuition,12 have subsequently been validated in studies of diverse populations13–15 and using different techniques.16

Investigators have equated a small change (a global rating score of 2 to 3) with the MIC. For patients with medically refractory
epilepsy, the MIC in global ratings is 3. This value arises from a previous study in which patients with medically refractory
epilepsy ascertained the minimum amount of worthwhile change in HRQOL, using a seven point scale ranging from 1 = no change
at all to 7 = a very great deal of change. On average, patients considered a change of 3 as the MIC.17 Other investigators have arrived at similar values for MIC in other conditions.12,13,16,18

Statistical analysis

Linear regression analyses were used to assess the relation between the patients’ summary global rating of change and change
(six month minus baseline total score) in QOLIE-89, QOLIE-31, HUI-III, and the physical (PCS) and mental (MCS) composite scores
of SF-36. The fitted regression line was anchored at zero (that is, no intercept). We examined the assumption that zero change
in summary global ratings corresponded to no change in HRQOL instruments by assessing the magnitude of the intercepts. The
values of the intercepts ranged from 0.05 to 1.0 and therefore justified our assumption. R2 was used to assess the strength of the relation between the global rating and change in HRQOL. Estimates of change in HRQOL,
and the corresponding 95% confidence intervals (CI), were calculated from the fitted regression line using the midpoint of
the four benchmarks of change for the summary global rating: 0.5, no change; 2.5, small change; 4.5, medium change; and 6.5,
large change. An estimated change in HRQOL for a MIC of 3 was also calculated from the fitted regression line.

Several investigators have noted that baseline HRQOL scores can substantially influence the magnitude of change in HRQOL.18–20 We assessed the extent of this influence, as recommended by Bland and Altman,21 by calculating the Pearson product–moment correlation coefficients between HRQOL change scores and the mean of baseline and
six month scores. In addition, we used Pearson correlation analyses to determine whether the level of baseline HRQOL influenced
global ratings—that is, whether patients with poorer baseline HRQOL systematically appraised change differently from those
with better HRQOL. Finally, we validated our results by performing the same analyses in a subgroup of 80 patients who answered
the same set of HRQOL questionnaires and global ratings at nine months.

RESULTS

Of 136 patients, 70 underwent surgery and 66 continued to receive medical treatment for their epilepsy. The patients’ clinical
characteristics were representative of patients suffering from difficult to control focal epilepsy (table 1). The response
rate for all instruments and global ratings ranged from 94% to 100%. There were no ceiling or floor effects in any of the
HRQOL instruments. Mean scores at baseline, follow up, and change were similar to those described in previous reports (table
2). The summary global ratings showed that 85 patients (63%) rated themselves improved (> 1), 25 (18%) as unchanged (0 to
1), and 26 (19%) as worse (≤ 1); eight patients (6%) endorsed maximum change. The mean (SD) of the summary global rating score
for the group was 1.9 (2.8).

Mean scores and mean change for health related quality of life instruments

The R2 values assessing the strength of the relation between summary global rating and change in HRQOL score are shown in table
3. The summary global rating was a good predictor of change for the epilepsy specific QOLIE-89 (fig 1) and QOLIE-31, and it
was a modest predictor of change for the generic HUI-III and a poor predictor of change for both composite scores of the generic
SF-36 (table 3). There were no significant correlations between the baseline HRQOL scores and change scores in QOLIE-89 (R2 = 0.01), PCS (R2 = 0.0001), MCS (R2 = 0.002), and HUI-III (R2 = 0.004). Although there was a statistically significant correlation between baseline and change scores in the QOLIE-31,
the R2 (0.06) approached zero and is probably clinically insignificant. We found no correlations between baseline HRQOL and global
ratings. Baseline HRQOL explained only 1–8% (average 2%) of the variance in global ratings.

Scatterplot and fitted linear regression line showing the relation between change in QOLIE-89 scores and summary global rating
scores of change. The regression line is forced through the origin (that is, there is no intercept) because zero represents
no change in both scales. The slope coefficient (β1) is 3.38 and R2 is 0.45. The regression equation is: QOLIE-89 change score = coefficient × summary global rating score. For example, a change
in QOLIE-89 of 10.14 corresponds to a summary global rating score of 3.

Table 3 shows the amount of change in HRQOL instruments that corresponds to no change, small, medium, or large change, and
MIC in summary global rating scores. Inspection of the 95% confidence intervals in table 3 shows that QOLIE-89 (fig 2) and
QOLIE-31 can distinguish precisely between no change, small, medium, or large change, as demonstrated by 95% confidence intervals
with minimum or no overlap. HUI-III distinguishes correctly between no change, small change, and medium change, but broad
95% confidence intervals around the estimate for large change limits the usefulness of the latter category. The MCS and PCS
components of SF-36 discriminate accurately only between no change and small change, estimates for large change being too
imprecise to be meaningful.

The MIC was 10.1 for QOLIE-89, 11.8 for QOLIE-31, 4.6 for SF-36 MCS, 3.0 for SF-36 PCS, and 0.15 for HUI-III. All instruments
were able to differentiate between no change and MIC with statistical significance, and QOLIE-89 and QOLIE-31 also distinguished
between MIC and medium or large change. As expected, the MIC is very close to small change, and their 95% confidence intervals
overlap in all instruments.

We examined whether the type of treatment (surgical or medical) influenced the results and found no differences in any of
the estimates between the two treatment groups.

A separate analysis using the same methods and instruments in a subgroup of 80 medically or surgically treated patients tested
at nine months generated similar results to those of the first analysis. The MICs were 12.0 (95% CI, 8.6 to 15.5) for QOLIE-89,
11.1 (7.7 to 14.5) for QOLIE-31, 4.4 (1.5 to 7.4) for SF-36 MCS, 4.1 (2.4 to 5.9) for SF-36 PCS, and 0.13 (0.7 to 0.19) for
HUI-III.

DISCUSSION

Assessing clinically important change in individual patients is increasingly recognised as a prerequisite for judging the
impact of interventions on HRQOL.22 This allows clinicians to obtain clinically useful measures such as absolute differences between treatments and the numbers
needed to treat1 for one additional patient to benefit. The importance of exploring change in individual patients rather than in the group
mean can be illustrated by further scrutiny of our data. Although the group mean change in all instruments was relatively
small (table 2), and it fell below the estimated MIC for all instruments (table 3), 53 patients (40%) experienced an improvement
equivalent to the MIC or larger (global rating ≥ 3); and of equal importance, HRQOL declined (global rating ≤ 1) in 26 patients
(20%).

Lydick and Epstein have reviewed several methods to ascertain the MIC in individual patients.11 We chose the global ratings method because it is patient centred, clinically based, easy to use, has been validated in several
conditions, and generates similar MICs to those obtained by other approaches.16,18,23,24

Because the global ratings explained a relatively small proportion of the variance in both composite scores of the SF-36 (R2 values of 0.12 and 0.15), our estimates of clinically meaningful change for this instrument are less accurate than for the
other three instruments. Research in epilepsy9 and in conditions such as carpal tunnel syndrome,25 chronic sinusitis,26 and angina,27 shows that SF-36 is less responsive to change than disease specific instruments. Some commentators have concluded that SF-36
is less apt than disease specific instruments for assessing clinically important change.25 Nevertheless, the MIC (3 to 4.6) and the effect size (0.3 to 0.4) for SF-36 in this study are similar to previous estimates
in other conditions and using different methods,28,29 and the 95% confidence intervals around the MIC were narrow. Therefore, we believe that SF-36 can adequately distinguish
between no change and MIC in epilepsy. The results are more accurate for HUI-III (R2 = 0.30) and the estimates can distinguish among no change, MIC, or small, medium, and large change. We know of no reports
assessing clinically meaningful change for the HUI-III in other conditions. Finally, the estimates for QOLIE-89 and QOLIE-31
are quite accurate. The R2 value is well within the range reported for other instruments in this type of analysis18,30,31 and the 95% confidence intervals are sufficiently narrow to distinguish accurately between categories of change.

An important question is whether baseline HRQOL scores influence the estimates of MIC. We found no correlation between baseline
scores and global ratings. Therefore, our estimates of clinically meaningful change seem applicable to a broad range of baseline
HRQOL values, including negative and positive scores (fig 1). However, because only 26 patients (19%) rated themselves as
worse, the results may be less accurate for judging worsening than improvement.

The MICs generated in this analysis should be viewed in the context of other measures of change in HRQOL. First, the effect
sizes corresponding to the MIC were 0.58 (medium) for QOLIE-89, 0.72 (medium to large) for QOLIE-31, 0.38 (small to medium)
for MCS, 0.3 (small to medium) for PCS, and 0.5 (medium) for HUI-III. Thus at least medium effect sizes are needed to detect
an MIC. Second, some commentators suggest that the standard error of measurement (SEM) is a reasonable approximation to the
MIC in some instruments.24 Using the SEM as a surrogate for the MIC in this epilepsy population is not supported because the MIC was considerable larger
than the reported SEM for QOLIE-89 (6.6), QOLIE-31 (5.5), and HUI-III (0.11).10 Finally, it is important to consider how certain one can be that the relatively small changes corresponding to the MIC in
various instruments represent real change as opposed to chance or measurement error. The upper 95% confidence interval of
the MIC for QOLIE-89, QOLIE-31, and HUI-III approach the reported threshold values for 90% certainty that real change has
occurred; and 95% certainty is achieved with medium change.10 We know of no such threshold values for SF-36 in epilepsy.

It remains to be determined whether patients with different types and severity of epilepsy differ systematically in their
specification of MIC, or of small, medium, and large changes in HRQOL. For example, we have shown that patients with milder
forms of epilepsy usually stipulate a higher MIC than those with intractable epilepsy.17 Because the MICs reported here are based on the minimum change that patients with difficult to control epilepsy consider
worthwhile,17 they should be used with caution in those with milder forms of epilepsy. On the other hand, our estimates of small, medium,
and large change denote empirically derived benchmarks that may not be directly dependent on any particular patient population,
and may be applicable to a wide variety of patients with epilepsy. Finally, it is of interest that the type of treatment (surgery
or anticonvulsants) had no impact on any of the estimates. This is in agreement with a previous analysis showing that the
severity of epilepsy is a stronger determinant of MIC than the type of treatment.17

Conclusions

In summary, we ascertained the minimum clinically important change, and small, medium, and large changes in four HRQOL instruments
for individual patients with difficult to control epilepsy. The validity of these estimates is supported by our analysis demonstrating
applicability across a wide range of baseline HRQOL scores and different treatment modes, narrow 95% confidence intervals,
and replication in a validation sample. The measures are most accurate for the epilepsy specific instruments QOLIE-89 and
QOLIE-31, moderately accurate for the generic HUI-III, and least accurate for the generic tool SF-36. These estimates can
assist clinicians and researchers in determining the magnitude of the effect of interventions on individual patients’ quality
of life.