Abstract

Background

Quality of life weights based on valuations of health states are often used in cost
utility analysis and population health measures. This paper reports on an attempt
to develop quality of life weights within the Zimbabwe context.

Methods

2,384 residents in randomly selected small residential plots of land in a high-density
suburb of Harare valued descriptors of 38 health states based on different combinations
of the five domains of the EQ-5D (mobility, self-care, usual activities, pain or discomfort
and anxiety or depression). The English version of the EQ-5D was used. The time trade-off
method was used to determine the values, and 19,020 individual preferences for health
states were analysed. A residual maximum likelihood linear mixed model was used to
estimate a function for predicting the values of all possible combinations of levels
on the five domains. The model was fit to a random subset of two-thirds of the observations,
with the remaining observations reserved for analysis of predictive validity. The
results were compared to a similar study undertaken in the United Kingdom.

Results

A credible model was developed to predict the values of states that were not valued
directly. In the subset of observations reserved for validation, the mean absolute
difference between predicted and observed values was 0.045. All domains of the EQ-5D
were found to contribute significantly to the model, both at the moderate and severe
levels. Severe pain was found to have the largest negative coefficient, followed by
the inability to wash and dress oneself.

Conclusion

Despite a generally lower education level than their European counterparts, urban
Zimbabweans appear to value health states in a consistent manner, and the determination
of a global method of establishing quality of life weights may be feasible and valid.
However, as the relative weightings of the different domains, although correlated,
differed from the standard set of weights recommended by the EuroQol Group, the locally
determined coefficients should be used within the Zimbabwean context.

Background

The resources available to health care are obviously finite, and prioritisation or
rationing of public health provision is on government agendas across the world [1]. Cost-utility analysis is one method of investigating the relationship between the
costs and benefits of health care that allows for comparison of different interventions
across different health states. The quality-adjusted life year (QALY) forms the basic
unit of measure in such evaluation and is the most widely used method for measuring
health outcomes [2]. The QALY is the arithmetic product of data on quantity of life and quality of life.
Whilst the former is typically measured in life years, the latter is measured in terms
of utility weights. There is little consensus as to how these weights should be developed,
but the measure should have at least interval properties and should represent the
preferences of society [3].

There are a plethora of instruments for describing health-related quality of life,
most of which demonstrate acceptable psychometric properties [4]. Some of these measures, such as the SF-36 [5], are primarily profile measures that provide descriptors of health states. Others,
such as the Health Utilities Index (HUI) [6] and the EQ-5D [7], are linked directly to utility estimates, derived from population studies using
some method of eliciting population preferences, such as the standard gamble.

The EQ-5D describes health-related quality of life in terms of five dimensions: mobility
(MO), self-care (SC), usual activities (UA) (work, study, housework, family or leisure),
pain/discomfort (PD) and anxiety/depression (AD). Each dimension is subdivided into
three levels indicating no problem, a moderate problem or an extreme problem [7]. Different health states can be described by a five-digit code number relating to
the relevant level of each dimension, with the dimensions always listed in the order
given above. Thus a health state of 11223 means:

The validity and reliability of the EQ-5D have been found acceptable in Europe among
different populations and patient groups [9-11]. Despite the limited number of dimensions and levels, the instrument has been found
to be sensitive to improvements in health-related quality of life [12]. A test-retest study was undertaken in Zimbabwe to determine the reliability of the
English language version of the EQ-5D. Forty-four randomly selected subjects who had
a minimum of seven years of education and whose health status had remained static
over the previous seven days completed the instrument twice, one week apart. In all
domains except SC, approximately half of the respondents reported some or severe problems.
The kappa statistics were 0.695 (fair to good agreement) for SC, 0.878 for MO, 0.884
for UA, 0.892 for PD and 0.893 for AD (all excellent agreement beyond chance [13]). A similar reliability study on the version of the EQ-5D in Shona, the local Zimbabwean
language, reported that the kappa statistics between the two sets of scores were high
and ranged from 0.78 to 1.00 for different domains [14]. Although the Shona version was not used in the current exercise, multiple translators
examined the cross-cultural equivalence of meaning of the EQ-5D during the process
of forward and back translation. One of the conclusions of the translators of the
instrument was that "although it is likely that the Shona respondents will identify
it as a foreign instrument, Shona is able to capture the EQ-5D concepts. The respondents
will be able to recognise the concepts and respond appropriately..." [15]. It was concluded that, despite the different cultural understanding of determinants
of ill health, the English version of the EQ-5D could be used with confidence in an
educated urban Zimbabwe population.

Several methods of valuation of health states have been developed, including rating
scales or visual analogue scales, magnitude estimation, standard gamble, time trade-off
and person trade-off methods [3]. The standard gamble has been extensively used to develop utility weights, and is
regarded by some as being the most theoretically sound method of determining utility
weights [16]. However, it is conceptually difficult and requires an ability to discriminate between
probabilities close to one [3]. Nord [17] proposes that time trade-off techniques are likely to be the most valid technique
for establishing preference weights for life years both in the clinical situation
and in program evaluation.

The Measurement and Valuation of Health Group (MVH), headed by Williams, used time
trade-off exercises to elicit preferences from 3,235 respondents in the United Kingdom
for a range of different EQ-5D descriptor states [8]. Regression analysis was used to develop a set of values for each individual component
of the five dimensions that can be used to calculate the value of health states not
observed directly [8]. The test-retest reliability of health state valuations collected with the EQ-5D
questionnaire is reported to be stable over time [18].

It is unlikely that preferences for health states are universal, although some health
states might be given similar valuations across cultures [19]. The greater the divergences of the local culture from the Western worldview, the
less likely that health state valuation will be the same [20]. Barker and Green [21] state that health state values should be developed locally based on the judgments
and priorities of local communities, in the service of these communities.

Much work has been done in developed countries on the valuation of health states [7,8,22-24], but there is a need to develop locally applicable measures of health that may be
used to monitor the impact of interventions in developing countries. The WHOQOL is
one of the few attempts to develop a genuinely international quality of life assessment
[25], but so far it has no direct link with a utility index. The primary objective of
this study was the generation of a set of weights for the different health states
as described by the EQ-5D that would represent the values of urban high-density dwellers
in Zimbabwe. Urban dwellers were chosen, as they were more likely to have the numeracy
and literacy skills necessary to participate in the exercise. Where appropriate the
results were compared with the results of the MVH study [8].

Methods

Subjects

In March 2000, 2,488 residents of randomly selected small residential plots of land
in Glenview, a high-density suburb of Harare, were interviewed in their homes. The
entrance criteria included completion of primary school education and a minimum age
of 15 years. The oldest person in each household who met the criteria was interviewed.

Instruments

English descriptors of 38 different health states based on the different combinations
of the five EQ-5D domains used in the original MVH study [8] were compiled on flash cards (See Appendices I and II). (Thirty-eight, rather than
the original 42 health states were used, as unconscious and death were not valued,
and two other states were excluded due to an administrative error). Each respondent
was asked to complete a self-assessment using the EQ-5D and to value his or her own
health condition on the EQ-5D visual analogue scale (VAS), which ranges from the "Worst
possible health state" at 0 to the "Best possible health state" at 100. Respondents
then each valued a different set of seven randomly selected health states (which included
one or two very mild, mild, moderate and severe states). All respondents also valued
an eighth state, the 33333 state. Valuation of states was undertaken using the time
trade-off (TTO) approach. For states better than death subjects choose between a length
of time in perfect health (11111), x, which was equivalent to spending ten years in the target state. In this case, a
larger x indicates a better health state. For states worse than death, the choice was between
dying immediately and spending a length of time (10 - x1) in the target state followed by x1 years in full health. A visual aid was used to clarify this choice. The greater the
number of years in full health perceived to compensate for the time spent in the target
state, the worse the health state. States worse than death were thus given negative
values in analysis.

Procedure

The full procedure is described in Appendix II and III – see 1. Nine interviewers, all of whom had higher degrees or diplomas of some kind, participated
in a training workshop over three days, which included a pilot study. All residential
plot numbers in Glenview were identified from a municipal map of the area, and a random
sample of 2,500 were chosen. In the event of no one being present at the identified
residential plot, the interviewers returned once more. The research assistants were
instructed to conduct interviews in the evenings and weekends to the extent possible,
but this was difficult because of the political unrest and weekend rallies at the
time, and many interviews took place during work hours. Before each interview, the
research assistant shuffled the 38 health states (excluding the 33333 health state)
and randomly chose seven states for the respondent to value. Check visits were conducted
by a supervisor to ensure that the randomly chosen residential plot had been visited.

Data analysis

Statistical analyses were undertaken using GenStat version 4.2 [26] and SPSS for Windows, Release 10 [13]. Descriptive statistics and χ2 and 95% confidence intervals (CI's) were used to delineate the demographic characteristics
of the subjects and to compare them with population demographics of high-density dwellers
in Harare derived from census findings. The health characteristics of the respondents
in terms of the five EQ-5D domains were described. The sample of respondents was randomly
divided into three, and analysis was performed on two-thirds of the sample, the internal sample. The results were then used to estimate the values of the remaining one-third, the
external sample. The dependent variable was the TTO score divided by ten. A residual maximum likelihood
(REML) linear mixed model was fitted. Residual maximum likelihood estimation is a
method of estimating variance components in the context of unbalanced incomplete block
designs. It takes account of the loss of degrees of freedom in estimating the mean
and produces unbiased estimating equations for the variance parameters [27]. Interviewer effect and subject nested within interviewer were fitted as the random
effects. The three levels of the five domains were entered as the fixed effects and
a weighted least squares model was fitted. The fixed effects were entered in a forward
and backward sequence and their effects assessed using Wald statistics. An ANOVA full
factorial model with Type III least squares (N = 15,671) was used to establish the
source of variance. The dependent variable was the TTO score, and random factors entered
included research assistant, health state, occupational category and gender. Interactions
between the main effects were also investigated.

Results

Subjects

Forty-eight respondents refused to answer the questionnaire. The data from 56 respondents
were incomplete, and the replies from 201 respondents demonstrated inconsistency but
were included in the analysis (see discussion below). Inconsistent data included responses
in which all states were given the same value, fewer than three states were valued,
or there were more than three logical inconsistencies (e.g. if a 11112 state were
valued as being more severe than a 11113 state). Ultimately, the responses of 2,384
subjects were analysed. The demographic details of the respondents were compared with
the results for Highfield in the 1992 census Harare Profile [28] or, if not available, with results for Harare Province (Table 1). Males were underrepresented at 38.3%, a proportion which fell outside the 95% CI
for the population (52.4 – 53.1%). There were more young adult respondents (46.2%)
than in the general population (38.9%, CI = 38.6 – 39.3), but the decline in numbers
with increasing age was similar to the population of Harare.

Tenure status, which is an indicator of socio-economic status, indicated that, as
in the population, the majority were lodgers (45.2%) or owners (34.8%). The sample
was better educated than the population with 67.1% having attained 11 to 13 years
of schooling compared to the population estimate of 48.3 – 48.5%. The percentage employed
in the sample (33.3%) was smaller than the reference population (CI = 54.1 – 54.3).
Approximately 72% of the sample interviewed by the female research assistants were
female compared to 54.5% of the male sample (χ2 = 552.0, p < 0.001), and 57.3% were unemployed compared to the 38.6% of the male interviewers'
samples (χ2 = 1060.6, p < 0.001).

Health status

Disability was reported by 104 (4.4%) of the sample, and 290 (12.2%) reported having
a serious illness in the prior three months (total number with disability and/or illness
was 14.7%). Approximately half (47.8%) reported no problem on any of the five dimensions.
Information regarding self-described health status appears in Table 2. With regard to the self-reported scores on the EQ-5D dimensions, nearly one-third
of the respondents reported either some or severe problems in the dimensions of pain/discomfort
and anxiety/depression. The mean score on the VAS was 79.8 (CI = 79.1 – 80.5).

Table 2. Self-reported health status of subjects in urban Zimbabwe (N = 2,183) compared to
the United Kingdom [9]. Frequency of levels on each dimension of EQ-5D are reported as percentages.

Valuations

There were 19,020 values of 38 EQ-5D health states analysed, 12,663 in the internal
sample and 6,357 in the external sample. Fifty two percent of the proportion of variance
was due to the domain scores, 7% due to interviewer and health state interaction,
6% due to interviewer effects and 35% due to error. (Table 3).

The Wald statistic was highly significant for all main fixed effects, both when the
effects were fitted in a forward and in a backward sequence (p < 0.001) (Table 4). Previous models included a variable (N3) which reflected that at least one domain
was valued at the severe level. However, this led to only a small increase in R2 and resulted in a model in which moderate problems in several domains were counter-intuitively
valued as being worse than extreme problems. This model was subsequently discarded.

There was some evidence of significant interaction effects. However, because not all
combinations of health states are plausible and were not valued, numerical problems
were experienced in the fitting of the model, and the estimates appeared unreliable.

Coeffecients from the model were used to generate predicted values for each of the
states included in the study. For example, for the health state 22331 (i.e., some
problems in walking about, some problems with self care, unable to perform usual activities,
severe pain and no anxiety/depression), the predicted value would equal 0.90 - 0.056
- 0.092 - 0.135 - 0.302 - 0.0 = 0.315. The actual and predicted value and residuals
for each health state for the internal and external samples and the observed results
of the United Kingdom MVH study are reported in Table 6. The Pearson's correlation between mean values of health states was 0.914 (p < 0.001).
The 33333 state was the only state that was valued as being worse than death in the
current study (mean value = -0.24). There were three health states in the external
sample for which the mean difference between the observed and predicted values was
more than 0.1. In the internal sample, the mean absolute difference for all health
states was 0.045.

Figure 1 depicts the predicted UK scores compared to the Zimbabwe internal sample scores for
the 39 health states (the UK sample did not include the 11111 health state).

Whereas there is initial convergence in scores between the Zimbabwe and MVH sample
at high health levels, values diverge as the health states become more severe and
domains at level three are included. Spearman's rank correlation between the values
for the different states was 0.95 (p < 0.001).

Figure 1. Observed TTO scores (divided by 10) of the full Zimbabwe sample (N = 19,020 values
from 2,183 respondents), compared to observed scores from the Measurement and Valuation
of Health study in the United Kingdom [8].

Discussion

To the knowledge of the authors, this is the first paper to present the preferences
for health states by urban Zimbabweans. The self-reported health-related quality of
life of the Zimbabwe subjects was similar to that of UK counterparts. Kind et al.
[9] found that 30% of a large UK sample reported some or severe pain/discomfort. However,
the number reporting some or severe anxiety/depression was smaller in the UK sample
(21%) than in the present study. The two samples were similar in finding very few
people reporting problems in the area of self-care, or extreme problems with mobility
or usual activities. The mean score on the Visual Analogue Scale (VAS) in the Zimbabwe
sample was 79.8 (CI = 79.1 – 80.5), which was similar to the British sample (mean
82.5).

As the questionnaire was administered in English and numeracy was required, the methodology
precluded gathering valuations from a truly representative sample. The educational
inclusion criteria and the limitations imposed on the times for data gathering by
female interviewers resulted in a sample in which females, younger people and those
with a higher level of literacy were over-represented. In addition, the interviewer
effect was considerable. As the interviewer and subject were entered in the computation
as random effects, the REML linear mixed model allowed for the demographic deviations
from census findings, the non-independence of the measures and the interviewer effect.

The interviewer effect appeared in spite of training sessions, piloting and standardisation
of the format of the interview. It is possible that the approach and amount of interpretation
given by each interviewer differed. The effect of the gender of the interviewers was
evident, and female interviewers apparently did not conduct interviews during the
evenings or weekends to the same extent as their male counterparts. This imbalance
might have compounded the interviewer effect, which suggests that the gender of interviewers
should receive careful attention in community surveys, particularly in socially unstable
conditions.

However, a credible model was ultimately developed in this study. The mean absolute
difference between the actual and estimated means for the external sample (0.045)
is comparable to that of the UK study (0.039) and a similar study in Japan (0.01)
[24], although in each case different models were used. The inclusion of inconsistent
responses is controversial, with some researchers excluding these data from analysis
[29]. These responses were included in this analysis on the assumptions that inconsistencies
do not necessarily indicate a lack of understanding of the task, that all those who
participate have the right to have their data included, and that human beings are
not always rational in their judgments regarding health states.

Significant interaction effects were found but, as noted above, appeared unreliable
due to an incomplete data set. The inclusion of interaction effects resulted in a
model that was difficult to interpret, which would likely limit the use of the model
in practice. Similarly, the inclusion of the N3 term, which indicates severe problems
on at least one domain, resulted in a more complicated and less intuitive model. It
was therefore decided to adopt the simple main effects model.

The UK and Zimbabwe samples produced similar descriptions of their own health states
and similar rank orderings of the hypothetical health states. (As a different model
was used, the coefficients of the valuation function could not be compared directly
between the UK and Zimbabwe samples.) The mean self reported VAS was 3% lower in the
Zimbabwe sample compared to the UK sample. The Pearson's correlation for the predicted
health state values (0.95) was high. Although previous studies based on the EQ-5D
have reported similarities in valuations, with low sensitivity for socio-demographic
variables across European countries [22], the results of this study were unexpected. A previous study on the rank ordering
of health states had found no correlation between the international and locally determined
Zimbabwean ranking [20]. It would appear that a deconstructed approach to valuation in which impairments
or activity limitations (e.g. pain or problems in moving around) are valued [30,31] rather than disease conditions (e.g. rheumatoid arthritis) is more likely to tap
into commonly understood constructs and yield universal preferences.

However, there were important differences between the samples that should be noted.
Respondents in the UK study valued 16 health states as being worse than death, whereas
in the Zimbabwe sample only the 33333 state was awarded a negative value. The inclusion
of 16 negative values in the UK model resulted in generally lower values being assigned
to health states in which an "extreme" problem was included. Consequently the predictions
from the UK model for about two-thirds of the health states are lower than those from
the Zimbabwe model. The reluctance to value states as worse than death in the Zimbabwe
sample might reflect a fundamentally different attitude towards the sacrifice of years
of life. There is, for example, no national debate on either euthanasia or abortion
in Zimbabwe, and both are illegal and likely to remain so for the near future. The
general state of health of the population might also contribute. The life expectancy
is now dropping drastically because of the HIV/AIDS pandemic. The expected number
of equivalent healthy years (Disability Adjusted Life Expectancy, DALE) is now estimated
to be 32.9 (cf. UK 71.7), and Zimbabwe ranks 184 out of 191 nations [32]. (To calculate DALE, the years of ill-health are weighted according to severity and
subtracted from the expected overall life expectancy to give the equivalent years
of healthy life [32]). There may be a greater reluctance to sacrifice life years in a society in which
each individual is likely to have had direct contact with death or illness. This conclusion
is supported by the results of a Spanish study of preferences of 103 patients who
were severely ill. The patients tended to rate the worst health states higher than
proxies and rated no states as worse than death. The authors of that study concluded
that within the EQ-5D descriptive system, there are no health states worse than death
for seriously ill patients [33].

For Zimbabweans, the inability to wash and dress oneself is a major contributor to
poor quality of life, and SC level 3 was ranked second. In contrast, SC level 3 was
ranked fourth in the UK study. This difference may possibly be due to the importance
that Zimbabweans attach to self-presentation. It is regarded as insulting to ask whether
people are able to wash or dress themselves, if in any way it is implied that they
have not done so [15,34]. In a poorer country, self-presentation may also be regarded as indicative of socio-economic
standing and hence valued more highly. The important differences between the results
of the two studies illustrate the dangers of applying measures developed in one culture
without adequate testing of items for cultural meaning and appropriateness.

Severe AD was ranked similarly in the UK (Rank 3) and Zimbabwe (Rank 4) samples. Of
all the EQ-5D concepts, the idea of depression and anxiety is most difficult to capture
in the sensibility of the Shona-speaking Zimbabwean. There is no specific word for
depression; it is usually implied from symptoms rather than self-report. Anxiety and
depression are not regarded simply as health states in Shona custom. They are understood
as occasional psychological (social/alienation) or spiritual (religious) states. In
addition, severe anxiety is seen to border on a psychiatric state known as "mhopu"
[15]. It is therefore not surprising that extreme anxiety or depression should be regarded
as being very serious.

The choice of the EQ-5D as the instrument to define the different domains of health-related
quality of life needs justification. The measure is limited in that there are only
five domains with three possible levels on each domain. The content validity may be
questioned, as it may be that important areas that contribute to quality of life,
such as cognitive function or energy, are excluded. However, even with this relatively
crude measure, 243 hypothetical health states can be described. Researchers have to
be cautious about transposing any measure across very different cultural contexts.
The current study required a robust, relatively simple measure and, despite the shortcomings
of the instrument, the EQ-5D appeared to be reliable and relatively insensitive to
cultural context.

Conclusions

This study attempted to elicit cardinal values of health states from urban Zimbabweans.
The limitation imposed by the educational criteria resulted in a sample that was more
educated than the general population of high-density areas, and the results should
be generalized with care to other urban populations in the country. Despite this limitation,
the values derived from the study are more likely to represent the values of urban
Zimbabwe than values derived from valuation exercises performed in other countries.
The parameter estimates for each level of the five domains generated by the TTO exercise
are credible and are comparable to those generated by other studies. The ranking of
observed preferences for health states by Zimbabweans and UK residents are remarkably
similar, and if consensus could be reached on the valuation of states worse than death,
it is possible that QALY weights based on EQ-5D descriptors might be developed which
are valid globally.

However, the observed cardinal values for health states are much lower overall in
the UK sample. It is therefore recommended that the parameter estimates developed
in this study be used both to describe health-related quality of life and as an outcome
measure of health interventions in the Zimbabwe urban population.

Authors' contributions

JJ carried out the data collection and analysis and wrote the paper. KH assisted in
data collection and analysis of the data. WdW, PDC and PK assisted in conceptualising
the study and developing the methodology. All authors have read and approved the manuscript.

Competing interests

None declared.

Acknowledgements

Funding for this study was made available by the Zimbabwe Burden of Disease Steering
Committee, which in turn is supported by DIFID and DANIDA. Thanks to Allan Williams,
Aki Tsuchiya and Jan Busschbach for comments and insightful reviewing of the paper
and to June Juritz and Jaqui Somerville for assistance with statistical analysis.

Macdonagh RP, Cliff AM, Speakman MJ, O'Boyle PJ, Ewings P, Gudex C: The use of generic measures of health-related quality of life in the assessment of
outcome from transurethral resection of the prostate.

Devlin NJ, Hansen P, Kind P, Williams. A: The health state preferences and logical inconsistencies of New Zealanders: A tale
of two tariffs. York, York Centre for Health Economics UK and University of Otago, New Zealand; 2000.

World Health Organisation: The International Classification of Functioning and Disability - Beta-2 Draft. Geneva, WHO; 1999.

World Health Organisation: The International Classification of Functioning and Disability - Beta-2 Draft. Geneva, WHO; 1999.