Abstract

BackgroundThe number of thyroid function tests (TFTs) performed in the UK and other countries has increased considerably in recent years. Inconsistent clinical practice associated with inappropriate requests for tests is thought to be an important cause for this increase. AimTo study the extent of variability in requests for TFTs from general practices. MethodsWe analysed routine data on all TFTs on patients aged 16 years and over carried out by two hospitals in south-west England (Royal Cornwall Hospital and Royal Devon & Exeter Hospital) during 2010 at the request of 107 general practices. ResultsA total of 195 309 TFT requests were made for 148 412 patients (63% female). The total requests included 192 108 tests for thyroid-stimulating hormone (TSH), 43 069 for free thyroxine (FT4) and 1972 for free tri-iodothyronine (FT3). The number of TSH tests per 1000 list size varied widely across the practices, ranging from 84 to 482. Most of the variation was due to heterogeneity across practices and only 24% of this was accounted for by prevalence of hypothyroidism and socio-economic deprivation. ConclusionsThere is wide variation in TFT requests from general practice and scope to reduce both unnecessary TFTs and the variability in the clinical practice. Further studies are required to understand the causes for the variability in testing thyroid function.

Keywords

Introduction

Thyroid disorders are common in the community,
with hypothyroidism alone affecting about 2% of
women.[1] Because symptoms of thyroid disorders are
often non-specific and highly prevalent in the general
population, biochemical tests are necessary for their
diagnosis and monitoring. Thyroid function tests
(TFTs) include serum thyroid-stimulating hormone
(TSH), free thyroxine (FT4) and free tri-iodothyronine
(FT3). In primary care, the first point of investigation
for thyroid disorders, TSH is the first line test for most
patients followed by an FT4 test if TSH is outside the
Reference range. FT3 testing is generally reserved for
endocrinologists or done at the discretion of the
laboratory.

The number of TFTs performed in the UK and
other countries has increased considerably in recent
years,2 and it has been suggested that inappropriate
requests for tests is an important cause.[3,4] National
guidelines to help clinicians use TFTs appropriately
have been published.[2] Although it is difficult tomeasure
the degree of ‘appropriateness’ of requests for TFTs at
the population level, the presence of variation in
clinical practice is an indicator.4 Our study aims to
establish the extent of variability in TFT requesting
across general practices referring to two hospitals in
south-west England.

Methods

The analyses are based on routine data on all TFTs
carried out by the biochemistry laboratories at the
Royal Cornwall Hospitals NHS Trust (RCHT) and the
Royal Devon & Exeter Hospital NHS Foundation
Trust (RDE) and requested by general practices during
2010. The catchment areas of these hospital
laboratories are the predominantly rural counties of
Cornwall and Devon in the south-west of England,
with a combined population of 1.7 million persons.
Together, the two hospitals serve a catchment population of approximatley 800 000. The datasets included
patient’s gender and age; type (TSH, FT4
or FT3) and result of TFTs; and the name of the
requesting general practice. Both laboratories use
chemiluminescent immuno-assay (Roche Modular
Analytics E170 analyser) for analysing TFTs.

We also obtained data on the following practice
characteristics from the Network of Public Health
Observatories5 for the year 2010: list size, percentage
of patients with hypothyroidism recorded on practice
disease registers, Index of Multiple Deprivation (IMD)
score as a measure of socio-economic deprivation, and
percentage of patients aged over 65 years.

Eligible records were TFT requests made by NHS
general practices for patients aged 16 and over. We
excluded records relating to practices outside the
usual catchment areas of the hospital laboratories.
Data for two separate practices in the RDE dataset
were excluded because their records were combined
under a common name and there was insufficient
information to disaggregate them. In addition, records
were excluded for a practice for which characteristics
were not available.We analysed records from
107 practices (RCHT, 57; RDE, 50).

Statistical analysis

We summarised the distribution of tests per 1000 list
size at general practice level within each hospital site
and overall. Both means (with standard deviations)
and medians (with interquartile ranges) are reported
as the former reflects the volume of tests and the latter
is appropriate for quantifying the average of skewed
distributions. Random effects Poisson regression models
were fitted to TSH test request rate (outcome) in order
to quantify the extent to which prevalence of hypothyroidism,
deprivation score and percentage of
patients aged over 65 years (predictors) account for
variability in practice test request behaviour. This
model explicitly recognises the variation across clusters
beyond chance. Analyses were carried out using
Stata Statistical Software (Release 12.1, 2011; Stata
Corporation, College Station, TX, USA).

Results

A total of 195 309 test requests for 148 412 patients
[mean (SD) age 60 years (19); age range 16–105 years;
63.1% female] were made to the two hospital laboratories
(Table 1). Of these, 46 897 (24%) were subsequent
tests to patients who had already been tested
earlier in the year. The total requests included 192 108
tests for TSH, 43 069 for FT4 and 1972 for FT3; 15.5%
of the TSH results were outside the laboratory Reference
range (0.35–4.5 mIU/L).

The number of TSH tests per 1000 list size varied
widely across practices, ranging from 84 to 482 (Table 2 and Figure 1). Using formula from Hayes and
Bennett,6 the standard deviation of the test rate across
practices that would be expected given the overall test
rate and the list sizes is [7]. The observed standard
deviation of 83 was considerably greater than this, indicating that most of the variation across practices
(over 99%) is due to between-practice heterogeneity
as opposed to mere sampling variability. Marked
variation was also seen in the rates for FT4 and FT3
(Table 2). The inclusion of the prevalence of hypothyroidism
(P<0.001) and deprivation score (P = 0.02)
as predictors in the random effects Poisson regression
model together accounted for only 23.8% of the
between-cluster variance component, indicating that
they account for relatively little of the marked differences
in TSH test rate. The percentage of patients aged
over 65 years did not explain any extra variability.
When the model was further adjusted for the proportion
of TSH tests that were outside the laboratory
range, it accounted for 53.8% (an extra 30%) of the
between-cluster variance component. The higher the
proportion of tests in the normal range, a crude proxy
for unnecessary testing, the greater the TSH test rate
(P < 0.001).

Table 2: Distribution of number of tests per 1000 list size across practices in 2010

Discussion

This study shows wide variation in the TFT test rate
across general practices in south-west England. Three
quarters of this variability was not explained by
practice-level prevalence of hypothyroidism, socioeconomic
deprivation and percentage of patients aged
over 65 years. O’Kane and colleagues also found a wide
variation in the rates of different biochemistry tests
requests, including TFTs, from general practices in
Northern Ireland.[7]

The recent increases in many tests ordered, unmatched
by obvious change in level of disease, alongside
wide variation in practice strongly suggest that
some test ordering is inappropriate. Our result, that
high levels of test ordering are related to high proportions
of results in the normal range for a practice,
provides support for this view. There is thus potential
for both health gain and savings to be made in NHS
expenditure. The latter will be important as the UK
Department of Health has set a target of £20 billion
efficiency savings to be made through the Quality,
Innovation, Productivity and Prevention (QIPP) programme.
[8] Our data show that if the TSH test rates in
the practices were all reduced to 229 per 1000 list size
(the current median) there would be a 7% reduction
in tests ordered. In Cornwall and Exeter this would
equate to approximately 13 000 TSH tests per annum,
costing £91 000 if priced at £7 per test. Furthermore,
inappropriate tests may also harm individual patients
by increasing the risk of false-positive results leading
to increased anxiety, a cascade of further investigations
and unnecessary treatments.

A variety of educational and administrative strategies
have been trialled to reduce unnecessary TFTs,[4,9–11]
but the results are sometimes disappointing. This may
be because the interventions were chosen as they were
standard approaches thought to be worthy of evaluation,
rather than interventions appropriate to the context and the particular behaviours needing to be
tackled to achieve reduced testing. Current behaviourchange
science indicates that the barriers to change
must be clearly identified.[12]

A limitation of our study is the lack of data on why
the tests were requested as many of the requests had
incomplete or unclear indication for the tests. This
precluded us from further exploring the potential
causes for the variability in TFTs requests seen in the
study. Althoughwe have gained some insights into the
reasons for variability in test ordering in primary
care,[9,13,14] this knowledge is incomplete. We believe
that understanding the causes of variation between
high- and low-ordering practices using mixed methods
research is the next step. As well as informing our
ultimate behaviour change strategy, this will also lead
to better definition of the most appropriate level of test
ordering: any assumption that this is the median rate
needs to be explored.

Acknowledgements

This report presents independent research funded by
the National Institute for Health Research (NIHR)
Collaboration for Leadership in Applied Health Research
and Care (CLAHRC) for the South West
Peninsula. The views expressed in this publication
are those of the authors and not necessarily those of
the NHS, the NIHR or the Department of Health in
England.

Ethical Approval

As this was an audit of the routine clinical practice, a
formal ethical approval was not necessary.

Contributors

The research question originated from the Peninsula
CLAHRC question generation workshop in Cornwall
led by JT and SF. SF, CH, AP, BV, JS and OU were
involved in the development and design of the study.
AB, AL and JS acquired the data, and OU did the
statistical analysis. BV, JT, OU and JS prepared the
draft of the manuscript. All authors were involved in
the critical revision of the draft and approved the final
version of the manuscript. BV is guarantor for the
study.