Statistics from Altmetric.com

The Smartest Person is a popular Dutch television quiz show in which three contestants receive a few seconds to answer trivia questions. In one of the rounds, to answer the question, contestants must hit certain key words, for example, ‘Apartheid’, ‘Prison’, ‘Nobel Peace Prize’ and ‘South Africa’ for the question ‘What do you know about Nelson Mandela?’ Contestants who mention one of these key words, regardless of context, hear a rewarding ‘ting!” and receive 20 extra seconds. The most efficient strategy to win is to mention the four key words without a linking sentence; answering with a complete and cogent sentence costs precious seconds. In fact, contestants can win even when they get the context wrong, for example: ‘Nelson Mandela is an Italian actor starring in a film about the Apartheid (ting!). He recently spent a night in Prison (ting!) after he egged the house of a Nobel Peace Prize winner (ting!) from Zimbabwe… Namibia… well, somewhere in South Africa (ting!)’.

The show is advertised as a ‘knowledge quiz with a wink’. The show’s ‘smartest person’ is not the most knowledgeable, but the one who knows how best to play the game.

Similar challenges affect performance measurement in healthcare. For example, consider the proportion of patients with 10-year cardiovascular risk >7.5% prescribed statins, as a measure of good preventive care. Or the proportion of patients with type 2 diabetes with haemoglobin A1c (HbA1c) levels <7.5%, as a measure of good diabetes care. Or the proportion of patients getting low-dose CT (LDCT) to screen for lung cancer who participated in shared decision making, as a measure of patient-centred care. These measures are set up to improve the quality of care for patients, but if you know how best to play this game, you would be rewarded for prescribing statins (ting!) to patients who are not willing or able to take them, or for getting a HbA1c of 6.9% (ting!) in a patient living in fear of recurrent hypoglycaemic episodes, or for giving a brochure and ‘engaging patients’ (ting!) but only to determine their eligibility for Medicare reimbursement of LDCT. Clinicians are gearing up to play a high-stakes version of this game, as they try to make the 60% of Medicare’s Merit-based Incentive Payment System or MIPS by acing (ting!) quality measures of their choice.1

Playing the measurement game to win has important unplanned consequences beyond unintended patient harm. Clinicians spend about twice as much time on documentation and billing tasks interfacing with the medical record than on facing patients.2 Health systems in the USA spend over US$15 billion each year collecting, processing and reporting performance measures.3 It is not self-evident that the upside of this attention, time, work and cost is getting care right, that the care improves because of measurement and reporting, or that these care improvements are sustained.4 Nor is it self-evident that this improvement proceeds without being distracted by efforts to improve the scores — to get more ‘tings’ — and to secure the resulting incentives, that winning is not at the expense of addressing issues not subject to measurement but of greater importance to a patient.5

The impact of measurement on the nature of clinical care and on clinicians has received limited attention.6 What effect does securing ‘tings’ and winning prizes have on the soul of the contestants? To have caring turned into the unending pressing of levers for sweet rewards, rewards that barely reflect evident improvements in patient well-being. To check HbA1c levels until, by chance, one falls in the rewards area (ting!), or, if not, to add another antihyperglycaemic agent, even in patients for whom this is likely to be harmful or futile.7 To prescribe a statin (ting!) to a patient ‘just in case you change your mind’ after the patient reasons that their risk is too low to justify one. To simply send a decision aid through the patient portal (ting!), and assume that it will help a patient participate in decisions about their healthcare. We are training health professional and organisations to respond to quality gaps others value and to respond in a manner that ensures the rewards.8 This is not the smartest healthcare. This is not care.

Quality must always improve and quality improvement is a professional obligation and a healthcare competency. It was important to the early efforts to improve quality to reliably measure and lower HbA1c levels, to demonstrate that high-risk patients were less likely to receive statins, and to describe patients who receive unnecessary care or care that is inconsistent with their informed preferences. Like those pioneering efforts, improving care must result from our commitment to careful and kind patient care. We should continue to use measures, like an accurate map, to know we are moving north. But we must also improve quality where the terrain is complicated or when only inaccurate maps exist. In these cases, we should be able to imagine improvements in care that reflect and respond not to incentives, but to what is pertinent and important to each patient’s situation, particularly when this cannot be quantified.

It is difficult to improve care without the risk of falling into the kinds of decontextualised measures and incentives that erode rather than promote better care. Personal, social and comorbid complexities force increasing levels of sophisticated tailoring of clinical action, a simultaneous attention to biology and to biography. Today, that attention often places clinicians in a soul-killing bind to either be patient-centred or to gain quality points and rewards. This conflict must be avoided; quality programmes must reward deep individualisation,9 variability as signal rather than noise. McGlynn and colleagues, for example, proposed a measurement system that embraces such variation, measuring quality as the extent to which patients’ needs are met and evidence is used to do so.10 To date, measurement has been almost exclusively numerical. Numbers have undeniably desirable practical advantages — for example, easy to extract from medical or billing records, enabling statistical adjustments and comparisons with minimal-to-no subjectivity. Yet the evaluation of care in context is fundamentally a subjective judgement best aided by descriptive ‘a-metrical’ assessments. The need to involve human judgement in making meaning of the descriptions, of the words in these descriptions, as Iona Heath stated, ‘is perhaps the one attribute of words that make them so peculiarly appropriate for judging quality within healthcare’.11

Our central point, however, is not about the nature of quality measures, but rather about the perverse effect of being rewarded by obtaining a desirable result on a metric without demonstrating a true improvement in the quality of care. Narratives that make judgements about quality may inhibit the simple paths connecting acing the measure and receiving the ‘ting’ without actual improvements. Deploying other measures that triangulate observations about the underlying care patients receive may also serve as balancing procedures (akin to balancing measures) to uncover when ‘measurement with a wink’ is affecting a quality programme. Examples of potentially helpful balancing procedures include accounting for the number of HbA1c measures in a period of time (as clinicians are trying to get under a quality threshold), or obtaining nurse or patient accounts of how statin prescriptions are prepared or dispensed (as clinicians try to ace statin use metrics by prescribing stains without shared decision making). These balancing procedures, like their linked improvement programmes, in their complexity, should not crowd out care. As such, before widespread adoption, measurement and improvement programmes should be tested for safety and efficacy and for minimally disrupting care.

Measurement misleads when it fails to award the best score to the best care, but instead rewards the best player. High scores on the measures provide the illusion of good, better or improved care, while favouring measurable care that is predominantly standard, technical, mechanical and context-blind.12 In the TV show, key words can be hit without context, in perfect context or in the wrong context, all leading to the same scores. Playing this game can help healthcare organisations and professionals learn the basics of improvement science, but mastering this level is not the endgame. We can get it completely right, ace it and yet be completely wrong. The best quality programmes — deploying multidimensional assessments that closely track with underlying improvements and are linked to intrinsically rewarding care experiences and better patient outcomes — may be immune to this concern. Focusing on acing the test instead of improving the underlying care that should be careful and kind dehumanises care and de-professionalises. It is time to move beyond measurement with a wink.

Footnotes

Contributors MK came up with the idea for this article and wrote the first draft; discussion with VMM and NDS led to revisions to include specific examples and discussions. All authors approved the final version.

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.