Choosing Wisely: Prevalence and Correlates of Low-Value Health Care Services in the United States

Abstract

Background

Specialty societies in the United States identified low-value tests and procedures that contribute to waste and poor health care quality via implementation of the American Board of Internal Medicine Foundation’s Choosing Wisely initiative.

Objective

To develop claims-based algorithms, to use them to estimate the prevalence of select Choosing Wisely services and to examine the demographic, health and health care system correlates of low-value care at a regional level.

Design

Using Medicare data from 2006 to 2011, we created claims-based algorithms to measure the prevalence of 11 Choosing Wisely-identified low-value services and examined geographic variation across hospital referral regions (HRRs). We created a composite low-value care score for each HRR and used linear regression to identify regional characteristics associated with more intense use of low-value services.

Patients

Fee-for-service Medicare beneficiaries over age 65.

Main Measures

Prevalence of selected Choosing Wisely low-value services.

Key Results

The national average annual prevalence of the selected Choosing Wisely low-value services ranged from 1.2% (upper urinary tract imaging in men with benign prostatic hyperplasia) to 46.5% (preoperative cardiac testing for low-risk, non-cardiac procedures). Prevalence across HRRs varied significantly. Regional characteristics associated with higher use of low-value services included greater overall per capita spending, a higher specialist to primary care ratio and higher proportion of minority beneficiaries.

Conclusions

Identifying and measuring low-value health services is a prerequisite for improving quality and eliminating waste. Our findings suggest that the delivery of wasteful and potentially harmful services may be a fruitful area for further research and policy intervention for HRRs with higher per-capita spending. These findings should inform action by physicians, health systems, policymakers, payers and consumer educators to improve the value of health care by targeting services and areas with greater use of potentially inappropriate care.

Keywords

Low-value care Medicare Geographic variation

Electronic supplementary material

The online version of this article (doi:10.1007/s11606-014-3070-z) contains supplementary material available to authorized users.

Introduction

Recent health policy initiatives prioritize impoved organization and delivery of care to reduce fragmentation and prevent expensive complications of chronic illness or iatrogenic disease. This approach may miss an important opportunity to address quality concerns and rising health care spending: the overuse of low-value services. In 2012, the Institute of Medicine estimated that 30% ($750 billion) of annual health care spending is wasteful and that over half of this spending is on unnecessary services and inefficient care.1 Elimination of low-value services as a cost control strategy has much economic appeal because it would improve quality while reducing costs.

Society increasingly recognizes the importance of excess medical care, but it is difficult pinpointing the services and populations that represent health care overuse. There is general agreement on the definition of overtreatment (treatment of indolent disease, aggressive treatment at the end-of-life) and overdiagnosis (diagnosis and treatment of disease that would not have affected the lives of patients), but consensus has not been sufficient to facilitate their identification in clinical practice.2 Identification of low-value and potentially harmful services is an essential first step in improving quality and reducing overuse. The second critical step is engaging physicians and patients in efforts to reduce use of these services. Together with physician specialty societies, the American Board of Internal Medicine (ABIM) Foundation launched the Choosing Wisely initiative in 2011 to advance both of these aims. Over 60 participating physician societies have now each identified five specialty-specific, low-value services whose avoidance would improve the efficiency of care through higher quality, reduced risks and lower costs.3

In this study, we developed claims-based algorithms to examine 11 services identified in one or more Choosing Wisely lists and estimated the prevalence of these services at the regional and national levels. We created a regional composite measure of overuse based on the prevalence of these 11 services and explored the demographic, health and health care system correlates of overuse at a regional level. Based on this information, we estimate the magnitude of the harm and wasteful spending attributable to each service. This information may aid decision makers in prioritizing areas for intervention and provide a baseline against which to test the impact of policies aimed at reducing use of low-value services.

Methods

Data

We used 100% Medicare administrative claims data (2006–2011) to determine the prevalence of low-value services. We limited our analysis to fee-for-service beneficiaries enrolled in Medicare Parts A and B (inpatient and outpatient insurance); we also required enrollment in Part D (prescription insurance) for three measures of Choosing Wisely services related to prescription drugs (analyses employing Part D data were limited to a 40% sample). We used residential ZIP codes to assign each beneficiary to a Dartmouth Atlas of Health Care hospital referral region (HRR).

Choosing Wisely Measurement

We developed claims-based algorithms for 11 services, representing 37 Choosing Wisely recommendations due to overlap in recommendations across specialty societies. A team of two physician health services researchers, two health economists and a Medicare claims data analyst reviewed all 130 available Choosing Wisely recommendations for inclusion in the study (available as of July 1, 2013). Each recommendation was scored based on: (i) the applicability to the over-65 Medicare population; and (ii) the feasibility of measuring prevalence using claims. Those Choosing Wisely recommendations that scored highest were included in the analysis. Choosing Wisely services excluded from analysis were either universally difficult to measure in claims data (e.g., “Don’t perform stenting of non-culprit lesions during percutaneous coronary intervention for uncomplicated hemodynamically stable ST-segment elevation myocardial infarction”) or not applicable to the elderly Medicare population (e.g., “Don’t schedule elective, non-medically indicated inductions of labor or Cesarean deliveries before 39 weeks, 0 days gestational age”). Measured services included a non-indicated subset of the following: back pain imaging; benign prostatic hypertrophy imaging; cardiac screening in low-risk patients; cervical cancer screening; dual-energy x-ray absorptiometry testing; preoperative cardiac testing in low-risk patients ahead of low-risk surgery (cataract and non-cardiac); vitamin D screening; prescribing antipsychotics and feeding tubes in dementia patients; and prescribing opioids for migraine. We categorize eight of the measures as low-value diagnostic services and three as low-value treatments (Table 1).

Table 1

Measures developed to assess prevalence of services identified as low-value through Choosing Wisely

Choosing Wisely recommendation

Specialty societies

Health service

Affected population

Cohort exclusions

Low-value diagnostic services

Don’t do imaging for low back pain when no red flags are present

American Academy of Family Physicians, American College of Physicians, North American Spine Society

Beneficiaries who received a low back x-ray, CT, or MRI within six weeks of incident low back pain diagnosis

Beneficiaries with low back pain over age 65 without other imaging indication

Prior diagnosis of low back pain, trauma and neurological impairment, within previous 12 months and cancer at any point during the study period; “E” code (external causes of injury) or trauma diagnosis on imaging event claim

Beneficiaries who received an intravenous pyelogram or an abdominal CT, MRI, or ultrasound within 60 days of the index diagnosis

Male beneficiaries diagnosed with BPH over age 65 without other indications for imaging

Cancer diagnosis at any point during the study period (e.g., chronic renal failure, nephritis, calculus of kidney and ureter, kidney stones, abdominal pain) within 60 days of index diagnosis

Don’t order cardiac tests on low-risk, asymptomatic patients

American Academy of Family Physicians, American College of Cardiology, American College of Physicians, American Society of Echocardiography, American Society of Nuclear Cardiology, Society of Cardiovascular Computed Tomography

Indications of cardiac disease or other conditions that could indicate cardiac testing (e.g., HIV/AIDS, diabetes, peripheral vascular disease, pulmonary disease, cancer) or use of a prescription drug associated with the above conditions in a calendar year; enrollment in hospice; appropriate clinical indication on testing event claim

Don’t screen women older than 65 years of age for cervical cancer who have had adequate prior screening and are not otherwise at high risk for cervical cancer

American College of cardiology, American College of Physicians, American College of radiology, American College of surgeons, American Society of anesthesiologists, American Society of Echocardiography, American Society of Nuclear Cardiology, Society of Cardiovascular Computed Tomography, Society of General Internal Medicine, Society of Thoracic Surgeons, Society for Vascular Medicine

Beneficiaries who received a non-indicated cardiac test, including stress tests, echocardiograms, electrocardiograms, CTs, MRIs or PETs within 30 days before low-risk surgery

We used a combination of International Classification of Diseases, Ninth Revision (ICD-9), and current procedural terminology (CPT) codes to construct cohorts at risk for 11 Choosing Wisely services and to identify health service events highlighted by the Choosing Wisely recommendations (Online Appendix 2). We also used Medicare Part D prescription records, where applicable, for cohort inclusion/exclusion or to identify Choosing Wisely prescription service events. In all cases, we conservatively excluded beneficiaries not targeted by the Choosing Wisely recommendation. We limited our analysis to non-indicated tests and procedures, excluding services with claims diagnoses that suggest appropriate medical indication. We drew from measure definitions in the literature and conducted claims-based sensitivity analyses to optimize the measure construct when possible.4, 5, 6, 7, 8, 9, 10] For example, we studied the characteristics and follow-up events for those we deemed “low risk” for the cardiac screening measure. All measures not drawn from the literature were developed by a clinician; each was then reviewed by a second clinician. Disagreements were resolved via discussion. Although we used 2006–2011 data, some measures were limited to smaller windows to permit sufficient look-back periods within the data to identify, for example, prevalent disease states (e.g., long-standing back pain that would result in denominator exclusion for the back pain imaging measure). In Table 1, we describe the data, the time window for cohort qualification, event definitions, measure-specific cohorts, and cohort and event exclusions for each measure.

Area-Level Variables

Based on a conceptual framework for decisions regarding health care services, we created HRR-level covariates to include in an exploratory regression analysis.11 These HRR-level measures characterized population demographics, health and health care systems for each area based on Medicare, Behavioral Risk Factor Surveillance System (BRFSS), U.S. Census and American Community Survey data. Explanatory variables included the following: per-beneficiary Medicare spending (a measure of health care use intensity); physician group concentration (a measure of market competition); the ratio of specialists to primary care physicians; the age-, sex- and race-adjusted mortality rate and the percent of adults reporting fair or poor health (measures of health state); the percent of Medicare beneficiaries of black race; the percent of Medicare beneficiaries of Hispanic ethnicity; a Medicare effective care use score; the percent of HRR residents living in a rural area; and the percent of residents below 150% of the federal poverty limit.

Statistical Analysis

We calculated an average annual prevalence in the at-risk population for each Choosing Wisely service, both nationally and at the HRR level, along with the coefficient of variation across HRRs. We estimated national spending associated with each service by multiplying observed average spending per low-value care event by the national number of low-value care events among fee-for-service Medicare beneficiaries. We constructed an overall composite measure of low-value care for each HRR, equal to the average of the 11 standardized rates (z scores or standard deviations from the mean, Cronbach’s alpha = 0.66). We examined geographic variation in the overall composite measure by dividing the HRRs into quintiles and mapping the results. We used ordinary least squares regression to determine the association of HRR-level characteristics with the composite low-value care scores (N = 306 HRRs).

Statistical analyses were performed using SAS and Stata software. The study was approved by the institutional review board at Dartmouth College. See Online Appendix 1 for further methodology details.

Results

Of the 11 health care services included in our analysis (Table 1), non-cardiac surgery was the most prevalent, with 46.5% of those identified receiving pre-operative tests (Table 2). The use of antipsychotics in dementia patients and opioids in migraine patients were also highly prevalent (31.0% and 23.6%, respectively). Low-value services with low prevalence included non-indicated imaging for benign prostatic hypertrophy (1.2%), non-indicated cervical cancer screening (3.1%) and non-indicated vitamin D screening (8.8%).

*“Back Pain Imaging” is the average annual percent of beneficiaries with uncomplicated, incident low-back pain who received non-indicated low-back imaging in the six weeks following diagnosis, 2007–2011. “BPH Imaging in Low-Risk Patients” is the average annual percent of male beneficiaries with benign prostatic hypertrophy (BPH) who received non-indicated upper urinary tract imaging in the 60 days after the index diagnosis, 2006–11. “Cardiac Testing in Low-Risk Patients” is the average annual percent of low-risk beneficiaries who received one or more non-indicated cardiac tests, 2006–11. “Cervical Cancer Screening” is the average annual percent of female beneficiaries who received at least one non-indicated screening test for cervical cancer, 2006–11. “DXA Testing (short interval)” is the average annual percent of non-indicated dual-energy x-ray absorptiometry (DXA) tests performed within 23 months of a previous DXA test, 2008–11. “Preoperative Cardiac Testing (cataract surgery)” is the average annual percent of beneficiaries undergoing cataract surgery who received one or more non-indicated cardiac tests in the 30 days before surgery, 2006–11. “Preoperative Cardiac Testing (non-cardiac surgery)” is the average annual percent of beneficiaries undergoing low-risk, non-cardiac surgery who received one or more non-indicated cardiac tests in the 30 days before surgery, 2006–11. “Vitamin D Screening” is the average annual percent of beneficiaries who received at least one non-indicated vitamin D screening test, 2006–11. "Antipsychotics in Dementia Patients” is the average annual percent of beneficiaries with diagnosed dementia and without a severe mental illness who received antipsychotic medication, 2006–11. “Feeding Tubes in Dementia Patients” is the average annual percent of beneficiaries with advanced dementia who received a feeding tube, 2006–11. “Opioids in Migraine Patients” is the average annual percent of beneficiaries with a diagnosed migraine who received a non-indicated opioid prescription in the 21 days after an office visit with migraine diagnosis, 2006–11.

† Annual spending estimate represents the total annual allowed amount attributed to utilization of the Choosing Wisely services in 2011. Spending estimates and population amounts are scaled to 100% of the fee-for-service Medicare population

The prevalence of low-value care varied across the United States. Non-indicated imaging for benign prostatic hypertrophy had the highest variation (coefficient of variation of 0.82), likely in part due to its low prevalence and relatively small affected population. Use of antipsychotics in dementia patients and non-indicated imaging for back pain had relatively low levels of regional variation (coefficient of variations of 0.12 and 0.16, respectively). Overall, use of low-value services, as indicated by our composite measure, was highest in the south and eastern parts of the United States (Fig. 1).

The spending amount associated with each of the low-value services was a function of the prevalence of the service, the size of the affected population and the cost of test or treatment. Non-indicated use of antipsychotics in dementia patients had the highest amount of associated spending ($765.1 million), followed by non-indicated vitamin D screening ($198.6 million). Non-indicated imaging for benign prostatic hypertrophy and non-indicated preoperative cardiac testing for cataract surgery had the lowest levels of associated spending ($0.3 and $0.6 million, respectively).

Health care and health system characteristics were associated with use of low-value services at the regional level in the Medicare population (Table 3). In our exploratory regression model, we found that higher age-, sex-, race- and price-adjusted total Medicare spending per capita was associated with low-value care utilization, in addition to a higher ratio of specialist to primary care physicians, a higher proportion of minority beneficiaries and a higher proportion of residents with poor or fair health. In contrast, a higher proportion of residents with income under 150% of the federal poverty limit was associated with lower low-value care utilization, along with a higher physician group concentration. Notably, use of low-value services was not associated with the quality index, a standardized rate of underuse for a collection of measures thought to represent more effective care.

*“Adjusted Reimbursements” is the total, average, annual age-, sex-, race- and price-adjusted reimbursement per beneficiary in thousands of dollars (Dartmouth Atlas, 2010). “Physician Group Concentration” is the Herfindahl-Hirschman Index of allowed Medicare charges to physician groups (the mean of the square of each provider tax identification number’s allowed charges divided by total allowed charges within a given HRR) (Medicare, 2010). “Specialist/PCP Ratio” is the ratio of specialists per 100,000 residents to primary care physicians per 100,000 residents (Dartmouth Atlas, 2010). “Quality Score” is a composite of the standardized rates for the following measures, ranging from approximately −2 to 2% of beneficiaries filling at least one prescription for beta-blockers within six months of a heart attack: the percent of female beneficiaries aged 67–69 who received mammography every two years; the percent of diabetics who received appropriate hemoglobin monitoring; and the percent of diabetics who received appropriate eye exams (Dartmouth Atlas, 2010). “Mortality Rate” is age- sex-, and race-adjusted mortality per 1,000 Medicare enrollees (Dartmouth Atlas, 2010). “Percent Black” is the percent of Medicare beneficiaries identified as black (RTI International, 2010). “Percent Hispanic” is the percent of Medicare beneficiaries identified as Hispanic (RTI International, 2010). “Percent with Poor or Fair Health” is the percent of adults that reported fair or poor health in the region (Behavioral Risk Factor Surveillance System, 2010). “Percent Rural” is the percent of residents in rural areas (U.S. Census, 2010). “Poverty Rate” is the percent of residents in the region below 150% of the federal poverty limit (American Community Survey, 2010)

†Mean across hospital referral regions, weighted by number of Medicare beneficiaries over age 65

Discussion

The Choosing Wisely initiative identified a set of low-value services via high-level expert opinion and consensus. We carefully constructed 11 claims-based algorithms to quantify and track utilization likely to represent overuse by relying on the recommendations from the Choosing Wisely program. Analysis of these services revealed substantial overuse and variation in overuse in the Medicare population by measure and geography. From both patient and societal perspectives, use of these services may have substantial health and economic implications. Some of the measured services represent treatments that may directly confer risk of harm (e.g., opioids in migraine patients), some may directly confer risk of harm and result in significant spending (e.g., antipsychotics in dementia patients) and others may indirectly confer risk of downstream harm by prompting additional testing and possibly resulting in false positive results (e.g., non-indicated preoperative cardiac testing). Our analysis provides an estimate of the opportunity for improving quality while reducing spending on these 11 services.

We found adjusted Medicare spending was positively associated with use of low-value services after controlling for regional health indicators. Many areas identified by others as having consistently high adjusted Medicare spending (e.g., McAllen, TX; Manhattan and Long Island, NY; Miami, FL; and Los Angeles, CA) also have high use of low-value services, indicating that at least some of their high spending results from wasteful services. The strong association between the proportion of racial and ethnic minority beneficiaries in the region and lesser use of low-value services in these exploratory regressions raises questions. We suspect this association is not due to individual-level differences in treatment between racial and ethnic groups, but rather is an artifact of practice styles in regions where these population sub-groups live. Previous research has shown that where a patient lives can affect the level and quality of health care the patient receives independent of individual characteristics, and that overuse patterns do not differ by insurance type.12, 13, 14, 15

Recent evidence indicates provider organizations and regions with a higher proportion of primary care physicians have lower utilization and spending and better use of recommended preventive and chronic care.16 Moreover, workforce characteristics explain 42% of the state-level variation in Medicare spending per beneficiary.17 The magnitude of the association between specialist ratio and low-value care in our study echoes these results, but does not suggest an obvious policy intervention. It is unknown whether this observation reflects excess testing by specialists or by all types of physicians in regions with a higher relative concentration of specialists.

Overuse of other services not included in our analysis may display different patterns than the 11 services we measured. Our estimates of overuse, however, include generalist- and specialist-directed care, expensive and inexpensive tests and procedures, and a broad range of specialty society lists. The main limitation of this research is our reliance on administrative claims to identify and describe use of low-value services.18 Claims may not provide the clinical detail needed to definitively identify certain examples of low-value care. Claims may miss important patient history such as long-term, untreated back pain that contributes to clinical decision-making and justifies services that would appear in claims as low-value. Often the same service can be high- or low-value depending on the patient; if the cohort exclusions are not adequately detailed, the measure will represent utilization of the procedure and not overuse. While claims data are not ideal for measurement of patient risk or symptoms, we provide algorithms to represent each recommendation and believe they are conservative starting points to estimate the use of these services, associated spending, variation in spending and correlates of use. These algorithms are valuable for research and discussion; use of these algorithms for quality measurement or payment by payers will require validation by chart review. We do not expect the “right” rate for these claims-based measures to be zero, but the differences across geography suggest what is achievable. In research on claims-based measurement of cancer treatment quality, Earle et al. define the 10th percentile as the benchmark for health care systems to set as a goal.19 In their work evaluating the intensity of end-of-life cancer care, for example, this meant that hospitals would be providing appropriate-intensity care if less than 2% of patients started a new chemotherapy regimen in the last 30 days of life. We report the 25th percentile for each measure in Table 2 as a conservative initial benchmark for clinicians and health systems to work toward. This benchmark may have to change as the quality of care improves.

Our analysis of the correlates of low-value care was exploratory and aimed at generating hypotheses. As an ecological correlation analysis, it was not based on individual patient- and provider-level modeling. Each low-value test or procedure likely has its own profile and is differentially affected by payment incentives, malpractice liability concerns, physician comfort with diagnostic uncertainty and patient demand for services, among other factors. Nonetheless, several of the patterns observed, including the association of higher spending and greater specialist supply with a greater provision of low-value care, are consistent with previous work and should serve as the basis for developing a conceptual framework for decision making around low-value care utilization.17,20

The measures developed for this study may help policymakers and payers focus attention on the forms of low-value care that are most harmful, prevalent or costly. Our conservative estimate of the spending for the low-value care services included in our analysis represents a small part of the overall cost problem and does not include other costs associated with the service or downstream costs, but is still an important starting point and opportunity for savings. Reduction in use of the services we measure will improve quality while lowering costs – changes that are hard to find in health care. Our measures also provide a baseline against which to test the impact of policies aimed at controlling costs and improving the efficiency of health care delivery, including, but not limited to, those that target low-value services directly.

The Choosing Wisely initiative has labeled services as low-value and has begun educating both patients and physicians through outreach material.21 Future work should examine the effects of the Choosing Wisely initiative and related programs on use of these services, as well as other reforms (such as accountable care organizations or value-based insurance design) that are intended to slow spending growth and reduce waste in health care. Identifying and eliminating low-value care is a critical component of health care reform and one in which careful measurement and targeting of policies will be essential to maximizing value and minimizing unintended harm.

Notes

Contributors

The authors would like to thank Daniel Gottlieb and Rebecca Zaha for analytic support, as well as Brook Martin and Joan Teno for assistance with measure development.

Funders

This study was supported by grants from the Robert Wood Johnson Foundation’s Changes in Health Care Financing and Organization (HCFO) Initiative (#70729), the National Institute on Aging (P01 AG019783 and K23 AG035030), and The Commonwealth Fund (#20130339). These organizations had no role in the collection, analysis and interpretation of the data, or approval of the finished manuscript.

Prior Presentations

This work was previously presented in a policy roundtable discussion sponsored by the Robert Wood Johnson Foundation’s Changes in Health Care Financing and Organization (HCFO) Initiative on January 27, 2014.