Abstract

Background

Nine breast cancer quality measures (QM) were selected by the American Society of Breast Surgeons (ASBrS) for the Centers for Medicare and Medicaid Services (CMS) Quality Payment Programs (QPP) and other performance improvement programs. We report member performance.

Results

A total of 1,286,011 QM encounters were captured from 2011–2015. For 7 QM, first and last PM rates were as follows: (1) needle biopsy (95.8, 98.5%), (2) specimen imaging (97.9, 98.8%), (3) specimen orientation (98.5, 98.3%), (4) sentinel node use (95.1, 93.4%), (5) antibiotic selection (98.0, 99.4%), (6) antibiotic duration (99.0, 99.8%), and (7) no surgical site infection (98.8, 98.9%); all p values < 0.001 for trends. Variability and reasons for noncompliance by surgeon for each QM were identified. The CMS-calculated target goals (ABC™ benchmarks) for PM for 6 QM were 100%, suggesting that not meeting performance is a “never should occur” event.

Conclusions

Surgeons self-reported a large number of specialty-specific patient-measure encounters into a registry for self-assessment and participation in QPP. Despite high levels of performance demonstrated initially in 2011 with minimal subsequent change, the ASBrS concluded “perfect” performance was not a realistic goal for QPP. Thus, after review of our normative performance data, the ASBrS recommended different benchmarks than CMS for each QM.

Recognizing the need to search for gaps in care, the American Society of Breast Surgeons (ASBrS) built a patient registry called Mastery of Breast SurgerySM (Mastery) and developed quality measures (QM) to audit.24 In Mastery, surgeons can view their own performance and immediately compare themselves to other surgeons after they enter data. As early as 2009, nearly 700 member surgeons of the ASBrS demonstrated their commitment to QM reporting by entering data on 3 QM for each of 28,000 breast cancer cases.13 We updated the results of the ASBrS measurement program for those QM accepted by the Centers for Medicare and Medicaid Services (CMS) for their quality payment programs (QPP) (Table 1).36 Our purpose was to provide transparency of member performance, investigate for variability of care, and describe how this information was used to develop quality targets (benchmarks). To our knowledge, we report the largest sample of breast surgeon-entered QM encounters assembled to date.

Number of patients age 18 and older undergoing a therapeutic breast surgical procedure considered an initial partial mastectomy or “lumpectomy” for a diagnosed cancer or an excisional biopsy for a lesion that is not clearly benign based on previous biopsy or clinical and radiographic criteria with surgical specimens properly oriented for pathologic analysis such that six margins can be identified

Number of patients age 18 and older undergoing a therapeutic breast surgical procedure considered an initial partial mastectomy or “lumpectomy” for a diagnosis of cancer or an excisional biopsy for a lesion that is not clearly benign based on previous biopsy or clinical and radiographic criteria

Surgical patients aged 18 years and older undergoing procedures with indications for a first- or second-generation cephalosporin prophylactic antibiotic, who had an order for a first- or second-generation cephalosporin for antimicrobial prophylaxis

All surgical patients aged 18 years and older undergoing procedures with the indications for a first OR second generation cephalosporin prophylactic antibiotic

Noncardiac surgical patients who have an order for discontinuation of prophylactic parenteral antibiotics within 24 h of surgical end time

All noncardiac surgical patients aged 18 years and older undergoing procedures with the indications for prophylactic parenteral antibiotics and who received a prophylactic parenteral antibiotic

Mastectomy reoperation

Unplanned 30 day reoperation rate after mastectomy

Patients undergoing mastectomy who do not require an unplanned secondary breast or axillary operation within 30 days of the initial procedure

Patients undergoing unilateral or bilateral mastectomy as their initial procedure for breast cancer or prophylaxis

aMastectomy with reconstruction is included

Methods

De-identified QM data were obtained from the ASBrS for the years 2011–2015. Due to de-identification, the Institutional Review Board of the Gundersen Health System deemed the study was not human subject research; the need for formal IRB approval was waived.

CMS Rules and Formulas

All QM must be specified with inclusion, exclusion, and exception criteria (Table 1).36,37

Using “performance met” (PM) and “performance not met” (PNM) for each QM, the formula for performance rate (PR) was as follows: PR = [PM]/[PM + PNM]. Patients with exceptions are included in the PR only if there was PM. Excluded patients are never included in the PR. For example, patients undergoing lumpectomy are excluded from the mastectomy reoperation QM.

For calculating the total number of surgeon–patient-measure encounters captured in Mastery, we summed the total reports for each individual QM for all study years and all providers who entered data. Statistical Analysis Software, version 9.3 (SAS Institute Inc., Cary, NC) was used to report performance.

Benchmarks for performance for each QM were calculated by the Achievable Benchmark of Care™ (ABC) methodology recommended by CMS.38,39 ABC benchmarks were reviewed by the ASBrS Board of Directors in person on January 22, 2016. By the ABC method, calculated benchmarks for six QM were 100% performance met. Thus, for these measures, performance not met became a defacto “never-should occur event.” As a result, the Patient Safety and Quality (PSQ) and the executive committees recommended different benchmarks to be based on our member normative performance data and society expert opinion. This methodology of setting a target goal for passing a test has been termed a modified Angoff approach by educators and is similar to the process used by the European Society of Breast Cancer Specialists (EUSOMA).28,40, 41, 42, 43 To assess annual trends in performance, the Cochrane Armitage test was used.

Society Actions

The ASBrS performed an annual review of participating member performance for the QM captured in Mastery. The results were presented to their Board of Directors by the PSQ Committee. Initiatives to address quality concerns were then discussed or planned.

Results

Encounters Captured

A total of 1,286,011 unique provider-patient-measure encounters were captured in Mastery during 2011–2015 for 9 QCDR QM.44 Encounters varied by QM from 275,619 for the specimen orientation QM to 2680 for a recently introduced hereditary risk QM (Table 2). The number of encounters differed by QM due to its eligibility requirements and the time point when it was first available for reporting. The dropout rate of surgeons who did not enter any encounters for the last reporting year (2015) but who had entered data in prior years was 43% (354/832).

aCMS Benchmark was derived from ABC™ formula; the ASBrS Benchmarks were determined after calculating and not endorsing the ABC™ benchmarks. The ASBrS Benchmarks were based on the observed normative performance data in this study and expert opinion of the ASBrS Quality and Executive Committees

The hereditary assessment and the unplanned reoperation after mastectomy QM not included in trending analysis, because they are new measures

The most common reasons for performance not met (PNM) by QM were “patient refusal” for NB (0.6%, 583/105,541), “fragmented tissue” for SO (0.6%, 599/105,186), “imaging not available” for SI (0.04%, 41/95,534), “attempted, not successful” for SN (0.6%, 627/99,172), “no reason given” for AS (0.4%, 419/97,206), “no reason given” for AD (0.1%, 136/96,583), “infection” for NSSI (1.4%, 1987/141,963), and “bleeding” for mastectomy reoperation (0.2%, 152/73,886). Other reasons for PNM and exceptions for each QM are in Table 4.

Cefazolin or cefuroxime NOT ordered- Allergy to penicillin or cephalosporin

7889/97,206 (8.1%)

Cefazolin or cefuroxime NOT ordered, no reason specified (419)

Cefazolin or cefuroxime NOT ordered for medical reason

236/97,206 (0.2%)

Antibiotic stopped after 24 h

Antibiotic NOT discontinued – ordered by plastic surgeon for expander or implant insertion

5898/96,583 (6.1%)

No reason specified, antibiotic NOT discontinued (or ordered to be) within 24 h (and given within 4 h prior to incision or intraoperatively) (136)

For medical reasons, antibiotic NOT discontinued (or ordered to be) within 24 h (and given within 4 h prior to incision or intraoperatively)

1417/96,583 (1.5%)

Antibiotic NOT discontinued—ordered by plastic surgeon for autologous flap

842/96,583 (0.9%)

Hereditary risk

Genetic testing denied by insurance

100/7047 (1.4%)

Genetic testing not ordered, no reason specified (166)

Patient refused genetic testing (117)

Post-op surgical site infection

No “exceptions”

Cellulitis (435)

Superficial SSI (310)

Deep SSI (116)

Exceptions mean the surgeon is not penalized in their performance rate for not meeting performance because the reason for PNM is justifiably not related to quality, as determined by the American Society of Breast Surgeons

Benchmarks

With the CMS ABC formula, the benchmarks were 100% performance met for every QM except for the hereditary risk measure, which was 98%. In contrast, the ASBrS-recommended benchmarks were as follows: needle biopsy (90%), specimen orientation (95%), specimen imaging (95%), sentinel node use (90%), mastectomy reoperation rate (< 10%), hereditary risk assessment (90%), and surgical infection (< 6%; Table 2). Benchmarks for the two antibiotic QM were not established, because they have been discontinued.

Discussion

Background

In 2008, the ASBrS launched its Mastery program for breast surgeons to document the quality of their clinical performance.13 After modified Delphi ranking, 9 of 144 breast surgical QM were chosen for ASBrS member self-assessment, benchmarking and CMS QPP.44 Program developers and ranking participants had diversity of practice location and type to include nonspecialty breast surgeons.

Participation

By 2017, spurred by landmark legislation and the need to improve quality, nearly 70 organizations developed patient registries for clinicians to report more than 300 QM to CMS.36,45, 46, 47, 48 Our registry which started much earlier has already successfully captured over one million unique patient-measure encounters and provided real-time benchmarking.

Performance

There was a high rate of performance for eight of our nine measures. For these eight measures, compliance was met in more than 94% of patient encounters. Notable examples include the high rate of preoperative diagnosis of breast cancer made by a needle biopsy (97.5%) and the low rates of surgical site infection and unplanned reoperation after mastectomy: both less than 2%. This level of performance exceeded most historical reports.18,49, 50, 51 The QM with the lowest aggregate performance was “documentation of surgeon hereditary assessment of a newly diagnosed breast cancer patient” at 86%. Overall performance for the other eight QM was excellent. However, we recognize that disparities of care may still be present. During the last 6 years of measurement, there were statistically significant changes in performance for all measures. Despite both the upward- and downward- trending changes, the absolute differences by year were small, all less than 3%, which raises the question of these changes’ clinical significance. Because the performance level of surgeons reporting in Mastery is so high, it is possible these surgeons may be a self-selected group of high-performing surgeons. Supporting this concept, surgeons voluntarily reporting in a cardiac surgery registry, compared with nonparticipants, demonstrated better performance.52 Our findings of such high performance in the initial study years, followed by minimal annual change, is similar to a recent report from European breast centers.53 When this scenario occurs, there is concern that these QM may have “topped out,” resulting in less opportunity for future improvement. However, because the level of performance for nonparticipants in our program is unknown, we have not yet retired our QM; rather by continuing to support them, we are endorsing their importance inside and outside our society membership.

Although aggregate performance rates were high, variability of performance existed, best demonstrated by histograms (Fig. 1). Whenever variability coexists with evidence that high performance is achievable, there is opportunity to improve overall care.54

When Performance is Not Met, What Can We Learn?

The most common reasons for not meeting performance for each measure are documented in Table 3. Even with high overall performance, there is value to identifying causes of measure noncompliance. Understanding causation affords opportunity to improve. For example, one reason for omission of a needle biopsy for diagnosing cancer was “needle biopsy not available in my community,” which represents a system and resource issue, rather than a surgeon-specific issue. Supporting solutions, the ASBrS has provided education and certification for both ultrasound-guided and stereotactic core needle biopsy.55 In another example, the second most common reason that patients underwent an unplanned reoperation after mastectomy was for a positive margin. Potentially, surgeons learning they are comparative outliers for margin involvement may reevaluate their care processes to better assess the cancer’s proximity to the mastectomy margins preoperatively.

If performance is not met for a QM due to a justifiable “nonquality” reason, then CMS defines this encounter as an “exception.” In such cases, the encounter did not penalize the surgeon, because it was not included in their performance rate. An exception to not meeting performance for achieving a cancer diagnosis by a needle biopsy occurred in 8264 patients undergoing prophylactic mastectomy and in 1814 patients having an imaging abnormality that was “too close” to an implant or the chest wall to permit safe needle biopsy. This granular level of information potentially aids improvement strategies. For example, in high-risk patients undergoing risk-reducing mastectomy, surgeons ought to pursue guideline concordant preoperative imaging to identify nonpalpable cancers, thereby improving both the needle-biopsy rate for cancer as well as reducing the mastectomy reoperation rate by excising sentinel nodes during the initial mastectomy in patients later found to have invasive cancer.

Capturing exceptions also allowed for accurate attribution assignments. For example, in our registry, a surgeon can attribute a reoperation after mastectomy to themselves, such as for axillary bleeding, or to the plastic surgeon for flap donor site bleeding.

Benchmarking

Benchmarking (profiling) means that participants can compare their performance to others and is a method for quality improvement.23,31,39,53,56,57 Benchmarking programs differ. Navathe et al. recently summarized eight different design factors.56 Using this categorization, our program is identity-blind, reports textually (not graphically), encourages high-value care, discourages low-value care, compares an individual to a group, contains measures with both higher and lower levels of evidence supporting them, has a national scope, and to our knowledge has not resulted in any unintended adverse outcome.

The term benchmark means a point of reference. A benchmark may simply be an observation of results of contemporary care, perhaps when first described in a specific patient population.39,58 A benchmark also can be an organizational target goal, such as a zero percent infection rate, or a data-driven reference, reached when content experts scrutinize observed ranges of performance and subsequently endorse a specific percentile.40, 41, 42, 43 In 2008, 24 breast cancer experts attended a workshop in Europe and established benchmarks for 17 quality indicators for breast centers, calling them minimum standards and quality targets.28,53 The establishment of a quality target is a known method for improving quality beyond the effect of peer comparison.35 Recognizing this concept, CMS requires that QCDR stewards determine ABC benchmarks for each of their QM.38 Conceptually, the ASBrS Board of Directors agreed that benchmarks can be catalysts for improvement. After application of the ABC formula, the CMS benchmarks for six of our QM were 100% “performance met.” After review, the ASBrS Board concluded that achieving perfection in every patient encounter was desirable but should not be considered the “standard of care”; nor should “performance not met” be considered a “never” event. As a result, the ASBrS Quality Committee and Board reviewed the member performance presented here, as well as relevant literature, then endorsed different benchmarks, that reflected high-quality and clinically achievable care (Table 2).

Was our Quality Program the Driver of Observed Improvements in Performance?

For the first three QM that measured needle biopsy, specimen imaging, and specimen orientation rates in our program before 2011, there was marked improvement compared with 2011–2015.13 Overall, there was significant improvement for seven of our nine QM from 2011 to 2015. Whether this improvement was directly related to our measurement and benchmarking, and the natural consequence of measurement driving improvement, cannot be conclusively determined given multiple competing reasons that could explain improvement. These potential confounders include some changes in QM specifications over time, as well as our own educational programs and scholarly publications within and outside our Society.53

Program and Study Strengths and Limitations

The strengths and limitations of the ASBrS Mastery patient registry have been described elsewhere.44 Strengths include large sample sizes, immediate peer comparison, and appropriate attribution assignments.44 In addition, our registry is flexible in terms of its ability to capture additional data fields after appropriate vetting by the society and in its ability to output data across a number of domains. While it was initially developed for quality measurement, it also has been used for clinical outcomes research.13,59, 60, 61, 62

Limitations are recognized.44 A selection bias is possible because the surgeons who self-select to participate may be “above average.” They may share certain characteristics, such as a focus on quality and safety, better resources or a different case-mix compared to nonparticipating surgeons. If so, our results may not be reproducible in other settings. Due to this concern, the ASBrS Board agreed to offer participation to non-ASBrS members for pilot studies. Other limitations include an unknown rate of nonconsecutive case entry and an unknown rate of surgeon dropout due to their perception of poor performance. In addition, most of our QM are process rather than outcome measures, and we are not providing risk-adjusted comparisons. As a result, investigations are underway to identify the interactions between patient, surgeon, and facility characteristics that affect our measured outcomes. Lastly, formal reliability testing of our measures and advanced analytic tools to disentangle each surgeon’s intrinsic performance from their supporting institution have not been performed.63,64

Conclusions

The ASBrS successfully constructed an electronic patient registry and then engaged breast surgeons to capture more than a million organ-specific QM encounters, providing proof of surgeons’ commitment to self-assessment as well as evidence of our societies’ compliance with a mission “continually to improve the practice of breast surgery.”65 Functionality was provided for surgeon profiling, program data were used to establish quality targets and a service was provided to surgeon members allowing them to participate in CMS incentivized reimbursement programs. Much work remains to include more advanced analytic methods for benchmarking and to decide when to retire existing measures that may have “topped out.” For now, we encourage all surgeons not participating in our program to compare their personal performance to our benchmarks. In addition, we are currently searching for inequities and disparities of care by surgeon and patient characteristics.

Institute of Medicine (US) Committee on Quality of Health Care in America. Crossing the quality chasm: a new health system for the 21st century. Washington (DC): National Academies Press (US); 2001.Google Scholar

Copyright information

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.