The blog of Ashish Jha — physician, health policy researcher, and advocate for the notion that an ounce of data is worth a thousand pounds of opinion.

Now we’re giving star ratings to hospitals? Does anyone think this is a good idea? Actually, I do. Hospital ratings schemes have cropped up all over the place, and sorting out what’s important and what isn’t is difficult and time consuming. The Centers for Medicare & Medicaid Services (CMS) runs the best known and most comprehensive hospital rating website, Hospital Compare. But, unlike most “rating” systems, Hospital Compare simply reports data on a large number of performance measures – from processes of care (did the patient get the antibiotics in time) to outcomes (did the patient die) to patient experience (was the patient treated with dignity and respect?). The measures they focus on are important, generally valid, and usually endorsed by the National Quality Forum. The one big problem with Hospital Compare? It isn’t particularly consumer friendly. With the large number of data points, it might take consumers hours to sort through all the information and figure out which hospitals are good and which ones are not on which set of measures.

To address this problem, CMS just released a new star rating system, initially focusing on patient experience measures. It takes a hospital’s scores on a series of validated patient experience measures and converts them into a single star rating (rating each hospital 1 star to 5 stars). I like it. Yes, it’s simplistic – but it is far more useful than the large number of individual measures that are hard to follow. There was no evidence that patients and consumers were using any of the data that were out there. I’m not sure that they will start using this one – but at least there’s a chance. And, with excellent coverage of this rating system from journalists like Jordan Rau of Kaiser Health News, the word is getting out to consumers.

Our analysis

In order to understand the rating system a little bit better, I asked our team’s chief analyst, Jie Zheng, to help us better understand who did well, and who did badly on the star rating systems. We linked the hospital rating data to the American Hospital Association annual survey, which has data on structural characteristics of hospitals. She then ran both bivariate and multivariable analyses looking at a set of hospital characteristics and whether they predict receiving 5 stars. Given that for patients, the bivariate analyses are most straightforward and useful, we only present those data here.

Our results

What did we find? We found that large, non-profit, teaching, safety-net hospitals located in the northeastern or western parts of the country were far less likely to be rated highly (i.e. receiving 5 stars) than small, for-profit, non-teaching, non-safety-net hospitals located in the South or Midwest. The differences were big. There were 213 small hospitals (those with fewer than 100 beds) that received a 5-star rating. Number of large hospitals with a 5 star rating? Zero. Similarly, there were 212 non-teaching hospitals that received a 5-star rating. The number of major teaching hospitals (those that are a part of the Council of Teaching Hospitals)? Just two – the branches of the Mayo Clinic located in Jacksonville and Phoenix. And safety net hospitals? Only 7 of the 800 hospitals (less than 1%) with the highest proportion of poor patients received a 5-star rating, while 106 of the 800 hospitals with the fewest poor patients did. That’s a 15-fold difference. Finally, another important predictor? Hospital margin – high margin hospitals were about 50% more likely to receive a 5-star rating than hospitals with the lowest financial margin.

Here are the data:

Interpretation

There are two important points worth considering in interpreting the results. First, these differences are sizeable. Huge, actually. In most studies, we are delighted to see 10% or 20% differences in structural characteristics between high and low performing hospitals. Because of the approach of the star ratings, especially with the use of cut-points, we are seeing differences as great as 1500% (on the safety-net status, for instance).

The second point is that this is only a problem if you think it’s a problem. The patient surveys, known as HCAHPS, are validated, useful measures of patient experience and important outcomes unto themselves. I like them. They also tend to correlate well with other measures of quality, such as process measures and patient outcomes. The star ratings nicely encapsulate which types of hospitals do well on patient experience, and which ones do less well. One could criticize the methodology for the cut-points that CMS used for determining how many stars to award for which scores. I don’t think this is a big issue. Any time you use cut-points, there will be organizations right on the bubble, and surely it is true that someone who just missed being a 5 star is similar to someone who just made it. But that’s the nature of cut-points – and it’s a small price to pay to make data more accessible to patients.

Making sense of this and moving forward

CMS has signaled that they will be doing similar star ratings for other aspects of quality, such as hospital performance on patient safety. The validity of those ratings will be directly proportional to the validity of the underlying measures used. For patient experience, CMS is using the gold standard. And the goals of the star rating are simple: motivate hospitals to get better – and steer patients towards 5-star hospitals. After all, if you are sick, you want to go to a 5-star hospital. Some people will be disturbed by the fact that small, for-profit hospitals with high margins are getting the bulk of the 5 stars while large, major teaching hospitals with a lot of poor patients get almost none. It feels like a disconnect between what we thinks are good institutions and what the star ratings seem to be telling us. When I am sick – or if my family members need hospital care, I usually choose these large, non-profit academic medical centers. So the results will feel troubling to many. But this is not really a methodology problem. It may be that sicker, poor patients are less likely to rate their care highly. Or it may be that the hospitals that care for these patients are generally not as focused on patient-centered care. We don’t know. But what we do know is that if patients start really paying attention to the star ratings, they are likely to end up at small, for-profit, non-teaching hospitals. Whether that is a problem or not depends wholly on how you define what is a high quality hospital.

Of all the pressing challenges in the US health care system, lack of innovation in delivery may be the most important. Indeed, as we come upon the 50th anniversary of Medicare, a few facts seem apparent. What we do for patients—whether they have infectious diseases, heart disease, or cancer—has changed dramatically. Yet, how we do those things—the basic structure of our health care delivery system—has changed very little.

I’m sorry I haven’t had a chance to blog in a while – I took a new job as the Director of the Harvard Global Health Institute and it has completely consumed my life. I’ve decided it’s time to stop whining and start writing again, and I’m leading off with a piece about adjusting for socioeconomic status. It’s pretty controversial – and a topic where I have changed my mind. I used to be against it – but having spent some more time thinking about it, it’s the right thing to do under specific circumstances. This blog is about how I came to change my mind – and the data that got me there.

Changing my mind on SES Risk Adjustment

We recently had a readmission – a straightforward case, really. Mr. Jones, a 64 year-old homeless veteran, intermittently took his diabetes medications and would often run out. He had recently been discharged from our hospital (a VA hospital) after admission for hyperglycemia. The discharging team had been meticulous in their care. At the time of discharge, they had simplified his medication regimen, called him at his shelter to check in a few days later, and set up a primary care appointment. They had done basically everything, short of finding Mr. Jones an apartment.

Ten days later, Mr. Jones was back — readmitted with a blood glucose of 600, severely dehydrated and in kidney failure. His medications had been stolen at the shelter, he reported, and he’d never made it to his primary care appointment. And then it was too late, and he was back in the hospital.

The following afternoon, I spoke with one of the best statisticians at Harvard, Alan Zaslavsky, about the case. This is why we need to adjust quality measures for socioeconomic status (SES), he said. I’m worried, I said. Hospitals shouldn’t get credit for providing bad care to poor patients. Mr. Jones had a real readmission – and the hospital should own up to it. Adjusting for SES, I worried, might create a lower standard of care for poor patients and thus, create the “soft bigotry of low expectations” that perpetuates disparities. But Alan made me wonder: would it really?

To adjust or not to adjust?

Because of Alan’s prompting, I re-examined my assumptions about adjustment for SES. As he walked me through the data, I concluded that the issue of adjustment was far more nuanced than I had appreciated.

Here’s the key: effective socio-economic adjustment doesn’t reward providers for giving bad care to poor patients. It just ensures that they aren’t penalized for taking care of more of them. In my clinical example, if people like Mr. Jones had a higher readmission rate, adjusting for SES wouldn’t give hospitals credit for lower quality care to poor patients. Done right, it would give credit to hospitals for having more poor patients, and that’s an important difference. Consider three scenarios of hospital performance on a readmission rates (modified from our JAMA piece).

In scenario 1 and 2, let’s assume that patients are readmitted 20% of the time on average, whether or not they’re poor. In scenario 1, Hospital A (a safety-net hospital) has higher readmission rates for everyone. They may have more poor patients, but their readmission rate is high for both poor and non-poor patients. So, compared to Hospital B, they look worse in unadjusted and adjusted scores. Adjustment doesn’t help.

In scenario 2, Hospital A has higher readmission rates for its poor patients and therefore has an overall readmission rate of 25%. Hospital B doesn’t suffer from readmitting its poor patients too often – hence its readmission rate is 20%. In this case, safety-net hospitals look worse than Hospital B in both unadjusted and adjusted analyses. Again, adjustment doesn’t help.

In scenario 3, Hospital A and B both struggle with readmissions for their poor patients – as does the rest of the country. The only thing that differentiates Hospital A from Hospital B is the proportion of poor patients in the hospital. In this case, adjustment makes a big difference. By adjusting, we account for the different proportions of poor patients between Hospital A and B. Adjustment ensures that organizations are judged by how well they care for their patients, not by how many poor patients they have.

One Size Does Not Fit All

The debate about whether to adjust for socioeconomic status needs to be far more nuanced than it has been to date. Specifically, we must recognize that quality measurement has multiple purposes, and we need to think about each one when deciding whether to adjust or not. If the goal is transparency –letting patients know how they are likely to fare – then the best approach is stratified data. In scenario 3 (where adjustment makes a difference) a poor patient will do about as well at both hospitals – and unadjusted numbers are misleading, because they tell poor patients that hospital B is better. If Hospital B has a larger co-pay or is out-of-network, you have done real harm by pushing a patient to a more expensive place that doesn’t provide better care.

To push hospitals to improve quality, unadjusted numbers are best. In all three scenarios, Hospital A should be more motivated to get better than Hospital B because for its patients, it tends to have worse performance. But in each scenario, the hospitals need stratified data. Without it they will have no idea where to target their efforts.

For penalties, we should use adjusted data. It will make no difference in scenarios 1 and 2. But, in scenario 3, it makes little sense to penalize the safety net hospital compared to other hospitals just for taking care of more poor patients. That’s not a smart policy. Penalties for bad care for poor patients? Sure. Penalties just for caring for more poor patients? Not so sure.

A way forward

The bottom line is that the care of poor patients is not evenly distributed across all U.S. hospitals. Some hospitals have a lot more patients like Mr. Jones than others have. And caring for people like him, who are homeless and without a social network, is challenging. None of us are very good at it. Why penalize the safety-net hospitals just for taking care of more poor patients?

Given the concern that safety-net hospitals may be disproportionately penalized, a bi-partisan group of Senators (3 Democrats and 3 Republicans) has signed on to a bill that would require CMS to account for SES when it doles out penalties for the HRRP (Senate Bill 2501). It’s an excellent start.

Adjusting for SES is an acknowledgement that medicine is not the only factor – and indeed may be a relatively minor factor – in health outcomes. For Mr. Jones, homelessness and poverty clearly contributed to his readmission to the hospital. Bad medical care did not. We should have no qualms penalizing safety-net hospitals for providing sub-standard care. But we just shouldn’t penalize them simply because they have more poor patients.

Adverse events – when bad things happen to patients because of what we as medical professionals do – are a leading cause of suffering and death in the U.S. and globally. Indeed, as I have written before, patient safety is a major issue in American healthcare, and one that has gotten far too little attention. Tens of thousands of Americans die needlessly because of preventable infections, medication errors, surgical mishaps, and so forth. As I wrote previously, according to Office of Inspector General (OIG), when an older American walks into a hospital, he or she has about a 1 in 4 chance of suffering some sort of injury during their stay. Many of these are debilitating, life-threatening, or even fatal. Things are not much better for younger Americans.

Given the magnitude of the problem, many of us have decried the surprising lack of attention and focus on this issue from policymakers. Well, things are changing – and while some of that change is good, some of it worries me. Congress, as part of the Affordable Care Act, required Centers for Medicare and Medicaid Services (CMS) to penalize hospitals that had high rates of “HACs” – Hospital Acquired Conditions. CMS has done the best it can, putting together a combination of infections (as identified through clinical surveillance and reported to the CDC) and other complications (as identified through the Patient Safety Indicators, or PSIs). PSIs are useful – they use algorithms to identify complications coded in the billing data that hospitals send to CMS. However, there are three potential problems with PSIs: hospitals vary in how hard they look for complications, they vary in how diligently they code complications, and finally, although PSIs are risk-adjusted, their risk-adjustment is not very good — and sicker patients generally have more complications.

So, HACs are imperfect – but the bottom line is, every metric is imperfect. Are HACs particularly imperfect? Are the problems with HACs worse than with other measures? I think we have some reason to be concerned.

HACs – Who Gets Penalized?

Our team was asked by Jordan Rau of Kaiser Health News to run the numbers. He sent along a database that listed CMS’s calculation of the HAC score for every hospital, and the worst 25% that were likely to get penalized. So, we ran some numbers, looking at characteristics of hospitals that do and do not get penalized:

These are bivariate relationships – that is, major teaching hospitals were 2.9 times more likely to be penalized than non-teaching hospitals. This does not simultaneously adjust for the other characteristics because as a policy matter, it’s the unadjusted value that matters. If you want to understand to what degree academic hospitals are being penalized because they also happen to be large, then you need multivariate analyses – and therefore, we went ahead and ran a multivariable model – and even in the multivariable model (logistic model with each of the above variables in the model), the results are qualitatively similar although not all the differences remain statistically significant.

What Does This Mean?

So how should we interpret these data? A simple way to think about it is this: who is getting penalized? Large, urban, public, teaching hospitals in the Northeast with lots of poor patients. Who is not getting penalized? Small, rural, for-profit hospitals in the South. Here are the data from the multivariable model: The chances that a large, urban, public, major teaching hospital that has lots of poor patients (i.e. top quartile of DSH Index) will get the HAC penalty? 62%. The chances that a small, rural, for-profit, non-teaching hospital in the south with very few poor patients will get the penalty? 9%.

Is that a problem? You could make the argument that these large, Northeastern teaching hospitals are terrible places to get care – while the hospitals that are really doing it well are the small, rural, for-profit hospitals in the south. May be. I suspect this is much more about the underlying patient population and vigilance than actual safety. Beth Isarel Deaconess Medical Center (BIDMC) in Boston is one of the very few hospitals in the country with exceptionally low mortality rates across all three publicly reported conditions and a hospital that I have written about as having great leadership and a laser focus attention on quality. And yet, it is being penalized as being one of the hospitals with, according to the HAC metric, a poor record on safety. So is Brigham and Women’s (though I’m affiliated there, so watch my bias) – a pioneer in patient safety whose chief quality and safety officer is David Bates, one of nation’s foremost safety gurus. So are the Cleveland Clinic and Barnes Jewish, RWJF Medical Center, LDS Hospital in Salt Lake, and Indiana University Hospital, to name a few.

So what are we to do? Is this just whining that our metrics aren’t perfect? Don’t we have to do something to move the needle on patient safety? Absolutely. But, we are missing a great opportunity to do something much more useful. Patient safety as a field has been stuck. It’s been 15 years since the IOM’s To Err is Human report came out – and by all counts, progress has been painstakingly slow. Therefore, I am completely on board with the sentiment behind Congressional intent and CMS’s efforts. We have to do something – but I think we should do something a little different.

If you look across the safety landscape, one thing becomes clear: when we have good measures, we make progress. We have made modest improvements in hospital acquired infections – because of tremendous work by the CDC (and their clinically-based National Hospital Surveillance Network) that collects good data on patient safety and feeds it back to hospitals. We have also made some progress on surgical complications, partly because a group of hospitals are willing to collect high quality data, and feed it back to their institutions. But the rest of the field of patient safety? Not so much. What we need are good measures. And, luckily, there is still a window of opportunity if we are willing to make patient safety a priority.

How to Move Forward

This gets us to the actual solution: harnessing the power of meaningful use in the Electronic Health Records incentive program. We need clinically-based, high quality patient safety metrics. Electronic health records can capture these far more effectively than billing codes can. The federal government is giving out billions of dollars to doctors and hospitals that “meaningfully use” certified EHRs. A couple of years ago, David Classen and I wrote a piece in NEJM that outlined how the federal government, if it wanted to be serious about patient safety, could require, that EHR systems measure, track, and feed back patient safety events as part of certification and requirements for meaningful use. The technology is there. Lots of companies have developed adverse event monitoring tools. It just requires someone to decide that improving patient safety is important – and that clinically-based metrics are useful.

So here we are – HACs. Well intentioned – and a step forward, I think, in the effort to make healthcare better. Everyone I know thinks HACs have important limitations – but reasonable people disagree over whether their flaws make them unusable for financial incentives or not. The good news is that all of us can agree that we can do much better. And now is the time to do it.

Last year, about 43 million people around the globe were injured from the hospital care that was intended to help them; as a result, many died and millions suffered long-term disability. These seem like dramatic numbers – could they possibly be true?

If anything, they are almost surely an underestimate. These findings come from a paper we published last year funded and done in collaboration with the World Health Organization. We focused on a select group of “adverse events” and used conservative assumptions to model not only how often they occur, but also with what consequence to patients around the world.

Our WHO-funded study doesn’t stand alone; others have estimated that harm from unsafe medical care is far greater than previously thought. A paper published last year in the Journal of Patient Safety estimated that medical errors might be the third leading cause of deaths among Americans, after heart disease and cancer. While I find that number hard to believe, what is undoubtedly true is this: adverse events – injuries that happen due to medical care – are a major cause of morbidity and mortality, and these problems are global. In every country where people have looked (U.S., Canada, Australia, England, nations of the Middle East, Latin America, etc.), the story is the same. Patient safety is a big problem – a major source of suffering, disability, and death for the world’s population.

The problem of inadequate health care, the global nature of this challenging problem, and the common set of causes that underlie it, motivated us to put together PH555X. It’s a HarvardX online MOOC (Massive Open Online Course) with a simple focus: health care quality and safety with a global perspective. I believe that this will be a great course—not because I’m teaching it, but because we have assembled a team of terrific experts. But, let me be clear: putting this MOOC together is unlike any educational experience I have ever had before.

First, you get to assemble the faculty – and here, I had almost no constraints. Want to learn about quality measurement? We have Jishnu Das (World Bank economist whose ground-breaking work includes sending trained, fake patients into doctors’ offices in Delhi) and Niek Klazinga (a Dutch physician who led the creation of the Health Care Quality Indicators for the OECD). These two guys have thought more deeply and broadly about quality measurement than almost anyone else in the world. What about the role of leadership? We have Agnes Binagwaho (Minister of Health, Rwanda) and Julio Frenk (former Minister of Health, Mexico and current dean of the Harvard School of Public Health) speaking about what leadership in quality looks like from a health minister’s perspective. We have T.S. Ravikumar, the CEO of a massive public hospital system in Pondicherry, India talking about how his decision to prioritize quality transformed his institution.

Sometimes, when you want the best people in the world, you don’t even have to go very far. On patient safety, we only had to cross the street for David Bates, Chief Quality Officer at Brigham and Women’s Hospital and patient safety maven. When we wanted to learn about the empirical basis for the role of management in improving quality, we went across town to Harvard Business School to spend time with Rafaella Sadun. And when we wanted to learn about quality improvement, we only had to cross the Charles River to find Maureen Bisignano, CEO of IHI.

Beyond getting to assemble an excellent, world-class faculty, the MOOC is a completely different approach to education. Because this course has never been offered before –we had the freedom to write a fresh syllabus specifically for online learners. This is not a live course copied onto a web platform. These are not hour-long lectures videotaped from the back of a classroom. Our lectures are short, pithy conversations on pressing topics. Instead of asking Professor Ronen Rozenblum, an Israeli expert on patient experience, to lecture about how and why we might measure patient-reported outcomes, we are having a meaningful discussion – back and forth, where I get to challenge his assumptions and let him articulate why patient experience should be considered an integral part of quality and more importantly, why he cares.

Beyond the discussions, we have interactive sessions where students create content. One of my favorites? Through this course we will crowd source the first global “atlas” on healthcare quality. Lets be honest, it’s one thing for me to point to individual studies on hospital infections in Canada or India, but right now, we have no place to turn to if we want to really understand key issues in healthcare quality around the globe and how they compare to one another. The goal of this exercise is as simple as it is ambitious. By the end of the course, we will draft a resource that maps out where the world is on the journey towards a safe, effective, patient-centered healthcare systems. It will be created by the collective energy and creativity of people in the course – a range of students, providers, policy folks and people just simply passionate about improving the delivery of healthcare. It will be a public good for us all to use and improve.

Finally, we have a few enticements to keep everyone engaged. The attrition rate in these courses tends to be high, so we have a few carrots. First, half-way through the course we will have a series of live discussion in which expert faculty will help students solve pressing quality and safety problems in their own institutions. Have a problem with high infection rates in your ICU? We will get an expert on nosocomial infections to help you think it through and figure out how to begin to solve it. Wondering how to keep your family members safe during their hospital visit? We will have healthcare consumer experts help you navigate those waters. Finally, at the end of the course, students have the opportunity to submit a 1200 word thought piece on the importance of improving quality and safety in their own context whether as a clinician, patient, or health policy expert. The top three pieces will be published in the BMJ Quality and Safety, arguably the most influential global quality and safety journal.

This is a grand experiment in a new way of teaching, engaging, and creating information on quality and safety of healthcare. I’m sure there are parts that won’t work, but we will learn along the way. I’m also sure that the pressing issues facing the US – healthcare that is not nearly as safe, effective, or patient-centered as it should be – are similar to issues facing not just other high-income countries, but also low and middle -income countries. Thinking globally about these issues, and their adaptable solutions, can help us all deliver better care.

Quality needs to be on the global health agenda. Don’t believe me? Take the course.

I recently spoke to a quality measures development organization and it got me thinking — what makes a good doctor, and how do we measure it?

In thinking about this, I reflected on how far we have come on quality measurement. A decade or so ago, many physicians didn’t think the quality of their care could be measured and any attempt to do so was “bean counting” folly at best or destructive and dangerous at worse. Yet, in the last decade, we have seen a sea change. We have developed hundreds of quality measures and physicians are grumblingly accepting that quality measurement is here to stay. But the unease with quality measurement has not gone away and here’s why. If you ask “quality experts” what good care looks like for a patient with diabetes, they might apply the following criteria: good hemoglobin A1C control, regular checking of cholesterol, effective LDL control, smoking cessation counseling, and use of an ACE Inhibitor or ARB in subsets of patients with diabetes. Yet, when I think about great clinicians that I know – do I ask myself who achieves the best hemoglobin A1C control? No. Those measures – all evidence-based, all closely tied to better patient outcomes –don’t really feel like they measure the quality of the physician.

So where’s the disconnect? What does make a good doctor? Unsure, I asked Twitter:

Over 200 answers came rolling in. Listed below are the top 10. Top answer? Having empathy. #2? Being a good listener. It wasn’t until we get to #5 that we see “competent/effective”.

Even though the survey results above come from those I interact with on twitter, I suspect the results reflect what most Americans would want. As I read the discussions that followed, I came to conclude one thing: most people assume that physicians meet a threshold of intelligence, knowledge, and judgment and therefore, what differentiates good doctors from mediocre ones is the “soft” stuff.

It’s an interesting set of assumptions, but is it true? It is, at least somewhat. Most American physicians meet a basic threshold of competence – our system of licensure, board exams, etc. ensure that a vast majority of physicians have at least a basic level of knowledge. What most people don’t appreciate, however, is that even among this group, there are large, meaningful variations in capability and clinical judgment. And, of course, a small minority of people are able to get licensed without meeting the threshold at all. We all know these physicians – a small number to be sure — that are dangerously ineffective. We, the medical community, have been terrible about singling these physicians out and asking them to get better – or leave the profession.

In the twitter discussion, there was a second point raised by John Birkmeyer and that was likely on the minds of many respondents. He said “I’d want different things from my PCP and heart surgeon. Humility. Over-rated for the latter” John was raising a key distinction between what we want out of a physician (an Internist or a family practitioner) versus a surgeon. Yes, in order to be “good”, humility and empathy are important, even for cardiac surgeons. But when they are cutting into your sternum? You want them to be technically proficient and that trait trumps their ability (or lack thereof) to be empathic. Surgeons’ empathy and kindness matter – but it may not be as critical to their being an effective surgeon as their technical and team management skills. For Internists, effectiveness is much more dependent on their ability to listen, be empathic, and take patients’ values into consideration.

A final point. My favorite tweet came from Farzad Mostashari, who asked: “If your doctor doesn’t use the best data available to them to take care of you, do they really care about you?” In all the discussions about being a good doctor, we heard little about effective use of beta-blockers for heart disease, or good management of diabetes care. That’s the stuff we measure, and it’s important. We use them as part of the Physician Quality Reporting System (PQRS). But I’m not sure they really measure the quality of the physician. They measure quality of the system in which the physician practices. You can have a mediocre physician, but on a good team with excellent clinical support staff, those things get done. Even the smartest physician who knows the evidence perfectly can’t deliver consistently reliable care if there isn’t a system built around him or her to do so.

So, when it comes to thinking about ambulatory care quality – we should think about two sets of metrics: what it means to be a good doctor and what it means to work in a good system. In measuring doctor quality, we might focus on “soft” skills like empathy, which we can measure through patient experience surveys. But we also have to focus on intellectual skills, such as ability to make difficult diagnoses and emotional intelligence, such as the ability to collaborate and effectively lead teams – and we don’t really measure these things at all, erroneously assuming that all clinicians have them. For measuring good systems, we could use our current metrics such as whether they achieve good hypertension and diabetes control. We need to keep these two sets of metrics separate and not confuse one for the other. And, alas, for surgeons, we need a different approach yet. Yes, I still believe that humility and empathy go a long way – but these qualities are no substitute for sound judgment and a steady hand.

March 2nd through the 8th is National Patient Safety Awareness Week – I don’t really know what that means either. We seem to have a lot of these kinds of days and weeks – my daughters pointed out that March 4 was National Pancake Day – with resultant implications for our family meals. But back to patient safety. It is National Patient Safety Awareness Week, and in recognition, I thought it would be useful to talk about one organization that is doing so much to raise our awareness of the issues of patient safety. Which organization is this? Who seems to be leading the charge, reminding us of the urgent, unfinished agenda around patient safety? It’s an unlikely one: The Office of the Inspector General of the Department of Health and Human Services. Yes, the OIG. This oversight agency strikes fear into the hearts of bureaucrats: OIG usually goes after improper behavior of federal employees, investigates fraud, and makes sure your tax dollars are being used for the purposes Congress intended.

In 2006, Congress asked the OIG to examine how often “never events” occur and whether the Centers for Medicare and Medicaid Services (CMS) adequately denies payments for them. The OIG took this Congressional request to heart and has, at least in my mind, used it for far greater good: to begin to look at issues of patient safety far more broadly. Taken from one lens, the OIG’s approach makes sense: the federal government spends hundreds of billions of dollars on healthcare for older and disabled Americans and Congress obviously never intended those dollars pay for harmful care. So, the OIG thinks patient safety is part of its role in oversight, and thank goodness it does. Because in a world where patient safety gets a lot of discussion but much less action, the OIG keeps the issue on the front burner, reminding us of the human toll of inaction.

While the OIG has had multiple important reports in this area, the watershed one was their eye-opening November 2010 report. If you haven’t read at least the executive summary, you should. The OIG looked at care for a national sample of Medicare beneficiaries and what it found was unexpected: 13.5% of Medicare beneficiaries suffered an injury in the hospital that prolonged their hospital stay, caused permanent harm, or even death. An additional 13.5% of Medicare patients suffered “temporary” harm – such as an allergic reaction or hypoglycemia – things that are reversible and treatable, but quite problematic nonetheless. Taken together, these data suggest that 27% of older Americans suffer some sort of injury during their hospitalization – much higher than previous numbers.

There are three more statistics from the OIG report that should give us all pause: First, they estimate that unsafe care contributes to 180,000 deaths of Medicare beneficiaries each year. This is a stunningly high number. Second, Medicare pays at least an additional $4.4 billion to cover the costs of caring for these injuries. And finally, about half of these events are preventable based on today’s technology and know-how. I suspect that if we actually make safety a priority, many more events would become preventable over time. And yet, although hospitals are supposed to identify, study, and track adverse events, the OIG says it mostly isn’t happening. At least not in any systematic way.

This is all old news, of course, so on to new news: the OIG just released another excellent report, this time on harm in skilled nursing facilities (SNFs). While we have paid a lot of attention to acute hospitals, we have generally paid far less attention to what happens when patients leave. And, about 20% of Medicare patients, after discharge, go to a SNF. So, the OIG went looking at SNF care, and what they found is both unsurprising and quite disappointing: during their SNF stay, 22% of Medicare beneficiaries suffered a harm that prolonged stay, caused permanent harm, or even death. And, an additional 11% suffered temporary harm that could be reversed with a medical intervention. Physician reviewers considered 59% of these events to be preventable and these physician reviewers “attributed much of the preventable harm to substandard treatment, inadequate resident monitoring, and failure or delay of necessary care.” And these adverse events add an additional $2.8 Billion to Medicare spending. And remember, none of these financial calculations include the financial harm patients suffer because of lost work, family members having to take time off to provide additional care, etc.

It’s been 15 years since To Err is Human and patient safety has gone from a niche topic to something far more mainstream. We now recognize that safety is a huge problem. However, over the past few years, we have seen consistently disappointing data that we aren’t making much progress. It has caused many people to stop trying. Of course, we can’t publicly admit that we are giving up when the human toll is so high. So, instead, we are encouraging “voluntary reporting” that ignores most errors, using metrics to assess performance that don’t really reflect the safety of underlying care, and putting tiny incentives in place that aren’t meaningful enough to really change behavior. In 5 years, when we talk about the 20th anniversary of the To Err is Human report, will we wonder again why we have made so little progress?

The path forward, although difficult, is pretty clear. I’ve previously described a set of proposed solutions but in a nutshell, I think we should do three things: Measure and monitor adverse events in a systematic and robust way. This is increasingly possible with EHRs and we have described how before. Second, make safety data public. It will catalyze professional ethos, create real competition for safety, and force hospitals to get better. Third, put big incentives on the table so that there is a clear business case for safety. There are lots of ways to do it and are well described. And if we actually want to do this, we will have to reform our malpractice system so that these data can’t be turned into information for litigation. Finally, we need to move beyond hospital safety (despite having made so little progress in this arena) and start including safety in in a much broader context. As the OIG points out, there are lots of safety problems in post-acute care as well. That’s my wish list for what we need to do. I’m not sure it’s right, and others surely have better ideas. But we can’t be satisfied with our current efforts. And, thanks to the OIG, we are fully aware of the size and scope of the problem.

So, during Patient Safety Awareness Week, we should all take a moment to thank the Office of the Inspector General at HHS for reminding us that patient safety remains a pressing concern. Fixing it, of course, will require tough solutions and a lot of unhappy “stakeholders” who like the status quo. But, as the OIG reminds us, the human and financial costs of waiting is very high.

I was just recently in Guiyang, the capital of the Guizhou province in China and had a chance to visit the Huaxi District People’s Hospital (HDPH), one of the largest “secondary” hospitals in the province. Like the rest of China, it has been gripped by the construction boom, recently opening a new surgery center and revamped medical facilities. They had a terrific EHR from a local vendor — probably more sophisticated than a majority of U.S. hospitals. Despite being in one of the poorest regions of China, the hospital has more money than it knows what to do with (so says its leadership) and is planning further expansion. The source of its wealth? A growing middle class that wants more healthcare services and has the ability to pay for it.

Background on hospitals in China

There are approximately 2853 counties in China across 33 provinces. Each county has a county hospital, a government owned facility that serves the people of that community. When the patient is too complicated to be managed there, he or she is transferred usually to a secondary hospital. Patients who need an even higher level of care are sent to the regional tertiary care hospital. The gatekeeping system is weak – one need not start at the county hospital – and in fact, a majority of the inpatients at GPH came there directly.

A few years ago, China launched a major health reform with the goal of getting to universal coverage. They got close and nearly every citizen now has health insurance that covers at least part of the costs of their care. The insurance has substantial co-pays and doesn’t cover more expensive drugs and tests. What does this mean for a hospital like HDPH? About 40% of their revenues came from insurance. And, despite being a government hospital, only about 5% of revenues came from the government. The rest? From the patients themselves. This revenue mix is supposedly pretty typical of county and secondary hospitals across the nation. Out of pocket spending remains substantial, despite universal health insurance. In fact, in absolute dollar terms, patients are paying about as much out of pocket now as they were before social insurance kicked in.

Huaxi District People’s Hospital

Outpatient clinics, where a typical appointment might last 2-3 minutes, are by far the biggest source of admissions to the hospital. But the hospital also has an ER. Actually, two: a Medicine ER and a Surgery ER. The patient gets to choose. Unsure about which you need? There is an “Enquiry” nurse who can help. I peppered the one on duty with various clinical scenarios and was impressed with the speed and confidence with which she made decisions. The flow is simple: you choose your ER, you register, pay the fee in cash, and go inside to wait.

In the Surgery ER, I encountered a young surgeon who was smart and technically competent. Based on the symptoms I was faking, he decided I likely had appendicitis and suggested surgery. Appendectomy costs 3200 RMBYs (approximately $528). For a typical person with social insurance, the deductible co-pay would cost approximately 780 RMBYs ($130) while the rest would be covered by insurance. You had to pay before they would prep the operating room for you.

On the medical floors, I saw and discussed two patients with the physicians. The first was in the hospital with an Acute MI. He had been there for several days, receiving anti-coagulation. No need for a cardiac catheterization, his cardiologist told me, because he did not have ST segment elevation. Clinically, taking a more cautious approach is reasonable. But, it was also clear that sending him for cardiac catheterization would mean that the hospital would lose a paying customer. No need to do that unless it is absolutely necessary. The clinical plan for him was simple: three weeks in hospital, mostly for observation and medication titration. In the U.S., the average length of stay (LOS) for a non-ST elevation MI is closer to 3 days.

The second woman I saw had come in with swollen legs and a cough. She had gone to the community hospital, treated (incorrectly) for pneumonia and discharged. She hadn’t felt better so she came to HDPH. The patient had been in the hospital for a couple of days and had received an EKG, a metabolic panel and basic blood counts. The plan was to get liver function tests the next day and if that was normal, urine tests the day after. Making a diagnosis and treating her was likely to be a several week process. In the U.S., within the first 24 hours, we would have gotten all of the blood and urine tests, EKG, an echocardiogram, chest x-ray, likely lower extremity ultrasounds, and a trial of treatment with a diuretic. Average length of stay? Probably 3 to 4 days.

The difference in pace and tempo of caring for patients in China and the U.S. reflects the underlying payment approach. In the U.S., we get a single payment for the entire hospitalization, so the goal is simple: get all the tests, start treatment right away (often before we know the diagnosis) and see how the person responds. And most importantly, get them out quickly. Given what we know about the safety of care in hospitals, quick diagnosis and early discharge is hardly a bad thing. In China (and in many other countries with a similar payment structure), it’s hard to get motivated to send people home quickly when each additional day is additional revenue.

But the most interesting part was how they do the billing. Each day, the woman got a bill for her care and had to pay it. What happens if she runs out of money half way through her work-up, I asked? She would be sent home. It’s simple: daily bill, daily pay. No credit. No collection agencies.

A picture of posted prices in front of the ER at Huaxi District People’s Hospital

What can we learn?

While I strongly prefer our approach of hospital payments, there are a couple of useful things in the fee-for-service approach that China uses that are at least worth reflecting on. First, the Chinese doctors seemed far more deliberate in their work-up, getting one test at a time and moving forward after the results are back. The downside of our approach, getting everything at once, is that it leads to a lot of over-testing. And over-testing causes more false-positives, which lead to more tests and procedures. Our payment approach makes being deliberate financially imprudent.

The second observation is that while I found the approach to patient charges – the daily bill and the immediate discharge if the patient cannot pay – jarring (to say the least), there is something strangely honest about it. When I asked one of the physicians about whether it bothered him to send people home who could no longer afford the treatment they needed, he reminded me that in every part of the economy, you have to pay your bill before you recive services. In America, we do all the tests and treatments without thinking about the costs or the payment. We believe we are protecting the patient, but too often we are not. For the uninsured patient who gets the bankruptcy-inducing bill a month after she’s discharged? Not sure we did her any favors by not paying attention to the costs along the way.

China is undergoing a series of social and economic changes that are breathtaking in scope and speed. The changes in its healthcare system are no different. In one generation, these publicly owned hospitals have come to function as for-profit entities (not unlike most of our non-profit hospitals) and there’s a lot here that’s interesting to watch. Patients have lots of “skin-in the game” when it comes to payment, and it keeps prices in check. You couldn’t charge $100,000 for an appendectomy the way our hospitals do. Nobody would pay it. But their price transparency and lots of out of pocket costs are a challenge because most hospitals are a monopoly and there’s no data on quality of care. It makes it difficult for patients to be consumers. The Chinese policymakers are beginning to address this by pushing for more competition among hospitals (primarily by allowing in more private hospitals). My hope is that they add some quality/outcomes data as well. Then, we can have a much better sense of the degree to which patients can function as consumers, and whether their approach can lead to both low costs and high quality care.

The most commonly heard comment in healthcare these days is that we have to move from paying for volume to paying for value. While it may sound trite, it also turns out to be pretty true. Right now, most healthcare services are paid for on a fee-for-service basis – with little regard for the quality of that service. We clearly need to move towards value-based payments (sometimes referred to as pay-for-performance or P4P).

Although a few folks remain skeptical about whether VBP/P4P can work (as though our pay for volume strategy is working out so well), asking whether we should pay for volume versus pay for quality no longer seems like a particularly interesting question. The far more compelling and difficult question is how best to pay-for-performance? As I have written before, we need bold experiments with new payment models that employ three key principles: putting real money on the table, focusing on outcomes, and keeping the reward system simple (i.e. the better you do, the more you should get).

The key question for VBP is whether it will work – whether patients will be better off because of it. We don’t know and realistically, we won’t for another year or so. But what we do know is that 2 years into the program, certain hospitals seem to be doing well and others, not so much. Yes, the incentives are small and my guess is that any impact will be very modest as well. But, it’s still worth taking a look at how different types of hospitals are faring under VBP. So we ran some numbers.

What we did (Methods)

We took the latest release of the CMS data on how much each hospital is penalized. We linked these data to the American Hospital Association annual survey, which provides us with a series of hospital characteristics and the Medicare Impact File, which tells us a hospital’s Disproportionate Share Hospital (DSH) Index. The DSH Index is widely used to measure the percentage of patients who are poor and high DSH Index hospitals are often referred to as safety-net institutions. We then examined “bivariate” relationships –like whether size of the hospital, ownership, or DSH Index was associated with a higher or lower VBP penalty. Finally, we built a multivariable model to see if, holding other characteristics constant, certain types of hospitals seem to do better than others.

What we found (Results)

Interestingly, we found that some things don’t matter that much — hospital size and teaching status just don’t have much of an effect. Small hospitals did about as well as large ones and major teaching hospitals showed up in each of the performance quartiles about equally often. There seem to be moderate regional differences: hospitals in the Northeast and West do not do as well as Midwestern hospitals.

What’s interesting, and possibly most challenging, is that public hospitals and safety-net hospitals – those in the highest quartile of DSH index, tend to do worse. They are losing money on VBP (see first column of results in table). Because many of these hospital characteristics are overlapping (major teaching hospitals are usually large, public hospitals often have high DSH index, etc), the table also displays results from a multivariable model. The story is similar: hospitals in the Northeast and West get bigger penalties, as do public hospitals and those in the highest quartile of DSH. Its additive. A large urban public hospital in the Northeast with a high DSH index gets an average penalty of about 0.30% of their Medicare payments. Is that a lot? No – but it’s not irrelevant for a safety-net hospital that may be operating with razor thin financial margins.

Table: Average total penalty, unadjusted and adjusted for VBP 2014

* Adjusted for all other variables in the table. For example, public hospitals will get an additional 0.10% penalty, holding size, teaching, urban, region, DSH Index and proportion of Medicare patients constant.

So how do we interpret this? Is the VBP program disproportionately penalizing safety-net hospitals? Yes. Is it unfair? We don’t know. We don’t know why public, safety-net hospitals do worse on VBP. My suspicion is that much of the difference is driven by differences in patient experience scores. The challenge for all of us is to understand why safety-net hospitals generally have worse patient experience scores. Is it that poorer or minority patients are just less likely to give high scores on patient experience? Or are safety-net hospitals not doing as good of a job on patient-centered care? Until we know, we must be careful declaring that this is an unfair playing field. This is in contrast to the medical readmissions measure, where we know that so much of what drives a readmission is about what happens after the patient leaves the hospital (resources at home, access to effective primary care, etc). If you have more poor patients, your readmission rate will likely be higher. In the readmissions program, creating a more even playing field, as MedPAC has suggested, is a good idea. For VBP, the jury is still out.

So – VBP is heading into year 2 – and we can see that some hospitals are doing well while others are struggling. What we don’t know is if this program is fundamentally changing the way hospitals are working on quality. Over the past few years, I’ve spoken to a lot of hospital leaders. For most, the VBP measures remain a “checkbox” item – something they invest just enough time and energy to hit their marks, but not much more. As more outcomes measures are included in VBP, I hope that mentality changes and VBP becomes part of a broader agenda to make quality and value central to how we pay for healthcare.

In my previous blog, I made the argument that whatever strategy we use to improve care in hospitals will not be implemented and executed well without proper focus by hospital leadership. So, it is in this context, that we recently published some pretty disappointing findings that are worth reflecting on.

We examined the pay of CEOs across U.S. hospitals and found that some CEOs got paid a lot more than others. This was not surprising. CEOs of larger, urban, teaching hospitals get paid a lot more than CEOs of small, rural, non-teaching institutions. But the disappointment was around quality: we found no relationship between a hospital’s quality performance and the pay of the CEO. Holding size, teaching, and other factors constant, what was the pay of CEOs of hospitals with high mortality rates? About the same as CEOs of hospitals with low mortality rates. What about other quality measures? Most of them didn’t really seem to matter, with the exception of patient experience, which correlated nicely with CEO compensation. It seems that when setting CEO compensation, patient outcomes are not a big part of the discussion. How could this be, and why does it matter?

How you set incentives for senior managers says a lot about your priorities. Boards generally set the salary for their CEOs and they clearly reward patient satisfaction scores. That’s good. They also seem to reward the things that build hospital reputations: having the latest technology such as a PET scanner or academic status. But are boards rewarding CEOs based on mortality rates or adherence to basic quality metrics? Not so much. Why not? I’ve spoken to a lot of board chairpersons over the years and the answer is not that they don’t care. Most boards want to reward quality and believe that they do. The problem is that most board members lack sufficient expertise on quality metrics and can’t decipher, from the large number of quality metrics, which ones are important (like mortality rates) and which ones are not. Hamstrung, they focus on satisfaction but also end up rewarding things that feel like proxies for quality, such as having the latest technology. And here’s the part that’s frustrating – our national efforts on quality measurement and improvement are not helping. We seem to have done very little to prioritize what’s really important, and shine a light on them.

So what do we do to move forward? Some states have started requiring that boards undergo training in quality. Medicare, as a condition of participation, could certainly require that boards (or at least some members thereof) show a degree of expertise with quality. I like these ideas but worry that training programs would themselves be of variable quality, and for some boards it would become an onerous requirement without achieving real gains in expertise.

Of course, if we really want to help boards be more effective and engage healthcare leaders, the biggest thing that we could do is actually reward hospitals, in a meaningful way, based on quality. Yes, we have the value-based purchasing program, and it is well-intentioned. But, as I’ve written before, it has several big problems. First and foremost: the incentives are very weak and there is little reason to believe it will have a meaningful impact on patient outcomes. Second, the measures are diffuse – we have too many of them, some of which matter (mortality) and many which don’t in the absence of the appropriate clinical context (checking the ejection fraction on a heart failure patient). It’s hard for hospital boards to really get a clear signal on what matters if they aren’t seeing it clearly and consistently from national leaders on quality.

So how might we move forward? I’d like to see, from CMS and other payers, strong incentives tied to patient safety, such as low hospital-acquired infection rates and patient outcomes (i.e. low mortality). That would send clear signals to boards that their chief executives need to be focused on what matters to patients. If the incentives are sizeable enough, and the metrics clear enough, boards will take notice and have clearer guidance for where to focus their efforts to hold management accountable.

The bottom line is that leadership matters enormously. Leaders set priorities, create the culture, and define what constitutes success for the organization. Currently, as I often hear Don Berwick say, we have a system that is perfectly designed to give us the results that it does. We can do better. Too often, we look to the Virginia Masons and the Intermountains of the world and say that if they can do it, anyone can. That’s fundamentally not right – they do it despite the fact that the incentives are stacked against them. We need to build a system for the ordinary, and not the extraordinary CEO – those leaders –who, despite commitment and the best of intentions, prioritize things that their incentive structure tells them to prioritize. And remember, these organizations, run by ordinary CEOs, care for a vast majority of Americans. And the job of boards and policy leaders should be simple: align the incentives so that hospitals and their leaders can really focus on doing what’s good for patients.