What It Takes To Make Health Care Reliable
I don’t have to reiterate the degree of errors, defects, adverse events, harms and deaths due to unreliable health care processes. I don’t have to reiterate the incredible inefficiencies in the current health care systems, whether the delivery of care or the financing of health care (like insurance companies and government health care programs like Medicare and Medicaid). I don’t have to because you can easily Google umpteen references to these things.
And this is a huge problem: all of the references you can Google are ignored. Ignored by health care. Ignored by Health Insurance Companies. Ignored by government agencies.
Oh, sure, they give lip service to it and relate “ways” they are addressing these things, but the interventions are weak and not nationally organized.
Sure, they can relate some interventions that are somewhat successful, like reducing central line infections in ICU’s. But what about PICC lines put in elsewhere? Why haven’t there been significant reductions of adverse events and deaths nationally overall? You’ll see all of these articles about saving money but the Medical Price Index is always 3-5 times the Consumer Price Index and cost is twice as high as any other nation’s.
It’s because the current cultures and leaders of health care don’t have the will to make where you get your medical care reliable.
In my previous posts, I’ve gone over a number of these things, so, now it’s time to offer solutions.
So, here they are.
Foundational premises:
- It starts at the top.
- It has to be inclusive.
- It has to not promote fear and reprisals.
- It has to promote learning.
- It has to promote listening.
- It has to use data wisely (data transformed into actionable information).
- It has to be intolerant of less than excellent execution.
First, look at this book cover, you’ll be seeing it again:
The number one national priority
The number one national priority for creating and sustaining the health care industries being as reliable as a companies like Toyota and Motorola is:
Reinstate Certificate of Need (CON). Place a moratorium on building new buildings and expanding health care services (unless there is a validated shortage that meets CON requirements) and funnel the money currently being spent on expansion into creating reliable and safe health care processes.
For example, it costs (depending on the number of beds) between $200,000,000 and $500,000,000 to build a hospital. But hospital care processes are known to be inefficient and error-prone, so, the money is being spent to perpetuate the health care unreliability status quo. No hospital system has ever devoted (let’s use the low end) $200,000,000 to radically improving the reliability, efficiency and safety of the care processes in a hospital. Never. Not even over a two-year period (which you could argue is the time it takes to build a new hospital).
Until the US nationally decides to do something like this, it will be a long time (or perhaps never) before our health care system has an error rate similar to Toyota.
Leadership solutions
I already gave you an example of leadership that demands excellence and reliability (the Sakichi Toyoda story in a previous post). This mentality is absent in health care. Leaders don’t champion the types of tools and measurement systems that Toyota does. Companies that adopt the Toyota Production System (TPS) excel in reliability and competitive cost of product (like Hyundai, which was producing less than mediocre automobiles in 1993 and now rivals Toyota in reliability and cost. They adopted the TPS. And it was due to the CEO (relentless leadership. You should read the Hyundai Story.) How about Jack Welch’s adoption of Six Sigma at GE? Similar story. Leadership driven.
Health Care leaders are too invested in making profits their legacy way (cost plus) and building new buildings and services, which has been the core competency of the health care industry leadership for 100 years (instead of safely and efficiently taking care of patients). So they are selected or self-selected for that purpose, not for making sure you get the right care at the right time in the right place without errors or defects or harm.
Here’s a typical example of that legacy way of making a profit from the American Hospital Association Underpayment Fact Sheet, 2015:
“In the aggregate, both Medicare and Medicaid payments fell below costs:
- Combined underpayments were $51 billion in 2013. This includes a shortfall of $37.9 billion for Medicare and $13.2 billion for Medicaid.
- For Medicare, hospitals received payment of only 88 cents for every dollar spent by hospitals caring for Medicare patients in 2013.
- For Medicaid, hospitals received payment of only 90 cents for every dollar spent by hospitals caring for Medicaid patients in 2013.
- In 2013, 65 percent of hospitals received Medicare payments less than cost, while 62 percent of hospitals received Medicaid payments less than cost.”
So, for Medicare, they were -12% and for Medicaid they were -10%. And they get that “extra” charge for that “Facility Fee” (See my post about “Prices”). Their strategy is to lobby Medicare and Medicaid for higher payments. Here’s a recent example of this lobbying strategy:
(Source: Modern Healthcare August 3, 2017)
Where is the AHA’s relentless program for reducing waste = reducing costs?
Nowhere. It’s a cost plus strategy.
Didn’t I already explain that experts from places like the Rand Corp. and the Institute if Medicine have estimated that there is 30-40% waste in the current system? These hospitals aren’t UNDERPAID! They have too much waste in their processes and their way of doing business. It costs too much for them to provide the services. Even if you took the lowest waste estimate (30%) and halved it you would get a savings of 15%, which means the hospitals would have a 3-5% positive margin on Medicare and Medicaid if they eliminated waste. But, instead, they have a VICTIM’s mentality. Ooohh, we’re underpaid! Poor us! Pay us more!
How about getting busy and investing in efficiency and safety to be able to make money on Medicare and Medicaid instead of building that new hospital in the next suburb?
There is actually a company called Premier that helps hospitals try to be less costly. They broker contracts for better rates for hospitals for supplies. They can show hospitals where there might be cost opportunities. The have a program called “Medicare Breakeven” that started in 2013, whose goal is to help hospitals re-design for efficiency and reduce costs so they don’t lose money on Medicare. If hospitals breakeven on Medicare, they have a +2% margin on Medicaid.
Premier partners with almost 500 hospitals across the US. I wonder how many hospitals are enrolled in the program. I wonder if they actually made it to breakeven. I’ve never seen an article about it. I’m thinking, no one has made it yet or there would be press releases about it. I know the hospital system I was in when I heard about this decided to decline to participate, and they were losing 12% on Medicare.
They need to Lean out their processes, not get paid more.
Solution 1. The Boards of Directors of health care institutions must start demanding excellence and reliability and drawing C-suite leaders from industries that have a proven record in safety and reliability. They must mandate the elements of being a reliable organization as described by AHRQ. Otherwise, nothing will change in health care. This is called demanding a culture of reliability.
MBA colleges and universities need to start imbuing their students with product reliability and how to build and maintain reliability cultures, not just how to read spreadsheets and optimize profits. Maybe they all should have to certify as Lean Six Sigma Black Belts to get their MBA.
Here’s an example of what goes on now. Financials rule.
I was in an integrated delivery system. It was March and the Finance Department and CFO presented financial projections to the Senior Executives. Their findings projected that the organization would miss its profit margin by 1%, which in the company was, $13,000,000. Take note that I’m saying missing their margin, not losing money. Let’s say they budgeted a margin of 4%. The projections said it would be 3%, so, $39,000,000 profit instead of $52,000,000.
The CEO, CFO, CMO and COO called all of the next level “down” leaders (so, the VP’s) into a room and told them that, as of the next pay period, their salaries would be cut by 10% (which apparently approximated $13,000,000) and they couldn’t get that money back unless they reversed the projected $13,000,000 shortfall PLUS ANOTHER $1,000,000!! Luckily, that happened and the VPs were “trued-up” in the first quarter of the next year.
In my entire 45+ year career in medicine, I have never seen a health care system dock people’s pay because of not meeting a clinical goal, including goals for things like the Mortality Rate (yes, YOU DYING!!), or NQF Serious Harm Events, or Readmission Rate, etc.
If a health care system had a culture of clinical reliability, they would use the same tactics they use to make profits and grow and apply them to clinical measures. Like cutting people’s pay for missing a clinical goal. I don’t believe that has ever happened in health care. Not even places like the Mayo Clinic.
Here’s another one. I’m a Medical Director in a hospital. They are having a terrible time getting patients from the Emergency Department (ED) into the inpatient floors where they really belong. Patients are staying in the ED waiting for a bed. They call these patients “ED Boarders”. Some of them are in there for 1-2 days waiting. There is evidence like the below related to patients boarding in the ED who should be in an inpatient bed:
“Mortality generally increased with increasing boarding time, from 2.5% in patients boarded less than 2 hours to 4.5% in patients boarding 12 hours or more (p < 0.001). Mean hospital LOS (Length of Stay) also showed an increase with boarding time (p < 0.001), from 5.6 days (SD ± 11.4 days) for those who stayed in the ED for less than 2 hours to 8.7 days (SD ± 16.3 days) for those who boarded for more than 24 hours. The increases were still apparent after adjustment for comorbid conditions and other factors.” (“The association between length of emergency department boarding and mortality”; Singer, AJ et al; Acad Emerg Med. 2011 Dec;18(12):1324-9.)
We actually validated this with our own data, which said people staying in the ED longer than 10 hours had a higher mortality rate and length of stay.
So, I’m in a strategic planning meeting in the Fall with all of the hospital’s Senior Leaders. They are proposing what the “threshold” measure (a goal that must be met) should be for Senior Leaders to get their annual bonuses. I proposed that we should choose “Zero boarders in the ED for longer than 10 hours.” I figured this would galvanize the Senior Leaders to analyze and process improve as a company-wide team to decrease inpatient throughput time and reduce the time boarders were in the ED, TO SAVE LIVES! I was the only leader to propose a clinical measure for bonuses. Did they choose that? Nope. They chose a financial measure. They chose a threshold within their core competency bandwidth. Not to be crass, but, basically guaranteeing they will get their bonuses. Moving a clinical measure was too risky.
Solution 2. Senior Leaders must spend time “on the workplace floor” a significant and specified amount of time to truly experience what the systems and processes in their organizations are like.
Senior Leaders tend to be reluctant to spend time on a clinical floor to actually experience what is going on. This includes physicians who take on Medical Director roles. For me, it was an easy decision to continue to see patients even while I assumed Medical Director roles, even to the C-suite level. I actually made the HR folks write it into my job description. I know many physicians who, once they assumed Medical Director roles, never saw another patient. I actually had a colleague of mine who called me one day because the company he was working for re-organized and his Medical Directorship was eliminated. He was asking if I knew of any Medical Director positions that were available. I didn’t know of any. He was disappointed. I said, “Well, you’re a board certified ER physician, right? You can work in an ER until you find something as a Medical Director, can’t you?” He said, “Oh no, I haven’t been in an ER or seen a patient for 20 years! I need a Medical Director position.” I thought to myself, “Who’s fault is that?”
This was not an isolated incident. I would say 75% of physicians who get to the “full time” Medical Director level stop seeing patients. There was one physician hired into a Chief Medical Officer position that part of his role required applying medical judgment, which, in turn required that he have a license to practice, who, when he applied for the license, was declined a license on the basis that he hadn’t seen a patient for so long, the Medical Board wouldn’t approve his license because they felt he might not be competent anymore.
The nurses are worse. I’ve NEVER seen a nurse who got into a full time Nurse Director or “above” position spend any time doing nursing on a clinical floor, be it inpatient or ambulatory. Never.
And as far as the MBA’s (or MHA; Masters of Healthcare Administration), spending time on a clinical floor, or even an administrative floor, is viewed as equivalent to cleaning out septic tanks. Once they get promoted, they never want to “go back”. As a matter of fact, it’s hard to get them to do anything except hang out in their offices and go to meetings. Some will do “walk arounds” where they schmooze with staff then disappear. They are pretty clueless as to the realities of what is going on in their companies. Very different from the CEO of the Ritz Carlton who spends at least a week a year going to one of the Ritz Carlton hotels and doing a “job a day”, like being a housekeeper one day, doing room service one day, working in maintenance one day, etc. I really admire that guy for that.
Here’s a story exemplifying this. I was a Medical Director in a hospital system where the Board Quality Sub-Committee tasked the leadership with reducing the mortality rate, which had been unacceptably high for years, despite several attempts to reduce it. I told the senior leaders I knew how to move that dot. They asked me to come back in two weeks with a plan, which I did. One of the elements was requiring all of the leaders at a Director level and “above” to spend 4 hours twice a month actually doing either what they were trained to do (like nurses doing nursing, physicians seeing patients, MBA’s being ward clerks, MHA’s being receptionists, or intake persons, etc.) or shadowing a clinical or administrative person if they felt they were now incompetent due to absence from doing the job they were originally trained to do or had done in past positions. (I actually had Medical Directors reporting to me whom I had asked to consider seeing patients for four hours a week who told me they didn’t want to do it because they hadn’t seen patients for so long they felt they would be dangerous to patients. Sad.)
Anyway, at the next meeting they had the previous meeting evaluation comments at the back of the meeting packet. There were several comments next to the agenda item that included my plan that said things like “There is no way I can spend 4 hours every two weeks doing line staff work. I’m too busy” or “I don’t see how my spending time doing line staff work would be productive”, etc.
See, no appetite to experience the dysfunctional processes they created or maintained. I was thrilled when Undercover Boss came out a year or so later. I revisited the subject using the experiences of the bosses on Undercover Boss as leverage, but, alas, I got the same responses.
I’m sorry, but, unless you are personally experiencing what patients and staff are going through, you will never feel you are on a burning platform and motivate the company into making the changes that need to be made.
Solution 3. Mandated adoption of models like Lean and Six Sigma by certifying and accreditation agencies.
As noted in my previous post “The Real Story Behind the Opioid Epidemic”, The Joint Commission was quick to mass-mandate pain as a fifth vital sign and ramp up the opioid epidemic leading to thousands of deaths a year, but they fail to mandate a relentless focus on reliability and safety. Why won’t they mandate implementation of Lean or Six Sigma? Why won’t state hospital certifying agencies, Medicare or Medicaid mandate this? Why didn’t the Affordable Care Act mandate this? Oh, that’s easy, the ACA was pretty much a financing law, not a “let’s make health care affordable by making health care more efficient and reliable”. (It’s all about the money, again.)
By the way, Lean and Six Sigma includes in their foundations The Improvement Model. Using The Improvement Model alone, which is an incremental improvement model, it would take 30-50 years to get to the degree of reliability you can get with Lean and Six Sigma in 5-10 years if they are implemented correctly. Cases in point, The Institute for Healthcare Improvement, which primarily relies on hospital system “collaboratives” using improvement models like “The Improvement Model” has had pretty negligible impact. They totally missed their 5 million lives campaign goal. Amusingly, they never even figured out how to measure it! And Intermountain Healthcare has used and taught The Improvement Model for 30 years. Their reliability rate is nowhere near Toyota’s or the Nuclear or Airline industries.
Remember this?:
This book is about mistake proofing. Mistake proofing virtually eliminates errors and defects. There are industries in the US where this book is a “bible” for their safety and reliability programs. Mistake proofing is a key element in Lean. It’s called Poka-Yoke.
“A poka–yoke is any mechanism in a lean manufacturing process that helps an equipment operator avoid (yokeru) mistakes (poka). Its purpose is to eliminate product defects by preventing, correcting, or drawing attention to human errors as they occur.” (Poka-Yoke – Wikipedia)
You especially want to use mistake proofing when 1. A human is a step in the process and, 2. A defect causes a catastrophic event (like a train wreck).
Here’s an example of the difference between Six Sigma and Lean with mistake proofing:
Putting your car in gear or starting your car engine: The difference between Six Sigma and Mistake Proofing.
As of 2015, 263.6 million vehicles were registered in the US. Assume that each vehicle moves twice a day. That results in 527.2 million opportunities for a car to lurch forward or backward if a car is in gear when the engine starts or with the engine running as the transmission is engaged.
Six Sigma means you’ve improved a process so that there are only 3.4 defects or errors per 1,000,000 opportunities. If the process of starting your car or putting the car in gear was refined to the six sigma level, then the formula would be 3.4/1,000,000 x 527,000,000 = 1792 defects per year. In other words a car would lurch forward or backward, potentially causing injury or property damage 1792 times a year or about 5 times a day.
Mistake proofing eliminates this potential. You cannot start your car or put it in gear unless you have your foot on the brake. Number of errors a year = 0.
If you applied Six Sigma to surgeries:
There are about 23,982,000 surgeries in the US per year. If you could “six sigma” surgical processes to avoid errors/defects, and assume that the maximum errors per surgery is one, that would be only 81.5 surgical errors a year in the US. For 5500 hospitals. Wouldn’t you want that as a patient?
Here’s how weird it can get in health care. About 15 years ago, McKesson developed and released a robot to dispense medications in hospitals. The way the robot works is, medications are loaded into the machine in unit dose packets. The packets have a UPC code on it that has the medicine name (like aspirin), the type (like tablet) and the dose (like 325 mg). The robot is 99.999% error free and the 0.001% error is when the robot occasionally drops a packet, not that it picks the wrong packet (it then goes back an picks the next same, correct packet and delivers it so the error never reaches the patient) . It turned out at the time that the robots were deployed that no drug company supplied drugs in unit doses with the elements needed for the robot to pick the packet on the drug company produced unit dose packet. The UPC codes were on the box that contained unit doses or the drug only came in bottles with 30-90 tablets. And the UPC codes only included inventory control information. In order to have packets to load into the robot with the proper UPC coding, a person had to manually put the medicines into packets and create and print UPC codes to stick on the packets. Since a human doing only one task, on average, has a 2 sigma level error rate (5 errors/100 opportunities or 50,000 errors/1,000,000 opportunities), the entire robot process was only 2 sigma instead of better than six sigma.
I actually emailed C. Martin Hinckley, the author of Make No Mistake!, asking him if mistake proofing could be applied to health care processes. Here is his email response:
“In general, the level of mistake-proofing in health care is minimal or poorly implemented, but desperately needed. Perhaps a few examples can illustrate.
Observing a nurse insert an IV, the materials were laid out in preparation. The order of executing the cleansing was done incorrectly twice. Each time the nurse had to go to a storage closet on the other side of the floor to retrieve the correct materials, and start over again. This added 10 minutes to the execution of the task. If someone had not been observing the process (even though I am not familiar with the correct sequence), it is not clear whether or not the nurse would have retrieved additional materials until the process could be executed correctly. The key problem is that the sequence is not an obvious part of the task. The correct solution would be to have an IV insertion kit, where each required item is presented in the correct sequence. Ideal would be a folded kit, where one and only one item can be accessed at a time, and then only in the correct order. Although there is a minor packaging cost to present the materials in this way, the risk of doing the task incorrectly is dramatically reduced and the elimination of wasted motion and materials more than pays for the expense.
Observing phlebotomists, we watched as labels for vials were dropped (without detection by phlebotomists) as they departed to collect samples. At times, they selected the incorrect specimen label, or used the incorrect specimen vial. In a stat situation, this can result in delayed testing, and life threatening consequences. A better solution that would be relatively inexpensive would be to upload a prioritized list of draws to a bluetooth device. When the phlebotomist scans the patient’s armband, a portable printer (weighing just a few ounces) prints the exactly the correct number of labels, where each label has a symbol linked to the correct vial to use for each label. Because the labels are only available for each patient it is virtually impossible to select an incorrect label. All required specimens are collected, and errors in using the wrong vial can be completely eliminated.
EMAR carts represent an example of an extremely poor implementations. The intent is that the nurse scans the meds and the patient’s armband to assure that the meds are consistent with the patient needs. Unfortunately, moving the carts into the room creates a major problem for nurses. It takes longer than traditional methods, and if an urgent situation arises, they can not leave the cart unattended without securing the meds. To avoid these problems, nurses scan the barcodes in the patient books at the nurses station, circumventing the entire purpose of the EMAR carts since the nurse may then deliver the meds to the wrong patient. Good mistake-proofing always makes the process easier. As an alternative to an EMAR cart would be a small secure box prepared for each patient. The nurse carries the small box with a barcode scanner to the patient’s bedside. When they scan the patient’s armband, the scanner provides a bluetooth signal which is detected by the box. If the box matches the patient ID, the box opens. Now, the nurse doesn’t have to push a cart around. The equipment and use is significantly simpler. In an urgent situation they can simply close to box to the secure the meds and leave it in the room (rescanning the correct patient armband reopens the box).
We have observed similar problems in Clinical Chemistry, in OR with tool packs, in Emergency Rooms, in the arrangement heart catheters, in cleansing of scopes for colonoscopies and so forth.
It is generally quite difficult for implementers to translate the mistake-proofing techniques from another industry into their own environment. To illustrate, an automotive example at first glance may seem to have little or no application in healthcare. However, the principles that prevent omitted parts in an automobile assembly are generally the same as those that prevent an omitted part in an OR tool pack! This is the reason that classifying the problem is so essential. Use the classification scheme to identify the outcome you are trying to prevent, such as an omitted part. Then review solution principles for that classification to see if you can find one that you can use. Look up examples for the selected principle, and try to translate the solution into your environment.”
I tried for eight years in one healthcare system to get mistake proofing incorporated into every improvement effort, to no avail. I even worked with a process improvement engineer from a manufacturing company that used C. Martin Hinckley’s book to create an easily used solution tool (we called it the “solution starter”). It turns out there is a “hierarchy” to mistake proofing. Here is a copy of that tool with the hierarchy, with examples in the right column:
I’m here to tell you that the majority of health care interventions in process improvements are in the low impact section of the above chart. Health care needs more high impact solutions.
Making improvements in health care processes currently looks like this:
An error occurs that makes it to the attention of the risk management department or the quality department. It’s usually something big, like a sponge or instrument left in a patient, a wrong side surgery, a patient seriously harmed by something a staff person or a doctor did or ordered. As per Joint Commission standards, a root cause analysis ins done to determine what could have been done differently and the quality committee or department recommends a change in the process. The staff is “trained” (again, really educated) and it is presumed the recommended changes are happening. There is no measurement of compliance to the new process to ensure a) the staff are compliant with the new actions and 2) that the new process step change actually worked.
Plus, it is one hospital’s answer to a national systemic problem. So, there are 5500 different ways people are trying to make things better. Wonder why it is taking so long and progress is so slow?
What should be done is like what happened with Central Line Infection Prevention (except they didn’t go “all the way”). Dr. Pronovost and his team proved that by using a certain protocol, central line infections could approximate zero incidence. What didn’t happen is the protocol wasn’t mandated nationally. Adopting it is voluntary. I guarantee you every hospital in the US isn’t following that protocol.
To prove my point, read “Wrong Site Surgery”; Kathryn E. Engelhardt, MD; Cynthia Barnard, PhD, MBA; Karl Y. Bilimoria, MD,MS; JAMA November 28, 2017 Volume 318, Number 20; pp. 2033-2034. Here’s the URL so you can read it. https://jamanetwork.com/journals/jama/article-abstract/2664463. Notice how often they say what they recommend “may” do something. No proof. They admit that their interventions are solely relegated to their hospital/hospital system. I actually can’t figure out why JAMA would even publish this weak article except their editors are probably as clueless about what it takes to make true improvements as the rest of health care leaders.
So, here’s a synopsis of why the current Joint Commission model doesn’t work:
1. There usually isn’t a measurement system set up to monitor and prove success.
2. The intervention is hospital or hospital system specific and is usually created and implemented by process improvement amateurs.
3. Most interventions are weak and people based, meaning prone to error, distraction and apathy.
4. Interventions are not nationally organized and are always voluntary/optional.
Until this changes not much progress will be made. What is needed is funding similar to the way biomedical research is funded, maybe through the NIH, and process improvements, like the central line infection protocol, when proven to be truly effective, should be mandated for every hospital in the nation.
Solution 4. Change the health care Quality Committee make-up
This category is especially irritating to me. With few exceptions, health care Quality Committees are made up of these following types of individuals: Board of Director members, C-Suite Execs, VPs, Executive Directors and Directors. Who is missing? Line staff (like nurses doing nursing, Physicians and APC’s who are seeing patients, clerical staff working at patient contact sites or in cubicles), and, most importantly patients.
Whenever I requested a shake-up in Quality Committee make-up to include the missing folks, I got these types of responses:
“Well, we all used to do those things so we remember what it was like” (yeah, but, like 10-20 years ago)
“Some of our board members are our customers/patients, so we have that covered” (yeah, but, their fiduciary responsibility is to the company, they are biased)
I actually got to do this in one health care organization. It was because I was running a region of the organization and the C-suite guys were 80 miles away. We had a practicing physician, an APC (both had to not have any administrative duties), a clerical person (like telephone switchboard operator) and a staff nurse on our Quality Committee. The remainder of the committee was staff who were required to be on it by NCQA (Administrator, Medical Director, Marketing Director, Quality Manager, Utilization Manager, etc.).
You have no idea how many times the “non-traditional” members of the committee would say, “What, you guys really think that would work?” or “I need to talk about this with my peers and I’ll report next committee.” Otherwise, the “administration folks” would have gone forward. The other perk with having these “non-traditional” folks on the committee is, they would tell their co-workers what was going on and why (minus any confidential information, like case reviews). The Quality process became transparent to the staff. Can you imagine the committee member who was a telephone operator talking about the committee over lunch with her co-workers?
Unfortunately, that was the only place I could get this to happen. Too much elite-ism and too much fear on the part of the Execs of folks like line staff, physicians/APCs and patients knowing what was going on.
Unless the Quality Committee make-up is changed as noted above, don’t expect meaningful changes at the necessary speed required to make a difference.
Solution 5. Stop using Risk Adjustment to manage clinical change
Ok, I’m going to say something that probably all of the Medical pundits will call heresy. Everyone should stop using risk adjustment to create and manage/maintain clinical improvements.
I’ll say it again. Everyone should stop using risk adjustment to create and manage/maintain clinical improvements.
For those of you who don’t want to hear this, read my lips: Everyone should stop using risk adjustment to create and manage/maintain clinical improvements.
Of course, at this point the Medical pundits are likely demanding I have evidence to support my statement. Well, I do! They do too, but they 1. Ignore it and, 2. It’s a sacred cow that makes all the clinical people feel better about not doing so well. Remember, the WHO ranked the US something like 44th in the world for health care quality. And there are pundits saying, “It’s because our population is sicker.”
So, what is Risk Adjustment? It’s a group of statistical models that take various aspects of a patient, like age, gender, medical conditions, medications, etc (depending on the model) and give that person or a group of patients a risk score (or something similar). Then you can aggregate the individual risk scores to create a “population severity rating” or a population level risk score. For example, Medicare does this with what they call Hierarchical Condition Categories. Based on data in bills sent to Medicare by providers, Medicare calculates a risk score for each Medicare patient. Then Medicare applies the risk scores to various things related to the Medicare population, like payments to Medicare Advantage Plans and penalties for hospitals for select conditions related to mortality and readmissions. If a Medicare Advantage plan has a higher than average risk score for the population they are serving, Medicare “risk adjusts” the payment upward, i.e., they pay the health plan more because their population is “sicker”; “they would be expected to spend more money.” For the hospitals, the higher the risk scoring, the more they allow a higher mortality or readmission rate. What happens there is, when they “risk adjust”, i.e., the higher the risk score, the LOWER the mortality rate or readmission rate becomes “statistically”. Here’s how that works:
Hospital “A” has a mortality rate for heart attack of 3%. Let’s say Medicare doesn’t allow a mortality rate higher than 2% and would penalize this hospital if their risk score is 1 or less. But, due to the bills sent in by the hospital and other providers caring for Medicare beneficiaries in that hospital region, the Medicare Risk Adjustment model says Hospital “A” has a severity weight of 1.5. They divide the actual mortality rate (3%) by the severity weight (1.5) and the rate is risk adjusted to 2%. Voilà! No penalty. But wait! 3% of the patients actually died! Is that OK?
So, what happens now is, Hospital A says it is “doing OK”! They didn’t get penalized. They don’t have to do anything about people dying from heart attacks.
This is the problem with risk adjustment. Hospital “A” now has a false sense of security relative to its heart attack patients. And, since hospital Quality Departments prioritize working on regulatory and accreditation compliance over anything else, which includes things like Medicare penalties, Hospital “A” will take improving heart attack care off their work list. After all, they didn’t get penalized. The CEO and CFO won’t be on their backs about heart attack mortality because no money was lost in penalties.
I’ve seen it go the other way. A hospital I worked in had a “raw” mortality rate for a Medicare penalty condition that was lower than the Medicare average, but Medicare’s risk adjustment model said the risk score was so much below 1.0 that when they divided the risk score into the mortality rate, the mortality rate was higher than allowed (i.e., the mortality rate went up “statistically”). [Example: Hospital Mortality rate is 1.5%. Risk adjustment index is 0.7. Therefore, 1.5% divided by 0.7 = 2.1% “statistically”]
So, the hospital had to pay a penalty! Turns out, the lower risk score was due to inadequate coding on the bills being sent to Medicare by the hospital and the providers in that hospital region. It wasn’t that the patients were really less sick than the average Medicare beneficiary.
Here are the problems with risk adjustment.
- The Risk Scoring Models are imperfect models. Most are linear models using R-squared calculations and Mean Absolute Error (MAE) to report accuracy. Most are only up to 50% accurate for concurrent risk scoring (meaning where a patient’s risk is right now) for “Commercial” populations (ages 0-64) based on R-square. That means they are at least 50% wrong. It’s much lower, in the 15-20% range for “prospective” risk (trying to predict if a patient is going to be worse or better)(Accuracy of Claims-Based Risk Scoring Models; Society of Actuaries, 2016). The MAE’s are ~100% (An MAE of zero would indicate that the estimated risk score was always perfectly accurate). The smaller the population, the lower the percent accuracy. When you get down to the individual person level, you might as well have a monkey throw a dart at a matrix of risk scores tacked to a wall. The models vary in accuracy based on how you filter them. For example, they are pretty accurate when dealing with the group: “top 1% of spenders”. But that seems logical because of the narrow bandwidth with relative similarities of conditions of the most ill patients. These models are better used to assess groups of over 10,000 people or people who are already really sick.
There are some emerging non-linear models out there now using things like machine learning. They are too new to fully evaluate yet. The Society of Actuaries evaluated one model and found the R-squares lower than the linear models, but the MAE’s were better. Still a work in progress. Maybe they need different measures than R-squared or MAE, since they are non-linear models.
Risk adjusting/Severity weighting requires a comparator group. What this means is, your population is compared to another population or group of populations to get “severity rating”. You usually end up with something like an “index” or observed vs. expected ratios where 1.0 is average, below 1.0 is better then average and over 1.0 is worse than average. Each model uses different comparator populations. For example, MIDAS, a time honored hospital Quality Department data tool, uses about 800 hospitals, only 85 of which are categorized as a >400 bed hospital. The other 715 are smaller hospitals. Premier has about 500 hospitals, 80% of which are >400 bed hospitals. Not the same populations. Even though you can “filter” by hospital size, the statistical reliability based on 400 hospitals (Premier’s 80% of 500) is different from just 85 or so >400 bed hospitals in MIDAS. I’ve actually had the same data submitted to both of these companies and gotten very different results for the presumably same measure (mortality index).
This is true of all severity weighing/risk adjusting models. So, if the comparative group is performing poorly and you are performing a little less poorly, you look good, but you are really still not good. You are just the best of the bad. The whole risk adjustment idea inhibits motivation to improve care processes.
And clinical folks feel assured when their index is 1.0, when that is the “average” or, in other words, usually, mediocre. With an index of 1.0 they feel that they don’t have to do anything to get better. Scoring a 1.0 is an OK place to be to them when really it isn’t. Unless of course, everyone is performing almost perfectly, which isn’t the current reality.
Risk adjustment/Severity weighting and Risk Scoring models rely on bills submitted by hospitals and providers. So, the final scoring depends on how good a hospital or provider is with coding admissions or visits. You can be terrible at coding and doing a great job taking care of patients and get dinged by Medicare or an insurance company. Or, you can game the coding system, not be taking care of patients well, and look really good! Believe me, I’ve seen both of these things.
Here’s an example. A nurse in Michigan was working for a consulting group who went into hospitals and showed them how to “optimize” their coding. The nurse left the consulting company and got a job in the Claims Department (remember, a claim is a bill when it reaches an insurance company) of our health plan. Her job was looking for hospitals and providers who were “over-coding”. That could be coding for something that really didn’t exist or unbundling services that were considered part of a global service and should be billed inclusive to the global service code. So, this nurse went to the same hospitals where she taught them to “optimize” their coding and did audits against “correct coding’” to look for “over-coding” and then discounted the same coding she had previously taught them as being optimal coding as now being coding abuse!! That led to reductions in the hospital’s reimbursement. The hospitals angrily ushered her to the door of the hospitals and told her to never return!
What do you think that did to the hospital’s risk scoring? For the health plan I was in, the risk score went down. For insurance companies that didn’t do the audits that nurse did (or Medicare) the risk score was higher.
There is a high degree of variability in the way each model is calculated. This means you can look good in one model and bad in another. When I was a Medical Director in insurance companies, this was the biggest complaint of providers. They would have contracts with several insurance companies who were profiling them for payment rates and quality outcomes and bonuses. Each insurance company told them something different. To the point where the variation was so high, the providers just tossed the profile reports in the trash can. I had a network Pediatrics group tell me that our profile results said they were in the worst quartile of risk adjusted performance and our biggest competitor said they were the best Pediatric group in their network! In reality it can’t be both! It was risk adjustment methodology differences.
There was an article published in the New England Journal of Medicine that describes this phenomenon (“Variability in the Measurement of Hospital-wide Mortality Rates”; David M. Shahian, M.D., et al; N Engl J Med 2010;363:2530-9.) Here’s the Abstract results and conclusions sections:
“Vendors applied their risk-adjustment algorithms and provided predicted probabilities of in-hospital death for each discharge and for hospital-level observed and expected mortality rates.
Results
The proportions of discharges that were included by each method ranged from 28% to 95%, and the severity of patients’ diagnoses varied widely. Because of their discharge- selection criteria, two methods calculated in-hospital mortality rates (4.0% and 5.9%) that were twice the state average (2.1%). Pairwise associations (Pearson correlation coefficients) of discharge-level predicted mortality probabilities ranged from 0.46 to 0.70. Hospital-performance categorizations varied substantially and were sometimes completely discordant. In 2006, a total of 12 of 28 hospitals that had higher-than-expected hospital-wide mortality when classified by one method had lower-than-expected mortality when classified by one or more of the other methods.
Conclusions
Four common methods for calculating hospital-wide mortality produced substantially different results. This may have resulted from a lack of standardized national eligibility and exclusion criteria, different statistical methods, or fundamental flaws in the hypothesized association between hospital-wide mortality and quality of care. (Funded by the Massachusetts Division of Health Care Finance and Policy.)”
So, there you have it. These models are too imperfect to be used in creating and maintaining reliability and safety of processes in health care systems. They could be useful in following progress over time, like measured annually, to see if health care improvements are making an impact in a population, or for actuaries to use for large groups of beneficiaries in insurance plans, but risk adjustment/severity weighting should not be applied when managing the processes of care. The Raw Rate is the better measure. And the goal rate should be the Raw Rate, not an index or risk adjusted.
As I said, this stance is heresy in health care. Even an esteemed physician leader from the Mayo Clinic told me I was nuts. But, he really didn’t know anything about clinical process reliability.
That’s because the first thing a health care provider (using the term generically) will say when they get a report that doesn’t look so good is, “Well, MY patients are sicker!” And you have to prove that is or is not the case before they will talk to you again. So, the risk adjustment paradigm became “the law” in medicine. Then it got applied universally, when it should only be used for certain situations.
It’s like an airline who has various airplane sizes from Boeing 727’s to 747’s being allowed to have more crashes than an airline with only one size, like a Boeing 737 because the variety of airplanes has an increased theoretical complexity risk due to pilots needing to know more to fly the various planes and the mechanics have to know more relative to servicing the various airplanes. You can hear the CEO now saying, “Well, MY airplane situation is more complex than the other company’s”.
Really?
Not!
Summary
- US health care systems and their leaders do not have the will or training to create reliable health care processes.
- The Boards of Directors of health care institutions must drive the culture of reliability.
- There are five solutions that, if implemented forcefully and fully in the US, would dramatically improve health care process reliability.
- The possibility that anything in this post would be adopted by the US health care system is unlikely.