Study Calls Medicare Pay-for-Performance Program Results Into Question

— MIPS doesn't consistently correlate with process, outcome measure performance in primary care

by Crystal Phend, Contributing Editor, 木瓜直播 December 6, 2022

A close up photo of a female physician with both hand making the thumbs down gesture.

Physicians' rankings by the Medicare Merit-based Incentive Payment System (MIPS) only occasionally correlated with process and outcome measure performance, a large primary care study showed.

Physicians whose MIPS scores fell in the range to qualify for "exceptional bonus" payments actually had significantly worse mean performance on three of five process measures compared with those whose MIPS scores qualified for penalties, according to the study of a 20% sample of primary care physicians in the U.S. participating in the program.

In terms of risk-adjusted acute patient outcomes, the high-scoring physicians had fewer patients require emergency department visits but more all-cause hospitalizations compared with the low scorers, while there was no difference on four other admissions measures sensitive to ambulatory care.

For composite outcomes, about as many high MIPS scorers were in the bottom quintile as low scorers were in the top quintile -- "approximately as effective as chance," researchers led by Amelia M. Bond, PhD, of Weill Cornell Medical College in New York City, reported in .

"These findings suggest that the MIPS program may be ineffective at measuring and incentivizing quality improvement among U.S. physicians," the group concluded.

It's "a reminder that redistributions resulting from pay for performance are sensitive to measure selection" and suggests "that the MIPS program does not reward the care society wants," according to an accompanying by J. Michael McWilliams, MD, PhD, of Harvard 木瓜直播 School in Boston.

So "should the MIPS be ? Theory and evidence would argue yes. Should it be replaced with a refined or scaled-back pay-for-performance program? For the same reasons, no," McWilliams argued.

MIPS emerged from the 2015 Medicare and CHIP Reauthorization Act (MACRA) as one of the two tracks for participation in the Quality Payment Program, cobbled together from three legacy programs: the Physician Quality Reporting System, the Value-Based Payment Modifier, and the Medicare EHR Incentive Program for Eligible Professionals.

Based on performance in 2019 -- the year analyzed by the study -- MIPS scores of 30 or less on the 100-point scale meant clinicians would have their Medicare Part B payments docked by up to 7% in 2021. Those scoring over 30 to 75 got a positive adjustment of up to 7% in 2021, and even higher scores got an "exceptional performance" bonus (although the actual positive adjustments were smaller in order to be budget neutral).

Since then, along with the relative weighting for quality and cost. Physicians also now need to score at least 75 to avoid a potential penalty and at least 89 to get an "exceptional performance" bonus. One's 2022 MIPS score will mean up to a 9% adjustment up or down for 2024 Medicare Part B claims.

But simply isn't going to fix anything in a system this "impressively flawed," McWilliams said.

"Apart from conceptual reservations, a decade of empirical evidence on the effects of pay for performance is not encouraging. Less charitably, it is damning," he wrote, adding that "the intractability of the drawbacks -- long recognized in the economics and management literature -- should not be underestimated."

Quality can't be bought; rather, what wins the quality improvement movement has largely been through "leveraging peer motivation (e.g., via teams), cultivating collective wisdom to support decision-making, and engaging clinicians in systems change," he argued.

"Competition is another mechanism that deserves more attention in the quality movement," McWilliams added, suggesting possible levers via "tougher antitrust measures, limits on noncompete provisions in physician employment contracts, and technical assistance from payers to ease market entry of promising delivery models."

Notably, in the study, physicians who scored the worst on MIPS but who achieved superior patient outcomes tended to have more medically complex and socially vulnerable patients and were more likely to work in small and independent practices, whereas high-scoring physicians with poor patient outcomes cared for fewer medically complex and socially vulnerable patients.

"These findings suggest that physicians caring for vulnerable populations are more likely to be penalized by MIPS scoring, even when delivering relatively high-quality care," Bond and colleagues wrote.

Along with potentially adding to inequity in healthcare, it also suggests that "high MIPS scores may reflect the ability of physicians to collect, analyze, and report data -- not the delivery of better medical care," the researchers added.

The study encompassed the 80,246 primary care physicians participating in MIPS in 2019, using Physician Compare files together with a 20% sample of 2018-2019 Medicare fee-for-service claims for their approximately 3.4 million patients. Among the physicians, 5.9% received a low MIPS score (mean 29.9) and 86.4% received a high MIPS score (mean 92.8).

The performance measures selected by Bond's group showed that the low scorers did worse on diabetic eye examinations (56.1% vs 63.2%), diabetic hemoglobin A1c screening (84.6% vs 89.4%), and mammography screening (58.2% vs 70.4%), which were all significant at P<0.001. But the low MIPS scoring group actually had higher rates of influenza vaccination (78.0% vs 76.8%, P=0.045) and tobacco screening (95.0% vs 94.1%, P=0.001).

Possible reasons for discordance between MIPS scores and outcomes selected for the study could be because "physicians can select the measures they report to the MIPS program from a broad set of quality measures (including measures outside of their specialty), making meaningful comparisons across physicians challenging," or because MIPS measures are invalid or of uncertain validity, the researchers pointed out.

On the other hand, the editorialist said, "acute events such as hospitalizations and emergency department visits also may not be the right basis for measuring quality, because enhanced outreach and access may increase acute care use."

Other study limitations were not encompassing all outcomes of importance to patients and clinicians, the potential for correlations to change from year to year as the MIPS parameters change, and possible confounding."

Crystal Phend is a contributing editor at 木瓜直播.

Disclosures

The study was supported by the Physicians Foundation Center for the Study of Physician Practice and Leadership at Weill Cornell Medicine.

Bond reported receiving grants from Arnold Ventures, the Commonwealth Fund, and the American Medical Association.

McWilliams reported receiving consulting fees from RTI International, Blue Cross Blue Shield of North Carolina, and Abt Associates; serving as a senior advisor for the Center for Medicare and Medicaid Innovation; and serving as an unpaid member of the board of directors for the Institute for Accountable Care.

Primary Source

JAMA

Bond AM, et al "Association between individual primary care physician Merit-based Incentive Payment System score and measures of process and patient outcomes" JAMA 2022; DOI: 10.1001/jama.2022.20619.

Secondary Source

JAMA

McWilliams JM "Pay for performance: When slogans overtake science in health policy" JAMA 2022; DOI: 10.1001/jama.2022.20945.