A new study by Stanford scholars and their colleagues shines a stark spotlight on governance issues that have plagued a cornerstone of the nation’s administrative system for years: rampant errors and a backlog of appeals cases involving veterans’ benefits.

Law Professor Daniel Ho is lead author of a study showing no improvement in veterans’ claim resolutions. (Image credit: Rod Searcey)

The volume of veterans’ appeals – of which the vast majority are related to disability compensation claims – is huge. Some 90 judges in the Board of Veterans’ Appeals (BVA) historically decided about 50,000 cases per year, with an inventory of over 425,000 cases pending.

It takes an average of seven years for a veteran’s disputed claim to get resolved. In fact, the Inspector General’s Office at the Department of Veterans Affairs estimated that in one quarter of 2016, 7 percent of cases were deemed “resolved” at the Veterans Benefits Administration because the veterans had died while waiting.

Part of the problem, the study suggests, stems from a mismanaged trade-off between quantity and quality. Thousands of cases that should not necessarily go up the chain of appeal end up on appeal. That prolongs the wait for decisions and increases the backlog of cases. And a program meant for quality review that was intended to reduce the rate of appeals and erroneous decisions largely failed to do so, the study finds.

“What was shocking about our findings is that the quality review program had none of its intended effects to reduce errors,” said Daniel Ho, the William Benjamin Scott and Luna M. Scott Professor of Law at Stanford and a senior fellow at the Stanford Institute for Economic Policy Research (SIEPR).

“When a veteran challenges the denial of disability benefits, Supreme Court cases require accurate decision-making as a matter of due process,” Ho said. “Based on our study, it’s hard to believe the BVA is meeting that goal of accuracy. Or it is overstating it in pretty dramatic terms.”

The stakes go beyond veterans’ lives, too.

The study raises broader questions about government oversight and the value of these kinds of review systems that have become a management linchpin for federal agencies, Ho said. Under the Government Performance and Results Act of 1993, federal agencies are required to provide performance measures and many agencies have instituted quality assurance systems to improve and measure quality of service delivery. The deficiency of BVA opinions may even rise to the level of a constitutional problem, the authors note.

“Our results paint a sobering picture about the ability for an agency to internally develop such quality assurance initiatives,” the study states. And the BVA’s quality review program, as it is structured today, will “unlikely address the longstanding quality problems in (veterans’) adjudication.”

The study is detailed in a working paper set to be published in a forthcoming issue of the Journal of Law, Economics, and Organization and a companion paper on the more general crisis in mass adjudication forthcoming in the Stanford Law Review.

This research is the first to rigorously examine the effectiveness of quality assurance systems, long espoused by scholars and policymakers, using data on nearly 600,000 veterans’ appeals cases from 2003 to 2016 that had never been accessible before by outside researchers. Ho and his fellow researchers also interviewed a wide range of officials and secured a rich set of internal records using the Freedom of Information Act.

Ho’s co-authors are David Ames, former chief of the Office of Quality Review at the U.S. Department of Veterans Affairs; David Marcus, a law professor at the University of California, Los Angeles; and Cassandra Handan-Nader, a doctoral candidate in political science at Stanford.

Ames developed concerns about the effectiveness of quality review during his time overseeing the process. “I found it increasingly difficult to shake the suspicion that our work was not benefiting veterans,” he said.

Detecting errors

At issue, the researchers say, are decision errors, including those based on legally inadequate explanations, inaccurate documentation, and due process mistakes. Their analysis also shows evidence of discrepancies, or inconsistent judgments between similar cases. The BVA established its quality review program in 1998 to try to fix some of these problems, but with no substantial effect toward that end, they found.

Under the program, 5 percent of appeals cases were randomly selected to undergo an additional layer of review by an elite slate of staff attorneys to detect and correct any errors before a decision moved forward. Random selection enabled researchers to test whether such review in fact reduced subsequent appeals and remands – the rate at which disputed BVA decisions are sent back to the agency for further review.

Yet the appeals rate and remand rates remained indistinguishable – despite the quality review efforts that were supposed to catch and deter errors to begin with.

“The caseload makes it difficult to guarantee no errors, but intensive review by an elite set of attorneys to correct errors had little effect,” Ho said.

Measuring accuracy or gaming statistics?

The reasons for the ineffectiveness, Ho explained, are the cross-purposes of quality review. The same agency charged with its own quality review faces a competing interest in trying to keep case numbers and accuracy high for its performance measures, which in turn affect its funding allocations. For over a decade, the BVA has published and touted its “accuracy rate” to Congress and the public as being between 91 and 95 percent.

In their analysis, however, researchers found that BVA deployed an extremely deferential way of counting errors, inflating the agency’s measure of accuracy. When the quality review team deemed the decision error-free and the case was appealed further, it was still remanded – sent back to the agency – nearly three-fourths of the time.

“It is well-known in the social science literature that creating your own performance measures poses conflicts of interest,” Ho said. “We found that over time, the quality review process was used to generate the appearance of effectiveness [rather] than to actually improve performance.”

The findings also bolster criticisms that the VA’s Office of General Counsel and others have raised in public records and internal documents, according to the study.

Back in 2010, the general counsel questioned, for instance, the BVA’s reported accuracy rates, in light of high remand rates. And some 100 staff attorneys submitted a loss-of-confidence statement to congressional committees in 2017, contending how the BVA’s increased production quota, “gross mismanagement” and inadequate training have failed to deliver accurate decisions to veterans.

Despite increasing the output to 85,000 cases over the last year, the BVA continues to tout a 94 percent accuracy rate.

The study concludes that this accuracy statistic is inaccurate.

“We were wasting some of the Board’s most talented attorneys on producing an essentially arbitrary number that glossed over quality problems, when those same attorneys could have been proactively working to reduce errors and inefficiencies,” Ames said.

Stanford Vice Provost and Dean of Research Kathryn Moler wants all research resources to be as readily available as books in a library. This model would enable faculty and students to pursue the most innovative research in flexible, collaborative teams.