Virginia Hughes wrote in How Many People Are Wrongly Convicted? Researchers Do the Math.:

So then the next terrifying question is, geeze, how many innocent people have actually been executed?
Fortunately it’s probably not many. Innocent defendants are far more likley to have their sentenced changed to life in prison than to be executed. Still, with an error rate of 4 percent, the researchers write, “it is all but certain that several of the 1,320 defendants executed since 1977 were innocent.”
It’s impossible to say whether this 4.1 percent false conviction rate applies to defendants who never went to death row. But I’ll leave you with one last depressing thought. Of all of the people found guilty of capital murder, less than half actually get a death-penalty sentence. And when juries are determining whether to send a defendant to death row or to life in prison, surveys show that they tend to choose life sentences when they have “residual doubt” about the defendant’s guilt.
That means, then, that the rate of innocent defendants serving life in prison is higher than those on death row. “They are sentenced,” the authors write, “and then forgotten….” http://phenomena.nationalgeographic.com/2014/04/28/how-many-people-are-wrongly-convicted-researchers-do-the-math/

Abstract
The ability of the experienced forensic scientist to evaluate his or her results given the circumstances and propositions in a particular case and present this to the court in a clear and concise way is very important for the legal process. Court officials can neither be expected to be able to interpret scientific data, nor is it their task to do so (in our opinion). The duty of the court is rather to perform the ultimate evidence evaluation of all the information in the case combined, including police reports, statements from suspects and victims, witness reports forensic expert statements, etc. Without the aid of the forensic expert, valuable forensic results may be overlooked or misinterpreted in this process. The scientific framework for forensic interpretation stems from Bayesian theory. The resulting likelihood ratio, which may be expressed using a verbal or a numerical scale, compares how frequent are the obtained results given that one of the propositions holds with how frequent they are given that the other proposition holds. A common misunderstanding is that this approach must be restricted to forensic areas such as DNA evidence where extensive background information is present in the form of comprehensive databases. In this article we argue that the approach with likelihood ratios is equally applicable in areas where the results rely on scientific background data combined with the knowledge and experience of the forensic scientist. In such forensic areas the scale of the likelihood ratio may be rougher compared to a DNA case, but the information that is conveyed by the likelihood ratio may nevertheless be highly valuable for the court. https://academic.oup.com/lpr/article-abstract/11/4/303/931682/The-likelihood-ratio-as-value-of-evidence-more

Science Daily reported in Courtroom use of ‘Likelihood Ratio’ not consistently supported by scientific reasoning approach:

Two experts at the National Institute of Standards and Technology (NIST) are calling into question a method of presenting evidence in courtrooms, arguing that it risks allowing personal preference to creep into expert testimony and potentially distorts evidence for a jury.
The method involves the use of Likelihood Ratio (LR), a statistical tool that gives experts a shorthand way to communicate their assessment of how strongly forensic evidence, such as a fingerprint or DNA sample, can be tied to a suspect. In essence, LR allows a forensics expert to boil down a potentially complicated set of circumstances into a number — providing a pathway for experts to concisely express their conclusions based on a logical and coherent framework. LR’s proponents say it is appropriate for courtroom use; some even argue that it is the only appropriate method by which an expert should explain evidence to jurors or attorneys.
However, in a new paper published in the Journal of Research of the National Institute of Standards and Technology, statisticians Steve Lund and Hari Iyer caution that the justification for using LR in courtrooms is flawed. The justification is founded on a reasoning approach called Bayesian decision theory, which has long been used by the scientific community to create logic-based statements of probability. But Lund and Iyer argue that while Bayesian reasoning works well in personal decision making, it breaks down in situations where information must be conveyed from one person to another such as in courtroom testimony.
These findings could contribute to the discussion among forensic scientists regarding LR, which is increasingly used in criminal courts in the U.S. and Europe.
While the NIST authors stop short of stating that LR ought not to be employed whatsoever, they caution that using it as a one-size-fits-all method for describing the weight of evidence risks conclusions being driven more by unsubstantiated assumptions than by actual data. They recommend using LR only in cases where a probability-based model is warranted. Last year’s report from the President’s Council of Advisors on Science and Technology (PCAST) mentions some of these situations, such as the evaluation
Bayesian reasoning is a structured way of evaluating and re-evaluating a situation as new evidence comes up. If a child who rarely eats sweets says he did not eat the last piece of blueberry pie, his older sister might initially think it unlikely that he did, but if she spies a bit of blue stain on his shirt, she might adjust that likelihood upward. Applying a rigorous version of this approach to complex forensic evidence allows an expert to come up with a logic-based numerical LR that makes sense to the expert as an individual.
The trouble arises when other people — such as jurors — are instructed to incorporate the expert’s LR into their own decision-making. An expert’s judgment often involves complicated statistical techniques that can give different LRs depending on which expert is making the judgment. As a result, one expert’s specific LR number can differ substantially from another’s.
“Two people can employ Bayesian reasoning correctly and come up with two substantially different answers,” Lund said. “Which answer should you believe, if you’re a juror?”
Viewpoints differ on the appropriateness of using LR in court. Some of these differences stem from the view that jurors primarily need a tool to help them to determine reasonable doubt, not particular degrees of certainty. To Christophe Champod, a professor of forensic science at the University of Lausanne, Switzerland, an argument over LR’s statistical purity overlooks what is most important to a jury….
The NIST authors, however, maintain that for a technique to be broadly applicable, it needs to be based on measurements that can be replicated. In this regard, LR often falls short, according to the authors.
“Our success in forensic science depends on our ability to measure well. The anticipated use of LR in the courtroom treats it like it’s a universally observable quantity, no matter who measures it,” Lund said. “But it’s not a standardized measurement. By its own definition, there is no true LR that can be shared, and the differences between any two individual LRs may be substantial.”
The NIST authors do not state that LR is always problematic; it may be suitable in situations where LR assessments from any two people would differ inconsequentially. Their paper offers a framework for making such assessments, including examples for applying them.
Ultimately, the authors contend it is important for experts to be open to other, more suitable science-based approaches rather than using LR indiscriminately. Because these other methods are still under development, the danger is that the criminal justice system could treat the matter as settled…. https://www.sciencedaily.com/news/health_medicine/

Citation:

Caution in use of courtroom evidence presentation methods urged
Courtroom use of ‘Likelihood Ratio’ not consistently supported by scientific reasoning approach
Date:
October 12, 2017
Source:
National Institute of Standards and Technology (NIST)
Summary:
Two experts are calling into question a shorthand method of presenting forensic evidence in courtrooms, arguing that it risks allowing personal preference to creep into expert testimony and potentially distorts evidence for a jury.
Journal Reference:
1. Steven P. Lund, Hari Iyer. Likelihood Ratio as Weight of Forensic Evidence: A Closer Look. Journal of Research of the National Institute of Standards and Technology, 2017; 122 DOI: 10.6028/jres.122.027

Here is the press release from NIST:

NIST Experts Urge Caution in Use of Courtroom Evidence Presentation Method
Use of ‘Likelihood Ratio’ not consistently supported by scientific reasoning approach, authors state.
October 12, 2017
Two experts at the National Institute of Standards and Technology (NIST) are calling into question a method of presenting evidence in courtrooms, arguing that it risks allowing personal preference to creep into expert testimony and potentially distorts evidence for a jury.
The method involves the use of Likelihood Ratio (LR), a statistical tool that gives experts a shorthand way to communicate their assessment of how strongly forensic evidence, such as a fingerprint or DNA sample, can be tied to a suspect. In essence, LR allows a forensics expert to boil down a potentially complicated set of circumstances into a number—providing a pathway for experts to concisely express their conclusions based on a logical and coherent framework. LR’s proponents say it is appropriate for courtroom use; some even argue that it is the only appropriate method by which an expert should explain evidence to jurors or attorneys.
However, in a new paper published in the Journal of Research of the National Institute of Standards and Technology (link is external), statisticians Steve Lund and Hari Iyer caution that the justification for using LR in courtrooms is flawed. The justification is founded on a reasoning approach called Bayesian decision theory (link is external), which has long been used by the scientific community to create logic-based statements of probability. But Lund and Iyer argue that while Bayesian reasoning works well in personal decision making, it breaks down in situations where information must be conveyed from one person to another such as in courtroom testimony.
These findings could contribute to the discussion among forensic scientists regarding LR, which is increasingly used in criminal courts in the U.S. and Europe.
While the NIST authors stop short of stating that LR ought not to be employed whatsoever, they caution that using it as a one-size-fits-all method for describing the weight of evidence risks conclusions being driven more by unsubstantiated assumptions than by actual data. They recommend using LR only in cases where a probability-based model is warranted. Last year’s report (link is external) from the President’s Council of Advisors on Science and Technology (PCAST) mentions some of these situations, such as the evaluation of high-quality samples of DNA from a single source.
“We are not suggesting that LR should never be used in court, but its envisioned role as the default or exclusive way to transfer information is unjustified,” Lund said. “Bayesian theory does not support using an expert’s opinion, even when expressed numerically, as a universal weight of evidence. Among different ways of presenting information, it has not been shown that LR is most appropriate.”
Bayesian reasoning is a structured way of evaluating and re-evaluating a situation as new evidence comes up. If a child who rarely eats sweets says he did not eat the last piece of blueberry pie, his older sister might initially think it unlikely that he did, but if she spies a bit of blue stain on his shirt, she might adjust that likelihood upward. Applying a rigorous version of this approach to complex forensic evidence allows an expert to come up with a logic-based numerical LR that makes sense to the expert as an individual.
The trouble arises when other people—such as jurors—are instructed to incorporate the expert’s LR into their own decision-making. An expert’s judgment often involves complicated statistical techniques that can give different LRs depending on which expert is making the judgment. As a result, one expert’s specific LR number can differ substantially from another’s.
“Two people can employ Bayesian reasoning correctly and come up with two substantially different answers,” Lund said. “Which answer should you believe, if you’re a juror?”
In the blueberry pie example, imagine a jury had to rely on expert testimony to determine the probability that the stain came from a specific pie. Two different experts could be completely consistent with Bayesian theory, but one could testify to, say, an LR of 50 and another to an LR of 500—the difference stemming from their own statistical approaches and knowledge bases. But if jurors were to hear 50 rather than 500, it could lead them to make a different ultimate decision.
Viewpoints differ on the appropriateness of using LR in court. Some of these differences stem from the view that jurors primarily need a tool to help them to determine reasonable doubt, not particular degrees of certainty. To Christophe Champod, a professor of forensic science at the University of Lausanne, Switzerland, an argument over LR’s statistical purity overlooks what is most important to a jury.
“We’re a bit presumptuous as expert witnesses that our testimony matters that much,” Champod said. “LR could perhaps be more statistically pure in the grand scheme, but it’s not the most significant factor. Transparency is. What matters is telling the jury what the basis of our testimony is, where our data comes from, and why we judge it the way we do.”
The NIST authors, however, maintain that for a technique to be broadly applicable, it needs to be based on measurements that can be replicated. In this regard, LR often falls short, according to the authors.
“Our success in forensic science depends on our ability to measure well. The anticipated use of LR in the courtroom treats it like it’s a universally observable quantity, no matter who measures it,” Lund said. “But it’s not a standardized measurement. By its own definition, there is no true LR that can be shared, and the differences between any two individual LRs may be substantial.”
The NIST authors do not state that LR is always problematic; it may be suitable in situations where LR assessments from any two people would differ inconsequentially. Their paper offers a framework for making such assessments, including examples for applying them.
Ultimately, the authors contend it is important for experts to be open to other, more suitable science-based approaches rather than using LR indiscriminately. Because these other methods are still under development, the danger is that the criminal justice system could treat the matter as settled.
“Just because we have a tool, we should not assume it’s good enough,” Lund said. “We should continue looking for the most effective way to communicate the weight of evidence to a nonexpert audience.”
Paper: S.P. Lund and H. Iyer, Likelihood Ratio as Weight of Forensic Evidence: A Closer Look. Journal of Research of National Institute of Standards and Technology, Published online 12 October 2017. DOI: 10.6028/jres.122.027 (link is external)
Forensic Science, Mathematics & Statistics, Public safety and Standards
Media Contact
Chad Boutin
boutin@nist.gov (link sends e-mail)
(301) 975-4261
Related News
• How to Quantify the Weight of Forensic Evidence: A Lively Debate
The ability to afford to challenge evidence often determines the outcome of very serious charges.

Mathew Shaer wrote in the The False Promise of DNA Testing: The forensic technique is becoming ever more common—and ever less reliable.:

Modern forensic science is in the midst of a great reckoning. Since a series of high-profile legal challenges in the 1990s increased scrutiny of forensic evidence, a range of long-standing crime-lab methods have been deflated or outright debunked. Bite-mark analysis—a kind of dental fingerprinting that dates back to the Salem witch trials—is now widely considered unreliable; the “uniqueness and reproducibility” of ballistics testing has been called into question by the National Research Council. In 2004, the FBI was forced to issue an apology after it incorrectly connected an Oregon attorney named Brandon Mayfield to that spring’s train bombings in Madrid, on the basis of a “100 percent” match to partial fingerprints found on plastic bags containing detonator devices. Last year, the bureau admitted that it had reviewed testimony by its microscopic-hair-comparison analysts and found errors in at least 90 percent of the cases. A thorough investigation is now under way…. https://www.theatlantic.com/magazine/archive/2016/06/a-reasonable-doubt/480747/

The reliability of the evidence and the ability of a particular accused to defend against evidence presented in a court hearing is crucial to preventing the innocent from being convicted.