Pages

Thursday, June 28, 2012

Preface: This posting replaces one from June 28. Part of that initial discussion of the Arizona Supreme Court's opinion was, I think, unwarranted. In particular the criticism of the court's treatment of the state interests may not have been accurate. Complex opinions, like good literature, rarely can be fully grasped on a first reading.

* * *

A few days ago, the Supreme Court of Arizona promulgated a creative “don’t peek” rule for DNA samples routinely taken from juveniles before a finding of delinquency. Justice Andrew Hurwitz (who has just moved to the U.S. Court of Appeals for the Ninth Circuit) penned the unanimous opinion in Mario W. v. Kaipio, Commissioner, No. CV-11-0344-PR (Ariz. June 27, 2012). The opinion injects some new ideas and analysis into the legal controversy over arrestee DNA sampling, but I have to question whether the reasoning is sufficient to support the result the court reaches and to ask how far the court's theory of Fourth Amendment privacy extends.

At the outset, the Arizona court quite properly sets out the normal rule that Fourth Amendment reasonableness requires a warrant and probable cause unless a categorical exception to these requirements exists. But then the court states that “[t]he parties do not dispute the applicability of the totality of the circumstances test, and we therefore analyze the Arizona scheme under that rubric.” This is hardly a ringing endorsement of this mode of analysis, but it is the way most courts approach the issue [1].

Getting to the specifics of DNA sampling on arrest, the court observes that there are “two separate intrusions” and “two searches — ‘the physical collection of the DNA sample’ and the ‘processing of the DNA sample.’” The former observation is basically correct. “The seizure of buccal cells is a physical intrusion, but does not reveal by itself intimate personal information about the individual.”1/

But the laboratory analysis probably is not a “later search.” The U.S. Supreme Court, at any rate, has yet to hold that physical testing or inspection is a separate search simply because it produces information about the substance being analyzed. Indeed, the Court, in two opinions—United States v. Edwards, 415 U.S. 800 (1974), and United States v. Jacobsen, 466 U.S. 109 (1984)—has held the opposite.

The Arizona court relies on an analogy to containers. It maintains that human cells are like steamer trunks or purses that contain private possessions. The police engage in a search when they open such a container and rummage through its contents.

The analogy looks good at first blush. People surely have reasonable expectations of privacy in the contents of their luggage and their purses. The Orthodox Jew on Yom Kippur with an apple core in her purse, the Catholic juvenile with birth control pills in hers, and the English literature professor with sleazy novels in his trunk all have a fair claim to freedom from unregulated intrusions into their purses or luggage. The police will all but inevitably espy these legal but embarrassing items if they look through the container without a warrant.

But compare this with the laboratory analysis of the epithelial cells. The laboratory extracts a single kind of molecule—DNA. It does not look at the rest of the cell. Within the DNA, it looks at a tiny fraction of the genome—locations (“loci”) that are not potentially embarrassing (except insofar as they match crime-scene samples).1/ The situation begins to resemble cases in which dogs that (supposedly) alert only to drugs are used to sniff luggage—and that, the Court has twice held, is not a search.2/

Because the government does not look through the parts of genome in which an individual has a strong expectation of privacy, a better analogy is required. Imagine, then, that every time a person commits a crime, a mysterious being delivers an envelope to the police that always contains only two things—a card with the name of an individual who was at the scene of the crime (but not necessarily at the time the crime occurred) and a key to a safe deposit box in that person’s name. Is opening the envelope a “search” that triggers the need for a warrant or an exception to the warrant requirement? Maybe, but the cases and the doctrine cited in Mario W. are insufficient to establish this result. All that the container cases establish is that the police must abide by the constitutional requirements for searches before and when they use the key to open the safe deposit box. The box, of course, is the vast part of the human genome that the police do not open in DNA testing for identity. In DNA profiling for law enforcement databases, they only read the name on the card.3/

Yet, whether one denominates the laboratory analysis as a separate search is not decisive. It might be a constitutionally permissible, warrantless, probable-causeless search, at least under the totality-of-the-circumstances balancing test. The Arizona justices reject this conclusion in favor of the following rule: (1) the state’s “important interest in locating an absconding juvenile and, perhaps years after charges were filed, ascertaining that the person located is the one previously charged” justifies collecting the sample—“even if a formal judicial determination of probable cause was not made at the advisory hearing.” However, (2) no combination of state interests justifies the warrantless laboratory analysis of the DNA sample (a) to determine whether it matches unsolved crime samples or (b) to have a profile in a database that will identify the juvenile as the contributor of DNA found in future crimes.

But why is taking DNA solely for “locating an absconding juvenile” so critical when the state already takes fingerprints that can be used this purpose? Doesn't the fingerprint on file eliminate the need to house the DNA as well, as the Maryland Court of Appeals recently held in King v. State, 42 A.3d 549 (Md. 2012)?

The Arizona court’s answer is that “[o]ne arrested for a serious crime may be fingerprinted before a judicial determination of probable cause. ... A judicial order to provide a buccal cell sample occasions no constitutionally distinguishable intrusion.” This suggests that the state can choose either fingerprints or DNA as the source of identifying marks. However, if a DNA profile is “intimate personal information about the individual” merely because it constitutes “uniquely identifying information”—which is all that Mario W. says about informational privacy—then fingerprints are equally “intimate personal information.” They too provide “uniquely identifying information.” Indeed, they are better for this purpose, for they permit differentiation of identical twins.

So does Mario W. prohibit the state from examining the minutiae in fingerprints unless or until arrestees are convicted (the no-peeking rule)? From running an arrestee’s print against a database of prints from unsolved crimes? From adding the fingerprint to the national Automated Fingerprint Identification System database (AFIS) before that point? Of course, DNA loci might be significantly more threatening to privacy than fingerprint details, but that conclusion is far from obvious [1].

In analyzing the state’s interests in pre-conviction DNA analysis, the opinion correctly notes that the value in solving unrelated crimes (and in deterring future ones) is reduced considerably by two features of the Arizona law. As with all pre-conviction profiling and databasing, many of the arrestees would have their samples analyzed and included in the state database after they are convicted anyway. As for the ones who are not convicted, the Arizona law does not permit continued use of the profiles. Thus, the opinion notes, with current technology and staffing, the government has the benefit of the profiles for only a month or so (for those who not adjudicated delinquent) and for only an extra month or so for the others.

These points help explain the court's balancing, but how enduring are they? Advances in technology, making it possible to analyze profiles in a matter of hours, easily could extend the period of pre-conviction use. In addition, what would happen if the law did not require the samples to be removed from the database in the event that the state does not prove delinquency? Obviously, that would advance the state's law enforcement interests (although it might not be politically popular). The sad fact is that lots of people who are arrested but never convicted commit later crimes. If DNA is to be believed, California's "Grim Sleeper" killer is one. Lives would have been saved had his profile been acquired at his first of sixteen arrests and kept in a state database. Of course, the mere fact that law enforcement could gain by keeping tabs on more people cannot make all such practices constitutional. Still, post-adjudication retention of juvenile DNA profiles in all cases adds something to the state's interests that is missing in the Arizona system of juvenile arrestee DNA databasing and that therefore must be considered in totality balancing for that more extensive system.

Interestingly, the Mario W. court intimates that expungement is mandatory "given the constitutional presumption of innocence" and the fact that those accused of crimes "do not forfeit Fourth Amendment protections." This part of the opinion raises several puzzles. Given the history and cases on the presumption of innocence, it is an expansive reading of the presumption [2]. Moreover, if the presumption does mean that the state may not include DNA profiles of those arrested but not ultimately convicted in databases, what of fingerprints, which are retained indefinitely? As noted earlier, the court's theory as to why DNA profiling invades informational privacy seems to apply with equal force to AFIS databases.
That people to not forfeit Fourth Amendment rights just because they are accused of crimes—or, for that matter, convicted of them—is important, but it does not imply that the Fourth Amendment is an absolute barrier to suspicionless profiling and databasing. The opinion asserts that

[O]ne accused of a crime, although having diminished expectations of privacy in some respects, does not forfeit Fourth Amendment protections with respect to other offenses not charged absent either probable cause or reasonable suspicion. An arrest for vehicular homicide, for example, cannot alone justify a warrantless search of an arrestee’s financial records to see if he is also an embezzler.

As with the purse and the trunk, the financial records of the arrestee merit strong Fourth Amendment protection (unless, according the U.S. Supreme Court, they are held by a bank or other third party). But what is it about the DNA loci that merits similar protection? The state’s claim is not that an arrest justifies every unrelated search. It is (or should be) that the custodial arrest justifies using identifying marks—whether they are within fingerprint impressions or DNA molecules—for identification of the person and then for speculative searching against the marks left at past and future unsolved crimes.

To be sure, a sensitive balancing of individual and public interests might lead to the conclusion that the latter goes too far. But the assumption in Mario W. seems to be that tokens of an individual’s identity are necessarily “intimate personal information” that impose a "serious intrusion on ... privacy interests." Without a clearer and more convincing analysis of the actual privacy interests associated with the many things that mark us as individuals—DNA profiles, fingerprints, iris scans, even photographs—Mario W. raises more questions than it answers.

Notes

1. One can quibble with the term “seizure,” for the extraction of the cells in the inner surface of the cheek does not seem to be a seizure in the Fourth Amendment sense. Unlike keeping a person away from his home or luggage or stopping him, it is not a substantial interference with the individual’s use of his possessions or his person. It is, however, probably a search under Cupp v. Murphy, 412 U.S. 291 (1973) (physical intrusion under fingernail), Schmerber v. California, 384 U.S. 757 (1966) (physical intrusion with syringe), or United States v. Jones, No. 10–1259 (U.S. Jan. 23, 2012) (the GPS tracking case that applied a trespass-with-intent-to-acquire-information test for ascertaining a “search”).

2. But Caballes and Place also are distinguishable in that DNA loci are not contraband.)

3. How much, if any, other information the card contains is an interesting question.

4. I am oversimplifying. When profiles from putative close relatives are available, the loci can be used for kinship testing. For example, if the state has the profiles of a mother-father-child trio, it could determine whether they are in the specified biological relationship or whether, for example, someone else is the biological father. The reader is invited to make his own comparison between the strength of the privacy interest in the contents of all manner of containers of personal effects and records on the one hand, and the STR loci used for identification, on the other.

¶26 The State argues that once it has lawfully obtained the cell samples, the Fourth Amendment provides no greater bar to the processing of those samples and the extraction of the DNA profile than it does to the analysis of fingerprints. But the State's reliance on the fingerprinting analogy here is misplaced. Once fingerprints are obtained, no further intrusion on the privacy of the individual is required before they can be used for investigative purposes. In this sense, the fingerprint is akin to a photograph or voice exemplar. But before DNA samples can be used by law enforcement, they must be physically processed and a DNA profile extracted. See Erin Murphy, The New Forensics: Criminal Justice, False Certainty, and the Second
Generation of Scientific Evidence, 95 Cal. L. Rev. 721, 726-30 (2007).

This is a distinction without a difference. First, in both fingerprinting and DNA analysis, a sample (an exemplar) must be collected from an arrestee. Elsewhere, the opinion describes the intrusion on the individual in this step with unusual clarity. Second, with both fingerprinting and DNA profiling, the physical sample must be examined "before [the] samples can be used by law enforcement."

The fingerprint information lies in minutiae that must be studied by eye or by computer to extract useful data. The DNA information lies in particular loci that must be characterized by chemical reactions and computers to extract useful data. What matters is not the physics or the chemistry, but the transformation into identifying information. If extracting this information is a separate search for DNA, then extracting the identifying information also is a separate search for
fingerprints. If this "second search" requires a warrant for DNA, it requires it for fingerprints.

Sunday, June 24, 2012

How accurate is latent fingerprint identification? Considering that fingerprints are the most common form of trace evidence (for crimes in general), this is a vital question. The other day, I mentioned one court’s view that a false identification rate of 0.01% is only marginally higher than 1 in 11 million. The former statistic comes from a well designed experiment—the Noblis-FBI study published in 2011.\1/ The latter comes from a U.S. Court of Appeals opinion citing the earlier testimony of an FBI fingerprint supervisor.

A few months after the publication of the Noblis-FBI study, a short report of a second controlled experiment on the ability of latent print examiners to match those prints to exemplars appeared, this time in the journal Psychological Science.\2/ University of Queensland psychology lecturer Jason M. Tangen and two co-authors recruited 37 “qualified practicing fingerprint experts from five police organizations” and 37 college students to see how they would do in evaluating pairs of prints—some from the same fingers (mates) and some from different fingers (nonmates). Id. at 995.

The Task

Although the Australian researchers wrote that the task they gave their subjects “emulates the most forensically relevant aspect of the identification process,” id., the professional examiners and the students did not have the options of declaring a pair of images unsuitable for analysis or ultimately inconclusive.\3/ Instead, these “[p]articipants were asked to judge whether the prints in each pair matched, using a confidence rating scale ranging from 1 (sure different) to 12 (sure same) ... [with] ratings of 1 through 6 indicat[ing] a match [and] ratings of 7 through 12 indicat[ing] no match.” Id.

The members of the two groups each received “36 simulated crime-scene prints that
were paired with fully rolled prints.” Id. Some of the nonmates were supposed to be similar to the latent print of the pair because they were the closest matches (according to a computer program) in the Australian National Automated Fingerprint Identification System (ANAFIS). The other nonmates were plucked at random from the research database and ANAFIS. Each participant received a 12 latent prints paired with “similar” nonmates, 12 paired with “nonsimilar” nonmates, and 12 paired with mates. Id. at 996.

False positive rate. For the pairs intended to be difficult (the similar nonmates), “experts correctly declared nearly all of the pairs (99.32%) to be nonmatches (correct rejections); only 3 pairs (0.68%) out of the 444 in this condition were incorrectly declared to be matches (false alarms).” Id. at 997. Furthermore, not a single expert “misidentified any of the 12 nonsimilar distractor prints as matches.” Id.

Students. The undergraduates did not fare nearly as well as the practitioners. For example, they “mistakenly identified 55.18% of the [pairs of] similar ... [nonmates] as matches.” Id.

The following two tables list more or less comparable error rates from both studies.\4/

Table 1. False negative rates

Professionals
(US)

Professionals
(Australia)

Undergraduates

Mates

450/4113
(10.9%)

35/444
(7.88%)

113/444
(25.45%)

Table 2. False positive rates

Professionals
(US)

Professionals
(Australia)

Undergraduates

Similar nonmates

6/3628
(0.02%)

3/444
(0.68%)

245/444
(55.18%)

Nonsimilar nonmates

—

0/444
(0.00%)

102/444
(22.97%)

Discussion

As both sets of researchers appreciate, these rates do not necessarily generalize to casework. They simply exemplify what can be achieved under the experimental conditions, in which the subjects knew they were being tested. Nevertheless, the sensitivity and specificity of professional examiners in ascertaining when a pair of prints emanates from the same finger should help mute the most extreme criticism of the field. By the same token, they should prompt investigations of the conditions under which misclassifications tend to occur and, as Tangen et al. note, they “should affect the testimony of forensic examiners and the assertions that they can reasonably make.” Id. at 997.

The Australian researchers are impressed by the extent to which professional examiners outperformed undergraduate students. But some of the gap could be caused in part by a difference in motivation. The students received course credit for turning in answers, but they may have had little incentive to agonize over the best classification in every one of the 36 comparisons they were asked to make. This confounding variable should be considered before making unequivocal claims of “a real performance benefit” that “may satisfy legal admissibility criteria.” Id.

3. The third author, a latent print analyst, believed that all the simulated latent prints were of value for identification. Id. at 996.

4. The studies differ in significant ways, limiting the number of comparisons that one can make and the confidence that one can have in direct comparisons of the statistics. The Noblis-FBI study used no randomly selected nonmate exemplars—all the nonmate pairs were “similar” within the meaning of the Australian study. Also, only practicing fingerprint examiners participated, so the student-professional dichotomy has not been replicated. To account as much as possible for the fact that all the pairs of prints had to compared and an ultimate conclusion had to be drawn in the Australian experiment, the rates from the U.S. study are based on instances in which the subjects deemed the prints to be of value for individualization and reached a conclusion about the comparison. Even if this adjustment is adequate for some purposes, however, it does not eliminate the possibility that the inability to use the “inconclusive” category contributed to the larger false positive rate of the Australian subjects.

Friday, June 22, 2012

In United States v. Love, No. 10cr2418–MMM, 2011 WL 2173644 (S.D. Cal. June 1, 2011), the “government charged Donny Love, Sr. with crimes related to the May 4, 2008 bombing of the federal courthouse in San Diego. ... Before trial, Love moved to exclude the testimony of Robin Ruth, the standards and practices program manager” of the FBI's latent fingerprint unit. Id. at *1.

Love argued that latent fingerprint analysis lacks the scientific or other foundation for admissibility under the federal rules of evidence. The district judge, M. Margaret McKeown, held a pretrial hearing at which Ruth, who "possesses degrees in genetic engineering and in forensic and biological anthropology," id. at *9, was the only witness. In her (inexplicably unreported) opinion, Judge McKeown described the testimony on the risk of error as follows:

Latent fingerprint examiners have sometimes stated that their analyses have a zero error rate. . . . Ruth did not so testify. Rather, she stated that the ACE–V methodology is unbiased and that the methodology itself introduces no random error. Errors occur, but those errors are human errors resulting from human implementation of the ACE–V process. . . . Because human errors are nonsystematic, Ruth believes that there is no overall predictive error rate in latent fingerprint analysis.

Id. at *5. This sounds like gobbledy-gook to me. To begin with, the FBI’s continued insistence that there is an ACE-V “methodology” separate from the human being who is doing the analysis, comparison, and evaluation is mystifying. As a NIST expert working group on human factors in fingerprinting recently wrote, human errors are a function of the system that includes human beings. Such a system can be structured to favor errors of one type—false identifications—or the other—false eliminations. The Noblis-FBI study (see postings of April 26, 27, 30, and May 1) indicates that in the analysis phase, examiners systematically discard prints that contain useful information. In the evaluation phase, they systematically err in favor of false eliminations over false identifications.

Moreover, the reasoning that “[b]ecause human errors are nonsystematic, ... there is no overall predictive error rate in latent fingerprint analysis” is hard to decipher. Long-term features of random (nonsystematic) processes are predictable. If a laboratory’s examiners make false identifications and false eliminations, each with a constant probability of 0.001, then the “human errors are nonsystematic” but the 0.001 error rate can be used to predict the incidence of errors in casework. Ms. Ruth probably was saying that error probabilities are not constant and that existing data do not permit very realistic or accurate statistical modeling of errors in a laboratory.

In any event, a better argument is available. It is clear that fingerprint examinations can produce useful information—they can make correct source attributions at levels far greater than chance alone would produce. The opinion refers to

very few . . . cases in which an examiner identifies a latent print as matching a known print even though the two prints were actually made by different individuals. Most significantly, the May 2011 [Noblis-FBI] study of the performance of 169 fingerprint examiners revealed a total of six false positives among 4,083 comparisons of non-matching fingerprints for “an overall false positive rate of 0.1%.” See Ulery et al., supra, at 7733, 7735.

Id. Of course, the 0.1% figure from a controlled experiment in which the examiners knew they were being tested might not be “predictive” of the rates in casework. An experiment reveals what happens under the conditions in the experiment. Generalizing to other situations takes judgment. Apparently, Ms. Ruth thought that the error rate in practice would be smaller “because the prints used in the study were of relatively low quality.” Id. And, the court noted that verification by a second examiner should reduce the rate still more. Id.

The observed error rate of 0.1%, the court wrote, was only “marginally higher than the rates the FBI has previously estimated.” Id. The only previous statistic that the court cites is “1 in 11 million cases.” Id. The difference between 1/1,000 and 1/11,000,000 strikes me as more serious than the term “marginal” connotes. The government ought to devote more resources to reducing the risk of error if the chance of a false identification is as high as 1/1,000 than if it is a mere 1/11,000,000. The former may be "very low," but the latter is incredibly low.

Furthermore, historically derived error statistics such as 1/11,000,000 are all highly problematic in the absence of any mechanism that would uncover possible errors in run-of-the-mill cases. Consequently, and contrary to the view expressed in the opinion, one should not regard a supervisor's recollections of errors and the outcomes of investigations of occasional complaints of misidentifications as "evidence suggest[ing] that the rate is much lower than that figure [of 0.1%]." Id. at *6.

Appendix

Love is the first case to generate a widely available opinion discussing the Noblis-FBI study. It is reproduced below:

The government charged Donny Love, Sr. with crimes related to the May 4, 2008 bombing of the federal courthouse in San Diego. A trial on those charges began on May 23, 2011. Before trial, Love moved to exclude the testimony of Robin Ruth, the standards and practices program manager of Federal Bureau of Investigation's (“FBI”) latent fingerprint unit. Love argued that latent fingerprint analysis generally, and therefore Ruth's specific testimony about fifteen latent prints she analyzed for this case, are insufficiently reliable for admission under Federal Rule of Evidence 702 and the Supreme Court's opinions in Daubert v. Merrell Dow Pharmaceuticals, 509 U.S. 579 (1993), and Kumho Tire Co. v. Carmichael, 526 U.S. 137 (1999). At Love's request, on May 16, 2011, the court held a Daubert hearing regarding the latent fingerprint evidence. Ruth testified at that hearing regarding latent fingerprint analysis generally, the FBI's practices, and the analyses she performed in the course of her work on this case. No other witnesses were called to testify at the hearing. Because the May 16 hearing took place one week before trial, in order to provide sufficient notice of the ruling, the court provided an oral ruling at the close of the hearing and denied Love's motion to exclude the latent fingerprint testimony. The court also gave a brief explanation of that ruling, but stated that it would issue a written order elaborating on its reasons for denying Love's motion. This order provides that further explanation. To the extent anything in this order differs from the court's oral ruling of May 16, 2011, this order supersedes the oral ruling.

I. Latent Fingerprint Analysis

There are, broadly speaking, two kinds of fingerprints. “Rolled,” “full,” or “known” prints are taken under controlled circumstances, often for official use. “Latent” prints, by contrast, are left—often unintentionally—on the surfaces of objects touched by a person. Latent fingerprint examiners compare unidentified latent prints to known prints as a means of identifying the person who left the latent print.

In United States v. Mitchell, 365 F.3d 215 (3d Cir.2004), the Third Circuit provided a succinct introduction to the terminology used by latent fingerprint examiners and to the methodology employed by FBI examiners. In that court's words,

[f]ingerprints are left by the depositing of oil upon contact between a surface and the friction ridges of fingers. The field uses the broader term “friction ridge” to designate skin surfaces with ridges evolutionarily adapted to produce increased friction (as compared to smooth skin) for gripping. Thus toeprint or handprint analysis is much the same as fingerprint analysis. The structure of friction ridges is described in the record before us at three levels of increasing detail, designated as Level 1, Level 2 and Level 3. Level 1 detail is visible with the naked eye; it is the familiar pattern of loops, arches, and whorls. Level 2 detail involves “ridge characteristics”--the patterns of islands, dots, and forks formed by the ridges as they begin and end and join and divide. The points where ridges terminate or bifurcate are often referred to as “Galton points,” whose eponym, Sir Francis Galton, first developed a taxonomy for these points. The typical human fingerprint has somewhere between 75 and 175 such ridge characteristics. Level 3 detail focuses on microscopic variations in the ridges themselves, such as the slight meanders of the ridges (the “ridge path”) and the locations of sweat pores. This is the level of detail most likely to be obscured by distortions.The FBI ... uses an identification method known as ACE–V, an acronym for “analysis, comparison, evaluation, and verification.” The basic steps taken by an examiner under this protocol are first to winnow the field of candidate matching prints by using Level 1 detail to classify the latent print. Next, the examiner will analyze the latent print to identify Level 2 detail (i.e., Galton points and their spatial relationship to one another), along with any Level 3 detail that can be gleaned from the print. The examiner then compares this to the Level 2 and Level 3 detail of a candidate full-rolled print (sometimes taken from a database of fingerprints, sometimes taken from a suspect in custody), and evaluates whether there is sufficient similarity to declare a match.

Id. at 221–22. At the evaluation step, the examiner may reach three conclusions—that the prints are a match (“identification”), that the prints are not a match (“exclusion”), or that more information is needed (“inconclusive”). See Hearing Ex. 1, at 42. “In the final step, the match is independently verified by another examiner.” Mitchell 365 F.3d at 222. It bears noting that since Mitchell was decided 2004, there have been important refinements and developments with respect to latent print identification, as documented in testimony before the court.

II. General Admissibility of Latent Fingerprint Evidence

As with all scientific or technical evidence, latent fingerprint evidence is admissible in court if it “will assist the trier of fact to understand or to determine a fact in issue” and “if (1) the testimony is based upon sufficient facts or data, (2) the testimony is the product of reliable principles and methods, and (3) the witness has applied the principles and methods reliably to the facts of the case.” Fed.R.Evid. 702. “Many factors ... bear on the [reliability] inquiry.” Daubert, 509 U.S. at 592–93. The Supreme Court has identified several factors that may often be relevant. These include whether the “technique can be (and has been) tested,” “[w]hether it has been subjected to peer review and publication,” the “known or potential rate of error,” “whether there are standards controlling the technique's operation,” and “whether the ... technique enjoys general acceptance within a relevant scientific community.” Kumho Tire, 526 U.S. at 149–50 (internal quotation marks and alteration omitted); accord Daubert, 509 U.S. at 593–94. Nevertheless, “the test of reliability is flexible, and Daubert 's list of specific factors neither necessarily nor exclusively applies to all experts or in every case.” Kumho, 526 U.S. at 141 (internal quotation marks omitted).

“The test under Daubert is not the correctness of the expert's conclusions but the soundness of [her] methodology.” Primiano v. Cook, 598F.3d558, 564 (9th Cir.2010). As a result, “the role of the courts in reviewing proposed expert testimony is to analyze expert testimony in the context of its field to determine if it is acceptable science.” Boyd v. City & Cnty. of S.F., 576 F.3d 938. 946 (9th Cir.2009). So long as “an expert meets the threshold established by Rule 702 as explained in Daubert, the expert may testify and the jury decides how much weight to give that testimony.” Primiano, 598 F.3d at 565. Put another way, although the court “must ensure that ‘junk science’ plays no part in the [jury's] decision,” Elsayed Mukhtar v. Cal. State Univ., Hayward, 299 F.3d 1053, 1063 (9th Cir.2002), “[s]haky but admissible evidence is to be attacked by cross examination, contrary evidence, and attention to the burden of proof, not exclusion,” Primiano, 598 F.3d at 564.

In this case, the parties discuss each of the five factors mentioned in Daubert and Kumho, They also address two additional factors drawn from United States v. Downing, 753 F.2d 1224 (3d Cir.1985)—namely, whether latent fingerprint analysis has a “relationship to ... established modes of scientific analysis” and whether it has “non-judicial uses.” Id. at 1238–39. The court addresses each factor in turn.

A. Testing

The first factor discussed in Daubert and Kumho asks whether a methodology can be tested and whether it has been tested. The parties do not dispute that the reliability of latent fingerprint analysis can be tested, and the record reveals three categories of potential tests. First, latent fingerprint analysis rests on two hypotheses—that fingerprints are unique to an individual and that prints persist, which is to say that they do not change over the course of an individual's life.\1/ These hypotheses are open to testing. The uniqueness hypothesis would be falsified if two people were found to share identical fingerprints, while the persistence hypothesis would be falsified if the same finger produced prints with details that evolved over time. Second, it is possible, at least in principle, to learn “the prevalence of different ridge flows and crease patterns” and “ridge formations and clusters of ridge formations” across individuals. See National Research Council of the National Academies, Strengthening Forensic Science in the United States: A Path Forward 144 (2009) (“NAS Report”), available at: http:// www.nap.edu/catalog/12589.html. This information would facilitate estimates of the reliability of the conclusion that two specific fingerprints are from the same individual. See Opp'n, Ex. B, at 8. Third, it is possible to test the reliability of conclusions reached by a given fingerprint analyst through controlled examinations.

1. At the hearing, Ruth testified that permanent scars alter an individual's fingerprints and thus are a known exception to the persistence hypothesis.

The fact that latent fingerprint analysis can be tested for reliability, without more, allows the first Daubert “factor to weigh in support of admissibility.” See Mitchell, 365 F.3d at 238. Ruth also testified, however, that at least some actual testing and research has been performed along each of the three dimensions discussed above. Testing of the uniqueness and persistence hypotheses dates to the eighteenth century and includes, among other things, a simple longitudinal study performed by Sir William Galton in the 1890s, a 1982 study of twins, and contemporary Bayesian statistical models. See Hearing Ex. 1, at 26–28; see also Mitchell, 365 F.3d at 236 & n. 16 (discussing a test of 50,000 fingerprints for uniqueness). Some recent statistical models also bear on the distribution of particular ridge characteristics across the population as a whole. See Opp'n, Ex. B, at 8 (citing three additional studies). Finally, several studies of the performance of fingerprint examiners have been performed. The most recent such study was published in May 2011. See Bradford T. Ulery et al., “Accuracy and Reliability of Forensic Latent Fingerprint Decisions,” 108 Proceedings of the National Academy of Sciences 7733 (May 10, 2011): see also Hearing Ex. 1, at 45 (citing five additional articles). The FBI also conducts proficiency examinations of its examiners, which—even if taken under conditions that “do not accurately represent [those] encountered in the field”—are of some value in assessing the reliability of individual examiners. See United States v. Baines, 573 F.3d 979, 990 (10th Cir.2009).

The court recognizes that the NAS Report called for additional testing to determine the reliability of latent fingerprint analysis generally and of the ACE–V methodology in particular. See NAS Report at 143–44. The Report also questions the validity of the ACE–V method. See id. at 142. However, Daubert, Kumho, and Rule 702 do not require “absolute certainty,” Daubert v. Merrell Dow Pharmaceuticals, Inc., 43 F.3d 1311, 1316 (9th Cir.1995) (opinion on remand); instead, they ask whether a methodology is testable and has been tested. On this record, the court finds that latent fingerprint analysis can be tested and has been subject to at least a modest amount of testing—some of which, like the study published in May 2011, was apparently undertaken in direct response to the NAS's concerns. The court therefore concludes that this factor weighs in favor of admitting latent fingerprint evidence. See Baines, 573 F.3d at 990 (“[W]hile we must agree with defendant that this record does not show that the technique has been subject to testing that would meet all of the standards of science, it would be unrealistic in the extreme for us to ignore the countervailing evidence.”).

B. Peer Review and Publication

The second factor in Daubert and Kumho concerns whether a methodology has been subject to publication and peer review. Publications regarding latent fingerprint analysis are relatively few in number. The government introduced into evidence a list of roughly thirty publications touching on various aspects of the field. See Hearing Ex. 1, at 66–69. Love's moving papers also contain citations to a handful of other relevant publications. See Mot. at 19–21. Although limited in quantity, many of these publications appear to “address ... theoretical/foundational questions” in latent fingerprint analysis. Mitchell, 365 F.3d at 239. The articles are therefore relevant to the reliability inquiry.

Ruth stated that at least one of these articles was published in a peer-reviewed journal, and the government's brief cites three other examples of peer-reviewed work. See Opp'n at 17. Even assuming that all of the articles cited by Ruth and the parties are peer-reviewed, latent fingerprint analysis would have only a small fraction of the number of peer reviewed publications found in established sciences. Nonetheless, because there are a handful of publications that concern the reliability of latent fingerprint analysis, at least a few of which are peer-reviewed, this factor is either neutral or weighs slightly in favor of admissibility. See, e.g., Boyd, 576 F.3d at 946 (affirming the admission of evidence of a theory supported by fourteen publications, ten of which were peer-reviewed); United States v. Prime, 431 F.3d 1147, 1153–54 (9th Cir.2005) (admitting handwriting analysis evidence in part because of publications that were peer reviewed by other forensic scientists).

C. Error Rates

The third factor deals with the known or potential error rate of a methodology. Latent fingerprint examiners have sometimes stated that their analyses have a zero error rate. See, e.g., NAS Report at 143. Ruth did not so testify. Rather, she stated that the ACE–V methodology is unbiased and that the methodology itself introduces no random error. Errors occur, but those errors are human errors resulting from human implementation of the ACE–V process. See Hearing Ex. 1, at 47–52. Because human errors are nonsystematic, Ruth believes that there is no overall predictive error rate in latent fingerprint analysis.

Ruth's testimony does not mean that the ACE–V process is perfect, or even that it is necessarily the best possible process for identifying latent prints. See NAS Report at 143 (“The [ACE–V] method, and the performance of those who use it, are inextricably linked, and both involve multiple sources of error (e.g., errors in executing the process steps, as well as errors in human judgment).”). Nevertheless, all of the relevant evidence in the record before the court suggests that the ACE–V methodology results in very few false positives—which is to say, very few cases in which an examiner identifies a latent print as matching a known print even though the two prints were actually made by different individuals. Most significantly, the May 2011 study of the performance of 169 fingerprint examiners revealed a total of six false positives among 4,083 comparisons of non-matching fingerprints for “an overall false positive rate of 0.1%.” See Ulery et al., supra, at 7733, 7735. This false positive rate is marginally higher than the rates the FBI has previously estimated. See, e.g., Barnes, 573 F.3d at 990–91 (recounting testimony suggesting that the FBI's error rate was 1 in 11 million cases). That discrepancy may be partially due to the fact that identifications in the study were not subject to verification.\2/ At least in Ruth's estimation, the error rate in the study may also have been marginally higher because the prints used in the study were of relatively low quality.\3/ In any case, a false positive rate of 0.1% remains quite low.\4/

2. No two individuals made the same false positive conclusion. See Ulery et al., supra, at 7735.

3. The examiners who participated in the study generally said that the prints in the study were similar in quality to those they encounter in their work. See Ulery et al., supra, at 7734.

4. The false negative rate in the May 2011 study was somewhat higher. See Ulery et al., supra, at 7733, 7736. The rate of false negatives is, however, not relevant here insofar as Ruth's testimony will identify various fingerprints as belonging to Love or others. See Mitchell 365 F.3d at 239. The FBI also concluded that two fingerprints in this case do not match those of Love or of the three individuals who pled guilty to the bombing. However, Love's brief does not specifically challenge the ACE–V methodology on the basis of its false negative rate, perhaps because a higher rate of false negatives could result if examiners take special care not to misidentify prints. See id. at 239 n. 19. The rate of false negatives may therefore be inversely related to the rate of false positives.

To counter the evidence of a low false positive rate, Love relies heavily on a high-profile misidentification made by the FBI in its investigation into the terrorist bombing of a train in Madrid. However, one confirmed misidentification is in no way inconsistent with an exceedingly low rate of error. Nor is the only other evidence on which Love relies. An examiner who served for fourteen years on a board that investigates complaints of misidentification has stated that he knew of twenty-five to thirty misidentifications that occurred in the United States during those fourteen years. See Mot. at 57 (citing Office of the Inspector General, A Review of the FBI's Handling of the Brandon Mayfield Case 137 (2006), available at: ht tp:// www.justice.gov/oig/special/s0601/final.pdf). Of course, any misidentification is troublesome. Without more foundation, however, this statement does not translate into a quantifiable error rate.\5/ The Inspector General's report does not include a benchmark for comparison.

5. For example, some rudimentary math suggests that this statement may imply a very low error rate. Ruth testified that she has made 1,200 identifications in six years as an examiner. At the same rate, she would perform 2,800 identifications over fourteen years, and it would take only ten examiners to perform 28,000 identifications over that time—roughly the number needed for 25–30 false positives assuming a 0.1% false positive rate. Because there are undoubtedly many more than ten latent fingerprint examiners in the United States, the examiner's statement is, if anything, indicative of a false positive rate lower than 0.1%.

The court acknowledges that, as Ruth testified, historical error rates do not necessarily reflect the predictive error rate that should be applied to identifications made by any given examiner. See also United States v. Cordoba, 194 F.3d 1053, 1059 (9th Cir.1999) (discussing polygraphs). However, there is no evidence in the record to suggest that the rate of misidentifications made by latent fingerprint examiners is greater than an average of 0.1%, and some evidence suggests that the rate is much lower than that figure. The court therefore concludes that the error rate favors admission of latent fingerprint evidence. Accord, e.g., United States v. John, 597 F.3d 263, 275 (5th Cir.2010); Mitchell, 365 F.3d at 241; see also United States v. Chischilly, 30 F.3d 1444, 1154–55 (9th Cir.1994) (“conclud[ing] that there was a sufficient showing of low error rates” in DNA profiling despite arguments that “the potential rate of error in the forensic DNA typing technique is unknown”) (internal quotation marks omitted).

D. Standards

The fourth factor discussed in Daubert and Kumho is whether standards control a technique's operation. It is not disputed that the ACE–V methodology leaves much room for subjective judgment. According to Ruth, a fingerprint examiner must decide whether a latent print has value for comparison. Then, if the print does have value, the examiner must choose both the points at which to begin her comparison of the latent print to the known print and the sequence of comparisons to make from those starting points. Examiners in the United States also make a subjective decision at the evaluation stage; it is up to the examiner to determine whether or not prints match on the basis of the comparison and her experience and expertise. The verifying examiner then makes the same subjective evaluation as the original examiner. Love suggests that the standards factor weighs against admission because these subjective elements in the ACE–V process imply a lack of specific, objective standards that guard against examiner bias and error.

The standards used to guide an examiner's judgment vary across laboratories. There is not a single national standard. In part for this reason, and in part because of the subjective judgments made during the ACE–V process, the court acknowledges that the standards used in fingerprint analysis “are insubstantial in comparison to the elaborate and exhaustively refined standards found in many scientific and technical disciplines.” Mitchell, 365 F.3d at 241.

The court finds, however, that various standards imposed by the FBI's latent fingerprints unit, which conducted the analyses relevant to this case, sufficiently safeguard against bias and error. The FBI uses three different types of standards in an effort to reach reliable conclusions through the ACE–V process. At the laboratory level, the FBI adheres to the standards for calibration laboratories provided by the International Organization for Standardization and the standards for forensic laboratories promulgated by the American Society of Crime Laboratory Directors Laboratory Accreditation Board (“ASCLD/LAB”). See Prime, 431 F.3d at 1153–54 (noting that a laboratory conformed to ASCLD/LAB standards). The FBI also follows the consensus-based guidelines for laboratories engaged in friction ridge analysis issued by the Scientific Working Group on Friction Ridge Analysis Study and Technology.

At the level of individual examiners, the FBI applies relatively stringent standards for qualification. In addition to a college degree, examiners must have a significant amount of training in the physical sciences. They are then given eighteen months of FBI training and a four-day proficiency examination. After passing that test, an examiner has a six-month probationary period in which every aspect of her work is fully reviewed. Even thereafter, examiners undergo annual audits, proficiency tests, and continuing education.

Finally, at the level of individual comparisons and evaluations, the ACE–V methodology provides procedural standards that must be followed, such as the requirement that examiners assess each ridge and ridge feature in the prints under comparison. See United States v. Crisp, 324 F.3d 261, 269 (4th Cir.2003). The FBI has also recently incorporated documentation requirements that record the process used to detect latent prints as well as the examiner's comparison of the latent print to a known print. These documentation requirements and other procedures are enforced through the technical and administrative review of examiners' work. And the verification process serves as an error-reducing backstop.\6/

6. Under certain relatively rare circumstances, the FBI now performs verifications that are blind to both the identity of the original examiner and the conclusion that examiner reached. Blind verification was used for only one print at issue in this case.

In short, despite the subjectivity of examiners' conclusions, the FBI laboratory imposes numerous standards designed to ensure that those conclusions are sound. See United States v. Llera–Plaza, 188 F.Supp.2d 549, 571 (E.D.Pa.2002) (concluding that the subjective elements in latent fingerprint analysis are of a significantly “restricted compass”). The court therefore concludes that this factor weighs in favor of admission.\7/

7. Love's brief points to an ongoing controversy regarding whether a minimum number of matching details should serve as a prerequisite to the identification of a latent fingerprint. Ruth testified that a minimum points standard would be misguided, because three matching details in an area of the finger containing relatively few characteristics can be more telling than, say, eight matching points in a detail-rich area. In holding that the standards factor supports admission, the court does not attempt to resolve this dispute. The court instead concludes that other standards unrelated to a minimum points standard provide sufficient guidance to satisfy Daubert, Kumho, and Rule 702.

E. General Acceptance

The final factor discussed in Daubert and Kumho is whether a technique is generally accepted in a relevant scientific community. Love argues that the NAS report's criticisms of latent fingerprint analysis in general and the ACE–V methodology in particular demonstrate that friction ridge analysis is not accepted in the relevant scientific community. That assertion contains a kernel of truth. The NAS report does demonstrate some hesitancy in accepting latent fingerprint analysis on the part of the broader scientific community. Love's claim is subject to two significant qualifications. First, the NAS report itself states that “friction ridge analysis has served as a valuable tool, both to identify the guilty and to exclude the innocent.” NAS Report at 142. Instead of a full-fledged attack on friction ridge analysis, the report is essentially a call for better documentation, more standards, and more research. Cf. Chischilly, 30 F.3d at 1154 (“[T]he mere existence of scientific institutions that would interpret data more conservatively scarcely indicates a ‘lack of general acceptance’ ....”).

Second, Love does not dispute that the forensic science and law enforcement communities strongly support the use of friction ridge analysis. Acceptance in that narrower community is also relevant to the Daubert inquiry. See, e.g., Baines, 573 F.3d at 991 (stating that “acceptance of other experts in the field should ... be considered”); Mitchell, 365 F.3d at 241 (“[W]e consider as one factor in the Daubert analysis whether fingerprint identification is generally accepted within the forensic identification community.”); see also Prime, 431 F.3d at 1154 (affirming the admission of handwriting evidence in part because the district court “recognized the broad acceptance of handwriting analysis and specifically its use by such law enforcement agencies as the CIA, FBI, and the United States Postal Inspection Service”). For both of these reasons, the court concludes that the general acceptance factor at least weakly supports the admission of latent fingerprint evidence.

F. Relationship to Established Techniques

“[A] a court assessing reliability may consider the ‘novelty’ of the new technique, that is, its relationship to more established modes of scientific analysis.” Downing, 753 F.2d at 1238. Insofar as such a relationship exists, “the scientific basis of the new technique [may have] been exposed to” indirect “scientific scrutiny.” See id. at 1239. The Third Circuit held in Mitchell, and Ruth testified, that friction ridge analysis is related to the undisputeclly scientific “fields of developmental embryology and anatomy,” which explain “the uniqueness and permanence of areas of friction ridge skin.” Mitchell 365 F.3d at 242. Love argues that this strong tie to the biological sciences is irrelevant, because “ ‘uniqueness and persistence are necessary” ‘ but not sufficient conditions for friction ridge analysis to be reliable. Mot. at 67 (quoting NAS Report at 144). Even if uniqueness and persistence alone cannot validate latent fingerprint analysis, however, it remains true that “[i]ndependent work in [developmental embryology and anatomy] bolsters” two critical “underlying premises of fingerprint identification.” Mitchell, 365 F.3d at 242. This factor weighs in favor of admission.

G. Non–Judicial Applications

“[N]on-judicial use of a technique can imply that third parties ... would vouch for the reliability of the expert's method.” Id. at 242–43. Ruth testified that friction ridge analysis is used for various other purposes, including to identify disaster victims, to identify newborns at hospitals, for biometric devices, for some passports and visas, and for certain jobs. See Hearing Ex. 1, at 64–65. Love stresses that these non-judicial applications generally use rolled and not latent prints. Ruth testified, however, that latent prints are used to identify disaster victims when rolled prints are unavailable. See also Barnes, 573 F.3d at 990 (noting the use of latent prints in this context). But see Mitchell, 365 F.3d at 243 (stating that post-disaster identifications “differ from latent fingerprint identification because [those] identification[s] us[e] actual skin[, which] eliminates the challenges introduced by distortions”). On the basis of the widespread use of fingerprints, and occasional use of latent prints, for non-judicial identification purposes, the court concludes that this factor modestly supports the admission of latent fingerprint evidence.

H. Summary

The court recognizes that the NAS Report and other publications cited by Love critique some aspects of latent fingerprint analysis. However, the forensic science community generally and the FBI in particular have begun to take appropriate steps to respond to that criticism. On this record, in part because of recent developments regarding testing, publication, error rates, and the FBI's governing standards, none of the seven factors discussed by the parties weighs against the admission of latent fingerprint evidence. Friction ridge analysis is not foolproof, but it is also far removed from the types of “junk science” that must be excluded under Rule 702, Daubert, and Kumho. Considering and weighing all of the factors, the credible testimony of Ruth, and the written submissions by both parties, the court concludes that latent fingerprint analysis is sufficiently reliable to be admitted. The court denies Love's motion to exclude the testimony of Robin Ruth insofar as that motion is based on the supposed unreliability of latent fingerprint analysis generally. See, e.g., John, 597 F.3d at 276 (affirming the admission of latent fingerprint evidence); United States v. Pena, 586 F.3d 105, 110–11 (1st Cir.2009) (same); Baines, 573 F.3d at 992 (same); Mitchell, 365 F.3d at 245–46 (same); Crisp, 324 F.3d at 269 (same); United States v. Hernandez, 299 F.3d 984, 991 (8th Cir.2002) (same); United States v. Havvard, 260 F.3d 597, 601 (7th Cir.2001) (same); see also United States v. Sherwood, 98 F.3d 402, 408 (9th Cir.1996) (holding that it was not error to admit fingerprint evidence when the party challenging the evidence admitted that several Daubert factors were satisfied).

III. The Evidence in this Case

Love's motion to exclude the specific evidence the government wishes to introduce in this case also implicates the questions of whether Robin Ruth is qualified to testify as an expert and whether Ruth's analyses are reliable.

There can be no doubt that Ruth is qualified to testify as an expert in latent fingerprint analysis. Since October 2010, Ruth has served as the standards and practices program manager in the FBI's latent prints unit, meaning that she is tasked with overseeing the quality control efforts in that unit. She previously spent five years as an FBI fingerprint examiner; since joining the FBI, she has conducted over 150,000 comparisons and has made roughly 1,200 fingerprint identifications. Ruth possesses degrees in genetic engineering and in forensic and biological anthropology, see Opp'n, Ex. A, at 2, has co-authored two articles about friction ridge analysis, and routinely teaches and lectures about the field, see id. at 4–5.

Love argues that Ruth should not be allowed to testify because she is not certified by the International Association for Identification (“IAI”).\8/ Love's contention relies on the NAS Report, but that report does not state that all examiners should be IAI-certified. It instead simply recommends that forensic scientists be accredited by some outside body, and suggests that the not-yet-existing National Institute of Forensic Science should “determin[e] appropriate standards for accreditation and certification.” See NAS Report 208, 215. Rule 702 requires only that an expert be “qualified as an expert by knowledge, skill, experience, training, or education.” “No specific credentials or qualifications are mentioned,” and the Ninth Circuit has “held that an expert need not have [any] official credentials in the relevant subject matter to meet Rule 702's requirements.” United States v. Smith, 520 F.3d 1097, 1105 (9th Cir.2008). The court rejects Love's challenge to Ruth's qualifications.

8. Ruth has never taken the IAI's certification test for fingerprint examiners, but she is an IAI member, and she passed the FBI's own proficiency test.

Love does not argue that the evidence in this case is less reliable than other latent fingerprint evidence, and there is no reason to believe it is. The fingerprint evidence in this case was analyzed using the ACE–V methodology and the FBI's standard operating procedures. See Sherwood, 98 F.3d at 408 (noting that the examiner's “technique [was] the generally-accepted technique for testing fingerprints”). Ruth is the actually the third FBI examiner to analyze the evidence at issue: After an examiner performed the initial comparison and evaluation, those steps were verified by the senior examiner who performs technical reviews for the FBI's latent print unit. Ruth then re-examined the prints. All three examiners reached the same conclusions regarding each of the fifteen prints at issue. Ruth also testified that no recourse to Level 3 details—the very small details in a print like pores, which Love contends are easily misinterpreted—was necessary to reach or support her conclusions. In short, nothing about the latent prints in this case suggests that Ruth's conclusions are less reliable than other conclusions reached using the ACE–V method as implemented by the FBI. Those conclusions are therefore sufficiently reliable to be admitted into evidence.\9/ Of course, Ruth will be subject to cross-examination about her background, methods, analysis, conclusions, and latent fingerprint analysis generally.

9. It is undisputed that the fingerprint evidence—which includes evaluations of latent prints taken from literature related to the construction of pipe bombs—is highly relevant to this case.

For these reasons, it is ORDERED that Love's motion to exclude the testimony of Robin Ruth is DENIED.

Friday, June 15, 2012

A trial court in Vermont has gone where no court has gone before. In State v. Abernathy [1], Chittenden Superior Court Judge Alison Sheppard Arms found that because "[s]ix CODIS loci ... have associations with an increased risk of disease or have functional properties," the custodians of law enforcement DNA databases can make "probabilistic predictions of disease." According to the judge, modern research has established that "some of the CODIS loci have associations with identifiable serious medical conditions," making the scientific evidence "sufficient to overcome the previously held belief[s]" about the innocuous nature of the CODIS loci.

Emphasizing this finding that richly information-laden STR profiles reside in identification databases, the court proceeded to strike down "Vermont's new pre-conviction DNA testing requirement ... that requires submission of a DNA sample from a 'person for whom the court has determined at arraignment there is probable cause that the person has committed a felony ... .'" In an atypical opinion, the court applied a "special needs" balancing test, placed the burden of proof on the state, and held that this law violates the state constitution.

A major theme in Judge Arms' discussion of human genetics is that there has been a revolution in our understanding of what used to be called "junk DNA." Even though the CODIS loci originally were described as "junk" in "good faith," that understanding was wrong--we now know that even DNA that does not code for proteins is biologically important.1 Other judges, advocacy groups, and at least one law professor have jumped from the discovery that the triplet code for proteins is not the sole message inscribed in DNA to the conclusion that all the CODIS loci may well convey significant information about disease states or propensities.

There are a couple of problems with this reasoning. All that we actually know is that some non-protein-coding DNA regulates gene expression. Scientists do not believe that all non-protein-coding sequences are regulatory. In particular, whether noncoding, nontranscribed, and largely nonconserved sequences are part of a regulatory system (even if their presence might have some function) is far from established.2 The opinion cites an essay I wrote making this point [3] but then ignores its content. It quotes the legal treatise, Modern Scientific Evidence, for the view that "while it is generally agreed that no single loci [sic] contains a gene that definitively determines any discernible characteristic of significance, there are nonetheless indications that they may play a role in some sensitive matters, and continued debates about their importance." Before Abernathy, it appeared that the "continued debates" ended five years ago with agreement with what already was clear -- that even if the loci do not play a functional role, they might, like certain fingerprint patterns or blood types, have some statistical associations with diseases.3

Venturing beyond the inconclusive generalities like these, Abernathy refers to the biomedical literature on five loci and to a testifying expert's characterization of the literature (with no specific references) on another locus. The opinion does not give the magnitude of any putative association, let alone any measure of predictive utility.4 It uses the following phrases: "a fairly large effect size," "a modest association," "not the most strongly associated," "small but ... not zero,"5 and "cannot find that this marker has no association." It does not provide measures of the uncertainty in these estimates. Finally, the opinion does not discuss the extent to which the studies said to show that the associations are real have been replicated.6

Of course, few judges could confidently review the flood of studies on human genetics. Unlike some previous opinions and law review articles, however, this opinion does not rely entirely or largely on newspaper headlines and stories about "junk DNA." Here, the iconoclastic findings came after an evidentiary hearing. But, as has happened before with DNA evidence [8], the evidentiary hearing was one-sided. The defendants presented the testimony of Professor Gregory Wray of Duke University, a specialist in genetics and evolutionary biology, and the state chose not to present an expert in medical genetics or genomics to counter his testimony. Although Professor Wray reviewed the biomedical literature before he testified, the defense submitted no written report, and the state rather than the defense introduced the papers cited in the opinion as exhibits. Scanning the testimony, it seems to me that Dr. Wray never was asked a series of critical questions:

Is it generally accepted that the associations he pointed to apply to the population of individuals whose DNA is placed in law enforcement databanks?

Assuming that they do apply to that population, what is the positive and negative predictive value of any inferences about disease based on the CODIS alleles?

How would the predictive or diagnostic disease-related information in a state DNA database compare to that of (a) color photographs, (b) fingerprints, (c) the blood types used in old-fashioned serology, and (d) the HLA-A and HLA-B haplotypes once used on parentage testing?

Are the CODIS genotypes likely to be substantially more predictive in the future?

Until these questions are answered, there is reason to ask whether the trial court's findings fairly represent the scientific status quo or instead are grim predictions of what could come to pass.

2. According to Judge Arms,"[t]he term 'junk DNA' was coined in the early 1980s." In fact, the phrase normally is attributed to Susumu Ohno, who used it in the title of a 1972 paper [2]. Ohno did not reason that "we don't know what noncoding DNA does, therefore, is it is useless junk." Indeed, he proposed that the duplication and inactivation of genes produce non-protein-coding DNA (now designated pseudogenes) that might have a function. A video introducing Ohno and reading an excerpt from the paper about the role of the noncoding sequences as "spacers" with evolutionary importance can be found at http://www.youtube.com/watch?v=nomI35DJB40&noredirect=1. Since 1972, other possible functions for noncoding DNA have been proposed. Some functions imply that the sequences should be conserved as one species evolves into another. Others, such as Ohno's suggestion that noncoding sequences act as buffers between genes, do not.

3. See [5, p. 228] (referring to "a brief debate in the legal literature" necessitated by "a misunderstanding by Simon Cole over some of the things I [John Butler] had written in a review article on STR markers" and emphasizing that "STR markers used for human identity testing do not predict disease."). One source of confusion, which also infects the Abernathy opinion is the thought that a statistical association between a locus and a disease detected in a family study in say, Northern India, establishes that the same association exists throughout the United States population.

4. Even a strong association (large relative risk) would not make for a useful predictive test if the prevalence of the condition is very small. See [3].

5. The sentence "[t]he relative risk of developing schizophrenia associated with this marker is small but it is not zero" is technically flawed. A relative risk of 1 would express a 0 correlation.

Tuesday, June 12, 2012

As we have seen, the fingerprint analyst who testified about a match between the two ten-print cards from the man named Acosta-Roque and the two from the man named Pecheca-Aromboles said next to nothing about her examination in the particular case. Anthony Capece, Assistant Chief Counsel for DHS, only asked only the following, very general questions, about the methodology:

Q. Okay now, how do you — if you can answer this, how do you actually identify someone by means of fingerprints?
A. I have two impressions that I'm comparing, I use — I personally use two fingerprint magnifiers and two pointers is what we call them, and I'll have a glass over each impression, and
I use the pointer to what I call walk through the ridges and count my points.
Q. Okay now, is there a set number of points that needs to compare in order for you to be able to say that this is the same person?
A. There is no established number. I normally would not want to sign off on anything with less than eight, but there is no set established policy for minimum.
Q. Okay, is there an amount that's required by criminal courts?
A. No, there's not.
Q. Okay, so eight would be acceptable in criminal courts then?
A. Yes. Yes, if you sign off on it, you're saying that that's the same person.

It is impossible to discern from the answers to these poorly framed questions how complete the comparisons among the four cards were. They could have been exceedingly thorough or they could have been disturbingly cursory. The fingerprint analyst, Susan Blei, has yet to respond to emails seeking clarification

The amicus brief (2012) of "scientists and scholars" offers the following guidance:

Although the record is not altogether clear on this point, it is possible that the examiner based this conclusion on eight corresponding ridge details from a single finger. Amici are aware of no research that would support concluding from eight corresponding details that “this is the same person.” Indeed, recent research suggests that while some comparisons showing eight corresponding characteristics may well have significant probative value, the probability of observing such number of characteristics in agreement between fingerprints from unrelated individuals is not negligible.28 The scientific data collected to date clearly
support the general value of fingerprint characteristics for constituting very strong evidence in favor of a same-origin hypothesis, but are in significant contradiction with the claims made by the expert during her testimony.

I wonder if, like the examiner’s testimony, this description of "scientific data" is subject to a charge of overclaiming. Just what do the data show about “the probability of observing such number of characteristics in agreement between fingerprints from unrelated individuals” and the probative value of most eight-point matches? Indeed, how would one measure the probability of a false eight-point match and the probative value of an eight-point match?

An experiment could investigate how often fingerprint examiners detect eight (or more) matching points among impressions of the same fingers (mates) as opposed to different fingers (non-mates). This experiment could provide an estimate of the probability of an eight-point match for different fingers and show that it "is not negligible” (if that is the case). Moreover, if eight-point matches turned out to be not much more common among pairs of impressions from the same finger than among pairs from different fingers, an eight-point match would not give strong support to the hypothesis that the marks come from mates. Thus, the ratio of the probability of a match with a mate to the probability with a non-mate—a quantity known as the “likelihood ratio” (LR)—could tell us whether the probative value of such matches is high or low.

The amicus brief cites no data on eight-point matches of human examiners. The footnote accompanying the claim that the probability of eight-point marches from different fingers is “not negligible” cites three studies. The researchers computed quantities akin to a likelihood ratio according to an algorithm and model of distortion in impressions for different numbers of points (even if a human would not call them matches). This is great research, but it does not investigate the properties of an eight-point matching rule as employed by fingerprint examiners—at least not directly.

Still, it might be suggestive, which is all that the brief claims. So what do the papers reveal about the probabilities for eight-point non-mates versus eight-point mates? Are only “some” likelihood ratios large? Are probabilities of non-mate matches “not negligible”? The first paper (Neumann et al. 2006) is confined to three minutiae. Hence, it is largely beside the point.

The second paper (Neumann et al. 2007) reports that computed LRs with 8 minutiae rarely exceeded 1 for prints from different sources (non-mates). The percentages of LRs in this range in different types of patterns are 1.2%, 0.5% 0.3%, 0.1% (Table 3). In contrast, the LRs exceeded 1 for prints from the same fingers (mates) 99.6% of the time (Figure 11). Of course, an LR only slightly greater than 1 has little probative value, but the LRs for mates exceed 10,000 nearly 90% of the time, judging from Figure 11.

In the third paper (Neumann et al. 2012), the separation in the distributions of the LRs for eight minutiae as between mates and non-mates was even sharper. The LRs almost always exceeded 1,000 for mates, almost never reached this level for non-mates, and never reached 10,000 for non-mates. The text accompanying Figure 3 refers to only a “few occasions when LR(Hd) [the LR for non-mates] exceed[ed] 1" and suggests that even this small number is inflated “when it is borne in mind that the prints have been retrieved as the closest among 600 million.” Later, the paper states that “very few instances of LR(Hd) > 1 were observed” and “the probability of misleading evidence ... is very low.” Moreover, the “wider range of features [used] by examiners would make their examination more discriminating.”

Thus, the research seems more supportive of the value of an eight-point match than the brief's suggestion that significant probative value is merely a sometimes thing for such a match. “Significant probative value” seems to characterize not just “some comparisons” but the vast majority of them. Likewise, the research indicates that different fingers match at eight points at a low rate. There could be some false eight-point matches, but this hardly means that an eight-point matching rule would produce a major proportion of false positives. (And even a small proportion would be of concern.)

So is it true that “[t]he scientific data collected to date are in significant contradiction with the claims made by the expert during her testimony”? Well, the answer is both yes and no. The research actually seems to support the examiner’s good feelings about eight matching minutiae when making a positive identification. At the same time, the studies do limit how positive a knowledgeable examiner can be about an eight-point identification. Two of the studies suggest that the examiner’s degree of confidence in the claim that no one else in the world has a finger that could produce eight or more similar minutiae was unfounded.

Of course, for the reference in the testimony to eight points of comparison to have much bearing on the specific identification in issue, the principal examiner would have had to have made her identifications among four ten-print cards using only one of the ten fingers and stopping upon finding eight corresponding points in the impressions of that one finger. In addition, the second verifying examiner also would have had to have used only eight features—and the very same ones at that. The amicus brief treats this as a possibility worth considering because the record does not “affirmatively” show otherwise. At the same time, the brief recognizes the distinct possibility “that the examiner relied upon a much greater number of corresponding details than the figure of eight mentioned in the record” (but never mentioned as having been used in the case itself).

The brief's response to this obvious possibility is curious. The scientists and scholars state that “[i]f this is the case, however, there is no information available regarding how the forensic expert witness adjusted her assessment of the probative value of the evidence in light of this greater amount of discriminating information.”

But why would an expert who believes that there is zero probability of a false match for 8 points have to “adjust her assessment” when there are, say, 32 points? Suppose I assert that (1) the probability that all the air molecules around my mouth and nose will happen to move away and stay away for one second is essentially zero, and (2) the chance of a temporary vacuum lasting two seconds also is virtually zero. The second statement is neither false nor misleading because of the first statement. Similarly, it makes little sense to demand that a fingerprint examiner who would give an opinion that the chance of false match is negligible for eight features of the sort seen here has to “adjust[] her assessment” when there are more such features. Naturally, her confidence would increase with more data pointing to the same conclusion. Once again, the real problem with the testimony is not the examiner’s personal rule of thumb. The problem lies with the protestation of 100% confidence in the first place.

What can an examiner who is sensitive to the growing base of scientific research into her craft do to explain the striking similarities in a pair of fingerprints? Here is a hypothetical dialogue:

Q. What did you find when you compared the prints?
A. I observed that the overall patterns and the minutiae were very similar. The prints had the same general appearance and the same detailed features.
Q. And what does that signify?
A. Well, all these similarities strongly support the conclusion that the marks come from the same finger rather than from two different fingers.
Q. How strongly do they support the fact they are from the same finger?
A. I cannot give you a simple number, but patterns like these are much more likely to arise from prints of one and the same finger than from impressions of different fingers.
Q. How certain are you that they are from the same finger?
A. I did not say they came from the same finger. All I can tell you is that they have the definite appearance of coming from the same finger. They do not look like a pair of marks coming from different fingers.
Q. How certain are you of this?
A. I am very, very certain

References

Brief of Scientists and Scholars of Fingerprint Identification as Amici Curiae in Support of Petitioner and in Favor of Reversal, Acosta-Roque v. Holder, No. 11-70705 (9th Cir., Mar. 8, 2012).

Sunday, June 10, 2012

Back in March, Who Is Nelson Acosta-Roque? (Part I) described the pending appeal of an Alaska waiter being deported because the government believes he is really Victor Antonio Pecheca-Aromboles — a previously deported Pennsylvania cocaine dealer from the Dominican Republic who changed his name and returned to the U.S.
The Department of Justice resolved the question of the true identity on the basis of “trust-me testimony.” Below are some key questions and answers in the hearing in 2010, along with some thoughts about "100% certain" testimony. Susan R. Blei, supervisor of Alaska’s Criminal Records and Identification Bureau, is testifying over the telephone:

Q. [W]hat is your occupation?
A. I'm a fingerprint analyst and I supervise the criminal records and identification bureau.
* * *
Q. Okay now, how do you — if you can answer this, how do you actually identify someone by means of fingerprints?
A. I have two impressions that I'm comparing, I use — I personally use two fingerprint magnifiers and two pointers is what we call them, and I'll have a glass over each impression, and I use the pointer to what I call walk through the ridges and count my points.
* * *

Comment: That’s it? Just count the number of Level 2 details in common, with no concern for which ones are common and which ones are rare? How many common points were there in this case? How many fingers were compared? The lawyer for the Department of Homeland Security and the immigration judge never thought to ask. And neither did Mr. Acosta-Roque, who was representing himself. You would think that Ms. Blei knows, but she has not responded to my emails.

Q. What is the likelihood of two people possessing the same fingerprints?
A. That won't happen. No two people have the same fingerprints.
Q. Okay, is that — and how — I mean, how do you know that? Is there a scientific study?
A. There have been scientific studies, yes.
Q. So even identical twins wouldn't have the same fingerprints?
A. No, they would not.
* * *

Comment: These answers attracted the scorn of the 39 scholars and scientists, whose amicus brief stated that “The record does not report what studies the expert was referring to, and amici have found no study that proves that ‘No two people have the same fingerprints.’ Amici are skeptical that it would be possible to even conduct a study that proved that proposition." [1]

Impossible to prove? Well, it cannot be proved directly, by a census of all fingerprints that exist, that have existed, and that will exist [2], but studies of identical twins show that their fingerprints are distinguishable, and statistical analyses purport to show very small probabilities of duplicating a fingerprint pattern. For example, the NIST report, released in February, observes that “[o]ne study reports that in 100,000 randomly chosen fingerprints of exemplar quality, there is only a 10–14 probability that some pair of them will match in regard to both minutiae and ridge shape.” [3] Of course, there are more than 100,000 fingerprints in the world’s population, but the same generative model estimates the probability of at least one match among 10 billion to be less than 0.0001. [4]

To be sure, the fact that some scientific studies purport to establish the uniqueness of complete fingerprints does not conclusively prove the point. But it does make the statement that “[t]here have been scientific studies, yes,” understandable as a response to the lawyer’s question.

The problem with the expert’s answer, then, is not that it refers to a fact outside the domain of potential scientific knowledge (which is always tentative in the sense of being subject to revision) [5]. The problem is that it fails to inform the judge that the expert's belief can be and has been disputed. The assertion that "there have been scientific studies, yes" may be the literal truth, but it is one-sided, and an expert forensic scientist or technician has some responsibility to tell the whole truth. The NIST report explains:

During a trial, counsel’s questioning frames the expert’s presentation. The expert may not simply decide what information to discuss but must answer counsel’s questions. This can create some tension between the goal of being complete and the need to be responsive. In resolving this tension, “ethical considerations and professional standards properly place a number of constraints on the expert’s behavior.” One such constraint is “a requirement of candor. While an expert is ordinarily under no legal obligation to volunteer information, professional ethics may compel this ... when the expert believes that withholding information will change dramatically the picture that his … analyses, properly understood, convey.” Thus, the “guiding principles” proposed by the American Society of Crime Laboratory Directors/Laboratory Accreditation Board (ASCLD/LAB) advise forensic scientists to “attempt to qualify their responses while testifying when asked a question with the requirement that a simple ‘yes’ or ‘no’ answer be given, if answering ‘yes’ or ‘no’ would be misleading to the judge or the jury.” [3, p. 115 (notes omitted)]

In Acosta-Roque, it would not have been hard for the expert to answer, “Yes, there have been scientific studies that bear on the question of whether every person has distinguishable fingerprints. Although the conclusion is accepted in most forensic science textbooks and fingerprint identification manuals, some commentators have questioned it.”

The amicus brief also chastises the expert for not listing specific studies, stating
that “forensic expert witnesses have an ethical obligation to specify to courts the studies they rely upon in forming their conclusions and should refrain from the dubious practice of invoking ‘studies’ without specifying what those ‘studies’ are.” However, experts should not be expected to cite studies on every general proposition that is part of their reasoning. Certainly, an expert should not try to bolster testimony with uncalled for references to unnamed “studies”; however, it is fair to answer a lawyer’s unanticipated question, “Is there a scientific study?” in the affirmative even without being able to name the study that the expert knows exists.

Here, the expert could have replied, “Yes, there have been scientific studies that bear on the question of whether every person has distinguishable fingerprints. Because I did not know you would be asking about these, I do not have a list of them at hand.”

Q. Were you able to obtain the Pennsylvania fingerprints?
A. Yes, I was.
* * *
Q. And did you compare all four sets of prints?
A. Yes, I did.
Q. And what was your conclusion upon comparing the prints?
A. They were made by one and the same individual.

Comment: This is source-attribution testimony for ten-print cards. It is based on the theory that an examiner can distinguish every possible pair of prints from the same individual’s ten fingers from every possible pair of ten-print cards from different individuals. The recent reports of the National Academy of Sciences and the NIST Expert Working Group emphasized in the amicus brief do not focus on this claim. Rather, they concern latent print identification, in which far less information is available. In that context, the NIST report concludes that “[g]iven the current state of scientific and professional knowledge, . . . it is best to avoid [source-attribution] testimony based on the theory of global general uniqueness.” [3, p. 138] The same point could be made for ten-print identifications. There are viable alternatives to source attributions on the part of the testifying expert and to reliance on the difficult-to-prove “theory of global general uniqueness.”

Q. Okay, now how did you go about comparing these sets of prints?
A. I began with the Pennsylvania prints and I compared them one at a time to each of the other cards.
Q. And why did you use the Pennsylvania prints?
A. They were the ones that had been requested. You could do it in any order. That was just the way I did it.
Q. Okay, and you determined that all four matched up to the same person?
A. Yes, they did.
Q. And did you have — was there any sort of back-up check done on this?
A. I did have another employee do comparisons on the very same cards, and that employee came to the same conclusion.
* * *

Comment: The record is skimpy about the verification. Was it blind? Appellate counsel for Acosta-Roque asserted that “Neither the initial analysis nor the verification were blind,” [6], but how does he know what the first examiner told the second? The shallow hearing reveals nothing other than that “another employee” looked at the “same cards and came to the same conclusion.”

Q. And so what are the chances that this could — that the fingerprints from Mr. Acosta-Roque maybe don't match Mr. Victor Arumboles?
A. They do match.
Q. Okay, so I mean, can you state — I mean, what are they — is there a percentage? Can you quantify it, or ...
A. No. No, I'm 100 percent sure the prints are the same and made by the same individual.
* * *

Comments: Despite the inarticulate questions, the witness professes 100% subjective certainty in her source attribution. The amicus brief denigrates all claims of certainty as irresponsible because human error is possible:

Claims of absolute certainty go beyond mere expressions of the degree of an expert’s confidence, and are, in essence, an unsupported form of bolstering the expert’s own testimony. In addition to failing to recognize the possibility of an erroneous match, they also fail to recognize that there might be other errors that could occur, such as mislabeling of a record within a database, or a misattribution of a source print to the incorrect individual. No human endeavor is always utterly error-free, and assertions of 100 percent certainty in this context are both misleading and scientifically unwarranted.

I know what the brief is driving at, but its rationale for eschewing "assertions of 100 certainty" is unconvincing. The analyst attributed the four prints she examined to the same source — "the prints [were] made by the same individual." This conclusion does not turn on how the exemplar prints were labelled.

As for the charge of "bolstering," a claim of total certainty is no less an “expression of the degree of an expert’s confidence” than is a claim of near certainty. Any assertion of a high degree of confidence could be overstated because of an inadequate perception of the risk of human error. Any assertion of great confidence “gives a boost to” a more tentatively stated conclusion that all the fingerprint impressions on the four cards come from the same set of ten fingers. There is nothing wrong with this type of "bolstering" if (1) the asserted degree of confidence is warranted, and (2) the expert acknowledges that, like all statements about the world, it is subject to some degree of doubt. An expert who testifies that “I am as sure about this as I am about my own name or the fact that the earth orbits the sun” also could be mistaken. Degree-of-confidence testimony is misleading only if it is misunderstood as a statement that the objective probability that the same fingers are represented on all the cards is the specified percentage.

That said, it would be best for experts to avoid 100% even as an expression of subjective probability. From a Bayesian perspective, this level of certainty corresponds to accepting infinite odds on a bet that the match is false. Would Ms. Blie be willing to accept eternal damnation and unbearable torture if she were proved wrong? Total certitude means that no contrary evidence, no matter how convincing could change her mind. Thus, the use of 100% smacks of dogma or advocacy rather than a neutral statement of the evidence. Most likely, this extreme probability is what the amicus brief means by “bolstering.”

The more fundamental and general issue, however, is whether an assertion of a number bordering on 100% for the subjective certainty of identity is “scientifically unwarranted.” Science certainly permits a witness to report probabilities conditioned on explicit assumptions. Do scientific studies permit an examiner to assert that, having extensively analyzed sets of ten-print cards, she is (nearly) 100% confident that the only plausible hypothesis to explain the comparable patterns is a common source? If so, was her examination sufficient to warrant such a claim? Without more information on the actual examination than the government and the expert deigned to provide, the last question cannot be answered.

Q. Okay, and you concluded that they're one and the same to the exclusion of all others?
A. Yes, sir.
Q. Okay and I just — one last time, how certain are you about your conclusion that these prints belong to the same individual?
A. I'm 100 percent certain.

Comment: Objection — asked and answered.

This is not the end of the testimony. I'll pick up the thread and wind it up soon with a discussion of the amici's criticism of the expert's desire to have at least eight matching points before making a source attribution.

References

1. Brief of Scientists and Scholars of Fingerprint Identification as Amici
Curiae in Support of Petitioner and in Favor of Reversal, Acosta-Roque
v. Holder, No. 11-70705 (9th Cir. Mar. 8, 2012).

3. NIST Expert Working Group on Human Factors in Latent Print Analysis, Latent Print Examination and Human Factors: Improving the Practice through a Systems Approach: The Report of the Expert Working Group on Human Factors in Latent Print Analysis 82 (David H. Kaye ed., 2012). The report adds that “[w]ithout the assumption of independence, however, the computed probability could be orders of magnitude higher.” Id.