Patrick L. Meehan, United States Attorney, Laurie Magid, Deputy United States Attorney for Policy and Appeals, Michael L. Levy, Assistant United States Attorney, Robert A. Zauzmer (Argued), Assistant United States Attorney, Paul A. Sarmousaki, Assistant United States Attorney, Senior Appellate Counsel, Eastern District of Pennsylvania, Philadelphia, for Appellee.

This appeal by Byron Mitchell from a judgment in a criminal case raises important questions concerning the admissibility of latent fingerprint identification evidence under Fed.R.Evid. 702. We adjudicate on the basis of a voluminous record developed at a Daubert hearing, see Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579, 113 S. Ct. 2786, 125 L. Ed. 2d 469 (1993), and explore in considerable detail the application of the various Daubert factors to the prosecution's expert testimony. We conclude that the testimony passes Daubert muster, and that there are "good grounds," id. at 590, 113 S. Ct. 2786, for its admission. In a related matter, we must decide whether the District Court properly took judicial notice that "human friction ridges are unique and permanent throughout the area of the friction ridge skin, including small friction ridge areas, and that... human friction ridge skin arrangements are unique and permanent." App. 1472a. We conclude that the District Court erred in taking judicial notice, but that the error was harmless.

We also consider Mitchell's contention that the District Court erroneously excluded from trial significant portions of his proffered expert testimony on the unreliability of latent fingerprint identification. Portions of the colloquies between the Court and counsel are less than pellucid, but we are satisfied that what the Court really did was to operate on a three-tier theory of what expert testimony was admissible: allowing (1) specific criticisms and (2) general reliability criticisms, but excluding (3) testimony about whether latent fingerprint identification is a "science." Within that framework, the exclusion of evidence that latent fingerprint identification is a science was proper under Kumho Tire Co. v. Carmichael, 526 U.S. 137, 119 S. Ct. 1167, 143 L. Ed. 2d 238 (1999).

The final fingerprint-related issue concerns the putative withholding by the government of a Department of Justice solicitation for research proposals directed at validating the reliability of latent fingerprint identification. This solicitation, Mitchell contends, was not only improperly and intentionally withheld by the government in violation of its obligations under Brady v. Maryland, 373 U.S. 83, 83 S. Ct. 1194, 10 L. Ed. 2d 215 (1963), but would have been powerful evidence, not only substantively but also to impeach the government's expert witnesses who testified that latent fingerprint identification was a well-established discipline with a strong and well-verified foundation. The District Court concluded that the solicitation was not material under the "reasonable probability of a different outcome" standard of Brady and its progeny. We agree.

The remaining issue on appeal is whether plain error was committed by the admission of testimony that a key government witness gave a statement to the FBI and testified at a prior proceeding. Mitchell characterizes the admission of this evidence as improper under the hearsay rules, Fed.R.Evid. 801, 802. We conclude that testimony about the existence of a statement is not itself a "statement"; that the testimony was not "offered ... to prove the truth of the matter asserted," Fed.R.Evid. 801(c), and thus not inadmissible under Fed.R.Evid. 802; and that, at all events, the plain error standard is not met. We will therefore affirm the judgment.

A. The Offense and Mitchell's First Trial and Appeal

This case began in 1991 when two men with handguns robbed an armored car employee of approximately $20,000 as he entered a check cashing agency at 29th Street and Girard Avenue in North Philadelphia. The robbers then got into a beige car driven by a third person, engaging in gunfire with the armored car employees as they fled. The beige car, which had been stolen about an hour beforehand, was abandoned by the robbers roughly a mile from the agency. The government sought to prove at trial that the robbers were William Robinson (a/k/a "Bookie") and Terrence Stewart (a/k/a "T"), and that the getaway driver was Mitchell. According to the government, the robbery had a fourth participant, Kim Chester, who knew of the plans, helped case the robbery site, and assisted the others in spending the proceeds of the robbery. Chester testified for the prosecution at Mitchell's trial as an uncharged accomplice. Both Robinson and Stewart died before trial, and thus Mitchell was the sole defendant.

Mitchell was charged with conspiracy to commit and commission of Hobbs Act robbery, 18 U.S.C. § 1951, and use of and carrying a firearm during a crime of violence, 18 U.S.C. § 924(c). In the first trial, at which Mitchell was convicted of all counts, the government introduced into evidence an anonymous note that had been left in the front seat of the abandoned beige car, apparently written by someone who observed the robbers exiting the beige car and getting into a different car. The note read, "Light green ZPJ-254. They changed cars; this is the other car." On appeal, we held the note to be inadmissible hearsay not subject to any exception in Fed.R.Evid. 803. United States v. Mitchell, 145 F.3d 572 (3d Cir. 1998). In view of the limited other evidence connecting Mitchell to the robbery-Chester's testimony was questionable, no robbery proceeds were ever linked to Mitchell, and the fingerprints recovered from the beige getaway car were identified as Mitchell's but in poor condition-we concluded that admission of the anonymous note was not harmless error. Id. at 579-80. Accordingly, we vacated Mitchell's conviction and remanded for a new trial. Id.

Prior to the retrial, the District Court conducted a lengthy Daubert hearing on the admissibility under Fed.R.Evid. 702 of the government's expert testimony (and Mitchell's counter-experts) on the identification of fingerprints found on the gear shift lever and driver's side door of the beige getaway car. This hearing was to adjudicate a major attack mounted by Mitchell on the government's fingerprint evidence. As with any expert testimony, some background in the field and an introduction to the jargon is helpful, and so we discuss the field of latent fingerprint identification in general before turning to the particulars of the Daubert hearing.

1. The Field of Latent Fingerprint Identification

Criminals generally do not leave behind full fingerprints on clean, flat surfaces. Rather, they leave fragments that are often distorted or marred by artifacts, terms we explain in the margin.1 These "latent" prints-from the Latin lateo, "to lie hidden," because they are often not visible to the naked eye until dusted or otherwise revealed-are the typical grist for the fingerprint identification expert's mill. Testimony at the Daubert hearing suggested that the typical latent print is a fraction-perhaps 1/5th-of the size of a full fingerprint. App. 435a-436a. A "full" fingerprint is familiar to anyone who has been fingerprinted for identification or law enforcement reasons: It is the print made by rolling the full surface of the fingertip onto a fingerprint card or electronic fingerprint capture device. (These prints are, for obvious reasons, also referred to as "rolled prints" or "full-rolled prints.") A full set of full-rolled fingerprints on a card-as would be taken during a police booking, for example-is known as a "ten-print card." Ten-print cards usually also have space at the bottom of the card for "flat impressions" or "plain impressions," where all four fingers of the hand are pressed at once onto the card without rolling.

Rolled prints and latent prints alike are subject to artifacts and distortions, though the problems with latent prints are more acute because they are smaller, and left more carelessly than full-rolled prints, and are left on surfaces that many other fingers have also touched. Appellant Br. at 10-11. See Andre Moenssens et al., Scientific Evidence in Civil and Criminal Cases, § 8.08 at 514 (4th ed. 1995) ("Many latent impressions developed at crime scenes are badly blurred or smudged, or consist of partially superimposed impressions of different fingers.").

Fingerprints are left by the depositing of oil upon contact between a surface and the friction ridges of fingers. The field uses the broader term "friction ridge" to designate skin surfaces with ridges evolutionarily adapted to produce increased friction (as compared to smooth skin) for gripping. Thus toeprint or handprint analysis is much the same as fingerprint analysis. The structure of friction ridges is described in the record before us at three levels of increasing detail, designated as Level 1, Level 2 and Level 3. Level 1 detail is visible with the naked eye; it is the familiar pattern of loops, arches, and whorls. Level 2 detail involves "ridge characteristics"-the patterns of islands, dots, and forks formed by the ridges as they begin and end and join and divide. The points where ridges terminate or bifurcate are often referred to as "Galton points," whose eponym, Sir Francis Galton, first developed a taxonomy for these points. The typical human fingerprint has somewhere between 75 and 175 such ridge characteristics. Level 3 detail focuses on microscopic variations in the ridges themselves, such as the slight meanders of the ridges (the "ridge path") and the locations of sweat pores. This is the level of detail most likely to be obscured by distortions.

The FBI-the agency that made the primary identification in this case-uses an identification method known as ACE-V, an acronym for "analysis, comparison, evaluation, and verification." The basic steps taken by an examiner under this protocol are first to winnow the field of candidate matching prints by using Level 1 detail to classify the latent print. Next, the examiner will analyze the latent print to identify Level 2 detail (i.e., Galton points and their spatial relationship to one another), along with any Level 3 detail that can be gleaned from the print. The examiner then compares this to the Level 2 and Level 3 detail of a candidate full-rolled print (sometimes taken from a database of fingerprints, sometimes taken from a suspect in custody), and evaluates whether there is sufficient similarity to declare a match. In the final step, the match is independently verified by another examiner, though there is some dispute about how truly independent this verification is.

The standards used by the FBI at the evaluation stage of the ACE-V protocol are somewhat less concrete than the numerical descriptions found in television police dramas that extol "twenty-point matches" and the like. An n-point match refers to a match between an unknown latent print and a known full print in which the examiner has identified n corresponding Galton points in the correct geometry relative to one another. A number of jurisdictions both outside the United States and within seem to rely on a system where a minimum number of corresponding points must be found before a match may be declared, irrespective of Level 3 detail. See, e.g., 2 Paul C. Giannelli & Edward Imwinkelried, Scientific Evidence § 16-7(A), at 768 (3d ed. 1999) ("In France, the required number [of points for a match] used most often is 24 while the number is 30 in Argentina and Brazil."). Such jurisdictions are said to use a "point system." On the other hand, Canada does not have a minimum point threshold for identification, and the United Kingdom recently eliminated a minimum point threshold. See United States v. Llera Plaza, 188 F. Supp. 2d 549, 569-70 (E.D. Pa. 2002) (quoting Lord Lester of Herne Hill's colloquy with Lord Rooker). The alternative approach-which gained favor with the FBI in the late 1940s, App. 378a-is to use a combination of quantity and quality: If ridge characteristics are abundant, then the quality of Level 3 detail is unimportant; but a paucity of Galton points can be compensated for by high-quality Level 3 detail. While this has the advantage of allowing an examiner to find a match in situations where an examiner using a strict point-based standard would not find one, this flexibility comes at the price of substituting a degree of subjectivity for an objective numerical standard.

The District Court held a five-day hearing pursuant to Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579, 113 S. Ct. 2786, 125 L. Ed. 2d 469 (1993), to rule on the admissibility of the government's and Mitchell's proposed expert testimony. The record of this marathon hearing alone comprises nearly one thousand pages of testimony and a similarly voluminous array of exhibits. The government called six witnesses (plus one rebuttal witness), and Mitchell, four. The District Court found all the offered expert witnesses to be qualified in their respective fields, and neither party raises a challenge to the qualifications, as such, of the witnesses. Rather, both sides' issues lie with the content of the testimony accepted by the District Court. We briefly describe the areas of testimony of each of the witnesses, starting with the government's witnesses.

a. The Government's Experts

Steven Meagher, an FBI special agent, testified at the hearing about Level 1, Level 2, and Level 3 detail (as described above), and other aspects of fingerprint identification. With regard to the FBI's practices, technology, and operations, he testified about the ACE-V protocol; that the FBI does not rely on a minimum "points" standard for matching fingerprints (and why it does not); and about the Automated Fingerprint Identification System ("AFIS") computer system (which automates some preliminary aspects of fingerprint matching). Meagher also described a survey (which we discuss, infra) of state fingerprint identification agencies that he prepared and circulated for the purpose of demonstrating that the fingerprint match in this case was, by wide consensus, correct. He also described an experiment (which we also discuss, infra) designed and run in cooperation with the contractor for the FBI's AFIS computer system, Lockheed Martin, that would search a portion of the AFIS database for identical fingerprints. Donald Zeisig, of Lockheed Martin, and Bruce Budowle, a statistician and population geneticist with the FBI, were also involved in this experiment, and both testified at the Daubert hearing. Zeisig also testified in greater detail about the technical background of the AFIS computer system.

The government offered two witnesses focusing principally on the biological aspects of fingerprints. Dr. William Babler, of Marquette University, testified about the prenatal development of friction ridges, opining that unique arrangements of friction ridges develop in the womb within a matter of months after conception. He also testified to the medical community's accepted understanding of the anatomical and cellular bases for the permanence of friction ridge arrangements. Ed German, of the United States Army Criminal Investigation Laboratory, testified to the lack of similarity found between corresponding fingerprints of identical twins, a conclusion established by his own research on identical twins and confirmed by other studies of identical twins.

The government also offered David Ashbaugh, of the Royal Canadian Mounted Police, who testified broadly about the development, comparison, and identification of friction ridge skin and impressions. Like the other government witnesses who were examined on the matter (viz., Agent German, Agent Meagher, and Dr. Budowle) he responded that it was his opinion that friction ridge arrangements were unique (the "uniqueness proposition") and permanent (the "permanence proposition"), and that positive identifications can be made from fingerprints containing sufficient quantity and quality of ridge detail. Dr. Babler also opined that friction ridge arrangements are unique and permanent. These propositions were the foundation of the government's argument that latent fingerprint identification evidence satisfies Daubert.

The government conducted two experiments in anticipation of the Daubert hearing: (1) a survey of state fingerprint identification agencies asking them, inter alia, if they could match the latent prints in this case to Mitchell's ten-print card; and (2) a search for identical fingerprints using data in the AFIS computer system.2 The specifics of these experiments bear on their relevance as expert evidence, and so we describe them in some detail.

For purposes of this case, Meagher created a survey packet that was sent out to the principal law enforcement agency of each of the fifty states, plus the District of Columbia, Canada's Royal Canadian Mounted Police, and the United Kingdom's Scotland Yard. The survey contained three parts: Part A involved questions about whether the agency currently accepts fingerprints as a means to individualize (i.e., make an identification), and about whether the agency regards fingerprints as unique and permanent. All fifty-three recipients responded in the affirmative to both queries. Joint Supp.App. at 56. Part C inquired whether the agencies had ever found two individuals to have the same fingerprint; the response was, unanimously, no. Part C also revealed that, in the aggregate, the ten-print records of nearly 70 million individuals-or about 700 million fingerprints-have been examined during the course of the agencies' operations.

Part B of the survey was designed as a demonstration of the ACE-V identification protocol, and it used the latent fingerprints at issue in this case. Part B offered each agency photographs of the two latent prints and of Mitchell's ten-print card. Agencies were asked first to attempt to identify the ten-print card using their own computerized fingerprint database. It is common practice (for efficiency's sake) to "filter" the database in making an identification, by considering only the subset of records (by race, sex, date of birth, etc.) that are likely to result in a match. Meagher requested that agencies not filter their database for this test, to ensure that the prints were compared against the maximum possible number of print records. Of the forty-seven agencies that responded, the only match that was found was in Pennsylvania, where Mitchell's ten-print record was already on file.

In the second segment of Part B, agencies were asked to attempt to match the latent prints to their existing records. The only "hits" were made by the two agencies (Mississippi and South Dakota) that inputted the ten-print card supplied by Meagher into their system prior to running the search (and thus raised the likelihood of a match). Pennsylvania was unable to run this search because of equipment troubles, but represented that it undoubtedly would have made a match if its system were fully operative.

The third segment of Part B asked agencies to perform manual comparisons of the latent prints to the ten-print card provided to them. This survey was single-blind, i.e., while Meagher knew that the latent prints had been identified as Mitchell's, knew that the ten-print card was Mitchell's, and believed the latents could be matched to the ten-print card, none of the survey recipients was told any of this. Roughly two thirds of the agencies responded to this portion. Over three quarters of the responding agencies matched both prints consistently with the FBI's identification. Of those that did not match both prints, half matched only one print consistent with the FBI's identification, and half matched neither print. In followup communications, the FBI either convinced these non-identifying agencies that a match did exist and they so acknowledged (though it took the strong suggestion of annotated blown-up photographs of the prints), or otherwise established reasons for the non-identification (e.g., the examiner deemed the quality of the supplied photographs to be too poor to make an identification, and would have preferred an original; or the comparison was performed by an inexperienced examiner, and on review, a senior examiner was able to find a match).

A critical summary point is that no agency ever registered a "false" positive (i.e., a positive match that contradicted the FBI's result): In the first segment of Part B, no agency matched Mitchell's ten-print card to someone else's ten-print card; in the second segment, no agency matched the latent prints to anyone other than Mitchell; and in the third segment, no agency matched a latent print to any finger other than the one to which the FBI had matched the latent print.

The second experiment conducted by the government's experts was known as the "50/50" experiment. This was an empirical examination by computer of a subset of the FBI's fingerprint records to search for pairs of very similar fingerprints taken from different sources. Finding such a pair would undermine the uniqueness proposition, see supra page 223, that the government's other experts testified was well-established. The experiment data set was a set of fifty thousand prints (out of about 340 million in the FBI's AFIS computer system). Rather than select these fifty thousand prints at random, the experimenters (Agent Meagher, Mr. Zeisig, and Dr. Budowle) took them from the subset of prints that were from white males and exhibited a left-sloped whorl pattern at Level 1 detail. The experimenters also ensured that multiple prints from the same person were included in the set of fifty thousand. The effect of these restrictions was to bias, from the outset, the prints toward being more similar (and hence more likely to contain a matching pair).3

In the first part of the test, a computer program-using the same algorithms as the FBI's AFIS computer system uses to match prints-attempted to match each of the fifty thousand prints against the full set of fifty thousand prints (hence the moniker "50/50"). Thus, a total of 50,000 × 50,000, or 2.5 billion, comparisons were performed. For each print, the best match was, by an enormous margin, itself.4 Based on statistical extrapolation from these results, the experimenters put the chances of a single full-rolled print matching another full-rolled print from anyone in the world other than the person who deposited the print at approximately one in ten to the eighty-sixth power (i.e., 1 chance in 1 followed by 86 zeroes), a very low probability indeed.

Apparently recognizing that analysis of full-rolled prints was not particularly germane to the question of the identification of latent partial prints, the government's witnesses conducted a second experiment. From each of the fifty thousand prints, they had the computer create a simulated latent print (referred to as a "pseudolatent print" or simply a "pseudolatent"), as might be recovered from a crime scene, by taking only about a fifth of the full-rolled print.5 They then ran a similar fifty thousand-by-fifty thousand comparison to see how strongly the pseudolatent prints matched full prints from which they had not been derived. With one exception which we identify in the margin, each pseudolatent was a strong match with the full print from which it had been derived, by a wide margin over any other full print.6 Statistical computations based on this experiment put the probability of a latent partial print matching the full print of anyone in the world other than the person who deposited the print at approximately one in ten to the sixteenth power (i.e., 1 in 10,000,000,000,000,000), also a very low probability.

b. Mitchell's Experts

Mitchell's first witness at the Daubert hearing was Marilyn Peterman, an investigator with the Defender Association of Philadelphia who took statements from those fingerprint examiners at state agencies who had failed to match the latent prints to Mitchell's ten-print card in completing Part B of the FBI's survey.7 She described which agencies adhered to a point system, how many points they required to make an identification, and noted that the agencies that did not find a match generally reported that they had found an insufficient number of points of similarity between the latent print and the ten-print card. Ms. Peterman also reported on the varying levels of experience and accreditation of the examiners who performed the comparisons for the agencies.

The first of Mitchell's three major experts was Dr. David Stoney, the director of the McCrone Research Institute in Chicago, a not-for-profit organization engaged in teaching and research in the forensic sciences. Dr. Stoney was, in Mitchell's counsel's words at the Daubert hearing, offered as an expert "with respect to whether a fingerprint examiner's conclusion that a latent fingerprint came from a particular individual is a scientific determination." App. 763a. The nucleus of Dr. Stoney's opinion is summarized in a portion of his testimony at the hearing:

The determination that a fingerprint examiner ... makes when comparing a latent fingerprint with a known fingerprint, specifically the determination that there is sufficient basis for an absolute identification, is not a scientific determination.... It is a subjective determination without objective standards to it.

Now, by "subjective" I mean that it is one that is dependent on the individual's expertise, training, and the consensus of their agreement of other individuals in the field. By "not scientific" I mean that there is not an objective standard that has been tested; nor is there a subjective process that has been objectively tested. It is the essential feature of a scientific process that there be something to test, that when that something is tested, the test is capable of showing it to be false.

App. 765a. Dr. Stoney opined that the evaluation phase of the ACE-V protocol requires the examiner to make a binary determination: Either two prints match sufficiently to make an absolute identification, or they do not. This Dr. Stoney contrasted to certain other forensic disciplines in which intermediate determinations are expressed in probabilistic terms. Dr. Stoney further objected to any characterization of fingerprint identification as having a "zero error rate," explaining that "something with a zero error rate cannot be a science.... [I]f we start out saying fundamentally something can't be shown to be wrong, then it means that we can't test it. If we can't test it, ... there's no way to show that it is wrong." App. 781a.

Dr. Stoney also criticized the 50/50 experiment. He noted first the undisputed proposition that two impressions of the same friction ridges will not be identical-artifacts and distortions will invariably appear.8 In that experiment, see supra page 225 and note 4, a fingerprint was compared against itself and 49,999 other fingerprints taken from the FBI's database. Hence, Dr. Stoney explained, the simulated task modeled by the 50/50 experiment was that of matching Print 1 and (the identical) Print 1 of Finger A. In his submission, the task in real-world fingerprint identification is one of matching Print 1 and Print 2 of Finger A. Thus, Stoney reasoned, the 50/50 experiment as executed assessed how much better a match is found between Print 1 and (the identical) Print 1 of Finger A than between Print 1 of Finger A and Print 1 of Finger B. A more meaningful version of the 50/50 experiment, Dr. Stoney explained, would have asked how much better a match is found between Print 1 and Print 2 of Finger A than between Print 1 of Finger A and Print 1 of Finger B.9

Dr. Stoney further criticized the method used to create the pseudolatent prints in the second part of the experiment. Dr. Stoney explained that it was established in the literature that simple masking, and even computer-generated blurring, of full prints cannot adequately simulate real latent partial prints. Dr. Stoney's ultimate conclusion was that these experimental defects rendered the probabilities derived by the government experts meaningless.

The defense's second principal expert was James Starrs, a professor in the Department of Forensic Sciences and the law school at George Washington University. Prof. Starrs has had a long career at the intersection of law and forensic science; indeed, an article by Prof. Starrs was cited by the Supreme Court in Daubert. See Daubert, 509 U.S. at 591, 113 S. Ct. 2786 (citing James E. Starrs, Frye v. United States Restructured and Revitalized: A Proposal to Amend Federal Evidence Rule 702, 26 Jurimetrics J. 249, 258 (1986)). Prof. Starrs was offered as an "exert [sic] in forensic science qualified to provide an opinion as to whether latent fingerprint examination meets the criteria of science." App. 813a-814a. Like Dr. Stoney, Prof. Starrs testified that it was his opinion that " [the current practice of] fingerprint comparison and analysis is not predicated on a sound and adequate scientific basis for purposes of making an individualization to one person from a fragmentary print to the exclusion of all other persons in the world." App. 828a.

To support his conclusion, Prof. Starrs highlighted five aspects of fingerprint examination that in his opinion were inconsistent with a scientific discipline: (1) claims to "absolute certainty"; (2) "the failure to carry out controlled empirical-data-searching experimentation"; (3) a failure to engage in error-rate analysis; (4) the lack of uniformity, objectivity, systematization, and standards; (5) "a failure to show a due regard to a vigorous and uncompromising skepticism." App. 828a-829a. In elaborating on each of these points, Prof. Starrs gave illustrations. For example, he briefly described a case of false identification; he described some of the subtle and non-systematized aspects of analyzing Galton points, see supra page 221-22, and he criticized some aspects of the training of new fingerprint examiners. Prof. Starrs also explained that he viewed the government's testimony and experiments involving full-rolled prints as irrelevant to the question of latent partial print identification. However, under cross-examination Prof. Starrs was agnostic on whether the propositions he challenged as unproven might, in the end, be scientifically supportable.

Mitchell's final expert at the Daubert hearing was Simon Cole, a post-doctoral fellow at Rutgers University, with expertise in "science and technology studies with particular expertise regarding the fingerprint profession." App. 939a. Dr. Cole had no experience in latent print examination. From his research, Dr. Cole identified four explanations for the widespread acceptance of fingerprint identification evidence: First, from the earliest days of the discipline, fingerprint examiners have developed an "occupational norm of unanimity," i.e., examiners would not publicly disagree with one another about an identification. Second, in terms of the way in which the fingerprint examination community handled the instances of known misidentification, such cases would, Dr. Cole explained, be blamed on practitioner incompetence or misconduct.10 Third was a simple lack of judicial scrutiny-a sort of snowball effect of string citations to cases and treatises approving fingerprint identification evidence. Fourth was a lack of an organized counter-expert group, a notable difference, Dr. Cole explained, between fingerprint identification and, say, psychiatric diagnosis. Dr. Cole also opined that fingerprint identification was not scientific because, inter alia, the fingerprint identification community had not engaged in studies that attempt to falsify the discipline's premises; did not engage in anonymous, critical (as opposed to positive) peer review; and did not recognize error rates.

c. Mitchell's Exhibits

As part of the Daubert hearing, Mitchell also introduced several hundred pages of documentary exhibits, principally journal articles and other excerpts from the corpus of literature criticizing the practice and theory of latent fingerprint identification, authored by his experts and by others. Also introduced were the results of some fingerprint proficiency tests, which suggested that examiners were prone to both false negatives (i.e., declaring a nonidentification where an identification should have been made) and false positives (i.e., making an incorrect identification). App. 3014a, 3063a. Finally, the defense introduced a survey of jurors that found that 93% agreed with the statement "fingerprint identification is a science" and 85% agreed with the statement "fingerprints are the most reliable means of identifying a person." App. 3047a-3048a.

d. The Government's Rebuttal Witness

To respond to defense testimony regarding the "occupational norm of unanimity" among fingerprint examiners, the government offered Pat Wertheim, a fingerprint examiner, as a rebuttal witness. Wertheim testified that he and David Grieve (who was present but did not testify) were involved as defense experts in a case of false identification in the United Kingdom. Based on their examination of the evidence in that case-which was both independent of the U.K. authorities and independent of each other-they testified, in opposition to the prosecution's expert, that the latent print in that case could not be matched to the defendant. The purpose of this testimony was to counter Dr. Cole's contentions about the occupational norm of unanimity within the discipline.

Two months after the Daubert hearing concluded, the District Court ruled from the bench on the admissibility of expert testimony at trial. In relevant part, the Court stated:

The matter presently pending before the Court is in reference to the defense motion to exclude the government's fingerprint identification evidence, and based on the Daubert hearing and also Kumho, this Court denies the defendant's motion. And pursuant thereto, this court is not going to make a determination as to the particular area of scientific knowledge and technical or specialized knowledge.

* * *

Further, pursuant to this Court's ruling, this Court finds that the government's fingerprint evidence is highly probative and substantially outweighs any danger of unfair prejudice to defendant.

* * *

We find that the government's expert witness-at this juncture it appears it's Duane Johnson [sic Wilbur Johnson?], an FBI latent fingerprint examiner who testified first in the previous trial, and those other latent experts that testified in the Daubert hearing-are capable of testifying in these proceedings, and in that regard, I am not going to limit the defense from calling latent fingerprint experts to testify as to the ability not to identify or make an identification from the fingerprints, and I am also going to allow the defense to call any latent fingerprint expert who indicates that fingerprints are not reliable sources of information.

Only for that limited purpose and I am going to exclude evidence as to whether or not [latent fingerprint identification is] scientific, technical, or whatever. It has no relevance before the jury here. The question is whether or not an identification can be made by examination of fingerprints-latent fingerprints.

App. 1029a-1031a (repunctuated for clarity).

As we understand the ruling, the District Court held that the government's expert witnesses and Mitchell's expert witnesses could testify, but with the caveat that the latter could not testify to the question whether latent fingerprint identification is a "science." This ruling forms at least the baseline of two of Mitchell's issues on appeal: the admission of government experts, and the restriction of his own experts. The Court again discussed the admissibility of the defense's expert witnesses in a colloquy with counsel immediately before jury voir dire, an exchange that we will discuss in greater detail, infra Part IV.

Immediately following its ruling on the admissibility of expert testimony, the District Court addressed what would become another ground of Mitchell's appeal. Again from the bench, the Court ruled:

This Court will take judicial notice that human friction ridges are unique and permanent throughout the area of the friction ridge skin, including small friction ridge areas, and further that human friction skin arrangements are unique and permanent, and if called upon, we will instruct the jury as so.

App. 1031a (repunctuated for clarity). The Court so instructed the jury. On appeal, Mitchell asserts that it was error for the District Court to take judicial notice of these matters.

The case against Mitchell rested on eleven lay witnesses and two experts. The government's star witness was Bookie's girlfriend, Kim Chester. Ms. Chester testified that she was present when Bookie and T were planning the robbery, and that she helped Bookie watch the comings and goings of the armored car in the weeks before the robbery. Ms. Chester said that she and T first met Mitchell and his wife at Mitchell's house, where she heard Mitchell and T discussing plans for the robbery. Mitchell's wife, Anita, invoked her spousal privilege and did not testify. Eileen Lambert, T's girlfriend at the time, testified that she also witnessed meetings between T and Mitchell.

Ms. Chester testified that the night before the robbery, Mitchell, Bookie, and T discussed the need to obtain a stolen car to use in the robbery. She explained that the next morning-September 12th-Bookie, T, and Mitchell drove her to work. She described how Mitchell and Bookie were arguing about what car to use in the robbery-the car they were in was Mitchell's wife's car, and he did not want to use it in the robbery. Ms. Chester testified that they dropped her off at her work, and that when she next spoke to Bookie, he indicated that they had gone through with the planned robbery. At that time, he had a substantial amount of cash, some of which he used to purchase a car and redeem several pieces of jewelry from a pawn shop.

Alma Shaw testified about her car being stolen the morning of September 12th. Emanuel Glover and Vernon Muse, the armored car guards, and Kim Kover-Jacobs, the check cashing agency manager, testified about the robbery itself. Messrs. Glover and Muse both identified Ms. Shaw's car as the getaway car; also, a fragment of the getaway car's license plate was noted by a bystander, Regan Wiggins, and this fragment was consistent with Ms. Shaw's car's license plate.

Laura Barnett, a Philadelphia police officer, testified that she recovered Ms. Shaw's car shortly after the robbery. It was found (with a bullet hole through the trunk) a few blocks from the check cashing agency. FBI Special Agent Donald Halfpenny testified that Ms. Shaw's car had been secured by the Philadelphia police at the time he took control of it. Wilbur Johnson, an FBI fingerprint examiner whom the Court qualified as an expert, testified that in Ms. Shaw's car he found, photographed, and preserved two latent fingerprints-one from the gearshift knob on the steering column, and one from the driver's side door handle-that he later identified as matching Mitchell's ten-print card as the right and left thumbs, respectively.

Mitchell was arrested the afternoon of September 12th. Special Agent Kevin Mimm and Special Agent Daniel Murphy, both of the FBI, testified to the circumstances of the arrest. They explained how they had been conducting surveillance operations in Philadelphia as a result of a number of armored car robberies; Agent Murphy was in charge of these operations. Agent Mimm testified that while he was engaged in covert surveillance of Mitchell and tailing Mitchell's car, Mitchell began to flee; Mimm described how he chased Mitchell at high speed for several blocks, and was ultimately able to stop him.11 Mitchell was arrested, and $1400 in five and ten dollar bills was recovered from him. This currency was never identified, however, as having been part of the armored car delivery.

Agent Meagher returned to testify at trial about many of the matters brought out by the government at the Daubert hearing. He discussed the embryology of friction ridge skin, the fingerprints of identical twins, and the biological basis for the permanence of fingerprints. He described how latent prints are left and how they are processed by examiners, and the various conclusions that examiners can draw from a comparison of prints. During Meagher's testimony, the government invoked the Court's promise to take judicial notice of the uniqueness of small areas of friction ridge skin. The government also read a stipulation detailing some of the results of the survey that Meagher testified about at the Daubert hearing, and the prosecutor examined Meagher regarding the agencies that did not make a positive identification of the latent prints. Meagher then demonstrated to the jury in some detail his use of the ACE-V technique in matching the latent prints to Mitchell's ten-print card. He stated definitively that the fingerprints from the beige car matched Mitchell's ten-print card. Agent Johnson also stated definitively that he had matched the latent prints from the beige car to Mitchell's ten-print card, though he did not give an in-depth demonstration to the jury as Agent Meagher did.

2. Mitchell's Case and Cross-Examination of the Government's Experts

The entirety of Mitchell's case was the testimony of individuals at state agencies who examined or supervised the examination of the latent prints sent by Agent Meagher in the survey. Specifically, Mitchell called thirteen latent fingerprint experts from nine states, all of whom were initially unable to identify one or both of the latent prints as belonging to Mitchell.12

Mitchell also cross-examined the government's experts, Agents Johnson and Meagher. Cross-examination of Johnson concentrated on questions about his presentation to the jury of the fingerprints he matched-Johnson's demonstrative exhibits identified only nine points of Level 2 similarity between the latent prints from the car and Mitchell's ten-print card, despite Johnson's and Meagher's claims of a greater number of similarities. Through cross-examining Agent Johnson, Mitchell also probed the existence and maintenance of minimum-point standards and other quality-control measures at the FBI in particular, and in the discipline more generally. Cross-examination of Agent Meagher ranged into more general considerations, most notably the limited studies performed specifically to establish an error rate for fingerprint identification, and the limited means for detecting errors in particular examinations. Meagher was also cross-examined on his highly suggestive follow-up communications to those state agencies that did not match Mitchell's prints in the survey.

D. Withholding of the NIJ Solicitation and Mitchell's Post-Trial Motion

On February 7, 2000, the jury returned a verdict of guilty on all counts. Mitchell's May 15, 2000 motion for a new trial pursuant to Fed. R. Crim. P. 33 was founded on the discovery of a research proposal solicitation released by the National Institute of Justice (an arm of the United States Department of Justice) entitled Forensic Friction Ridge (Fingerprint) Examination Validation Studies (the "solicitation"). The solicitation sought proposals for research studies on "validation of the basis for friction ridge individualization and standardization of comparison criteria." App. 3078a. Creation of the solicitation had been underway before Mitchell's trial, but the solicitation was not released until March 2000-after Mitchell's trial had concluded.

The District Court held a four-day hearing to take testimony and receive exhibits on the creation and import of the solicitation. At that hearing, Mitchell established that Agent Meagher (as well as some of the government's other witnesses at the Daubert hearing) had been involved in drafting the solicitation. Prof. Starrs testified that he regarded the solicitation as "a bolt out of the blue" that suggested to him "that the sponsors of the solicitation ... admitted ... [to] serious shortcomings in fingerprinting as it has been done up to this time." App. 2325a.

Moreover, Mitchell suggested that even the government regarded the solicitation as material. His most damaging evidence came from Dr. Richard Rau of the NIJ, who coordinated the drafting of the solicitation. Rau testified to conversations at a September 1999 meeting among himself, Donald Kerr (the Assistant Director of the FBI in charge of the FBI crime laboratory), David Boyd (the Deputy Director of the NIJ), and others. Rau claimed that at that meeting Kerr and Boyd agreed to withhold release of the solicitation until the end of Mitchell's trial. In response to Dr. Rau's testimony, the government called Kerr, Boyd, and the other individuals at the meeting to testify that Dr. Rau's account of the delay in releasing the solicitation was incorrect and that the delay was caused by budgetary issues.

The District Court denied Mitchell's motion, reasoning that the solicitation was not material for two independently sufficient reasons: First, the solicitation would not have been admissible at trial because attacks on the reliability of latent fingerprint identification were not permitted at trial based on the Court's Daubert ruling; and second, the solicitation was "not meant to set forth the state of the current research" and so its "claimed impeachment value ... either during the trial or for Daubert purposes is questionable at best." App. 12a-13a. On appeal, the government disclaims the first ground, but defends the District Court's ruling on the second ground, as well as on alternative grounds not reached by the District Court.

The District Court had jurisdiction over this case under 18 U.S.C. § 3231. Mitchell filed a timely appeal from the final judgment of conviction and sentence, and we have jurisdiction under 28 U.S.C. § 1291.

On appeal, Mitchell asserts that the District Court committed five errors. First, he challenges the District Court's ruling following the Daubert hearing that admitted the prosecution's expert testimony on fingerprint identification. Second, Mitchell claims that the District Court erred in precluding his experts from testifying at trial that fingerprint identification is not a science, and is otherwise unreliable. Third, Mitchell finds error in the District Court's decision to take judicial notice of the uniqueness of small areas of friction ridge skin. Fourth, Mitchell contends that the government's withholding of the NIJ solicitation, which could have been used as impeachment evidence, merited a new trial under Fed. R. Crim. P. 33, or that this nondisclosure violated the government's obligation under Brady v. Maryland, 373 U.S. 83, 83 S. Ct. 1194, 10 L. Ed. 2d 215 (1963). Fifth, Mitchell asserts that the District Court improperly admitted hearsay in the testimony of the government's principal lay witness, Ms. Chester. We will address each of these contentions in turn.

III. Admissibility of the Government's Expert Testimony

The parties disagree about the standard of review we should apply in evaluating the District Court's decision to admit the government's expert testimony. It is well-settled that, as a general matter, we review a district court's decision to admit expert testimony for abuse of discretion. See In re TMI Litig., 193 F.3d 613, 666 (3d Cir. 1999). We exercise plenary review, however, over a district court's legal interpretation of Fed.R.Evid. 702, under which the evidence in question was admitted. See id. On this much the parties agree.

Disagreement arises about the standard of review where, as here, the District Court made no findings of fact to support its admission of the testimony; indeed, after the lengthy Daubert hearing, the District Court elected not to make findings of fact or conclusions of law (written or oral), and simply ruled from the bench. This absence of factual findings, Mitchell contends, requires plenary review. We reject the rule that Mitchell urges for four reasons. First, Mitchell has provided no precedent for such a heightened standard of review over a field historically committed to the sound discretion of district courts.13 Second, the exception that Mitchell proposes would swallow the rule that district courts' evidentiary rulings are generally reviewed only for abuse of discretion. The vast majority of evidentiary rulings are made on-the-fly and without written findings of fact, yet this Court routinely affords deference to such judgments. Third, Mitchell's argument misconceives the rationale for using a deferential standard of review. Deferential review is employed not because the court being reviewed labored to produce a long opinion-there are lengthy but incorrect opinions just as there are brief but sagacious ones. Rather, deferential review is used when the matter under review was decided by someone who is thought to have a better vantage point than we on the Court of Appeals to assess the matter. See Ruggero J. Aldisert, The Judicial Process 728-29 (2d ed.1996) (quoting Maurice Rosenberg, Judicial Discretion of the Trial Court, Viewed from Above, 22 Syracuse L.Rev. 635, 663 (1971) (" [P]robably the most pointed and helpful [reason] for bestowing discretion on the trial judge is [that].... he sees more and senses more [than the Court of Appeals].")). This case is a good example: The District Court assessed extensive live testimony, while we work from a cold record. Fourth, the Supreme Court has in other contexts rejected heightened appellate review of district court rulings on expert testimony. See Gen. Elec. Co. v. Joiner, 522 U.S. 136, 118 S. Ct. 512, 139 L. Ed. 2d 508 (1997).

Thus we reject Mitchell's proposed standard of review, and adhere to the usual precepts of abuse-of-discretion review over the District Court's decision to admit the government's expert testimony.

If scientific, technical, or other specialized knowledge will assist the trier of fact to understand the evidence or to determine a fact in issue, a witness qualified as an expert by knowledge, skill, experience, training, or education, may testify thereto in the form of an opinion or otherwise.

Daubert identified the twin concerns of "reliability" (also described as "good grounds") and "helpfulness" (also described as "fit" or "relevance") as the "requirements embodied in Rule 702."15Daubert, 509 U.S. at 589-92, 113 S. Ct. 2786. Daubert was "limited to the scientific context because that [wa]s the nature of the expertise offered [t]here," id. at 590 n. 8, 113 S. Ct. 2786, but Kumho Tire extended Daubert's "general principles" to all of "the expert matters described in Rule 702." Kumho Tire, 526 U.S. at 149, 119 S. Ct. 1167. Thus "technical knowledge," under which heading the discipline of latent fingerprint examination and identification seems to fall, is generally subject to the same considerations as "scientific" expertise.

The "general principles" adverted to in Kumho Tire comprised not only the fundamental concerns of reliability and helpfulness, but also a method for assessing reliability. The Daubert Court articulated "general observations" to this end by offering a nonexclusive list of five factors that a district court might consider in deciding whether to admit evidence under Rule 702. The Advisory Committee summarized these factors:

The specific factors explicated by the Daubert Court are (1) whether the expert's technique or theory can be or has been tested-that is, whether the expert's theory can be challenged in some objective sense, or whether it is instead simply a subjective, conclusory approach that cannot reasonably be assessed for reliability; (2) whether the technique or theory has been subject to peer review and publication; (3) the known or potential rate of error of the technique or theory when applied; (4) the existence and maintenance of standards and controls; and (5) whether the technique or theory has been generally accepted in the scientific community.

Fed.R.Evid. 702 advisory committee's note.

Citing Kumho Tire, the Advisory Committee noted that " [o]ther factors may also be relevant," id., and indeed, courts have augmented this list. In Paoli II we drew on Daubert and our earlier decision in United States v. Downing, 753 F.2d 1224 (3d Cir. 1985), to lay out an expanded list of factors:

(1) whether a method consists of a testable hypothesis; (2) whether the method has been subject to peer review; (3) the known or potential rate of error; (4) the existence and maintenance of standards controlling the technique's operation; (5) whether the method is generally accepted; (6) the relationship of the technique to methods which have been established to be reliable; (7) the qualifications of the expert witness testifying based on the methodology; and (8) the non-judicial uses to which the method has been put.

Paoli II, 35 F.3d at 742 n. 8.

These factors address only reliability, and not "helpfulness" or "fit." But the fit inquiry in the case of fingerprint identification is not a significant factor, because identity evidence is the archetypal relevant evidence in criminal cases. Thus, the analysis that follows only addresses the reliability prong of Daubert.

We first consider whether the premises on which fingerprint identification relies are testable-or, better yet, actually tested. "Testability" has also been described as "falsifiability." See, e.g., Daubert, 509 U.S. at 593, 113 S. Ct. 2786 (citing Karl R. Popper, Conjectures and Refutations: The Growth of Scientific Knowledge 37 (5th ed.1989)). A proposition is "falsifiable" if it is "capable of being proved false; defeasible." Webster's Third New International Dictionary 820 (unabridged ed.1966). Proving a statement false typically requires demonstrating a counterexample empirically-for instance, the hypothesis "all crows are black" is falsifiable (because an albino crow could be found tomorrow), but a clairvoyant's statement that he receives messages from dead relatives is not (because there is no way for the departed to deny this).

In this case, the relevant premises were posed as explicit questions to many of the government experts: (1) Are human friction ridge arrangements unique and permanent? and (2) Can a positive identification be made from fingerprints containing sufficient quantity and quality of detail? The government's experts responded in the affirmative. We must consider not whether we agree as a factual matter with their responses, see Paoli II, 35 F.3d at 744, but rather whether these hypotheses are testable (or tested). We conclude that they are.

Consider the first premise (which is really two hypotheses in one)-that human friction ridge arrangements are unique and permanent. The uniqueness proposition is testable because it would immediately be shown false upon the production of identical friction ridge arrangements taken from different fingers (either from different fingers on the same person, or from two different people). The uniqueness proposition has also been tested in several ways: First, the full-print matching portion of the FBI's 50/50 experiment tested it and found no true matches.16 Second, studies on identical twins (testified about by Agent German) showed unique fingerprints. While this is a small sample, there are independent and solid genetic grounds for believing that if identical friction ridge arrangements are to be found, they are most likely to be found in identical twins. Third, in the course of routine fingerprint examination, there are certainly opportunities to encounter identical fingerprints; as several witnesses testified, such a discovery would be very notable and word would spread quickly throughout the fingerprint examiner community. Yet no reports of non-unique friction ridge arrangements were introduced, and, indeed, the FBI survey sent to state agencies revealed that none had ever encountered two different persons with the same fingerprint. Joint Supp.App. at 55.

The permanence component of the first hypothesis is also easily testable-simply take fingerprints from an individual at one time and compare them to the prints taken at another time. The Daubert hearing did not provide much evidence of actual testing of this hypothesis, however.

We turn next to the testability of the second hypothesis-that positive identification can be made from fingerprints containing sufficient quantity and quality of detail. Much of the debate in this case is masked by the word "sufficient." For example, a sufficiency standard of "100 points of matching Level 2 detail in an undistorted fingerprint lifted from a clean, smooth surface" would surely attract less objection than a sufficiency standard of "four points of matching Level 2 detail and passable quality." The actual standard employed by any given FBI examiner falls somewhere between these extremes, yet the FBI's reliance on an unspecified, subjective, sliding-scale mix of "quantity and quality of detail" makes meaningful testing elusive, for it is difficult to design an experiment to test a hypothesis with unspecified parameters. Two things rescue fingerprint identification from this apparent failure of testability: First, the examiner can testify to how much detail (quantitative and qualitative) was necessary for the particular identification at issue; and second, any testing directed toward falsifying the premise that a greater or equal amount of detail is sufficient to make an identification will serve as an attempt (albeit an imperfect one) to falsify the adequacy of the identification standard actually used.17

Just how much testing has been done to this end is unclear from the testimony at the Daubert hearing. On the one hand, it might be that examiners compare a latent print to a series of full-rolled prints until a match is found, and then terminate the process. If this protocol is used for routine examinations, those examinations will not tend to turn up multiple matches, because the examiner stops work after finding one match. In essence, the examiner has assumed the conclusion-that no other prints will match the latent, and therefore no further search is required. On the other hand, testimony at the Daubert hearing about the AFIS computer system suggests that the system tests a given latent print against its entire database (or a selected subset) of full-rolled prints, and returns a set of the best candidate matches. This protocol would tend to expose multiple full-rolled prints that match a given latent. Consequently, a lack of multiple matches from AFIS searches can constitute testing of the hypothesis that single positive identifications can be made from latent fingerprints. Whatever the case, no state agency claimed in response to the FBI survey that it had found a latent fingerprint that was "identified with two different fingers of the same person or even different persons." Joint Supp.App. at 55. This is perhaps the strongest support for the government on this point.

Modest support also comes from the second part of the government's 50/50 experiment, which matched simulated latent prints (pseudolatents) against the 50,000 full-rolled prints in the sample under examination. Setting aside spurious results due to mistakes in the FBI's database, the experiment found that each pseudolatent strongly matched one and only one full-rolled print. In other words, the experiment did not reveal any counterexample to the hypothesis that identifications can be made. Moreover, statistical computations extrapolating this to a much larger population of prints suggested that such duplicate matches would still be highly improbable.

Mitchell's experts, however, attacked the design of the 50/50 experiment, most effectively on the ground that pseudolatents are poor approximations of real latent prints.18 This lack of correspondence undermines the utility of the experiment because the issue for Daubert purposes is the testing of the hypothesis that positive identification be made from actual latent fingerprints containing sufficient detail. As we recount above, see supra page 226-27, Mitchell's experts (particularly Dr. Stoney) convincingly explained why the process used by the government experts to generate the pseudolatents for the 50/50 experiment renders them poor substitutes for actual latent prints. In brief, the failing flagged by Dr. Stoney is that actual prints are subject to distortions and artifacts that were not simulated by the pseudolatent generator. Arguably, the pseudolatents resembled actual latents only in that the former were similar in areal size to the latter. Dr. Stoney's contention rings true: Distorted, real-world latent prints should tend to be harder to match to full-rolled prints than should computer-generated simulated latents. Since the 50/50 experiment did not adequately model real-world conditions, we cannot say that it significantly supports the government's position.

In sum, if directed, specific actual testing were the requirement of Daubert, we might be hesitant to find this factor weighing in favor of the government. There is some force to Budowle's point that " [n]o one would say any one test or any kind of thing [that] has been done in one hundred years proves uniqueness." App. 1013a. But his further point about a long history of implicit testing is equally forceful: "It's the culmination of all of the experiences that [demonstrate uniqueness]." App. 1013a. Moreover, testability-which assures the opponent of proffered evidence the possibility of meaningful cross-examination (should he or someone else undertake the testing)-is one of the factors announced by the Daubert Court as an indicium of reliability. In sum, the hypotheses that undergird the discipline of fingerprint identification are testable, if only to a lesser extent actually tested by experience, and so we find this factor to weigh in favor of admitting the evidence.

The evidence at the Daubert hearing on peer review was not particularly extensive. Much of the testimony centered around the question whether the "verification" step in the ACE-V protocol-where a second examiner confirms the identification made by the first examiner-constitutes effective peer review. On the one hand, this could be viewed as stringent peer review, equivalent to the best sort used in, for example, the physical sciences, where peer review most often consists of anonymously reviewing a given experimenter's methods, data, and conclusions on paper. Sometimes the review takes the form of reproducing in full the results under review-that is, a second investigator repeats the entire course of experiments. Thus the verification step of ACE-V seems usually to be akin to this heightened form of peer review: The government's experts testified that verification often amounts to repeating the whole identification process de novo, though sometimes the verifying examiner will merely confirm the match found by the initial examiner. See App. 161a. Moreover, in this particular case, the survey of state law enforcement agencies constitutes verification many times over of the match of Mitchell's fingerprints.

Mitchell's experts, however, (Dr. Cole in particular) cast some doubt on the purity of the verification step. Backed by his research, Dr. Cole suggested that fingerprint examiners have developed an "occupational norm of unanimity" that strongly discourages the verifying examiner from challenging the identification made by the initial examiner. Moreover, Dr. Cole criticized peer review of latent fingerprint identification conclusions for not being anonymous. We also acknowledge that the cultural mystique attached to fingerprint identification may infect the peer review process. But the government's experts countered that they were aware of cases where the results of the verification step caused the initial examiner to withdraw his initial identification. Looking at the entire picture, the ACE-V verification step may not be peer review in its best form, but, on balance, the peer review factor does favor admission.

The peer review factor also encompasses publication, as the dissemination of a work tends to subject it to scrutiny in the same way that prepublication peer review does. See Daubert, 509 U.S. at 593-94, 113 S. Ct. 2786. On the one hand, a significant fraction of the publications in the field concern articles on technique-for example, the best practices for preserving latent prints-and such materials say little about the field's reliability. On the other hand, there are articles-introduced both by the government and by Mitchell-that address more theoretical/foundational questions, such as an appropriate minimum point standard, the likelihood of two persons having identical friction ridge arrangements, and so on. Thus the publication facet of peer review is not a strong factor, and neither reinforces nor detracts from our conclusion that the peer review factor favors admission.

The parties have waged a considerable battle of experts over whether a known error rate exists for latent fingerprint identification. Assuming that such a rate has been soundly established, it is surely a low rate of error. But the existence of any error rate at all seems strongly disputed by some latent fingerprint examiners.

The question whether an error rate can be established on the existing data is subtler than the parties seem to acknowledge. Preliminarily, we must distinguish between two error rates: false positives and false negatives. In this context, false positives are incorrect affirmative identifications, and false negatives are incorrect findings of dissimilarity. A fair amount of the government's evidence-and also much of Mitchell's response-centers on the existence vel non of failed identifications. For example, the government stresses the large number of state agencies that confirmed its identifications, and Mitchell counters by pointing to the agencies that failed to identify the prints. But these observations go to the rate of false negatives: While a system of identification with a high false negative rate may be unsatisfactory as a matter of law enforcement policy, in the courtroom the rate of false negatives is immaterial to the Daubert admissibility of latent fingerprint identification offered to prove positive identification because it is not probative of the reliability of the testimony for the purpose for which it is offered (i.e., for its ability to effect a positive identification).19

Thus we must focus on evidence that is probative of the rate of false positives. Perhaps the government's most powerful evidence is the fact that, in the course of the FBI survey of state agencies, no jurisdiction ever matched the latent prints from the gearshift knob and door handle to anyone other than Mitchell himself-despite searches run against (in the aggregate) nearly 70 million ten-print records. Assuming that every record had 10 fingerprints, and that the latents actually were left by Mitchell, the test of the two latent prints against these records implies something on the order of 1.4 billion comparisons resulting in no false positives. The government can also draw support from the very limited number of reports of false positive identifications throughout the many decades that the technique has been in use. Furthermore, the government's 50/50 experiment using pseudolatents, representing 2.5 billion comparisons, also did not register any false positives, though as we have noted, see supra page 237, it had flaws.

Mitchell counters this evidence in two different ways, but neither of them fully refutes the government's evidence. First, he raises a legal challenge, claiming that the burden of proof under Fed.R.Evid. 104(a) is up-ended by effectively requiring him to come forward with examples of false positives. While Mitchell is correct that Rule 104(a) places the burden of proof on the proponent of the evidence (here, the government), see Bourjaily v. United States, 483 U.S. 171, 175, 107 S. Ct. 2775, 97 L. Ed. 2d 144 (1987), this does not mean that the burden is static, at least in terms of a burden of going forward. Particularly in a case like this, where what is sought to be proved is essentially a negative (i.e., the absence of false positives), it seems quite appropriate to us to use a burden-shifting framework. Such a framework was applied here: The government's experts-qualified as knowledgeable in matters pertaining to fingerprint identification-testified to their being unaware of significant false positive identifications. At that point, it becomes quite reasonable to shift the burden to the opponent of the evidence (here, Mitchell) to counter this claim with affirmative examples.

Mitchell's second attack on the government's evidence of error rates is factual. He presented evidence that fingerprint examiners sometimes make false positive identifications on proficiency examinations. This evidence is troubling, but we view it as evidence relating only to the competency of those practitioners, leaving undisturbed the government's evidence about the near-absence of false positive identifications.20

We therefore accept that the error rate has been sufficiently identified to count this factor as strongly favoring admission of the evidence. The error rate has not been precisely quantified, but the various methods of estimating the error rate all suggest that it is very low. This follows from three pieces of evidence we identify above as favoring the government: (1) the absence of significant numbers of false positives in practice (despite the enormous incentive to discover them), (2) the absence of false positives in the FBI's state agency survey, and (3) the statistical computations based on the 50/50 experiment.

Closely related to the question of error rate is the maintenance of standards to guide the application of the method. This is lacking here in some measure. The FBI maintains that its flexibility to consider a mixture of Level 2 and Level 3 detail in making identifications renders its method superior to and more flexible than the minimum-points standards used in some states and various foreign jurisdictions. The tradeoff, though, is that the FBI's method lacks a significant yardstick of standard-based objectivity. In contrast, with a minimum-point standard there is at least some agreement about what constitutes a Galton point and what does not.

Some standards do remain: There are procedural standards (such as ACE-V) and terminological standards (such as the naming conventions for Galton points). But these are insubstantial in comparison to the elaborate and exhaustively refined standards found in many scientific and technical disciplines. As such, we find that this factor does not favor admitting the evidence.

Prior to the adoption of the Federal Rules of Evidence, admission of expert testimony was governed by the Frye test, which required that the evidence must have gained "general acceptance in the particular field in which it belongs." Frye v. United States, 293 F. 1013, 1014 (D.C. Cir. 1923). Daubert held that Congress's adoption of Rule 702 legislatively overruled Frye, see 509 U.S. at 588-89, but at the same time acknowledged that "`general acceptance' can yet have a bearing on the inquiry," id. at 594, 113 S. Ct. 2786. Thus we consider as one factor in the Daubert analysis whether fingerprint identification is generally accepted within the forensic identification community. The answer is yes, as demonstrated by the results of the FBI's survey of state agencies. See App. 383a. Mitchell's only argument with respect to this factor is that there is no scientific community that generally accepts fingerprint identification. But the scientific/nonscientific distinction is irrelevant after Kumho Tire, and accordingly we reject the argument. We also note that the Court of Appeals for the Fourth Circuit, in addressing the same question that we are considering here, relied heavily on general acceptance to support the admission of fingerprint identification evidence. See United States v. Crisp, 324 F.3d 261 (4th Cir. 2003). We likewise conclude that this factor weighs in favor of admitting the evidence.

6. Relationship to Established Reliable Techniques

Although the parties have not provided us with extensive analysis of the relationship of the principles and practice of latent fingerprint identification to "`more established modes of ... analysis,'" Paoli II, 35 F.3d at 742 (quoting Downing, 753 F.2d at 1238-39), it seems to us that this is the best heading under which to consider the government's evidence from the fields of developmental embryology and anatomy. The testimony and documentary materials introduced on these topics during the Daubert hearing-especially through Dr. Babler-tended to establish biological bases for the uniqueness and permanence of areas of friction ridge skin. Since no question was raised about the soundness and reliability of the work in these specialties, we are comfortable that the reliability of these fields is well-established. Independent work in these fields bolsters the underlying premises of fingerprint identification, and so we find that this factor lends additional support to admitting the latent fingerprint identification evidence.

7. Degree to Which the Expert Testifying Is Qualified

As we have noted before, there were essentially no challenges to the qualifications of the government's experts (or of Mitchell's experts, for that matter), but the binary question whether an expert is or is not qualified to testify to a particular subject is analytically distinct, under Rule 702, from the more finely textured question whether a given expert's qualifications enhance the reliability of his testimony. See Schneider ex rel. Estate of Schneider v. Fried, 320 F.3d 396, 407 (3d Cir. 2003) (" [The defendant's] argument appears to challenge the qualification of [the plaintiff's expert]; although we note that `the degree to which the expert testifying is qualified' also implicates the reliability of the testimony." (quoting Paoli II, 35 F.3d at 742)).

The qualifications of Agents Meagher and Johnson matter the most, because they were the government's experts at trial. Both had estimable qualifications. The putative blemish on their qualifications, which we hint at above, see supra note 20, is that neither testified extensively about his own known error rate as a practitioner (as might be revealed, for example, by proficiency tests they had taken). While this is by no means fatal to the admissibility of the testimony, prosecutors would be well-advised to elicit testimony about their experts' personal proficiency, rather than relying on the discipline's good general reputation among lay jurors. Failing that, we are confident that defense counsel will use cross-examination to expose incompetent fingerprint examiners. In this case, Agent Meagher's uniquely strong qualifications and the confirmatory identifications from state agencies are a surrogate for testimony about Agent Meagher's and Agent Johnson's personal proficiency as examiners.21 Thus this factor supports admitting the government's evidence.

We have recognized that evidence of the non-judicial uses of the technique in question is relevant to the Daubert reliability inquiry. See Paoli II, 35 F.3d at 742. This is because non-judicial use of a technique can imply that third parties-i.e., persons other than the proponent of the expert testimony, for whom the testimony is typically self-serving-would vouch for the reliability of the expert's methods.22 The government offered some evidence of the non-judicial uses of fingerprint identification, particularly through Dr. Budowle. App. 639a-641a. In analyzing this factor, the government relies on three categories of non-judicial uses of fingerprints: (1) the identification of arrested persons (e.g., checking an arrestee's record at the time of booking); (2) biometric identification as a security measure (e.g., authenticated access to a computer system) or for regulatory purposes (e.g., fingerprinting for driver licensing as an anticounterfeiting measure); and (3) identification of partial remains following disasters. While at first blush this seems like a factor strongly supporting admissibility, the bloom recedes upon close analysis.

Latent fingerprint identification works from fingerprints that are partial and subject to distortions. All the nonjudicial uses listed above either use full-rolled prints, or avoid the difficulties introduced by distortion-or both. Both differences are critical, as Mitchell's experts testified and as the government's experts acknowledged: It is significantly easier to match one clean full-rolled print to another than it is to match a somewhat distorted latent fragment to a full-rolled print.23 Thus, in the case of identification of arrestees, the booking officer will take a ten-print card with a full set of full-rolled prints, and if the prints do not come out cleanly, the officer has the opportunity to take a second set of impressions. Likewise, the security and regulatory uses of fingerprinting generally rely on clean, full-rolled prints.24 As for disaster-victim identification, the government's experts did testify that fragments of friction ridge skin have been used to make identifications, but even those identifications still differ from latent fingerprint identification because identification using actual skin eliminates the challenges introduced by distortions.25 Thus there is less here than meets the eye, and while this factor supports admitting the government's evidence, it does so only weakly.

Although it is clear from the foregoing analysis of the Daubert factors that the government's fingerprint evidence passes muster, Mitchell contends that the government's inability to establish that its evidence is correct, and its failure to show that its evidence meets the standards required of "science," mean that the government's evidence must be excluded. Mitchell is wrong. This is established by Daubert itself, which requires no more than that the Court satisfy itself that "good grounds" exist for the expert's opinion. See 509 U.S. at 590, 113 S. Ct. 2786.

Judge Selya has put it well:

Daubert does not require that a party who proffers expert testimony carry the burden of proving to the judge that the expert's assessment of the situation is correct. As long as an expert's scientific testimony rests upon "good grounds, based on what is known," it should be tested by the adversary process-competing expert testimony and active cross-examination-rather than excluded from jurors' scrutiny for fear that they will not grasp its complexities or satisfactorily weigh its inadequacies. In short, Daubert neither requires nor empowers trial courts to determine which of several competing scientific theories has the best provenance. It demands only that the proponent of the evidence show that the expert's conclusion has been arrived at in a scientifically sound and methodologically reliable fashion.

To the extent that Mitchell's attack rests on his experts' claim that latent fingerprint examiners do not engage in "science," he does not heed the text of Rule 702 or the Supreme Court's teachings in Kumho Tire. Rule 702 "makes no relevant distinction between `scientific' knowledge and `technical' or `other specialized' knowledge." Kumho Tire, 526 U.S. at 147, 119 S. Ct. 1167. The very holding of Kumho Tire is that those categories simply address what type of testimony is covered by the rule, and that, in addressing admissibility under Rule 702, the same factors generally apply to all categories of expert testimony. Kumho Tire explicitly rejected as unworkable and unnecessary any "distinction between `scientific' knowledge and `technical' or `other specialized' knowledge." Id at 148, 119 S. Ct. 1167. That a particular discipline is or is not "scientific" tells a court little about whether conclusions from that discipline are admissible under Rule 702; at best, there will be some overlap between the factors that bear on a field's status as "science" and Daubert's factors addressed to reliability. Reliability remains the polestar.

Mitchell seeks a significantly higher threshold of admissibility under Rule 702, and, consequently, a very different allocation of responsibility between judge and jury. Yet Rule 702 and Daubert put their faith in an adversary system designed to expose flawed expertise. Mitchell misconceives this balance struck by the framers of Rule 702 and the Daubert Court. As the Advisory Committee explained in the context of the December 1, 2000 amendment to Rule 702, "Daubert did not work a `seachange over federal evidence law,' and `the trial court's role as gatekeeper is not intended to serve as a replacement for the adversary system.'" Fed.R.Evid. 702 advisory committee's note (quoting United States v. 14.38 Acres of Land Situated in Leflore County, Miss., 80 F.3d 1074, 1078 (5th Cir. 1996)). Daubert itself emphasized the point: "Vigorous cross-examination, presentation of contrary evidence, and careful instruction on the burden of proof are the traditional and appropriate means of attacking shaky but admissible evidence." 509 U.S. at 596, 113 S. Ct. 2786. These trial practices and procedural devices like the directed verdict, "rather than wholesale exclusion under an uncompromising ... test, are the appropriate safeguards where the basis of scientific testimony meets the standards of Rule 702." Id. We echoed this in Paoli II, where we noted "Rule 702 mandates a policy of liberal admissibility." 35 F.3d at 741.

In this context, the court is often referred to as a "gatekeeper." This metaphor is particularly apt because it works two ways: On the one hand, the court must exclude some evidence as a gatekeeper, by "preventing opinion testimony that does not meet the requirements of qualification, reliability and fit from reaching the jury," Schneider, 320 F.3d at 404. But on the other hand, the court is only a gatekeeper, and a gatekeeper alone does not protect the castle; as we have explained, " [a] party confronted with an adverse expert witness who has sufficient, though perhaps not overwhelming, facts and assumptions as the basis for his opinion can highlight those weaknesses through effective cross-examination." Stecyk v. Bell Helicopter Textron, Inc., 295 F.3d 408, 414 (3d Cir. 2002).

Indeed, as our discussion of the various Daubert factors suggests, many of them are guarantees that cross-examination and adversary testing will be possible: Testability ensures the basic possibility of meaningful cross-examination. Peer review and publication also provide raw material for the cross-examining attorney to confront the expert with. The existence of a known error rate may force an expert to admit to the limitations of his or her methods. The maintenance of standards provides an objective benchmark to confirm that the expert did indeed follow her method. And so on. Since these factors were well-satisfied in this case, it was with confidence that the baton was passed from the Court to the adversary system.

The principle that cross-examination and counter-experts play a central role in the Rule 702 regime has three important applications to this case. First is the core holding of United States v. Velasquez, 64 F.3d 844, 848-49 (3d Cir. 1995): Experts with diametrically opposed opinions may nonetheless both have good grounds for their views, and a district court may not make winners and losers through its choice of which side's experts to admit, when all experts are qualified. Rather, the same standards of reliability and helpfulness should be applied to both sides, with a "`preference for admitting any evidence having some potential for assisting the trier of fact.'" Id. at 849 (quoting DeLuca v. Merrell Dow Pharm., Inc., 911 F.2d 941, 956 (3d Cir. 1990)). We return to this in the next section, where we discuss the District Court's handling of Mitchell's experts.

Second, district courts will generally act within their discretion in excluding testimony of recalcitrant expert witnesses-those who will not discuss on cross-examination things like error rates or the relative subjectivity or objectivity of their methods. Testimony at the Daubert hearing indicated that some latent fingerprint examiners insist that there is no error rate associated with their activities or that the examination process is irreducibly subjective. This would be out-of-place under Rule 702. But we do not detect this sort of stonewalling on the record before us.

Third, this case does not announce a categorical rule that latent fingerprint identification evidence is admissible in this Circuit, though we trust that the foregoing discussion provides strong guidance. And as we explain in Velasquez, both Rule 702 and the Sixth Amendment's Confrontation Clause permit any criminal defendant to put the prosecution to its proof at trial. None of this, however, should be read to require extensive Daubert hearings in every case involving latent fingerprint evidence. The Supreme Court has emphasized that district courts "have the same kind of latitude in deciding how to test an expert's reliability" as they do in deciding "whether or not that expert's relevant testimony is reliable." Kumho Tire, 526 U.S. at 152, 119 S. Ct. 1167. Thus a district court would not abuse its discretion by limiting, in a proper case, the scope of a Daubert hearing to novel challenges to the admissibility of latent fingerprint identification evidence-or even dispensing with the hearing altogether if no novel challenge was raised.

E. Conclusion on the Admissibility of the Government's Evidence

We conclude, on the record before us read in light of the basic Daubert principles, that most factors support (or at least do not disfavor) admitting the government's latent fingerprint identification evidence. There are good grounds for its admission. We therefore conclude that the District Court did not abuse its discretion in holding the government's evidence admissible.

IV. Admissibility of Mitchell's Expert Testimony

Mitchell asserts that he was not permitted to put on all of his experts at trial, and hence was not able to effectively counter or undermine the government's fingerprint identification evidence. Specifically, Mitchell contends that his three principal experts at the Daubert hearing-Dr. Stoney, Prof. Starrs, and Dr. Cole-were, as a practical matter, excluded from the trial by the District Court's rulings limiting the scope of their testimony. Mitchell argues that our holding in United States v. Velasquez, 64 F.3d 844 (3d Cir. 1995), requires that he be able to present qualified expert testimony before the jury to challenge the government's expert testimony. The government does not dispute this as a legal matter; instead it takes issue with Mitchell's premise, arguing that the District Court did not in fact exclude Mitchell's witnesses. The foregoing discussion about the central role of adversary testing in expert testimony has direct application.

If Mitchell were correct that his experts-who were undoubtedly qualified to offer their expert opinions-were precluded from testifying in opposition to the government's experts, our holding in Velasquez would obligate us to vacate Mitchell's conviction and remand for a new trial at which their testimony would be heard. But our review of the record does not disclose that Mitchell's experts were excluded or the scope of their testimony improperly limited. To the extent that the record is even ambiguous, the onus was on Mitchell's counsel to make a clear record, especially given the multiple, nuanced categories of testimony being discussed in the colloquies with the District Court on this matter.

As in the previous section, we review the District Court's decision to admit or exclude expert testimony for abuse of discretion, see In re TMI Litig., 193 F.3d at 666, but also note that an error of law-such as a failure to follow Velasquez-is an abuse of discretion, see Planned Parenthood v. Attorney Gen., 297 F.3d 253, 265 (3d Cir. 2002). We begin with a discussion of Velasquez and then turn to the District Court's rulings.

The defendant in Velasquez was tried on federal drug, firearms, and conspiracy charges. A fact in issue at trial was the origin of certain packages with handwritten mailing labels, packages the government sought to connect to Velasquez's coconspirators. The government proposed to make the connection by way of forensic handwriting identification, and the District Court qualified an analyst from the Postal Inspection Service to testify to the handwriting identification. In response, Velasquez proffered his own expert-a law professor critical of handwriting analysis whose research, we held, qualified him as an expert in handwriting analysis-to testify that handwriting analysis in general is not reliable, and, in the alternative, that the particular identifications made by the government's expert were unreliable. The District Court declined to admit Velasquez's expert's testimony, reasoning that "whether or not handwriting expertise is admissible in a courtroom is a legal question that was resolved against the defense when the court permitted [the government's expert] to testify as a qualified expert in the field of handwriting analysis." Velasquez, 64 F.3d at 846-47 (internal quotation marks omitted).

On appeal, we reversed. The central error in the District Court's reasoning was its failure to follow the "axiom" that "the reliability of evidence goes `more to the weight than to the admissibility of the evidence.'" Id. at 848 (quoting United States v. Jakobetz, 955 F.2d 786, 800 (2d Cir. 1992)). Following that principle, the substantive reliability question is as much for the jury (in the context of courtroom adversary testing) as it is for the court (in the context of a Daubert hearing). Consequently, we held that it was an error of law to fail to admit the testimony of a qualified opposing expert, provided that the testimony meets the usual criteria for admission under Rule 702. Moreover, in situations covered by Velasquez, the opposing expert's testimony will ordinarily be helpful to the jury precisely because it is opposing-it will help the jury to evaluate the reliability of the opinion offered by the proponent expert. See Velasquez, 64 F.3d at 852 (holding that Velasquez's expert "would have assisted the jury in determining the proper weight to accord [the government's expert's] testimony").

In sum, Velasquez announces a parity principle: If one side can offer expert testimony, the other side may offer expert testimony on the same subject to undermine it, subject, as always, to offering a qualified expert with good grounds to support his criticism. Having this in mind, we turn to what happened in Mitchell's case.

C. The Parties' Interpretations of the District Court's Rulings

The District Court addressed the scope of Mitchell's proposed trial experts' testimony on two occasions before trial: first at the time it ruled on the admissibility of the government's expert testimony (the "first colloquy"), and again immediately prior to jury voir dire (the "second colloquy"). Because our discussion may be illuminated for some readers by a transcript of these colloquies, we prescribe the relevant passages in the Appendix.

In brief, the government claims that the District Court simply precluded Mitchell's experts from testifying to the (irrelevant, it argues) issue of whether or not latent fingerprint identification is a science; all other testimony by Mitchell's experts regarding the reliability of the discipline, the government says, was ruled admissible by the District Court. Mitchell, however, submits that the District Court expressly precluded two of his witnesses (Prof. Starrs and Dr. Cole) from testifying at trial, and severely (and impermissibly, he submits) restricted the scope of the testimony of his third expert (Dr. Stoney). To support these positions, both parties offer interpretations of the colloquies with the District Court.

The government advances a three-tier theory of the rulings of the Court on defense expert testimony, supported principally by the following statement by the District Court during the first colloquy:

I am not going to limit the defense from calling latent fingerprint experts to testify as to the ability not to identify or make an identification from the fingerprints and I am also going to allow the defense to call any latent fingerprint expert who indicates that fingerprints are not reliable sources of identification.

Only for that limited purpose and I am going to exclude evidence as to whether or not it's scientific, technical or whatever. It has no relevance before this jury here. The question is whether or not an identification can be made by examination of fingerprints-latent fingerprints-and the record of this case, as far as the Daubert hearing will remain intact with these proceedings and will go with it through the life of this case.

App. 1030a-1031a.

The government interprets the three tiers as follows: First, the defense could challenge the specific identifications made of Mitchell's prints. (Something like this was actually done-Mitchell put on the fingerprint examiners who responded to the FBI survey and who initially did not match the latent prints found in the car to his fingerprints.) Second, the defense could challenge the reliability of latent fingerprint identification in general, by arguing, for example, that the discipline lacked an error rate, and thus the government expert witnesses' testimony was unreliable. (This, the government recognizes, is compelled by Velasquez.) Third, the defense could not put on witnesses to speak to the essentially definitional question of whether latent fingerprint identification was a science.

The government primarily directs our attention to four points in the colloquies. First is the passage quoted above from the beginning of the first colloquy, before counsel for either side had even spoken. Second, moving to the second colloquy (nearly five months later), the Court arguably suggested a more blanket exclusion of defense testimony, but the government counters that the written record of the colloquy is misleading because the whole topic of discussion had caught the Court by surprise and the Court's recollection needed to be refreshed. (Indeed, for much of the colloquy, the Court did not even have a transcript of the prior ruling before it.) Third, the government points out that during the second colloquy, the prosecutor advanced his own recollection of the ruling, saying that, in addition to permitting the defense to call experts that would testify that the fingerprints in this case did not match Mitchell's, " [the Court] also said [to the defense] that they can call any qualified expert... that would testify that fingerprints are not reliable sources of identification." App. 1071a. The government emphasizes that this was consistent with the three-tier theory.

Fourth, the government reads the ultimate ruling at the end of the second colloquy-especially the Court's approval of defense expert testimony by experts addressing "Mr. Mitchell's fingerprints or anyone else's fingerprints," App. 1072a-as a reaffirmance of the three-tier ruling. This should have special significance because it was the Court's last word on the subject. Finally, looking beyond the colloquies, further circumstantial support for the prosecution's three-tier theory can be drawn from the District Court's ruling at trial that Mitchell was allowed to cross-examine Agent Meagher on several issues pertaining to the general reliability of latent fingerprint identification. See App. 1543a.

For his part, Mitchell first directs our attention to the first colloquy where the Court seemed to make a specific ruling against admitting testimony by experts other than Dr. Stoney. There, the Court said, "the only one that appears close [to admissible] ... would be Dr. David A. Stoney."26 App. 1032a. Second, Mitchell points to the Court's statement near the end of the first colloquy that "I am not getting into the issue of latents in general. That's been established," App. 1033a, contending that this runs directly counter to our holding in Velasquez. Third, Mitchell disagrees with the government's claim that some of the second colloquy was colored by the need to refresh the Court regarding the issue; Mitchell would have us take the Court's statements literally-for example a "yes" from the Court following a statement by defense counsel that Mitchell had been "precluded from introducing [testimony] that the fingerprint field is of questionable reliability," App. 1067a, as evincing agreement rather than as a signal to "go on." Fourth, Mitchell does not read the Court's ultimate ruling at the end of the second colloquy to be a blanket authorization to put on any reliability-related expert testimony, but rather a very limited approval of testimony assailing any government testimony that relied on a particular point-based standard for identification. This interpretation seems consistent with Mitchell's counsel's contemporaneous representation that they had no witness that would meet the Court's requirement.27

We begin our analysis with the point on which the parties are in agreement: The District Court excluded expert testimony on the subject of whether latent fingerprint identification is a science. We hold that it was correct to do so. Kumho Tire renders the question of "is it science?" immaterial to the jury's determination (and the court's, for that matter, see supra page 244). Consequently, such testimony will not "assist the trier of fact ... to determine a fact in issue," making the testimony not admissible under Rule 702. Since the evidence is opinion testimony, there is no other appropriate basis on which to admit it, and so the District Court was correct to exclude it.

On balance we agree with the government that the District Court consistently operated on a three-tier theory of what expert testimony was admissible-allowing specific criticisms and general reliability criticisms, but excluding testimony about whether latent fingerprint identification is a "science." At the same time, we acknowledge the force of Mitchell's reading. But even if Mitchell's reading were correct, he would not prevail because the record does not establish an affirmative exclusion of testimony that should have been admitted under Velasquez. Counsel simply did not seek rulings on the admissibility of proposed expert testimony, and instead simply discussed admissibility in terms of proposed expert witnesses. From these rulings, we cannot say that the District Court erred.

To elaborate, both Mitchell and the District Court framed the issue as whether a given witness was or was not admissible, and not as whether testimony on a given subject matter was admissible. While this approach may seem pragmatic-after all, from a logistical point of view, what matters is whether a given witness will or will not testify-it has serious pitfalls for creating an appellate record. If an expert witness is excluded, it is generally because he or she is unqualified; but this is irrelevant here because the parties do not dispute the qualifications of the witnesses. To be sure, expert witnesses may also as a practical matter be excluded because they cannot testify to any admissible subject matter. But in such a case, the legally operative question is "what is (are) the proposed subject matter(s) of the witness's testimony?" This is necessarily so because the only way for appellate courts to state the law for future cases is to do so in terms regarding the subject matter of proposed testimony-as we did in Velasquez, for example. Thus speaking in terms of which witness is admissible is actually one step removed from the legally operative question. Using witnesses as shorthand for subject matters may be convenient, but it becomes confusing and the law becomes difficult to apply, especially when a given witness testifies on multiple subject matters.

This is precisely what happened here: All of the principal defense experts testified in some measure on whether fingerprint identification was a "science." This, we have already held above, was properly excluded. Those same experts also testified to the reliability (or lack thereof) of fingerprint identification. That evidence, under Velasquez, would have been unambiguously admissible. Yet the admissibility question was not, as best we can divine from the colloquies, framed in this way.

At the Daubert hearing, Mitchell's counsel cast his case as an assault on the scientific status of fingerprint identification. Indeed, at the Daubert hearing, Dr. Stoney was offered as "an expert with respect to scientific status or lack thereof with respect to latent fingerprint identification," App. 761a; Prof. Starrs was offered as "an exert [sic] in forensic science qualified to provide an opinion as to whether latent fingerprint examination meets the criteria of science," App. 813a-814a; and Dr. Cole was offered as "an expert in the field of science and technology studies with particular expertise regarding the fingerprint profession," App. 939a. At no point thereafter did Mitchell attempt to have these witnesses qualified differently.

Mitchell's attorneys hewed to this rubric even after the hearing, and so interpreted the District Court's (proper) exclusion of "is it science?" testimony as a wholesale exclusion of their witnesses. They were not required to approach the matter in this way, and the District Court was surely not required to disabuse Mitchell's counsel of this notion. Mitchell could have asked the Court whether Prof. Starrs and Dr. Cole would be permitted to testify as to the reliability of fingerprint identification, provided that they did not opine on the irrelevant issue of whether it was science. Instead, he accepted their exclusion. Mitchell could have proffered the subject matter of testimony he would like to present. Instead, he proffered the witnesses he would like to call. Mitchell could have attempted to put his witnesses on the stand to preserve his objections. Instead, they never appeared at trial.

At best, Mitchell offers a modest circumstantial case that, if he had posed the question of the admissibility of defense expert testimony that fingerprint identification is unreliable, the District Court would have excluded it, contrary to Velasquez. But if the question was never asked-and our review of the record suggests it was not-then it is hardly grounds for reversal that the District Court might have ruled incorrectly. Thus the District Court committed no error.

V. The District Court's Declaration of Judicial Notice

We next turn to the question whether the District Court properly took judicial notice that "human friction ridges are unique and permanent throughout the area of the friction ridge skin, including small friction ridge areas, and that ... human friction skin arrangements are unique and permanent." App. 1472a. " [A] court's decision whether to take judicial notice of certain facts is reviewed for abuse of discretion." In re NAHC, Inc. Sec. Litig., 306 F.3d 1314, 1323 (3d Cir. 2002).

Federal Rule of Evidence 201(b) specifies what matters are the proper subject of judicial notice:28

A judicially noticed fact must be one not subject to reasonable dispute in that it is either (1) generally known within the territorial jurisdiction of the trial court or (2) capable of accurate and ready determination by resort to sources whose accuracy cannot reasonably be questioned.

The actual phrasing offered by the government and adopted by the District Court is opaque; while we can comprehend the notion that friction ridge arrangements are permanent, we are unsure what it means to describe "arrangements," considered in the abstract, as "unique." On one level, this seems irrelevant: Since the issue at trial was latent fingerprints, it is difficult to see how general propositions about "arrangements" are related to any "fact that is of consequence to the determination of the action," Fed.R.Evid. 401. Moreover, "small friction ridge areas" seems problematic-what is "small"? (In light of the issues at trial, we imagine that it was a reference to areas the size of typical latent fingerprints.) Even without reference to the substantive standard in Rule 201(b), we wonder whether the very phrasing of the judicially noticed material signals that the District Court erred.

Vagueness and irrelevance aside, judicial notice of these matters clearly failed Rule 201(b). The Rule requires that the matter "not [be] subject to reasonable dispute." Yet much of Mitchell's presentation at the Daubert hearing was directed at disputing this very proposition;29 if the question merited such an extensive Daubert hearing, it surely was not suitable for resolution by judicial notice. Moreover, Rule 201 speaks in terms of "fact [s]." Here, the Court took judicial notice of a scientific conclusion-something which is subject to revision-not a "fact."30 One of the purposes of a Daubert hearing is to educate the Court as to the relevant expertise. That the Daubert hearing consumed five days before the Court could take judicial notice only further compels the conclusion that this "fact" was neither "generally known" nor "capable of ... ready determination."

The government's defense of the District Court's taking of judicial notice focuses on the large number of cases where courts have taken judicial notice of the uniqueness of fingerprints. None of the cases cited by the government is binding on this Court. More to the point, none of them concern judicial notice of the uniqueness and permanence of "small areas" of friction ridge skin-rather, the cases generally concern the uniqueness of full fingerprints, or the method of fingerprint identification. While we have doubts about the propriety of taking judicial notice even in those cases (one need only look at our Daubert analysis above to see that the matter is in dispute), for present purposes we need only note that the cases cited by the government are clearly distinguishable. Thus we conclude that it was error for the Court to take judicial notice as it did.

Having concluded that it was error for the District Court to take judicial notice as it did, we must consider whether the error was harmless. Under our precedent, an error is harmless if "`it is highly probable that the error did not contribute to the judgment.'" United States v. Davis, 183 F.3d 231, 255 (3d Cir. 1999) (quoting Murray v. United of Omaha Life Ins. Co., 145 F.3d 143, 156 (3d Cir. 1998)). We conclude that the error was harmless.

The record of the Daubert hearing establishes that the government could have adduced estimable testimony-both in its quantity and quality-in place of the District Court's taking judicial notice. The ready availability of probative, credible substitute evidence suggests with a high probability that the jury's verdict would not have changed had the District Court declined to take judicial notice and the government been forced to put on live testimony. The Court of Appeals for the Fifth Circuit has endorsed the view that the availability of cumulative or substitute evidence can make admission of evidence harmless. See United States v. Arroyo, 805 F.2d 589 (5th Cir. 1986); cf. United States v. Anderskow, 88 F.3d 245, 251 (3d Cir. 1996) (holding that improper admission of cumulative evidence is generally harmless error). We also note that Mitchell was free to put on evidence to rebut the substance of the Court's judicial notice, see Gov't of V.I. v. Gereau, 523 F.2d 140, 147 n. 17 (3d Cir. 1975), but did not do so.

We recognize the possibility that the Sixth Amendment's Confrontation Clause may be implicated when a court undertakes a harmless error analysis in a criminal case-such as we are doing here-and bases its conclusions on the probable outcome of a hypothetical trial where hypothetical witnesses are called. See United States v. Gallego, 191 F.3d 156, 164-65 & n. 3 (2d Cir. 1999). This would not present an obstacle here, however, because the putative substitute testimony was actually given at the Daubert hearing and was subject there to cross-examination by Mitchell, who had the same motive to attack the government's experts as he would have had at trial. Thus the Confrontation Clause would not, at all events, be offended by our harmless error analysis. See Crawford v. Washington, 541 U.S. 36, 124 S. Ct. 1354, 1374, 158 L. Ed. 2d 177 (2004) ("Where testimonial evidence is at issue, however, the Sixth Amendment demands what the common law required: unavailability and a prior opportunity for cross-examination."); cf. Fed.R.Evid. 804(b) (1) (permitting introduction of hearsay under these conditions).

Mitchell counters that the District Court's declaration of judicial notice lent an imprimatur of authority to the government's fingerprint case that no amount of expert testimony could have replaced, and no amount of rebuttal could have overcome. We acknowledge that the consequences of a district court's taking judicial notice of disputed facts can be considerable, for the unique imprimatur of the district court can render judicial notice of a disputed fact not harmless, even when there is cumulative (or substitute) evidence. But we do not think the facts here support that argument, principally because the government had not only substitute evidence, but almost overwhelming substitute evidence: The Daubert hearing record discloses a wealth of testimony on this point from credible and well-qualified experts. In fact, at the Daubert hearing the government asked each of five distinguished expert witnesses his opinion of essentially the matters the District Court judicially noticed. All five took the same position as the District Court did in taking judicial notice. See supra page 223. Thus, this was not a case where judicial notice replaced limited and shaky evidence. Any additional authority the government drew by the Court's taking judicial notice was, at most, marginal. Thus we conclude that, though error, the District Court's taking of judicial notice was harmless.

Mitchell argued in his Fed. R. Crim. P. 33 motion that the government violated its obligations under Brady v. Maryland, 373 U.S. 83, 83 S. Ct. 1194, 10 L. Ed. 2d 215 (1963), by failing to disclose the solicitation for fingerprint validation studies which it ultimately released to the public shortly after Mitchell was convicted. Several prongs must be met to establish a Brady violation, but we need only concern ourselves-as the District Court did-with Brady's materiality prong. We agree with the District Court that, even if Mitchell had had the solicitation at trial, there was not a reasonable probability that he would have been acquitted.

We have explained that " [o]rdinarily we review a district court's ruling on a motion for new trial on the basis of newly discovered evidence for abuse of discretion." United States v. Perdomo, 929 F.2d 967, 969 (3d Cir. 1991) (citing Gov't of V.I. v. Lima, 774 F.2d 1245 (3d Cir. 1985)). But " [b]ecause a Brady claim presents questions of law as well as questions of fact, we will conduct a de novo review of the district court's conclusions of law as well as a `clearly erroneous' review of any findings of fact where appropriate." Id. (citing Carter v. Rafferty, 826 F.2d 1299, 1306 (3d Cir. 1987)).

In Brady, the Supreme Court announced that "`the suppression by the prosecution of evidence favorable to an accused upon request violates due process where the evidence is material either to guilt or to punishment, irrespective of the good faith or bad faith of the prosecution.'" Banks v. Dretke, 540 U.S. 668, 124 S. Ct. 1256, 1267, 157 L. Ed. 2d 1166 (2004) (quoting Brady, 373 U.S. at 87, 83 S. Ct. 1194). " [T]he three components or essential elements of a Brady prosecutorial misconduct claim," the Court recently reiterated, are: "`The evidence at issue must be favorable to the accused, either because it is exculpatory, or because it is impeaching; that evidence must have been suppressed by the State, either willfully or inadvertantly; and prejudice must have ensued.'" Id. at 1272 (quoting Strickler v. Greene, 527 U.S. 263, 281-82, 119 S. Ct. 1936, 144 L. Ed. 2d 286 (1999)).

In evaluating a Brady claim, the "touchstone on materiality is Kyles v. Whitley." Id. at 1276. " [T]he materiality standard for Brady claims is met when `the favorable evidence could reasonably be taken to put the whole case in such a different light as to undermine confidence in the verdict.'" Id. (quoting Kyles, 514 U.S. 419, 435, 115 S. Ct. 1555, 131 L. Ed. 2d 490 (1995)). This a defendant must show by demonstrating a "`reasonable probability' of a different result," had the withheld evidence been available. Kyles, 514 U.S. at 434, 115 S. Ct. 1555 (citing United States v. Bagley, 473 U.S. 667, 678, 105 S. Ct. 3375, 87 L. Ed. 2d 481 (1985)). This standard is relatively lenient; " [t]he question is not whether the defendant would more likely than not have received a different verdict with the evidence, but whether in its absence he received a fair trial, understood as a trial resulting in a verdict worthy of confidence." Id.

Two other questions of law bear on the somewhat unusual circumstances of the alleged Brady violation in this case. First, assuming that the government acted in bad faith to withhold publication of the solicitation, we must consider how, if at all, the bad faith aspect affects the Brady calculus. We are deeply discomforted by Mitchell's contention-supported by Dr. Rau's account of events, though contradicted by other witnesses-that a conspiracy within the Department of Justice intentionally delayed the release of the solicitation until after Mitchell's jury reached a verdict. Dr. Rau's story, if true, would be a damning indictment of the ethics of those involved.

The District Court declined to reach the issue of whether the government suppressed the solicitation, and it made neither a finding of fact nor even an implicit credibility determination on the conflict between Dr. Rau's account and the testimony of the government's witnesses. Thus we have no factual determination to which we may defer. But as a legal matter, the question of good faith versus bad faith is a distinction without a difference in the Brady context. Indeed, the Brady Court itself said that its holding was "irrespective of the good faith or bad faith of the prosecution," 373 U.S. at 87, 83 S. Ct. 1194, and this was reaffirmed in United States v. Agurs, 427 U.S. 97, 110, 96 S. Ct. 2392, 49 L. Ed. 2d 342 (1976) ("If the suppression of evidence results in constitutional error, it is because of the character of the evidence, not the character of the prosecutor."). Mitchell does not suggest, nor do we adopt, a rule of per se materiality in the face of bad faith withholding by the prosecution.

Mitchell does, however, urge us to adopt the position enunciated in United States v. Jackson, 780 F.2d 1305 (7th Cir. 1986). There the Court of Appeals explained that the existence of bad faith on the part of the prosecution is probative of materiality because it is "doubtful that any prosecutor would in bad faith act to suppress evidence unless he or she believed it could affect the outcome of the trial." Id. at 1311 n. 4. We agree that the existence of bad faith on the part of the prosecution is a factor for the court to consider in weighing the materiality of the withheld evidence. The District Court erred to the extent that it undertook its Brady materiality inquiry without evaluating and incorporating the government's alleged bad faith. In the next section we will consider the alleged bad faith in making our own materiality determination.

The second question of law that we must address arises because the government proffered extensive evidence to rebut Mitchell's contentions regarding the solicitation. Therefore we must determine whether we are to assess Brady materiality by reference to a hypothetical trial at which the withheld evidence alone is introduced, or one at which both the withheld evidence and reasonable rebuttal evidence are introduced. The Supreme Court has made clear that the Brady (or, in its citations, Bagley) materiality determination displaces a harmless error inquiry. See Kyles, 514 U.S. at 435-36, 115 S. Ct. 1555. Thus, assuming that the Confrontation Clause bears on this issue, see supra page 253, its significance is the same.

In deference to the possible Confrontation Clause implications, absent an opportunity for cross-examination of prosecution rebuttal evidence (which would satisfy Crawford), we will undertake the Brady materiality inquiry with reference only to the evidence withheld, and not consider the prosecution's rebuttal. We note, however, that the typical case will be the exception to this rule: Normally a Brady claim will be assessed in light of an evidentiary hearing-as was the case here-and the defendant will have an opportunity for cross-examination at that hearing. Such cross-examination satisfies Crawford, 124 S. Ct. at 1374, and thus would clearly be properly considered in evaluating Brady's materiality prong. Since Mitchell had the opportunity for cross-examination in his new trial hearing, we will consider the full record in determining whether there is a reasonable probability that the solicitation would have changed the outcome of the trial.

The first Brady prong ("favorable to the accused") is met, for the parties do not dispute that the existence of the solicitation is favorable to Mitchell (though just how favorable it is very much in dispute). We do not reach the question whether the second prong ("suppressed by the State")-which we have held requires that the prosecution have "actual knowledge or cause to know" of the undisclosed material, see United States v. Veksler, 62 F.3d 544, 550 (3d Cir. 1995)-is met by virtue of either (1) the involvement of government experts in the solicitation's preparation, or (2) the fact that the NIJ and the United States Attorney for the Eastern District of Pennsylvania are both under the United States Department of Justice. Therefore, we confine our discussion to the third prong ("prejudice must have ensued").

As we have noted, the District Court gave two reasons why the solicitation was not material under Brady: first, that it would not have been admissible, and second, that even had it been admitted, there was not a reasonable probability that the outcome of the trial would have changed. On appeal, the government does not defend the District Court's first ground; the parties correctly recognize that under Velasquez, 64 F.3d 844, the solicitation would have been admissible both at trial and at the Daubert hearing as tending to undermine the government's claim that latent fingerprint identification is reliable.

Mitchell principally presses on appeal that use of the solicitation at the trial itself would have had a reasonable probability of changing the verdict, but we will first consider whether the solicitation was material to the Daubert ruling, since a Daubert ruling favorable to Mitchell would very likely have changed the outcome at trial. Based on our thorough review of the admissibility under Daubert of the government's latent fingerprint identification evidence, see supra Part III, it is clear that the Daubert calculus does not materially change in light of the solicitation.

Mitchell's main contention requires that we consider whether the absence of the solicitation at trial "undermine [s] confidence in the verdict." Kyles, 514 U.S. at 435, 115 S. Ct. 1555. We assume, but do not decide, that the solicitation would have been admissible at trial for its contents as a non-hearsay admission of a party opponent (the government) under Fed.R.Evid. 801(d) (2), and would have been admissible as impeachment evidence under Fed.R.Evid. 801(d) (1) (A) against Agent Meagher, who participated in the preparation of the solicitation.

Mitchell hypothesizes that " [t]he jury most probably would have been stunned to learn ... that the government and its fingerprint experts have `invited' ... `basic research' to determine whether fingerprints are truly unique and testing to determine whether fingerprint examiners can produce correct results with acceptable error rates." Reply Br. at 39. If the solicitation were to be taken in a vacuum, this might be true. But the government witnesses at the new trial hearing explained-and the District Court found as a factual matter-that this solicitation (like other NIJ solicitations) is not "meant to set forth the state of the current research, but rather is only intended to set forth sufficient information such that researchers can apply for funds to perform further research." App. 12a. Apart from direct testimony from several government witnesses familiar with the NIJ solicitation process, there was also evidence that the NIJ routinely issues solicitations for research in other well-established fields of forensic expertise, such as DNA identification. Thus the District Court's finding regarding the purpose of the solicitation is not clearly erroneous. In that light, we conclude that a reasonable jury would not conclude that the solicitation was the smoking gun that Mitchell makes it out to be.

The government's bad faith, if any, in withholding the solicitation does not appreciably alter this because intentional withholding in these circumstances is consistent not only with a guilty mind but also with a concern on the government's part that the solicitation would be misunderstood. Moreover, the solicitation would have been only a small part of a large mosaic of evidence put on at trial about the reliability and operation of latent fingerprint identification. In our view, the impact of the solicitation would have been dwarfed by other evidence favorable to the government.

Relatedly, Mitchell contends that the solicitation would have been powerful impeachment evidence against Agent Meagher, who was the government's principal expert witness at trial, because Meagher was involved in the drafting of the solicitation. In ruling on Mitchell's Rule 33 motion, the District Court credited "the testimony of the Government's witnesses at the Solicitation Hearing that the Solicitation does not change their testimony regarding fingerprint technology." App. 12a-13a. In other words, the District Court discounted the impeachment value of the solicitation even after having seen Mitchell's actual cross-examination of the government's experts both with the solicitation (at the new trial hearing) and without it (at trial). The District Court had the best vantage point, at both proceedings, to assess the government's witnesses (especially Agent Meagher), and we defer to its finding. See United States v. Perez, 280 F.3d 318 (3d Cir. 2002).

In sum, the solicitation does not undermine our confidence in the verdict from a substantive or impeachment vantage point. We conclude that it was not material, and therefore reject Mitchell's Brady claim.

VII. Admission of Alleged Prior Consistent Statements

Mitchell's final objection is to what he regards as the admission of certain prior consistent statements by the government's key lay witness, Kim Chester. Mitchell contends that, following his attack on Chester's credibility during cross-examination, the government on redirect sought to rehabilitate her by introducing prior consistent statements. Mitchell's argument is that the District Court erred in letting the prosecution proceed as it did because those statements were hearsay not within any hearsay exception. We conclude that, in fact, no hearsay was introduced, and therefore Mitchell's objection fails.

Although counsel for Mitchell objected at pertinent points during the redirect examination of Chester on various specific grounds, no hearsay objection was made. Thus Mitchell has failed to preserve this objection for appeal. See Fed.R.Evid. 103(a) (1); United States v. Sandini, 803 F.2d 123, 126 (3d Cir. 1986) (holding that specific objections are required to preserve issues for appeal); United States v. Gomez-Norena, 908 F.2d 497, 500 (9th Cir. 1990) (holding that a party fails to preserve an issue for appeal by making an incorrect specific objection).

Accordingly, our review is for plain error only. See United States v. Brink, 39 F.3d 419, 425 (3d Cir. 1994). To establish plain error, a defendant must prove that there is "(1) `error,' (2) that is `plain,' and (3) that `affects substantial rights.' If all three conditions are met, an appellate court may then exercise its discretion to notice a forfeited error, but only if (4) the error `seriously affects the fairness, integrity, or public reputation of judicial proceedings.'" Johnson v. United States, 520 U.S. 461, 467, 117 S. Ct. 1544, 137 L. Ed. 2d 718 (1997) (citations omitted).

The government's redirect examination of Ms. Chester elicited three things. First, she had met with FBI agents and given them a statement. Second, that statement included discussions of Mitchell, Bookie, and T's activities. Third, she had testified before regarding their activities. (This testimony was in Mitchell's first trial, though the jury, of course, did not learn this.) The examination did not establish the contents of those prior statements, merely their existence and subject matter. The prosecution used the existence of these prior statements during closing arguments to bolster Chester's credibility with a "dog that did not bark" argument. That is, the prosecutor offered the jury the line of reasoning that if these statements existed, and they were harmful to Ms. Chester's credibility, then Mitchell surely would have introduced them. The fact that he did not, the prosecutor argued, must mean that they were not inconsistent, and that Ms. Chester was in fact a reliable and consistent witness.31

Mitchell claims that the government introduced Chester's prior consistent statements (to the FBI and at Mitchell's first trial) to rehabilitate her in the wake of attacks on her credibility during cross-examination. While the government's motive was to rehabilitate Ms. Chester, we do not agree that any hearsay statements were introduced. Rule 801(c), which defines "hearsay," concerns only "statements," and so the first question to ask is whether the government elicited a statement.

"A `statement' is (1) an oral or written assertion or (2) nonverbal conduct of a person, if it is intended by the person as an assertion." Fed.R.Evid. 801(a). Nonverbal conduct is plainly not at issue. Chester's prior statements may be oral or written assertions, but they were not actually introduced. Testimony about the existence of a statement is not itself a "statement." Furthermore, to the extent that Chester testified that certain matters were discussed on prior occasions, that testimony was not "offered ... to prove the truth of the matter asserted," Fed.R.Evid. 801(c), and thus not inadmissible under Rule 802.32 Thus the District Court committed no error.

Moreover, even if Chester's testimony were hearsay, we would not reverse Mitchell's conviction, because the third prong of the Johnson plain error test is not met. The "substantial right" implicated in erroneous admission of hearsay in a criminal trial is the Sixth Amendment Confrontation Clause. See, e.g., Crawford, 124 S. Ct. at 1374. The Clause has little weight when the declarant is actually on the stand, as was the case here. Moreover, the whole issue was collateral (it went only to credibility), and Mitchell had done a relatively unconvincing job of undermining Ms. Chester's credibility on cross-examination. In our view, rehabilitated or otherwise, the jury would have given the same weight to Ms. Chester's testimony.

The judgment of the District Court will be affirmed.

APPENDIX

APPENDIX: Colloquies with the District Court Regarding Admissibility of Mitchell's Proposed Experts.

With the exception of identifying the prosecutor and defense counsel, the following transcripts are verbatim the transcript supplied in this Court. We have not attempted to repunctuate it, but have noted possible errors in transcription or in speaking. What follows is the District Court's colloquy with counsel following its ruling on the admissibility of the government's expert testimony:

THE COURT: Counsel, the matter presently pending before the Court is in reference to the defense motion to exclude the Government's fingerprint identification evidence and based on the Daubert hearing and also Kumho, this court denies the defendant's motion and pursuant thereto, this court is not going to make a determination as to the particular area of scientific knowledge and technical or specialized knowledge. We are going to grant the motion with respect to the expert pursuant to Rule 702 and as stated in Kumho, not only would it be difficult to prove, but almost impossible for a judge to administer evidentiary rules under which a gatekeeper obligation depending upon a distinction between scientific knowledge and technical or other specialized knowledge.

Since there is no clear line dividing the one from the others and no convincing need to make such distinction, therefore, this court does not feel compelled by any case authority to make that distinction in the case before us.

* * *

We find that the Government's expert witness at this juncture appears it's Duane Johnson [sic, "Wilbur Johnson"?], an FBI latent fingerprint examiner who testified first in the previous trial and those other latent fingerprint experts that testified in the Daubert hearing are capable of testifying in these proceedings and in that regard, I am not going to limit the defense from calling latent fingerprint experts to testify as to the ability not to identify or make an identification from the fingerprints and I am also going to allow the defense to call any latent fingerprint expert who indicates that fingerprints are not reliable sources of identification.

Only for that limited purpose and I am going to exclude evidence as to whether or not it's scientific, technical or whatever. It has no relevance before this jury here. The question is whether or not an identification can be made by examination of fingerprints-latent fingerprints-and the record of this case, as far as the Daubert hearing will remain intact with these proceedings and will go with it through the life of this case.

* * *

I believe, ultimately, it will be a factual determination for the jury to make as to whether or not there's been a positive identification pursuant to whatever standards are applicable and make that determination, as opposed to this court taking judicial notice of that.

* * *

In that regard, when I am speaking about the defense experts, out of the three that testified-I say "experts" because they called a paralegal to testify, but out of the three, the only one that appears close, based on the testimony at the Daubert hearing, would be Dr. David A. Stoney and I say "close" because a vast majority of his testimony dealt with the scientific aspect as opposed to the latent fingerprint reliability and his experience from that background.

All right, you can make your decisions and at that point in time that you decide to make or attempt to call a witness, we will have an offer of proof and I will entertain it and make a determination based on the offer of proof as to whether or not the witness will be allowed to testify as with any witness.

* * *

THE PROSECUTION: Just a clarification, your Honor.

You first mentioned that the defense experts-did I understand the court correctly with respect to the sufficiency of the latent fingerprints in this particular case?

THE COURT: Yes.

THE PROSECUTION: Okay and that is likewise —

THE COURT: Such as some of the witnesses that were used to look at these latents throughout the United States.

If they were to call that fingerprint expert and that fingerprint expert says, "There is no way I can make a positive identification from that latent fingerprint," that's relevant for the purpose of these proceedings.

THE PROSECUTION: I wanted to clarify we were talking about these latents versus the issue of latents in general.

THE COURT: No, I am not getting into the issue of latents in general. That's been established.

THE DEFENSE: One quick point of clarification. I take it we would not be permitted to call Professor Starrs?

THE COURT: Looking at his testimony from the Daubert hearing, he would not qualify under my analysis based on Rodriguez?

THE PROSECUTION: The Eleventh Circuit case is U.S. versus Paul.

THE COURT: I am talking about the Third Circuit case, Vasquez. [sic, "Velasquez"?]

THE COURT: Anything further?

THE DEFENSE: No, your Honor, not on this point.

App. 1029a-1034a.

Nothing further appears in the record on the issue of defense experts until the morning of jury voir dire, at which the Court had the following colloquy with counsel:

THE DEFENSE: ... And, in addition, your Honor, I would like to state on the record, to clarify my understanding of this Court's pretrial ruling, I discussed it with the government, I think we are in agreement as to what the Court's ruling was. In some respects it was not clear initially to me. I want, for appellate purposes to put it on the record.

THE COURT: What's that in reference to, what ruling?

THE DEFENSE: Referring to your ruling as to the admissibility or the partial admissibility of the fingerprint examiners, in light of the Daubert hearing, entertained by the Court.

THE COURT: When was the Daubert hearing?

THE DEFENSE: It was over the summer, the exact dates, I don't know. The Court's ruling was announced from the bench on September 13th of last year.

* * *

THE COURT: What specifically did you have problems understanding?

THE DEFENSE: Your Honor, what my understanding of this Court's ruling, the defense may call any witness or examiners which I'm prepared to do, who formed an opinion as to the latent prints at issue. But, I further understood the Court to say, I was precluded from introducing any evidence by individuals who are of the opinion that the fingerprint field is of questionable reliability, given the lack of testing, the reasons that I have articulated at the Daubert proceeding.

THE COURT: Yes.

THE DEFENSE: I would just proffer, your Honor, that I would call the same three people that the Court heard at the hearing, if the Court had so ruled.

* * *

THE PROSECUTION: I want one clarification.

* * *

THE PROSECUTION: You also told them that they could call any qualified expert, meaning in the field of fingerprints that would testify that fingerprints are not reliable sources of identification.

I mean there's a slight difference. I think the Court ruled with respect to two of the witnesses on the 13th, that they would be excluded. You did not preclude Stoney or exclude him in all respects then but you had made a ruling, you didn't-he had not been fleshed out as an expert in fingerprints either. All I'm saying, that the Court let the defense try to find experts in the field that would say that the fingerprints are not reliable sources of identification.

THE COURT: I don't have that transcript before me.

THE PROSECUTION: I can hand up my copy.

* * *

THE COURT: Let me refresh my recollection as to this whole hearing, counsel. I'm somewhat at a disadvantage since I thought this was done. Let me refresh.

Specifically, on page four, I indicated: "I am not going to limit the defense from calling latent fingerprint experts to testify as to the ability not to identify or make an identification from the fingerprints and I am also going to allow the defense to call any latent fingerprint expert who indicates that fingerprints are not reliable sources of identification."

* * *

THE COURT: Then I said: "Only for that limited purpose and I am going to exclude evidence as to whether or not it's scientific, technical or whatever."

* * *

THE DEFENSE: The government before that said on page six, your Honor, in the middle of the page, line 18.

"The Prosecution: Just a clarification, your Honor.

You first mentioned that the defense experts-did I understand the Court correctly with respect to the sufficiency of the latent fingerprints in this particular case?

The Court: Yes.

The Prosecution: Okay and that is likewise —

The Court: Such as some of the witnesses that were used to look at these latents throughout the United States.

If they were to call that fingerprint expert and that fingerprint expert says, there is no way I can make a positive identification from that fingerprint, that's relevant for the purpose of these proceedings."

THE COURT: That's what I said, any latent fingerprint expert, who can look at these prints and say I can't make an identification or I can make an identification.

THE DEFENSE: As to these particular prints at issue, that's it.

THE COURT: That's it, the only thing relevant for these proceedings, right.

THE DEFENSE: Over my objection, the Court ruled.

THE COURT: Based on the facts that I made that ruling.

THE DEFENSE: Yes.

THE COURT: Anything further?

THE PROSECUTION: Just again for clarification, your Honor, not clarification but the statement, so I understand on page four, you also said that they can call any qualified expert in the field that would testify that fingerprints are not reliable sources of information, not limited to those latents, but if they can get a qualified expert in the fingerprint field to come in here to say, well, I'm a qualified expert in fingerprints. Fingerprint identification is not a reliable source of identification, they have the option and the ability to do that?

THE DEFENSE: That's what we would have done with Dr. Stoney, we did at the hearing, that he has the opinion that the field is of questionable reliability.

THE COURT: He is going to say, a scientific and technical determination?

THE DEFENSE: That the Court ruled on.

THE COURT: That the Court ruled on. That's fine, that's complete. But, in that regard, though, if you have a latent fingerprint expert who will testify, an expert or a person in latent fingerprints can't make a positive identification with 10 points, 15 points, 40 points, then you are permitted to-you can call that expert to testify, it doesn't have to do with just his particular points, that one can find but in general, if you have an expert, a latent fingerprint expert that can testify that a person cannot, a person in the field, an expert in the field cannot make an identification, whether it is Mr. Mitchell's fingerprints or anyone else's fingerprints, based on 10, 20, 15, you are permitted to call that expert.

* * *

THE DEFENSE: No one to present the testimony as your Honor outlined.

THE COURT: I don't know that.

THE DEFENSE: I'm representing that.

THE COURT: That's what you are representing to the Court.

THE DEFENSE: There would, yes, sir, there would be Dr. Stoney's testimony, that there is-it is of questionable reliability because there's no testing done in the field. Not to be redundant, similar to what he testified to.

THE COURT: The record will remain as his testimony that you presented at these proceedings. Whether or not you call him in reference to latent fingerprint identification is your call.

THE DEFENSE: Right. That would be similar to the other two people that I would call.

THE COURT: Very well.

THE DEFENSE: Simon, Cummins, Professor Starr.

THE COURT: The other individuals that testified at the Daubert hearing?

In the jargon, artifacts are generally small amounts of dirt or grease that masquerade as parts of the ridge impressions seen in a fingerprint, while distortions are produced by smudging or too much pressure in making the print, which tends to flatten the ridges on the finger and obscure their detail

We note that these experiments-and, indeed, much of the expertise marshaled both by the government and by Mitchell-required resources and preparation that are far from typical in federal criminal trials

An analogy may illustrate this biasing effect: Consider a large multicolored pile of crayons produced by mixing several boxes of crayons. If one chooses a dozen "dark" crayons at random, one is more likely to find among those dozen crayons a pair of exactly the same color than one is to find such a pair if one selects a dozen crayons at random from the pile at large

We note that the comparisons were run for each print against all 50,000 prints, not against the other 49,999 prints. Thus, every print was assured of having a tautologically perfect match (i.e., itself) that could serve as a baseline for statistical comparisons. This was done to quantify statistically how much better the perfect match was than all other comparisons. The cases in which a print was a strong match for a print other than itself were subsequently discovered to be the product of a double-entry in the database (i.e., a set of prints from the same person had been entered into the database twice). The experimenters testified that the system's ability to catch this unintentional duplication bolstered their confidence in its capabilities.

Meagher explained that the sole exception was caused by a poorly created fingerprint card. On the card in question, the flat impression had strayed out of the region on the card designated for the flat impression, and had left part of a print in the box designated for one of the rolled impressions. Consequently, one of the boxes for a rolled print actually contained a rolled print, plus a fair-sized piece of a flat print of a different finger. As a result, the strong match found by computer was actually a match between the pseudolatent print and the stray portion of the flat print. As with the database error discovered in the first stage of the 50/50 experiment, the experimenters found this mistaken match to be evidence of the robustness of their computer system

It appears that, in the interest of efficiency, the parties consented to introducing hearsay from the examiners who completed the FBI survey-primarily through Agent Meagher for the government, and through Ms. Peterman for Mitchell

This point also underpins Dr. Stoney's more general criticism of the discipline of latent fingerprint identification: Dr. Stoney agreed that human friction ridges are unique and permanent, including small areas, App. 914a, but suggested that this alone is unhelpful on the question whether prints are identifiable, because fingerprints are so subject to distortion and the forensic identification process is so flawed, App. 917a-920a.

Dr. Cole noted that both of these first two explanations were well illustrated by the FBI's survey: Agent Meagher followed up with each agency until a match was agreed to, or otherwise identified inexperienced examiners as the source of nonidentifications

The anonymous note that was the subject of the previous appeal in this case was the critical link: That note connected the robbery getaway car to Mitchell's own car, allowing the FBI to monitor and capture Mitchell so quickly

The case Mitchell cites in his brief and relied on at oral argument,United States v. Ellis, 121 F.3d 908, 927 (4th Cir. 1997), is inapposite. Ellis applied plenary review not to the admission of expert testimony, but rather to a claim of prosecutorial misconduct where the district court had made no findings of fact. Apart from the fact that the issue in Ellis has strong Constitutional overtones that the Rule 702 issue in this case lacks, this Court does not agree with the Fourth Circuit on this point. See United States v. Ismaili, 828 F.2d 153, 163 (3d Cir. 1987) (reviewing District Court's rejection of a prosecutorial misconduct claim for abuse of discretion).

The rule was subsequently amended, effective December 1, 2000, to codify aspects of Daubert and its progeny. The Advisory Committee's note accompanying that amendment is a useful consolidation of commentary and precedent on the version of Rule 702 that applies in Mitchell's case, and so we will refer to it at points in our opinion.

In applying the teachings of Daubert in In re TMI Litigation, we explained that Rule 702 was addressed to two issues: first, the qualification of the experts themselves, and second, the reliability and helpfulness of their testimony. See In re TMI Litig., 193 F.3d at 664 (citing In re Paoli R.R. Yard PCB Litig., 35 F.3d 717, 749-50 (3d Cir. 1994) (Paoli II)). Daubert addresses the latter. As noted above, the former is not at issue in this appeal, as the District Court qualified all experts on both sides in their proffered areas of expertise, and neither party challenges any of these rulings.

The experiment had its limitations, though. First, the test sought to match fingerprints, not friction skin arrangements on actual fingers. Second, it was only a sample-50 thousand fingers tested, out of about 60 billion in the world. While this sample size seems quite large, and doubtless would be adequate in many if not most circumstances, we are unsure if it is adequate here. There is limited evidence on the record of why the government's experts chose a 50 thousand fingerprint set, and why they could confidently extrapolate from it. Indeed, there is some suggestion that purely practical technical concerns may have dominated this choice See infra note 18.

A concrete example may provide some clarity. In this case, Agent Meagher identified fourteen points of Level 2 detail (and unspecified supporting Level 3 detail, which we leave aside for simplicity) that matched Mitchell's right thumbprint to the latent print taken from the gearshift knob. Thus, for purposes of this particular identification, "sufficient quantity and quality of detail" really means "fourteen points of Level 2 detail." The hypothesis that "fourteen points of Level 2 detail is enough to make an identification" is falsifiable because one might be able to show that some latent print matches more than one full-rolled print under the "fourteen points of Level 2 detail" standard

Actual testing (as opposed to mere testability) is harder to come by, probably because someone seeking to falsify this hypothesis has no a priori reason to choose 14 points instead of 13 or 15 as the standard. Nonetheless, any showing that a more stringent standard (e.g., a 20-point standard) is fallible necessarily implies that the 14-point standard is also fallible.

They also contended that actual tests on a larger data set (i.e., more fingerprints) would have been preferable to statistical extrapolations. However, significantly larger data sets may be computationally intractable: The experiments conducted for this case took on the order of a day to run on the computer. But for larger sets of fingerprints, the number of comparisons goes up as the second power (i.e., the square) of the number of prints in the sample. Thus, a 1 million / 1 million experiment would take 20 × 20 = 400 times longer than a 50 thousand / 50 thousand experiment-or on the order of a year to complete, given the same computing power. An experiment with the FBI's full AFIS database would take millennia

Moreover, evidence of the false negative rate is often equivocal. While it might suggest a generally error-prone method, it is equally consistent with a very conservative method with a low false positive error rate. That is, a method may be designed to lower its false positive error rate by accepting a large number of false negatives out of an abundance of caution. One very familiar example of such a system is the criminal jury using the "beyond a reasonable doubt" standard: As the adage (attributed to Blackstone) says, "It is better that ten guilty escape [false negatives] than one innocent suffer [a false positive]." The same may be true for latent fingerprint identification-the examiners who declared they could not match the latent prints in the FBI's survey (the examiners responsible for the putative false negatives) may have done so because they would rather commit a likely false negative error rather than risk a small chance of a false positive identification

Mitchell's experts respond by denying the existence of a dichotomy between method error rate and practitioner error rate, asserting that both are part of a unitary inquiry. We reject this view as a legal conclusion inconsistent with Paoli II. Paoli II makes clear that error rates and the qualification of the expert are distinct inquiries. 35 F.3d at 742. The corollary to this, however, raises an issue for any given fingerprint expert: His testimony would be more likely to be admitted (because he would be more qualified) if he himself demonstrated a low rate of false positives in his own work and/or on his own proficiency tests. Cf. Calhoun v. Yamaha Motor Corp., 350 F.3d 316, 322 (3d Cir. 2003) (holding that the scope of an expert's testimony was properly circumscribed by the scope of his expertise).

As suggested above, known false positives have been attributed to malice or incompetence on the part of the examiner, and not to a deeper flaw in the method itself. Dr. Cole testified that this "circling the wagons" behavior is yet another occupational norm of a fingerprint identification community bent on preserving the unimpeachability of its methods. But even if every false positive identification signified a problem with the identification method itself (i.e., independent of the examiner), the overall error rate still appears to be microscopic.

Mitchell's counsel came close to inquiring on voir dire about Agent Meagher's results on proficiency examinations administered internally by the FBI, but did not actually ask a specific question. App. 1456a-1457a. The government did ask Agent Johnson about his results on FBI proficiency examinations, but defense counsel objected and the Court sustained the objection on the ground that Johnson had already been qualified as an expert. App. 1652a-1653a. As our discussion in the text suggests, this question was proper-even desirable-and the District Court was wrong to sustain the objection.

Keeping this rationale in mind is helpful, because some non-judicial uses will support the required inference of third-party confidence better than others. For example, no one would argue that the commercial popularity of astrology for non-judicial use makes it fit for admission under Rule 702. This case may provide another example: As we discuss below, the government introduced evidence of the widespread commercial use of biometric identification technology based on fingerprints. It is possible that commercial adoption of the method signals acceptance of its reliability. But, as Mitchell's uncontradicted survey evidence showed, fingerprint identification enjoys a near-mythical reputation for reliability, and so the evidence of commercial adoption is equally consistent with uncritical acceptance of a method that consumers merely believe-but do not know-to be reliable.

The government's experts implicitly acknowledged this-even before the Daubert hearing-in the very design of the 50/50 experiment: The first stage of that experiment was the matching of full-rolled prints to full-rolled prints, but the ultimate aim of the experiment was to test pseudolatent prints against full-rolled prints to better simulate the more demanding exercise of latent fingerprint identification. Of course, as we have noted above, see supra page 227, even this refined experiment used pseudolatents, and thus failed to capture the complexities of matching latent prints marred by distortions and artifacts.

Dr. Budowle testified that current commercial research and development seeks to use as little as 6% of the area of the full print to make an identification. App. 639a. This makes such a technique more akin to latent fingerprint identification, but it still differs in significant ways. First, the fraction of the print will be distortion-free, unlike actual latent prints. Second, the 6% portion is likely to be taken from a portion of the finger with a high areal density of Level 2 detail, a luxury that latent fingerprint examiners do not have

We also understand the task in disaster-victim identification as being (merely) to individualize one victim out of at most a few thousand victims, while forensic criminal identification seeks to individualize the defendant out of a pool of millions of potential perpetrators. Accordingly, there seems to be less of a threat of a false positive in the context of disaster-victim identification than in forensic criminal identification

Mitchell bolsters this contention by pointing to a press release issued by the United States Attorney for the Eastern District of Pennsylvania on the day of the first colloquy. With respect to Mitchell's proposed experts, the press release stated:

The Court granted the government's request to exclude the testimony of the defendant's experts James E. Starrs, a Professor at George Washington University Law School, David A. Stoney, Ph. D. of the McCrone Research Institute, Chicago, and Simon A. Cole, Ph.D. Those witnesses testified that fingerprint evidence and comparisons are not scientific evidence under Daubert.

2d Supp.App. 1a. The government counters that this is consistent with its three-tier theory because the release characterizes the ruling as precluding Mitchell's experts from testifying about whether latent fingerprint identification is scientific. Whatever the case, we note that such press releases do not strike us as reflecting good practice.

Mitchell also contends that his reading of the District Court's rulings is correct because of statements made by the District Court as part of its ruling on Mitchell's Fed. R. Crim. P. 33 motion for a new trial. In that order, the District Court explained that, based on its earlier rulings, the NIJ solicitation would not have been admissible because:

[W]e excluded any evidence at trial as to whether or not fingerprint identification technology is reliable pursuant to the Daubert/Kumho standards. We clarified that the only issue for the experts to discuss at the Mitchell trial was whether or not an identification could be made by examination of the specific latent fingerprints and the record of this case.

App. 5a.

We decline to rely on these statements and accept the government's submission that the District Court's statements in its post-trial order are not entitled to weight. The Court was looking back at oral rulings that were over a year old, and made its ruling following a trial at which Mitchell had not, in fact, put on experts to opine that fingerprint identification was not a reliable discipline. And at all events, when the question is (as here) whether a party has preserved the record for appeal, the salient issue is not what the District Court thought it had ruled, but what the state of the record before us is. Thus the post-trial ruling is irrelevant to our discussion.

Rule 201 also provides that a party be "heard as to the propriety of taking judicial notice," Fed.R.Evid. 201(e); Mitchell was heard in the course of the Daubert hearing. Further, the Rule requires that " [i]n a criminal case, the court shall instruct the jury that it may, but is not required to, accept as conclusive any fact judicially noticed," Fed.R.Evid. 201(g), a caveat that the Court included in the jury instructions.

The distinction implied by Rule 201(b)'s use of "fact" can be made clearer by the use of more polarized examples: Matters like "February 7, 1977 was a Monday" (a fact) are suitable for judicial notice, while propositions like "daily exercise reduces the likelihood of heart disease" (a scientific conclusion) are not

Indeed, you heard, [Ms. Chester] had testified in a prior proceeding. Did you hear counsel take the notes from that and say, well, isn't it true you said something different before? No. I suggest to you that the reason was because she didn't.

Did he take that statement that the agent took from her, the seven page statement and say, now didn't you say something different?

* * *

You didn't hear [defense counsel] try to impeach her with the statement that she had given to the agents back in December of 1991.

In fact, the entire situation is analogous to the typical unremarkable nonhearsay use of out-of-court statements. For example, testimony that "I heard another tenant in my building complain to the landlord about a dangerous condition on the stairs" is admissible to prove that the landlord had notice (but not that the stairs were in a dangerous condition). In that case, testimony that someone spoke to the landlord does not involve any "statement" at all, and the subject matter of the conversation is not "offered ... to prove the truth of the matter asserted," Fed.R.Evid. 801(c)