A personal weblog on issues related to the use of biometrics, in order to promote the effective development & implementation of all Biometric technologies (Fingerprint, Iris, Retina, Voice Recognition, Vein, Hand, Keystroke dynamics, Signature) standards and applications.

Monday, August 09, 2004

THEORY OR REAL-WORLD TESTS

THEORY OR REAL-WORLD TESTS: WHICH ROAD SHOULD FUSION BIOMETRICS TAKE?

By Yona Flink

Over the years there has been various discussions on whether combining twoor more biometrics (Fusion or Layered) enhances the accuracy of the biometric process for verification and identification. I have been of the opinion, that by intelligently layering two different biometrics, the resulting error rates would be reduced. However, on the occasions that I presented such a position, others turned to mathematical proofs andstatistical documentation using a paper by Professor Daugman. I have great respect for Professor Daugman positions, but the basis of Professor Daugman's paper are based on a different premise than those taken my myselfand others. The below documents are in the following order:

A brief outline of Professor Daugman's paper.

An email to the BC that I sent a year ago outlining a proposal forlayered biometrics.

An article appearing in Wave this month.Combining Multiple Biometrics John Daugman, The Computer Laboratory, Cambridge University Overview This short note investigates the consequences of combining two or morebiometric tests of identity into an "enhanced" test.

There is a common and intuitive assumption that the combination of different tests must improve performance, because "surely more information is better than less information." On the other hand, a different intuition suggests that if astrong test is combined with a weaker test, the resulting decision environment is in a sense averaged, and the combined performance will lie somewhere between that of the two tests conducted individually (and hence will be degraded from the performance that would be obtained by relying solely on the stronger test). There is truth in both intuitions. The key to resolving the apparent paradoxis that when two tests are combined, one of the resulting error rates (False Accept or False Reject rate) becomes better than that of the stronger of thetwo tests, while the other error rate becomes worse even than that of the weaker of the tests. If the two biometric tests differ significantly in their power, and each operates at its own cross-over point, then combiningthem gives significantly worse performance than relying solely on the stronger biometric. Example: Combination of two hypothetical biometric tests, one stronger thanthe other: Suppose weak Biometric 1 operates with both of its error rates equal to 1 in 100, and suppose stronger Biometric 2 operates with both of its error ratesequal to 1 in 1,000. Thus if 100,000 verification tests are conducted with impostors and another 100,000 verification tests are conducted with authentics, Biometric 1 would make a total of 2,000 errors, whereas Biometric 2 would make a total of only 200 errors. But what happens if thetwo biometrics are combined to make an "enhanced" test? If the "OR" Rule is followed in the same batch of tests, the combined biometric would make 1,099 False Accepts and 1 False Reject, for a total of 1,100 errors. If instead the "AND" Rule is followed, the combined biometric would make 1,099 False Rejects and 1 False Accept, thus again producing atotal of 1,100 errors. Either method of combining the two biometric tests produces 5.5 times more errors than if the stronger of the two tests hadbeen used alone. Conclusion: A strong biometric is better alone than in combination with aweaker one... when both are operating at their cross-over points. To reap any benefits from combination, the equations above show that the operating point of theweaker biometric must be shifted to satisfy the following criteria: If the"OR" Rule is to be used, the False Accept rate of the weaker test must be made smaller than twice the cross-over error rate of the stronger test. If the "AND" Rule is to be used, the False Reject rate of the weaker test mustbe made smaller than twice the cross-over error rate of the stronger test. The following is a position I took on layered biometrics in an email about ayear ago to the List. I do not think that the below disproves ProfessorDaugman's premise, but only points out that the premise from which I andothers have based their conclusions are different from that of Professor Daugman.The idea of layered biometrics has come about because of False Rejection. The layered biometrics issue may therefore be approached from another pointof view by not layering the biometrics but subjectively comparing the statistical results. At first, this may sound very non-scientific, butlet's first examine the problem of FR. Let us take for our examination two widely used biometric technologies forAccess Control: Hand Geometry and Face. In our example, both these system will be used for an access control system. The Hand Geometry reader will beour primary biometric system with the Facial Verification being our secondary system. Hand Geometry has a field proven EER of 0.2%. For our example, we will set the Hand Geometry reader's security at a threshold level of 60 which gives us a FAR of 0.08%. What we are saying in essence is that any person that verifies at or below the threshold level of 60 is who heclaims and anyone over the threshold level of 60 is an imposter and will notbe granted access. If a person verifies at 61, how much more of an imposteris he than the person that verified at a threshold level of 59 or 60? And if that same person verified at 69 or 75, is he more of an imposter than theperson that verified at a threshold level of 40? From the stand point ofthe set threshold level, anyone above 60 is an imposter and anyone at 60 or below is not an imposter. When we set a threshold level, there is a clear YES or a NO and no 'possible'. In the real world, we know that people may not always verify at the same level day in and day out. Should we reject a legitimate user because theuser failed by 0.05% and was verified at 61 instead of the minimum threshold level of 60? There may be a case for 'parallel biometrics'. In the case of parallel biometrics, we state that any person that does not meet the minimum threshold level will be verified by a second biometric technology. In this case, Facial Verification will be used. Let us assume that two persons, George and Giles were verified at the Hand Geometry reader and George received a verification level of 65 ( FA% 0.11%) and Giles received a verification level of 85 (FA 0.28%. George came pretty close toour required FA% of 0.08%, but could not be allowed access because he missed the threshold level by 0.03%. Close but no prize. Now, George and Giles are given a second chance to prove that they are who they claim. Both George and Giles look at the camera and receive the following verification levels: George is verified at a FA level of 7 % and Giles at 3 %. In the Facial, Giles came out better than George. If we hadbeen using Facial Verification alone and our access control threshold had been set for 98%, neither George nor Giles would have gained access. But now we are using Facial Verification in parallel with Hand Geometry. George did better than Giles with Hand Geometry but Giles did better than George with Facial. Do we deny both access because the did not meet the facial minimum requirements or should be combine both the Hand Geometry and Facial and divide by 2 or just toss a coin? Or should we ask an additional question: What is the probability of animposter achieving 99.89% accuracy for Hand Geometry and 93 % accuracy withFacial Verification based on George's two templates residing on the biometric database. In other words, what is the statistical possibility ofan imposter have facial characteristics that match the facial template onthe database by a similarity level of 97% and that the same imposter has hand geometry that matches that on the database by an accuracy level of 99.89%? Is not the possibility far less than 0.08%, which is our verification threshold on the Hand Geometry reader? What may be required in order to resolve the issues of falsely rejected legitimate users is an algorithm that is weighted in favor of the primary biometric technology and weighs the primary biometrics' rejection level against the secondary's evaluation of how accurately the rejected user matches the secondary biometric template in comparison to the level that he was rejected by the primary biometric technology. The levels of weighing each of the biometrics acceptance/rejection levels will be subjective and based on security requirements.

The following is an outline of Josef Kittle's paper at the Biometric Conference in Hong Kong. http://www.wave-report.com/other-html-files/currentwave.htm ****ICBA 2004, First International Conference on Biometric Authentication By John LattaHong Kong, July 15 - 17, 2004 Challenge of Biometric Fusion Josef Kittle, University of Surrey, UK, gave a keynote presentation on"Fusion of Intramodal and Multimodal Biometric Experts." It was one of themost interesting of the conference. One example in facial recognition wasbased on color channels. Three different methods related to the color channels netted TER, total error rates, of 5.8, 5.8 and 4.8. But when combined using a fusion process, the TER dropped to 1.9. This is anintramodal fusion because the same biometric modality was used, i.e., facial. Another example used face, voice and lips for the biometrics. In this case the HTER (1/2 TER) varied from .74 to 13.3. When it was fused andall modalities were used, the HTER dropped to .15. The last example was the fusion of face and voice with the HTER of 1.8 and 1.23. But the fused HTER was only .28. Logic draws us to the expectation that the use of more than one and even multiple biometric measures would result in lower error rates. Professor Kittle showed that the real challenge comes in operational environments. In these environments: Not all sensors are assumed to be able to collect their respective biometric for every individual in the authentication/identification process, The potential for fusion is limited to the number of biometrics used at the time of enrollment, and Some biometrics are of higher reliability that others. Operational expectations are that the use of biometrics will force the evaluation in the direction of the biometric with the highest confidence. Note that this is the case with Hong Kong Immigration which has both fingerprints and images. The images are not used as a biometric. In fusion, we would expect that the weights applied to the sensors used for authentication/ identification should be based on the reliability of the biometric. The WAVE asked the question: How does one compensate for these issues in operational environments? In response, it was stated this is oneof the issues to be addressed in the R&D of multimodal systems. The promise of fusion also carries with it the need for more research.What is of interest here is that in theory there is no improvement possibleby layering, combining or fusing two or more biometrics, yet in practice the test results indicate something else.