The big headline for the past several days has been that an audio expert hired by The Orlando Sentinel, Thomas J. Owen, had concluded that the person screaming for help on the 911 tape when a neighbor called in was not George Zimmerman because voice biometric analysis of the scream could only find a 48% match, and a much higher match (I’ve seen reference to 60% and 90% in different reports) would be needed.

The analysis was done comparing the Zimmerman 911 phone call voice with the voice on the separate 911 call from the neighbor.

Like the original ABC video tape supposedly showing Zimmerman uninjured, Owen’s analysis has caused a firestorm of pontification that Zimmerman’s self-defense argument was a lie and that Zimmerman executed Martin.

Tom Maguire has done an exellent job running down the various uncertainties with this type of voice analysis, as has Jeralyn at Talk Left, a criminal defense attorney who has used Owen as an expert in the past. The point of their numerous posts is not that Owen is wrong, but that caution must be used given the nature of voice biometric analysis which is not universally accepted and which may be subject to challenge in court as not sufficiently reliable as to be admitted in evidence.

Equally important, as Tom points out again today, such analysis normally is done using numerous similar exemplars from the subject, so that a subject saying “help” could be compare to the person on the tape saying “help.” In this case so far, however, there were no such exemplars, as an exemplar of Zimmerman’s voice screaming help was not taken so as to be compared to the 911 recording.

I’ll add just one piece of information to this mix not to prejudge the outcome either of the case or Owen’s analysis, but again to urge media caution.

In Red Ant, L.L.C v Sony Music Entertainment, Inc. , et al. (Docket No. 97 Civ. 4917), a 1997 case in federal court in the Southern District of New York (Manhattan), Owen was hired as an expert witness by the defendants to identify a voice on a tape. The court ordered the witness, Melvin Britt, to provide Owen with voice exemplars so that Owen could compare Britt speaking certain words with the voice on the tape speaking the same words.

Owen believed that Britt deliberately was not giving accurate voice exemplars so as to stymie Owen’s analysis, and a motion for sanctions was made.

The court rendered a Decision and Order on the motion for sanctions on May 15, 1998. I obtained a copy of the Order via Westlaw. Unfortunately, there is no publicly available copy that I could find (not even on the PACER system, which is not surprising since old docket documents often have not been scanned, although the docket sheet is on the system). Since Westlaw copyrights the way in which it presents cases (but it cannot copyright the text of the case) I can’t post the Westlaw copy. For those of you with access to Westlaw, the cite is 1998 WL 249195 (S.D.N.Y.).

In the Order, the Judge laid out the procedures Owen demanded so that he could perform a proper analysis, including the requirement of at least six similar exemplars. Here’s the pertinent portion (emphasis mine):

Pursuant to my March 24 order, Melvin C. Britt, II (also known as Melieck Britt) appeared in Los Angeles on April 1st and gave voice exemplars over the telephone to Thomas J. Owen, an expert in voice identification retained by the defendants.

2. The session wound up having five parts, and they were all recorded. In the first part, Mr. Britt gave exemplars for about 32 minutes, and did not come close to doing what my order required—“repeat the word in the same manner ( i.e., same emphasis and same melody pattern and same rate of speech) as requested by the expert.” In the second part, the attorneys spoke with me by telephone and I told Mr. Britt that (a) his exemplars were coming nowhere near the melody pattern, (b) I did not think it was very difficult to do what the expert was asking and (c) he should concentrate on trying to mimic the expert’s melody pattern and speed. In the third part, lasting about 5 minutes, Mr. Britt again gave exemplars, on two phrases. In the fourth part, the attorneys again spoke with me, and I said that (a) he was not even close on the second phrase, (b) he gave 3 or 4 good exemplars on the first phrase (in the course of more than 30 repetitions) and (c) I was surprised that Mr. Owen claimed that none of them were useable. I told Mr. Britt that I had heard about his speech impediment (I would not call it a stutter, but a mild tendency to repeat a word occasionally) but that I did not think it was all that hard for him to mimic Mr. Owen. (It seems to be undisputed that Mr. Britt is a singing coach.) Once again, I directed him to (a) listen very closely to what Mr. Owen was saying, and (b) deliberately try to copy his pitch and speed for another half hour. In the fifth and final part of the session, lasting about 34 minutes, Mr. Britt’s performance was substantially improved for the last 26 minutes.

3. On April 21 the defendants moved for sanctions and also for an order directing Mr. Britt to give exemplars in a second session in my presence. Mr. Owen’s April 20 declaration says:

… virtually each and every time I asked Mr. Britt to repeat a particular phrase, Mr. Britt would either speak in an obvious wooden monotone, repeat declarative statements in a question form, alter the beat or pitch of the phrases, vary the speed of the words, change the word coupling used by me or over-articulate particular linguistically significant words.

I have listened to the tape of the session, and Mr. Owen’s statement is quite accurate as to the first 32 minutes. But Mr. Britt’s performance was substantially improved for the last 26 minutes (from 6:56 to 7:22 p.m. New York time).

4. Mr. Owen further writes:

In order to complete a proper voice comparison, I regularly require six useable samples of each of the phrases I use for comparison purposes. Here, however, given Mr. Britt’s obfuscation, I do not have six useable samples of even one such phrase, despite the fact that I attempted to obtain 600 exemplars from Mr. Britt for approximately 18 phrases.

But for my present purposes (deciding whether Mr. Britt willfully disobeyed my orders) I must rely primarily on my non-expert view as to whether he provided exemplars that a reasonable person would consider to be useable. Mr. Britt’s performance was certainly not as natural as his speaking voice at his video deposition. But, in my non-expert view, Mr. Britt provided at least one useable sample in 14 of the 17 phrases tested in the last 26 minutes, namely:

(17) “Troy didn’t write nothing but ah, ah, ah two lyrics on the first verse.”

Approx. 10 repetitions, 0 useable, but see phrases (4) and (5) above.

5. To summarize, in my non-expert view, Mr. Britt provided 4 useable samples on three phrases (including “Trying to get you rich”), plus 5 useable samples on two phrases, plus 6 useable samples on one phrase. In his earlier declaration (dated March 13) Mr. Owen said at ¶ 23 that the International Association for Identification generally recommends that the text be read or recited at least 3 times, and the American Board of Recorded Evidence generally recommends 3 to 6 repetitions. I realize that an expert may need more repetitions when the subject performs poorly. Nevertheless, weighing all of the circumstances, I decline to order sanctions and I decline to order Mr. Britt to give more voice exemplars at this time. I do note, however, that plaintiff’s other arguments seem to me to be unfounded. Moreover, I disagree with the statement of plaintiff’s expert that “it is all but impossible for an individual to disguise his voice or to speak in a consistently unnatural manner for the amount of time Mr. Britt spent in the voice exemplar session.”

6. Mr. Owen will have to make do with the April 1st exemplars, plus any other samples of Mr. Britt’s voice which may be collected. I note that I was given only 5 minutes of excerpts of the videotaped deposition of Mr. Britt, and even those brief excepts have captured him speaking several of the words which appear in the crucial portions of the disputed tape.

This 1998 case does not prove that Owen is wrong in his current 911 tape analysis. Perhaps there are new techniques developed, but based on the information in Tom’s posts, the industry still seems to adhere to the standard practice of comparing numerous voice exemplars to the subject recording.

Caution is urged. The media in particular needs to think and investigate and put new drips of information in context rather than jumping to conclusions.

Comments

I trust one thing more than this VR hoakum…the witness who saw GZ yelling for help as he was talking to 911. I think voice biometrics in this case is about as reliable as graphology, lie detectors, and astrology. Even the latest and greatest VRS for personal use isn’t great and those are used under ideal conditions.

I work on speech compression for a living and have indirectly contributed to the standards bodies on more than one used in cell phones. I’m not going to give a long explanation as to why cell phones mangle the living hell out of the sounds, but believe me they do.
If this gentleman had this much trouble with sounds on a tape, I will say that his difficulties will go up by an order of magnitude working with sound from a cell phone. The fidelity of speech from a cell phone just isn’t any where near that coming from a tape.

Very, very true! His difficulties will go up by orderS of magnitude. And then it gets even worse when you view the many, many other uncompensated variables.

This was nothing but a blatant defamation attempt in the court-of-public-opinion, where the standards are so low that the “win” often goes to the one who can best fool mislead the jury public. (Related homework assignment with multiple choice question [poll] over at Ann’s place.)

I too worked on cell-phones: in order to pack the most calls into the fixed bandwidth available to them, the phone companies compress the heck out of the human voice, removing all kinds of data. And of course those tiny little, one-cent microphones on the phone are designed and placed to give as cheap an audio reception as possible, from a voice less than an inch from the microphone (v. someone screaming from the other side of the phone, yards away).

A good example: take your spouse’s phone, call your phone, put their phone outside on a table, then have your spouse walk some yards away from the phone. Now you go inside the house, hold your phone to your head, and have your spouse scream a sentence. See if you can even identify who it is, let alone understand it. And it’s a voice you’re intimately familiar with.

Did notice this am that Hal Uhrig has joined the legal team. Have seem him many times on TV being interviewed on cases and it’s a relieg because Sonner has never tried a criminal case and seems out of his element on this. I like Uhrig. He stated he’s joining the team because of the negative publicity and trial by mob GZ has gotten. I think if he’s as sharp as I think he is and if there is some sort of charges/trial, he’ll make mincemeat out of the prosecution’s case.

The problem with Nifong is even after the story fell apart he still sought to prosecute—he did it for political reasons as he was in the midst of his campaign. Durham has a large black population and he needed to be seen as a ‘man of the people’. From what I am getting about the state prosecutor he is being cautious and wants the investigation to be done right. If there are charges, he wants them to be the right ones.
NOW…I want to toss this out there. Many of us have tried to be sympathtic to the parents because their son is dead. OK…that’s fine. But some serious questions are coming up and it’s not looking good for them. Many of us have been asking why are they demanding an arrest when they have to know there’s not probable cause due to evidence and witness statements. The answer is in the statute 776.032 (2) A law enforcement agency may use standard procedures for investigating the use of force as described in subsection (1), but the agency may not arrest the person for using force unless it determines that there is probable cause that the force that was used was unlawful.

This is about money. That’s all they need is an arrest. A conviction doesn’t matter. They can then start the wheels to sue the city, Zimmerman, the HOA, insurance companies, and anyone/thing else. What an interesting development.

Doesn’t Owen have some serious explaining to do? Either he deliberately misled the newspaper – one assumes his sworn testimony about the standard is true – or he was asked to give the answer he gave. He should answer for this.

The stories about this case have frustrated me so much, and seeing this on the professor’s site added to my frustration so much that I had to register to comment on this blog. First I really admire this blog and have actually been a very long time reader. My frustration is with the overly sensationalized nature of this case. Truly, this individual’s assertion that voice analysis proves Zimmerman is guilty is wack. As the professor points out he doesn’t even meet his own criteria to establish a true voice match. Yet the thing that frustrates me is that no one, not even the professor, has acknowledged the true crux of this case. Did Zimmerman have the legal right to lethal self-defense? Lethal self-defense is allowed if escape/avoidance is not feasible and if you have good reason to believe you are in immediate danger of serious bodily harm or death. According to an eye witness and Zimmerman, Zimmerman was pinned down by Martin who was banging his head into the pavement. Zimmerman has the corresponding injuries, his claim of Martin being on top of him is supported by an eyewitness, police state Martin was on his stomach when they arrived further indicating he was on top of Zimmerman. Voice analysis, facebook pages, prior bad acts, whether you agree with Zimmerman following Martin, is all immaterial. It does not matter. What matters is whether or not there is evidence that Zimmerman had the right to lethal self-defense. The rest of this stuff is noise that may lead to a man losing his life, his reputation, and any financial security based on suppositions, exaggerations, and outright false information. Aren’t we all better than this?

[…] Then the press picked up a report from audio expert Thomas Owen, saying his conclusion proved with a…. But… Owen said a positive identification would involve a number over 90 percent. “[Y]ou can say with reasonable scientific certainty that it’s not Zimmerman,” he told the Sentinel. Source: Daily Caller […]

[…] in a previous court case where Mr. Owen was attempting to determine if a party to a lawsuit was the …, Mr. Owen demanded that the person to be tested give at least six (6) voice exemplars of the exact […]

[…] to be Martin’s. Yet these experts had no recording of Martin’s voice for comparison, which is absolutely necessary in such cases. Observant readers will notice that very little has, of late, been heard of these […]