Teaching German

Science

Voice analysis: an 'objective' diagnostic tool based on flawed algorithms?

A new generation of voice analysis software can tell whether you're ill with a disease or even depression. To some it's a useful diagnostic tool. But is this also the next invasion of our privacy?

For decades, speech recognition software has been able to detect spoken words and turn them into text, or respond to commands. But now researchers are delving deeper: they are looking at the voice's acoustic qualities to find out about the medical - or even emotional - state of the speaker.

Although voice analysis isn't new, advances in computational power and the accessibility of digital data over the past decade or so have changed the game for machine voice analysis.

And one of the big fields where voice analysis has made concrete progress is in medical diagnostics.

Software is being developed that can tell whether a person has a neurological disease, such as Parkinson's, or a psychiatric problem, such as attention deficit disorder.

There's even research into apps that can tell whether you're tired, or in fact depressed.

Some experts say the technology is likely to become pervasive - but not everyone agrees that this is a good thing.

Objective diagnosis

Max Little, a mathematician and MIT research fellow who works in voice analysis, has been developing voice analysis technology to detect Parkinson's disease.

Little and his team have collected audio samples from people, with and without Parkinson's, saying "aaah."

Using machine-learning algorithms to detect tremor and weakness in the voice, Little has developed a model that can identify the vocal qualities of a person suffering from Parkinson's - with an accuracy of around 99 percent.

The Parkinson's Voice Initiative is now crunching numbers on the 17,000 audio samples the team has gathered to try and answer the question of whether or not this technology will work over a mobile phone.

Little describes using the telephone to detect Parkinson's as "technologically convenient," as three-quarters of the world has access to a phone. But another major advantage of using such software to diagnose illness, he says, is that it cuts out human subjectivity.

Tests for Parkinson's and other neurological disorders involve expert clinical opinion. This can lead to slightly different answers every time since the process is based on human judgment.

"The advantage of any kind of algorithmic system like this is that the process is entirely repeatable and entirely objective," Little explains.

Little stresses however that the technology "really has to be used in a clinical context, because you'd obviously need to have the access to care."

Brain music

Jörg Langner, a mathematician at the Charité Hospital in Berlin, takes inspiration from musicology. He's developed an analytical model called deep speech pattern analysis.

The model defines six different features of voice: loudness, articulation, tempo, rhythm, melody, and timbre.

"Everything that happens in the brain influences speech production," Langner says. "Because of this, we can trace the clues of what's happening in the brain via speech sound analysis."

Langner's current research on clinical applications focuses on diagnosing Attention Deficit Hyperactivity Disorder, or ADHD, in children.

Based on voice analysis, Langner has found differences in the utterances of children with ADHD and without, including fluctuations in speech loudness and melody.

As with Little's work on Parkinson's, Langner and his team have developed a model for the speech characteristics of children with ADHD. They say a diagnosis can be made using the model.

Langner agrees with Little that the perception of "objectivity" in the process is good. He also says speech analysis should not be used as a standalone diagnostic tool.

"It should be a tool to support a doctor or psychologist," says Langner.

Since speech is produced in the brain, voice analysis could be used to diagnose many psychiatric disorders. Langner's team claims to be able to detect the degree of depression with a very high success rate.

"This may be useful for the prevention of suicide," Langner says.

Pervasive and all-knowing?

But the diagnosis of illness isn't the only field where voice analysis has potential.

Jarek Krajewski, a psychologist at Wuppertal University in Germany, has been researching how voice analysis can be used to detect people's emotional states.

"It could be really basic emotional states such as sadness, anger, or joy," Krajewski says.

He has studied the voices of alert and tired people to detect fatigue.

"Or you could even think of quite abstract states, such as confidence."

In this spectogram, color indicates the typical lower-frequency sounds and micro-tremor of a depressed voice

Krajewski says there are hundreds of potential applications, including in business, science, gaming, marketing, healthcare, and dating, to mention a few.

But Krajewski warns the technology also presents a risk.

"We could suffer from a brave new world where emotions are no longer private," he says. The government, insurance companies, and others could monitor our emotions and personalities, Krajewski says, "and come to conclusions which may not be beneficial for the person in question."

Little agrees voice analysis will become more pervasive.

"This sort of technology is going to become a part of what we do on an everyday basis," he says.

And just because something is objective, it doesn't necessarily mean it's perfect.

"These things are only going to be widely accepted and used correctly if people grasp the fact that just because an algorithm is objective," says Little, "it doesn't mean that the design of the algorithm doesn't have flaws in it."