Perfecting The Power to Talk – The Future of Voice And Speaking

The Medical Futurist 30 October 2018

Talking, conversing, exchanging words: for more than 10 million people, this seemingly simple act cannot be imagined without assistive technologies, such as voice generating devices, touch screens or text-to-speech apps. What does the digital future bring for them? How could innovations turn around the translation industry or the medical administration process? Here’s a glimpse into the future of voice and speaking.

Speaking, identity, voice stereotypes

Give me the key! – This simple sentence carries much more information when it’s pronounced. A weary Philippino mother could instruct her little child as she cannot open the door to their home otherwise. A friend could reassure a troubled twenty-something that everything’s going to be alright with her dog, she will take care of little Suzy while she’s away on holiday. An angry and jealous Dutch woman can demand the opener from her boyfriend who doesn’t want to unlatch the door as another woman hides behind it.

Voice carries much more information about the speaker than just content.Tonality, speed, volume, pitch or articulation of the vowels all matters – and make every single one of us unique, become part of our identities. And as other characteristics of human appearance, it creates expectations and stereotypes. A strong, muscle-man is anticipated to have a low, rotund voice while a small girl a thin and high-pitched one. That’s why jokes playing with voice stereotypes could work: as in the cute cartoon, Up, where the Alpha dog’s voice machine gets messed up and instead of his alpha-male, strong barking he’s giving instructions in a high-pitched bird cry.

In the light of the above, the question emerges: what happens to those who lose their ability to speak due to injury or illness? Could modern technology give them back the functionality of speaking alongside the lost part of their identity?

The Guardian reminds us, the 2014 biopic of Hawking’s life, The Theory of Everything, contains a stark reminder of the loss that this technology could bring with it. When Hawking and his first wife, Jane, first hear what will be Hawking’s new voice, they are stunned. After a moment of speechlessness, Jane offers a timid objection: “It’s American.” The moment is played for laughs, but it marks a trauma. From that time on, the British Hawking was talking to his audience through an American accent and a rather robotic voice. Moreover, in contrast with the unique human voices, this mechanic sound scale has also been the voice to choose both for little girls or old men worldwide – as the collection of voices wasn’t that wide.

Alternative communication solutions for people having trouble speaking

1) Assistive apps

The appearance of smartphones, tablets, and digital touch screens, in general, allow a more straightforward way of communicating without speaking. The simplest AAC device is a picture board or touch screen that uses pictures or symbols of common items and activities that make up a person’s daily life. For example, autistic children or adults might have great difficulties in expressing their needs, but it’s possible to teach them to touch the image of bread if they want to ask for a slice of bread. Several of such picture boards can be customized and expanded based on a person’s age, education, occupation, and interests.

The iOS app, SayIt! is a simple help for people having trouble speaking: they can open the app, type something and tap the speak button. The program also has a word prediction element similar to autocomplete or autocorrect. The Predictable app offers something similar, using intelligent word prediction to provide word options on a keyboard to limit the amount of typing it takes to write a sentence or phrase to be spoken by the app. On the other end of the spectrum, the Proloquo2Go app offers symbol-based communication – used by over 200,000 people in need.

Regarding assistive communication apps, the future undoubtedly holds artificial intelligence – through the development of natural language processing and text prediction. Experts believe that personalized synthetic speech applications could provide users with opportunities to customize and/or personalize their voice in the future. An example is the platform WaveNet produced by Google’s DeepMind artificial intelligence algorithm, which utilizes the technology to model audio waveforms based on real human speech and learns from it to create its own sounds in a variety of voices. Or consider GazeSpeak, an open-source platform which uses artificial intelligence to convert eye movements into speech, and makes it easier for users with ALS and other disabilities to converse in real time.

2) Speech-generating devices

Speech-generating devices go one step further by translating words or pictures into speech. Some models allow users to choose from several different voices, such as male or female, child or adult, and even some regional accents, others enable the person to use their eyes or tilt their heads to activate switches and have personalized features for specific conditions and use cases. Some devices employ a vocabulary of prerecorded words, while others have an unlimited vocabulary, synthesizing speech as words are typed in.

For example, the Pocket Go Talk is a wearable, light-weight, small and portable AAC device, also built for tabletop use with five adjustable scanning speeds. MegaBee is a simple-to-use assisted writing tablet, which was developed in conjunction with patients and as a consequence, tailored perfectly to their needs. For a particular use case, Enabling Devices’ Tactile Symbol Communicator applies tactile symbols for concrete or abstract representations that can be tactually recognized by the visually impaired user.

In addition, several companies exist which aim to combine existing digital technology, including social companion robots, to further improve assisting speech devices. A spin-off of Luxemburg University, LuxAI has developed a social companion robot for children with autism, the QTrobot, which teaches the little ones new skills for being able to express their emotions and better participate in social interactions.

On the other hand, the Boston-based VocalID promises to bring life to robotic machine voices by the power of crowdsourcing and voice blending technology. The company creates a unique vocal persona for any text-to-speech tool. So far, over 14,000 speakers from over 110 countries have contributed over 6 million sentences to their growing spoken repository, The Human Voicebank.

3) The future: communication through brain-computer interfaces

The idea that once humanity will be able to give out commands purely by thinking about them is as far away in the terrain of science fiction as Marty’s hoverboard from Back to the Future (no, single-wheeled hoverboards don’t count).

However, some researchers undertook the issue in large-scale research projects. University of Reading researcher Dr. Kevin Warwick managed to control machines and communicate with others using only his thoughts with a cutting-edge neural implant. In 1998, Warwick, who earned the nickname “Captain Cyborg” from his colleagues, implanted a transmitter in his arm to control doors and other devices; then in 2002 he decided to implant electrodes directly into his nervous system in order to control a wheelchair with his thoughts and allow a remote robot arm to mimic the actions of his own arm.

Having a goal of helping voiceless patients communicate as a next, and very brave, step, Warwick implanted a chip into his wife’s arm to link their brains together through the Internet, creating the world’s first electronic brain–to–brain communication. When she moved her hand three times, he felt three pulses and recognized that his wife was communicating. He is optimistic that mind–to–mind communication will become a commercial reality in the next one or two decades. When Cathy Hutchinson, paralyzed years earlier by a brainstem stroke, managed to take a drink from a bottle by manipulating a robot arm with only her brain and a neural implant in 2012, the path became clear for future research.

Source: https://www.hochschule-rhein-waal.de

Speaking 2.0: Voice interface technology and real-time translations

Innovations could lift communication to another level in the future – not only with regards to the fact how it empowers people with speech impairments but also in connection with how people exchange information with individuals from other cultures or with machines.

Not even sign language is left out of technology’s sight. Researchers at Texas A&M University have developed a wearable device that “translates” sign language into English by sensing the user’s movements. In October 2015, the instrument was already able to recognize some 40 ASL signs with 96 percent accuracy. However, as The Atlantic points out, projects converting sign language, especially ASL gloves, have strong limitations at the moment. Perhaps, the future will bring some breakthrough in interpreting human movements for sign linguistic purposes by machines.

Nevertheless, machines will not only act as interpreters but also as communication partners. Gartner estimates that 30% of our interactions with technology will be through conversations with smart machines by the end of this year. Now, one in six U.S. adults owns a voice-activated intelligent speaker or device – and Forbes believes that number will continue to rise. In the future, speaking to Siri or Alexa will not only result in turning the lights on and off, but also in making restaurant reservations or dentist appointments. In a couple of years, you might not be sure whether you are talking to an A.I. or a real person.

Source: www.lwolf.com

Medical implications: Is the doctor speaking?

Speech technologies will impact the business world, the private sphere – and the medical community. Chatbots will become the first line in primary care. That means patients will turn to chatbots for medical advice in simple cases and they will be sent to a doctor only in cases when the algorithm decides that the individual needs further assistance. It will take some burden off the shoulders of busy doctors and nurses while taking care of patients.

Artificial intelligence-powered and speech-based solutions will support diagnostics in many other ways, too. As vocal biomarkers are just as unique as our voice itself, researchers believe that they could help in diagnosing certain illnesses. The tech giant, IBM is teaming its Watson AI supercomputer with academic researchers to try to predict from speech patterns whether patients are likely to develop a psychotic disorder. An Israeli company, Beyond Verbal deals with emotion analytics and provides voice analysis software. It has announced that its algorithms were successful in helping to detect the presence of coronary artery disease (CAD) in a group of patients.

In addition, voice assisting technologies could also become a valuable asset in medical administration. Imagine that artificial intelligence-based voice tools could record patient visits and make medical notes for doctors without the need to type them into any health record system. How much more time a medical professional could spend with their patients without the need for constant administration? A company called Augmedixutilizes Google Glass to enable physicians to examine their patients, while remote medical scribes fill out the electronic medical records based on what they hear and see from the visit. How mind-boggling solutions technology could bring to the medical practice, don’t you think?