Main navigation

Text-to-Speech: Upgraded Alexa

One of the most distinct features of any TTS application is the voice that reads the text on the device. Amazon, the largest online retail company in the world has gone far and wide with the use of its own TTS technology named Alexa. The AI-powered assistant is considered to be one of the most responsive and smartest TTS assistants available that is at par with other known TTS platforms such as Apple’s Siri. On the downside, Alexa is notoriously known for its relatively monotonous and flat voice. According to the article by Khari Johnson of Venturebeat.com, “Alexa is pretty smart, but no matter what the AI-powered assistant talks about, there’s no getting around its relatively flat and monotone voice.” Given the relatively robotic nature of the voices behind the TTS applications in the market, developers are finding new ways to go around such characteristic by introducing speech synthesis smarts.

In the case of Alexa, Amazon has added Speechcons to its skills kit, which is made up of various words and phrases written in SSML or Speech Synthesis Markup language. Furthermore, Khari highlighted in his article that new SSML skills will enable Alexa to say more than 100 phrases, “like “Bada bing,” “Good grief,” or plain old “Boom” — to add more excitement and emotion to Alexa’s speech”. Amazon’s way of improving Alexa is to create a more organic AI voice that takes away its robotic nature. Other improvements integrated into Alexa include, “In recent months, Amazon has added other features to the Alexa Skills Kit to give developers more built-in tools. Last December, a library of hundreds of commands was added to the Alexa Skills Kit, making it easier for Alexa to talk about things like books, the weather, and local businesses”. Examples of the speech integrations added into Speechcons can be found on the solutions/Alexa page of developers.amazon.com.

Primarily, the objective of upgrading Alexa is to enable the AI voice to be more expressive by adding phrases such as “ahoy”, “argh”, “ba humbug”, “cowabunga”, and among others. The addition of the expressive phrases hopes to take away the monotonous quality of Alexa, which on the part of the users will enable them to have a more realistic experience of using Amazon’s TTS platform. Furthermore, Amazon’s objectives for upgrading Alexa is to enable her to be more responsive when talking about Amazon’s products and website features.

Aexa’s upgrade are not meant to only improve the user’s experience, but also introduce a TTS innovation that features an AI voice assistant that is smart, expressive, interactive, and engaging. In the coming updates, Amazon will continue to interject more improvements in Alexa’s skill box, which users can check out on the company’s developers’ website.