Life is an exercise to express the InExpressible.

I attended few sessions by Amazon about its Alexa devices, during last few months. Voice represents next major disruption in computing. These devices provides VUI (Voice User Interface). We had hands-on sessions and interactive QAs. Let me cover some of the relevant URLs and overview in this blog post for readers of "Express YourSelf !"Comparison Here is a quick comparison of all major devices:

Echo Dot

Echo Dot Kids Edition

Echo

Echo Plus

Echo Spot

Echo Show

Price

Rs. 4000

N/A

Rs. 10,000

Rs. 15000

Rs. 13000

N/A

$50

$80

$100

$150

$130

$230

Rs. 3341

Rs. 5346

Rs 6682

Rs. 10023

Rs.8687

Rs. 15369

Microphones

7

7

7

7

4

8

Misc.

Smart hub

2.5" screen

7" screen

Apart from the regular devices from Amazon, few smart cars and smart TVs also have built-in Alexa support. Amazon also launched "Alexa 7-Mic Far-Field Dev Kit" that hardware can be part of any product. One can add display support also like Echo Spot and Echo Show, however it needs to go through rigorous certification process from Amazon.Comparison of Mobile App with Alexa SkillMobile App ~ Alexa SkillMobile App icon ~ Invocation NameGUI ~ VUIMany mobile apps have Alexa skill e.g ola, goibibo, crickinfo, zomato etc. have alexa skillHow it worksAlexa software has mainly two major components. 1. ASK (Alexa skills kit) to build new skill

2. AVS (Alexa voice service) to integrate with RPi kind of device. The hardware is quite simple with microphone array and speaker. The microphone array used for noise cancellation. The spoken sentence is divided into :1. Wakeup word2. launch3. Invocation name. It should be two words. 3. UtteranceStepsstep 1. Wake up word can be = Alexa / Computer / Echo / AmazonThis will wakeup the device. It triggered beam forming to listen. step2. The utterance (captured audio) goes to cloudstep3. At cloud real magic happens with3.1 speech processing 3.2 NLPstep4. The invocation name is detected. With invocation name, the execution flow goes to specific skill. Now skill has all the logic, algorithm to further understand the utterance, to access cloud service, database etc and finally for the responseHere the front-end is developed and tested with simulator using developer.amazon.comstep5. As per training model, Alexa translate the utterance to Intent. The developer need to create custom intent, that mapped to function implementation to provide response. Alexa also provide standard built-in Intent, that developer can implementhttps://developer.amazon.com/docs/custom-skills/standard-built-in-intents.htmlThere is a set of built-in intent libraries for various use caseshttps://developer.amazon.com/docs/custom-skills/built-in-intent-library.html

In Alexa terms "slot" is like argument to function. Alexa has built-in slot types : https://developer.amazon.com/docs/custom-skills/slot-type-reference.htmlThere is many to one mapping between utterances and intent. There is one to one mapping between intent and functionThere is many to one mapping between utterances and custom slot. There is one to one mapping between custom slot value and argument value to function. So one can pronounce "A.C" or "Air Conditioner" still it maps to same enumarated value as argument to function. Such synonymous are detected using "Entity Resolution" The back-end function can be implemented at any HTTPS terminated end-point or AWS lambda service. The AWS Lambda service, at present, is available only for regions: 1. US east North vergina2. EU (Ireland)The professional skill can use session attribute for better user experience and also for data analytics. step6. The response can be 6.1 Speech : SSML, Local lingo, TTS, audio stream, small mp3 files6.2 Cards = title, subtitle (skill name), text (content), image. Cards are optional. We can use rich text with different font including Unicode at card. It is built using various BodyTemplate and ListTemplate. The speech output goes to speaker. The card output goes to 1. Alexa Companion App2. Echo Spot and3. Echo ShowOne can check device capability for including card/video in response. The Alexa skill can be built using pre-built models

1. Custom: For unique need2. Flash briefing : For RSS feed3. Smart Home : For home automation4. Video : For video applicationQuestions - AnswersLet me highlights few leanings about Alexa Echo eco-system and the devices* The Alexa companion app can be connected to only one device. So it is not possible to push same image/content/card to all companion app running on mobile using single Alexa device* Amazon allows to use same invocation word for multiple skill developed by same/different people. All such skill can be configured for given device. However the skill that is configured last, it will be invoked for the duplicate invocation word. * It is possible to enable/disable specific app on the device using mobile app* It may possible to develop smart home device using Raspberry Pi for single user, with skill that is not published. One can use Smart Home pre-built model. Let the Intent invoke code running at Raspberry Pi, that turn on/off home appliances using GPIO pin and relay. * None of the Alexa devices has built-in battery. * "Alexa for Business" can have features like allowing access to very specific limited set of skills only. * Alexa does not have any adult content, so parental control is not needed. * One can change wakup word and replace "alexa". However still it will be female voice only. The Alexa devices do not support response in male voice. * Alexa device cannot be used for dictation or speech to text conversion. One can use AWS transcribe service https://aws.amazon.com/transcribe/for the same. * One can develop (1) one shot dialogue (2) multi-turn dialogue skills* To design multi-turn dialogue skills, one can use (1) graph UI or (2) frame UI. * Alexa can prompt for missing slot* Amazon is coming up with Notification, that will be triggered by skill to Alexa device. However until the end-user ask to get notifications, the Alexa device will not start talking by itself to inform about notification. URLsNow, let's have a look to important URLsalexa.design/guide : Design of Voice Experience alexa.design/indiacheckin :Join the Amazon developer community & check in for the event in India. alexa.design/india : It has details about all meetup, hackathon, webinar, slack channel etc.alexa.design/codecademy and alexa.design/training: Online learning resourcesalexa.design/factskill and bit.ly/2JWxlY9: Getting started with skill developmenthttps://github.com/alexa : Alexa public sample code repositoryhttps://developer.amazon.com/en-in/alexa-skills-kit Getting Started in IndiaAlexa response can be further enhanced at skill using1. SSML. SSML is Speech Synthesis Markup Language. More details:https://en.wikipedia.org/wiki/Speech_Synthesis_Markup_Languagehttps://www.w3.org/TR/speech-synthesis11/Alexa specific SSML : https://developer.amazon.com/docs/custom-skills/speech-synthesis-markup-language-ssml-reference.html2. Speechcon https://developer.amazon.com/docs/custom-skills/speechcon-reference-interjections-english-india.html