Archives

Categories

Connect

Posts Tagged ‘Echo’

It’s not easy to be a retailer today when more and more people are turning to Amazon for shopping. And why not shop online? Ordering is convenient with features such as ratings. Delivery is fast and cheap, and returns are easy and free – if you are Prime member! In April 2018 Bezos reported there are more than 100 million Prime members in the world, and the majority of US households are Prime members. Walmart and Google have partnered in an ecommerce play to compete with Amazon, but Walmart is just dancing with the devil. Google will use the partnership to gather data and invest more in their internal ecommerce and shopping experiences. Walmart isn’t relaxing, and is aggressively pursuing ecommerce and AI initiatives through acquisitions, and its Store #8 that acts as an incubator for AI companies and internal initiatives. Question: why does Facebook have a Building 8 and Walmart have a Store 8 for skunkworks projects?

Apple introduced Siri in 2011 and my world changed. I was running Sensory back then as I am today and suddenly every company wanted speech recognition. Sensory was there to sell it! Steve Jobs, a notorious nay-sayer on speech recognition, had finally given speech recognition the thumbs up. Every consumer electronics company noticed and decided the time had come. Sensory’s sales shot up for a few years driven by this sudden confidence in speech recognition as a user interface for consumer electronics…

Here’s the basic motivation that I see in creating Voice Assistants…Build a cross platform user experience that makes it easy for consumers to interact, control and request things through their assistant. This will ease adoption and bring more power to consumers who will use the products more and in doing so create more data for the cloud providers. This “data” will include all sorts of preferences, requests, searches, purchases, and will allow the assistants to learn more and more about the users. The more the assistant knows about any given user, the BETTER the assistant can help the user in providing services such as entertainment and assisting with purchases (e.g. offering special deals on things the consumer might want). Let’s look at each of these in a little more detail:….

I have spoken on a lot of “voice” oriented shows over the years, and it has been disappointing that there hasn’t been more discussion about the competition in the industry and what is driving the huge investments we see today. Because companies like Amazon and Google participate in and sponsor these shows, there is a tendency to avoid the more controversial aspects of the industry. I wrote this blog to share some of my thoughts on what is driving the competition, why the voice assistant space is so strategically important to companies, and some of the challenges resulting from the voice assistant battles…

Amazon, Google, Sonos, and LINE all introduced smart speakers within a few weeks of each other. Here’s my quick take and commentary on those announcements.

Amazon now has the new Echo, the old Echo, the Echo Plus, Spot, Dot, Show, and Look. The company is improving quality, adding incremental features, lowering cost, and seemingly expanding its leadership position. They make great products for consumers, have a very strong eco-system, and make very tough products to compete with for both their competitors and their many platform partners that use Alexa.

The hands-free personal assistant that you can wake on voice and talk to naturally has significantly gained popularity the last couple of years. This kind of technology made its debut not all that long ago as a feature of Motorola’s MotoX, a smartphone that had always-listening Moto Voice technology powered by Sensory’s TrulyHandsfree technology. Since then, the always-listening digital assistant quickly spread across mobile phones and PCs from several different brands, making phrases like, “Hey Siri,” “Okay Google,” and, “Hey Cortana,” commonplace.

Then, out of nowhere, Amazon successfully tried its hand at the personal assistant with the Echo, sporting a true natural language voice interface and Alexa cloud-based AI. It was initially marketed for music, but quickly expanded domain coverage to include weather, Q&A, recipes, and the ability to answer common questions. On top of that, Amazon also opened its platform up to third-party developers, allowing them to proliferate the skill sets available on the Alexa platform, with now more than 10,000 skills accessible to users. These skills allow Amazon’s Echo, Tap, and Dot, as well as the several new third-party Alexa-equipped products like Nucleus and Triby, to be used to access and control various IoT functions, from reading heart rates on Fitbits to ordering pizzas and controlling lights within the home.

I was at the Mobile Voice Conference last week and was on a keynote panel with Adam Cheyer (Siri, Viv, etc.) and Phil Gray (Interactions) with Bill Meisel moderating. One of Bills questions was about the best speech products, and of course there was a lot of banter about Siri, Cortana, and Voice Actions (or GoogleNow as it’s often referred to). When it was my turn to chime in I spoke about Amazon’s Echo, and heaped lots of praise on it. I had done a bit of testing on it before the conference but I didn’t own one. I decided to buy one from Ebay since Amazon didn’t seem to ever get around to selling me one. It arrived yesterday.

Here are some miscellaneous thoughts:

Echo is a fantastic product! Not so much because of what it is today but for the platform it’s creating for tomorrow. I see it as every bit as revolutionary as Siri.

The naming is really confusing. You call it Alexa but the product is Echo. I suspect this isn’t the blunder that Google made (VoiceActions, GoogleNow, GoogleVoice, etc.), but more an indication that they are thinking of Echo as the product and Alexa as the personality, and that new products will ship with the same personality over time. This makes sense!

Setup was really nice and easy, the music content integration/access is awesome, the music quality could be a bit better but is useable; there’s lots of other stuff that normal reviewers will talk about…But I’m not a “normal” reviewer because I have been working with speech recognition consumer electronics for over 20 years, and my kids have grown up using voice products, so I’ll focus on speech…

My 11 year old son, Sam, is pretty used to me bringing home voice products, and is often enthusiastic (he insisted on taking my Vocca voice controlled light to keep in his room earlier this year). Sam watched me unpack it and immediately got the hang of it and used it to get stats on sports figures and play songs he likes. Sam wants one for his birthday! Amazon must have included some kids voice modeling in their data because it worked pretty well with his voice (unlike the Xbox when it first shipped, which I found particularly ironic since Xbox was targeting kids).

The Alexa trigger works VERY well. They have implemented beamforming and echo cancellation in a very state of the art implementation. The biggest issue is that it’s a very bandwidth intensive approach and is not low power. Green is in! That could be why its plug-in/AC only and not battery powered. Noise near the speaker definitely hurts performance as does distance, but it absolutely represents a new dimension in voice usability from a distance and unlike with the Xbox, you can move anywhere around it, and aren’t forced to be in a stationary position (thanks to their 7 mics, which surely must be overkill!)

The voice recognition in generally is good, but like all of the better engines today (Google, Siri, Cortana, and even Sensory’s TrulyNatural) it needs to get better. We did have a number of problems where Alexa got confused. Also, Alexa doesn’t appear to have memory of past events, which I expect will improve with upgrades. I tried playing the band Cake (a short word, making it more difficult) and it took about 4 attempts until it said “Would you like me to play Cake?” Then I made the mistake of trying “uh-huh” instead of “yes” and I had to start all over again!

My FAVORITE thing about the recognizer is that it does ignore things very nicely. It’s very hard to know when to respond and when not to. The Voice Assistants (Google, Siri, Cortana) seem to always defer to web searches and say things like “It’s too noisy” no matter what I do, and I thought Echo was good at deciding not to respond sometimes.

You need to know who is talking and build models of their voices and remember who they are and what their preferences are. Sensory has the BEST embedded speaker identification/verification engine in the world, and it’s embedded so you don’t need to send a bunch of personal data into the cloud. Check out TrulySecure!

In fact, if you added a camera to Alexa, it too could be used for many vision features, including face authentication.

Make it battery powered and portable! To do this, you’d need an equally good embedded trigger technology that runs at low power – Check out TrulyHandsfree!

If it’s going to be portable, then it needs to work if even when not connected to the Internet. For this, you’d need an amazing large vocabulary embedded speech engine. Did I tell you about TrulyNatural?

Of course, the hope is that the product-line will quickly expand and as a result, you will then add various sensors, microphones, cameras, wheels, etc.; and at the same time, you will also want to develop lower cost versions that don’t have all the mics and expensive processing. You are first to market and that’s a big edge. A lot of companies are trying to follow you. You need to expand the product-line quickly, learning from Alexa. Too many big companies have NIH syndrome… don’t be like them! Look for partnering opportunities with 3rd parties who can help your products succeed – Like Sensory! ;-)