Amazon introduces PC designs to integrate Alexa into PCs

Alexa for PCs, announced earlier this year, brings the cloud-based voice service to Windows 10 computers. Today, we introduce Alexa for PC solutions from Original Design Manufacturers (ODMs). Customers use PCs every day for business and entertainment. We believe voice is the next major disruption in the PC category, which is an important part of our “Alexa Everywhere” vision.

Four Windows 10 PCs have been added to our portfolio of qualified ODM solutions integrated with Alexa: An all-in-one desktop from Wistron, and convertible notebooks from Compal, Quanta, and Wistron. All of these pre-tested, final-product designs have been built for a far-field Alexa experience, with Intel CPUs, drivers, wake word engine, and microphone arrays.

I find Amazon devices very off-putting. I know Alexa devices are popular in the US, but does anyone outside of the US use Alexa?

The fact is, in the near future it’ll be hard to buy any sort of mainstream audio speaker without an assistant built in. The cost of the additional hardware must be minimal, and given the advertising opportunities I’d be surprised if Google and Amazon don’t start practically paying manufacturers to include them. Plus, I’m willing to believe people much prefer them to pressing buttons.

On Windows, where it’s competing against Cortana, the usefulness isn’t so clear; mostly the benefit comes from sharing data across devices I expect.

Seriously, why does everything have to have a voice assitant in it now? Since Microsoft rolled out Windows 10, i can’t think of a modern computing device that hasn’t got some sort of voice assistant installed as default. Macs, iOS devices, Windows 10 PC’s, Android devices and pretty much anything else that has internet connectivity has one installed out-of-the-box.

With all this talk about privacy and unwanted data leaks (like Cambridge Analytica), i find it amazing how people are so acceptant of a software suite who’s sole purpose is to snoop into your private conversations, looking for a keyword word like “Cortana”, “Siri” or “Alexa” in order to then execute a command following that keyword. Even if the device can recognise that keyword without external datacentre help, it’s still going to be doing calls to the “mothercloud” in order to complete most requests. Whilst it’s doing that, there’s the possibility it’s shipping off all your conversations to analyse them for data that can then be used to sell you more stuff. They’ve done it before, and they’ll do it again: https://www.cbsnews.com/news/phone-listening-facebook-google-ads/

Seriously, why does everything have to have a voice assitant in it now?

Voice driven stuff has been one of the sci-fi pipe dreams from back when computer interfaces were slower to use than voice commands (and from sci-fi writers who didn’t understand how computers worked).

I predict there will an inflection point, when AI combined with voice input makes modern computers faster than GUIs. We are a bit off from that right now, but it’s coming.

Also, we are not preparing well for machine intelligence. It can solve most of our human problems – but so far we are letting a small number of individuals use it to enrich themselves at the expense of everyone else (just like they did with spreadsheets). We can and should do better.

I predict there will an inflection point, when AI combined with voice input makes modern computers faster than GUIs. We are a bit off from that right now, but it’s coming. [/q]

I agree about the technical inflection point, but I believe the human element will prevent it from becoming popular, except in limited circumstances.

There will be nothing more annoying than listening to an office / coffee shop / [any public place] filled with people talking to their devices. It will be enough to make even the most pacifist person “go postal”!

And that doesn’t take into account the privacy problems – both the “big brother” angle, and sharing all your info with everyone within earshot.

[q]Also, we are not preparing well for machine intelligence. It can solve most of our human problems – but so far we are letting a small number of individuals use it to enrich themselves at the expense of everyone else (just like they did with spreadsheets). We can and should do better.

Amen to that! It’s long past time that we start thinking about the societal (not to mention environmental, etc.) impacts of new technologies a lot earlier in the process.

I predict there will an inflection point, when AI combined with voice input makes modern computers faster than GUIs. We are a bit off from that right now, but it’s coming.

For a lot of people I think even today’s voice interfaces are faster than GUIs – for them. It’s faster for a lot of people to, say, order something from Amazon by voice than to search and navigate the site.

It’s just that it lacks the capability to keep up with people who, when it comes to keyboard and mouse, naturally find their way to shortcuts or interface quirks.

> sole purpose is to snoop into your private conversations, looking for a keyword word like “Cortana”, “Siri” or “Alexa”

Plus the recent articles that these devices might be able to be controlled/hacked covertly using ultrasonics [1].

I have no idea if any of this has been seen “in the wild”, or whether a relatively simple hardware filter could effectively combat it.

I guess that might depend on exactly what “useful” information a far-field mic system can extract form ultrasonic frequencies. If it’s just directional information, then I would presume the high-frequency stuff could be processed and only a “directional” signal be output. Or maybe just make the device so that it would only activate if there were a low-frequency component in ADDITION to any high-frequency component.

I don’t think voice input is going to be the next major disruption. I have an Alexa device, a Google home mini, and of course Siri on my iDevices. None of these things can perform anything other than basic tasks. They can set a timer, start playing music, tell me the weather, and control certain devices at the most basic level. Useful functions, certainly, particularly the hands-free timers while cooking. However, they don’t work reliably enough to do more than these basic tasks. Half of the time they don’t understand what I say at all, and part of the rest they get it wrong. On top of this, they’re 100% reliant on the internet being live to do anything, even basic tasks like setting a timer or telling me the time.

Voice input works for simple things. Anything more complex is best left to a more suitable interaction method. That doesn’t even take into account what happens when multiple people are in the same area with their own devices; it’s bad enough already when Alexa hears an Alexa advertisement on the radio or TV. The results can be hilarious, but it demonstrates the fundamental problem with voice interaction: there seems to be no way, currently, to filter out what is a legitimate command to the system and what is excess noise. I’ve even had Alexa go into a loop while playing SiriusXM, when an advertisement for the station came on and said ad featured Alexa. Oops!

None of them do anything other than basic tasks simply because that’s what works for their target market. They are trying to target people who don’t want or can’t use more traditional ‘command and control’ style voice interfaces, and such people generally don’t need much in the way of voice interfaces because they only need them to speed up things they do really often that are otherwise trivial tasks.

Put another way, just because that’s how things are right now doesn’t mean that’s the only way they can be. As an example, I’ve got my home server equipped with a couple of nice mics, some decent speakers, and about half a dozen pieces of open source software chained together in such a way that I can verbally query system state (load average, what services are running, what alarms from the various monitoring tools I use are raised, etc) and trigger things like restarting services or updating commonly changed configuration. This isn’t something that would be useful to most people though, simply because the level of knowledge required to use it effectively is so insanely high and none of the commands are inherently discoverable (and it’s a ‘command and control’ style interface, not a natural language interface (so issuing commands for it is not too different from programming), which is not exactly what most people want).

I think you jumped an extra step. I don’t expect voice assistants to do everything, but there are a few areas they are weak.

For instance, calendaring. When someone says assistant it immediately puts an image of a secretary managing a calendar. Amazon’s alexa can read events from google calendars and even multiple calendars from within google. The Google home mini cannot do this. At my last job, I used gmail at work and had a personal google calendar. I could share my work calendar with my personal calendar and then connect them to alexa to find out what was on my work schedule. With google home, it can’t do that. The solution from Google when I asked was that I like the personal speaker to my WORK ACCOUNT. This would of course mean that my work account would get a bunch of personal information in it. No thanks.

So while you jumped to complex integration processes, I’m thinking about refining basic functions like handling multiple calendars, reading important emails but not every spam email, knowing that I like to have the lights on at night in certain rooms without me telling it that, and other machine learning / data mining tasks.

Even with the voice training features, occasionally an alexa or google home device thinks it heard a wake word while watching TV.

Then there is availability problems. Echo’s are nearly always up. Google devices have network issues frequently or simply outages. I had a lot of problems with google mesh networking using their OnHub routers. Ended up bridging one and using pfsense which cut down on problems. I couldn’t even get an all google environment to work together. Google has a lot to do.

When they work, I feel like Google home devices are superior except for calendaring. Amazon has reliability.

Siri is the most useless of the group. She can’t do much and on some devices practically nothing but tell the time and get the weather (apple tv).

Cortana is pretty cool on windows, and does support some calendaring integrations and seems built to be a real assistant. However, the iOS and android apps aren’t that good and they don’t really support iPads. Microsoft’s big failure is that they didn’t get into the market soon enough and there aren’t many popular speakers with their software.

Point taken. I actually hadn’t known about there being such difficulties with calendar integration, although the issues with multiple Google calendars don’t surprise me, even Google’s tools that have nothing to do with voice control have trouble dealing with multiple calendars sanely.

I’ve got my own issues with them too though. One of the ones that comes particularly to mind is that whenever I try to call a certain person on my contacts list who lives in the same city I do using voice commands, Google instead tries to call a well-known doctor who happens to have the same name as this person, but lives over 1000 km away from me. I find it particularly interesting that it wants to prioritize ‘public figures’ over entries in your contacts list.

The really big one though is that most voice assistants still have serious issues responding to compound queries that would be trivial for a normal ‘assistant’ to respond to, or ones that coordinate data from multiple sources, which I also find particularly interesting given that most of them use Wolfram Alpha as a backend (at least, for factual stuff), which has no issues dealing with such things. At least this has gotten a bit better recently (for example, Alexxa can handle a number of these cases without issue recently, and Google Assistant has gotten the ability to alias a sequence of actions to a single voice command).

Ironically, I find Siri to be both the dumbest and the smartest. It’s more limited in some ways because Apple mine far less data than Amazon or Google. However, I can actually tell Siri that I have an event at 9:00 on my work calendar, and it will be put there (iOS only, though). It’s the only digital assistant I can use for calendar stuff at the moment, since my work calendar is run through Exchange and the other assistants don’t support that at all. It’s an interesting dynamic: Siri has less integration with third-party services and tracks you far less, and yet is able to handle some compound tasks that Alexa and Google can’t even parse. I’m looking forward to what iOS 12 will bring to Siri, as Apple seems to have a different idea in the works.

Voice is a huge threat the the business model of Google, and less directly to Apple, where screen beauty (for want of a better word) is core to their advantage. I notice Amazon is trying very hard to make Alexa a buy things first product vs a help you first product on mobile. Point is that there are vested interests keeping voice dumb and they are doing a good job, especially Apple, as in Seri is the dumbest. May be just a conspiracy theory and Apple really can’t do proper AI.

Clearly that happens on the web today.. but it’s not been obvious enough to be an emotional trigger for most people.

I agree with Thom that having something listening all the time is uncomfortable – but that may be a personality type – clearly lots of people don’t care – and perhaps may even enjoy it.

People got freaked out by Google glasses – though that was because the people wearing them were impinging on the people who didn’t.

Would a significant number of people be happy to be watched by an assistant at home and have their needs anticipated? ( You appear to be looking for your keys – they are under the newspaper on the kitchen top ).

In my view giving the device a personality is key to success – like Alexa – then it becomes more like having another person around rather than it being an ever watching eye on the end of an internet tentacle attached to a central brain of a remote corporation.