Just How Dangerous Is Alexa?

Shelly Palmer

1 year ago

The “willing suspension of disbelief” is the idea that we (the audience, readers, viewers, content consumers) are willing to suspend judgment about the implausibility of the narrative for the quality of our own enjoyment. We do it all the time. Two-dimensional video on our screens is smaller than life and flat and not in real time, but we ignore those facts and immerse ourselves in the stories as if they were real.

We have also learned the “conventions” of each medium. While we watch a movie or a video, we don’t yell to the characters on the screen “Duck!” or “Look out!” when something is about to happen to them. We just passively enjoy the show.

The Willing Suspension of Our Privacy

We apply similar concepts to our online lives. Most of us are willing to give up our data (location, viewing, purchasing or search history) for our online enjoyment. We can call this the “willing suspension of our privacy” because if you spent a moment to consider what your data was actually being used for, you would refuse to let it happen.

The Willing Suspension of Our Agency

Which brings us to the next level of insanity: the willing suspension of our agency for our own enjoyment. This is past the point of giving up a “reasonable amount” of data or privacy to optimize the capabilities of our digital assistants. Suspension of our agency exposes our normally unmonitored physical activity, innocent mumblings and sequestered conversations. Some people believe this is happening with Alexa, Google Home, Siri and other virtual assistant and IoT systems. It may well be.

First, Let’s Give It a Name

Since we are discussing a combination of automatic speech recognition (ASR) and natural language understanding (NLU) engines that enable a system to instantly recognize and respond to voice requests, for this article, let’s call the interface an intelligent voice control system (IVCS).

How It Works

You activate most commercial IVCSs with a “wake word.” For an Amazon Echo or Echo Dot, you can choose one of three possible wake words, “Alexa” (the default), “Amazon” or “Echo.” Unless you turn off the microphones (the Echo has seven) and use a mechanical button or remote control to activate its capabilities, Alexa Voice Service, the system that powers the Echo and Alexa, and other IVCSs are always listening for their wake word.

In Amazon’s case, it keeps approximately 60 seconds of audio in memory for pre-processing so the responses can be situationally aware and “instant.” Amazon says the listening is done locally, on the device, not in the cloud. So technically, the audio does not leave the premises.

Always Listening Does Not Mean Always Transmitting!

Yes, an IVCS is always listening AND recording. Which raises the question, “What does it do with the recordings it does not use?” In Amazon’s case, the official answer is that they are erased as they are replaced with the most current 60 seconds. So while the system locally stores approximately 60 seconds of audio preceding your wake word, it transmits only a “fraction of a second” of audio preceding your wake word, plus your actual query and the system’s response. For Alexa, you can find a record of your query on the Home screen of your Alexa app.

More Questions

What happens to the approximately 60 seconds of audio recording preceding a wake word? The one that has a recording of the TV soundtrack, footsteps, the loud argument in the next room, the gunshot, etc.? What happens with that audio? Again, Amazon says it is erased and replaced with the next 60 seconds of audio. Skeptics say if a wake word is detected, the previous 60-ish seconds of audio is put in a database for further IVCS training. If so, could that audio be subpoenaed? Yep! Just like your browser history or phone records. It’s just data. But does it actually exist? Amazon says no. As for other systems? We’ll have to ask.

What About Hackers?

Seven microphones! Could a hacker tap into one or all of them and eavesdrop on me? The official answer is no, and specific technical reasons are cited. However, at The Palmer Group we have several theses for 2017 including, “Anything that can be hacked will be hacked.” Anyone who believes otherwise is simply naïve.

“It’s the Profile, Stupid!”

Data is more powerful in the presence of other data. It is an immutable law of 21st-century living, which in this case means that the most serious threat to each of us is the profile that can be created with the willing suspension of our agency.

Most people have no idea how much information about them is available for sale. The willing suspension of agency has the potential to take us right up to the line that separates where we are now from an Orwellian future. (Many people believe we already live in a surveillance state. We’ll explore this in another article.)

We Must Deal with This Sooner or Later

Alexa is NOT dangerous. The data it collects is NOT dangerous. Nothing about an Amazon Echo is dangerous. It’s awesome. I have one in the kitchen, in the living room, in my home office, and on my night table. It’s an amazing controller, great alarm clock, spectacular Spotify and Amazon Prime interface, an exceptional news and weather reporter, and it does lots of other stuff you can look up online. I love it.

I also love my Google Home. Its ASR/NLU system is second to none! Let’s face it: Google is “the” repository of publicly available knowledge. When I’m on my handheld, I rely on “OK Google,” and while I think Siri is audio impaired and database challenged, sometimes I use it too.

But …

The world will be a very different place when Google, Amazon, Microsoft, Apple and other AI-empowered players have assembled 1st-party profile data that includes our agency. It will make what they do with our current behavioral profiles look like primitive data processing.

We are predisposed to pay for convenience. We happily do it with cash and with data every day. However, we should not suspend our judgment about the implausibility of this narrative for convenience or for the quality of our enjoyment. Though this is a story we have been told before, there are no conventions of this medium. So let me be the first to scream: “Look out!”