Welcome to the Internet of listening, eavesdropping, spying things

There’s a new frontier for digital privacy: home devices that understand spoken commands. That’s impressive and convenient, but it comes with definite risks, as Rick Falkvinge pointed out earlier this week. The product sites of the main players in the so-called “smart speaker” sector – Amazon, Apple, and Google – offer plenty of upbeat advertising copy about the convenience, but are naturally silent about the potential problems.

“Google Home listens in short (a few seconds) snippets for the hotword. Those snippets are deleted if the hotword is not detected, and none of that information leaves your device until the hotword is heard. When Google Home detects that you’ve said “Ok Google,” the LEDs on top of the device light up to tell you that recording is happening, Google Home records what you say, and sends that recording (including the few-second hotword recording) to Google in order to fulfill your request. You can delete those recordings through My Activity anytime.”

“Alexa – the brain behind Echo – is built in the cloud, so it is always getting smarter. The more you use Echo, the more it adapts to your speech patterns, vocabulary, and personal preferences.”

That’s clearly a big advantage for users, because it means that the “brain” behind Echo will improve as advances in hardware and AI are incorporated in Amazon’s cloud-based platform. But it also means that Echo users don’t really know what the system sitting in their home can “hear” or “understand”, since those capabilities are provided elsewhere, and upgrades are outside the customer’s control.

All three devices – Apple’s HomePod, Google’s Home, and Amazon’s Echo – are designed to work with other electronic objects in the home, creating and controlling a complete network of “intelligent” systems. That underlines an important fact: it is not just people who buy one of these “smart speakers” that will be subject to constant eavesdropping by digital devices waiting for the wake-up word. As hardware costs plummet, and AI-based software increases in sophistication, voice-recognition systems will start to appear in most “intelligent” domestic devices as standard. Acoustic surveillance will become the norm and pervasive to the point that people will forget it is even happening.

“Just like the fingerprints of your hand, you have a voice that is totally unique to you. In our experiment, by recognising the individual characteristics of your voice (tone, modulation, pitch etc), processing that information and then matching it to a sample of your voice stored in the cloud, artificial intelligence software checks that you are who you say you are and then signs you in, without you having to type anything.”

Copyright companies will love this, since using voice-based sign-ons will ensure that subscriptions to music or video streaming services are not handed around, something that is hard to prevent at the moment. Voice sign-ins will also make it easier to prove the origin of particular content. Currently, courts are rightly unwilling to accept that a particular IP address linked to unauthorized copies can be associated with a single person and used to prove guilt of some kind. Voice-prints offer a natural identification system that will be harder to challenge. Always-on voice systems might even usher in pricing based on how many people are present in the room, although the BBC post doesn’t quite frame it that way:

“Just by listening to the voices in the room, your TV could automatically detect when there are multiple people in the living room, and serve up a personalised mix of content relevant to all of you in the room. When your children leave the room to go to bed, BBC iPlayer might hear that the children are no longer there and then suggest a different selection of content for you and your partner. All of this personalisation could happen without anyone having to press a button, sign in and out or change user profiles.”

It’s not just the number of people that will be evident to such systems. Advanced AI-based voice recognition can “understand” the content of conversations, and thus the inter-relationships of the participants. It is only a matter of time before owning intelligent voice-based systems is tantamount to having a spy sitting in every room in the house, constantly listening to everything that is said, and understanding it almost as well as a human.

Once these devices are in place, marketing companies will be very keen to know what people are saying and feeling as they watch TV programs and their advertisements, for example, or eating a meal with brand-name dishes. Assuming that the gathering of this information from voice-enabled TVs, ovens and refrigerators will be subject to privacy laws to some extent, the obvious approach would be to offer incentives – perhaps financial ones – to encourage consumers to share their data. Assurances would be given that it would only be used in an anonymous form, as usual, but there would inevitably be leaks of highly-personal information gathered in this way, also as usual.

Even more problematic than commercial snooping of this kind are the intelligence agencies – both domestic and foreign. Home devices with always-on listening capabilities will provide the perfect surveillance tools. Since they are necessarily online – these systems generally work by sending data somewhere, and pulling down system updates periodically – they will also be accessible over the Internet to both state actors and criminal operators. Even if manufacturers are not forced to install backdoors in their products – something that is already an option under UK laws – weaknesses because of programming flaws will inevitably allow unauthorized access and control.

A future Internet of listening, eavesdropping and spying things will represent a serious threat to privacy. That is not to argue that such systems should not be developed, bought and used. But the rapid development of this field, both in terms of increasingly popular products, and innovative research projects, means that we need to start discussing now how the risks can be mitigated. Ideally, that’s through voluntary technical means, but ultimately it might need legislative action if those prove insufficient.

Glyn Moody is a freelance journalist who writes and speaks about privacy, surveillance, digital rights, open source, copyright, patents and general policy issues involving digital technology. He started covering the business use of the Internet in 1994, and wrote the first mainstream feature about Linux, which appeared in Wired in August 1997. His book, "Rebel Code," is the first and only detailed history of the rise of open source, while his subsequent work, "The Digital Code of Life," explores bioinformatics - the intersection of computing with genomics.