There might not be much evidence for this but, it turns out, it isn’t far from the truth. Researchers worldwide have begun developing many types of powerful audio analysis AI algorithms that can extract a lot of information about us from sound alone.

While this technology is only just beginning to emerge in the real world, these growing capabilities – coupled with its 24/7 presence – could have serious implications for our personal privacy.

Instead of analyzing every word people say, much of the listening AI that has been developed can actually learn a staggering amount of personal information just from the sound of our speech alone.

Another AI system developed last year can predict, just by listening to the tone a couple used when speaking to each other, whether or not they will stay together. These are all examples of current AI technology developed in research labs worldwide.

All of these technologies – no matter what they’re trying to learn about you – use machine learning. This involves training an algorithm with large amounts of data that has been labelled to indicate what information the data contains.

By processing thousands or millions of recordings, the algorithm gradually begins to infer which characteristics of the data – often just tiny fluctuations in the sound – are associated with which labels.

For example, a system used to detect your gender would record speech from your smartphone, and process it to extract “features” – a small set of distinct values that compactly represent a bigger speech recording.

Typically, features represent amplitude and frequency information in each successive 20 millisecond period of speech. The way that these fluctuate over time will be slightly different for male or female speech.

Machine learning systems will not only look at those features, but also how much, how often, and in which way the features change over time.

While the recording happens in the smartphone itself, clips are sent to internet servers which will extract features, compute their statistics, and handle the machine learning part.

AI was first created to perform conceptual tasks normally requiring human intelligence. At the moment, most AI systems perform analysis and understanding tasks, which means they provide information for humans to act on, rather than acting automatically.

For example, audio AI systems for road monitoring can alert traffic controllers to the sound of a vehicle crash, and audio-based medical diagnosis AI would alert a doctor about findings of concern. But a human would still have to make a decision based on the information provided to them by the AI.

But new AI technologies are changing. Many AI systems are starting to exceed human capabilities, with some devices even able to act without human intervention.

While most AI systems today are designed to assist people, in the wrong hands, these technologies could look more like the Thought Police from George Orwell’s 1984.

Audio (and video) surveillance can already detect our actions, but the AI systems we have mentioned are starting to detect what is behind those actions – what we’re thinking, even if we never speak it aloud.

Most tech firms say their devices don’t record us unless we command them to, but there have been examples of Alexa making recordings by mistake.

And researchers have shown that it doesn’t take much to turn your phone into a permanent microphone. It may only be a matter of time before advertisers and scammers start to use this technology to understand exactly how we think, and target our private weaknesses.