Smartphones these days are getting truly smart, with ever more AI-powered services like digital assistants and real-time translation. But typically the neural nets crunching the data for these services are in the cloud, with data from smartphones ferried back and forth.

That’s not ideal, as it requires a lot of communication bandwidth and means potentially sensitive data is being transmitted and stored on servers outside the user’s control. But the huge amounts of energy needed to power the GPUs neural networks run on make it impractical to implement them in devices that run on limited battery power.

Engineers at MIT have now designed a chip that cuts that power consumption by up to 95 percent by dramatically reducing the need to shuttle data back and forth between a chip’s memory and processors.

Neural nets consist of thousands of interconnected artificial neurons arranged in layers. Each neuron receives input from multiple neurons in the layer below it, and if the combined input passes a certain threshold it then transmits an output to multiple neurons above it. The strength of the connection between neurons is governed by a weight, which is set during training.

This means that for every neuron, the chip has to retrieve the input data for a particular connection and the connection weight from memory, multiply them, store the result, and then repeat the process for every input. That requires a lot of data to be moved around, expending a lot of energy.

The new MIT chip does away with that, instead computing all the inputs in parallel within the memory using analog circuits. That significantly reduces the amount of data that needs to be shoved around and results in major energy savings.

The approach requires the weights of the connections to be binary rather than a range of values, but previous theoretical work had suggested this wouldn’t dramatically impact accuracy, and the researchers found the chip’s results were generally within two to three percent of the conventional non-binary neural net running on a standard computer.

This isn’t the first time researchers have created chips that carry out processing in memory to reduce the power consumption of neural nets, but it’s the first time the approach has been used to run powerful convolutional neural networks popular for image-based AI applications.

“The results show impressive specifications for the energy-efficient implementation of convolution operations with memory arrays,” Dario Gil, vice president of artificial intelligence at IBM, said in a statement.

“It certainly will open the possibility to employ more complex convolutional neural networks for image and video classifications in IoT [the internet of things] in the future.”

It’s not just research groups working on this, though. The desire to get AI smarts into devices like smartphones, household appliances, and all kinds of IoT devices is driving the who’s who of Silicon Valley to pile into low-power AI chips.

The big chip companies are also increasingly pivoting towards supporting advanced capabilities like machine learning, which has forced them to make their devices ever more energy-efficient. Earlier this year ARM unveiled two new chips: the Arm Machine Learning processor, aimed at general AI tasks from translation to facial recognition, and the Arm Object Detection processor for detecting things like faces in images.

Qualcomm’s latest mobile chip, the Snapdragon 845, features a GPU and is heavily focused on AI. The company has also released the Snapdragon 820E, which is aimed at drones, robots, and industrial devices.

Going a step further, IBM and Intel are developing neuromorphic chips whose architectures are inspired by the human brain and its incredible energy efficiency. That could theoretically allow IBM’s TrueNorth and Intel’s Loihi to run powerful machine learning on a fraction of the power of conventional chips, though they are both still highly experimental at this stage.

Getting these chips to run neural nets as powerful as those found in cloud services without burning through batteries too quickly will be a big challenge. But at the current pace of innovation, it doesn’t look like it will be too long before you’ll be packing some serious AI power in your pocket.

I am a freelance science and technology writer based in Bangalore, India. My main areas of interest are engineering, computing and biology, with a particular focus on the intersections between the three.