"What we developed was so compelling that our challenge has been much more one of engineering 'How quickly can we get it rolled out everywhere?'," says Sloss.

"If you can save that amount of power, what you want is to grab that gain, and we'll continue to train the model, and continue to probably put more systems under its care because the initial results were just so profound."

Sloss says that it won't just be Google that is clamouring to put its datacenters under the stewardship of AI, the results achieved by self-learning systems are such an unambiguous improvement over manual decision-making that using machine-learning systems will rapidly become essential when running large datacenters.

Speaking last year, DeepMind co-founder Demis Hassabis, said Google had stepped up its use of AI since then, using a DeepMind AI that modelled the running of a datacenter and adjusted 120 variables related to its operation to achieve the highest level of energy efficiency. When the recommendations from that model were applied, the datacenter increased by 15 percent its Power Usage Effectiveness (PUE), a measure that reflects how much of the electricity used by a facility ends up powering the servers, rather than driving associated infrastructure handling cooling and power distribution.

Andy Lawrence, VP of research for datacenters and critical infrastructure at 451 Research, agrees that Google's experiment with using AI to help run datacenters will eventually become mainstream.

"Google's use of DeepMind to reduce the PUE of its datacenter is an interesting application of AI/machine learning, and clearly points to what will eventually be achievable," he says.

"The long term trend is towards automatic or autonomic management of datacenters using software tools."

However, he says that Google datacenters are already so efficient that the gains represented only "a datacenter power efficiency improvement from around 86% to 88%".

"Even so, at Google global scale, that would represent an very significant saving: Google uses over 5m MWh of electricity a year," he says, adding the approach could make sense for the largest tech firms, but would require large-scale investment.

"One of the challenges, even for Google, is that a lot of sensors are required, and these can be expensive to install at scale."

Services are already springing up to support AI-driven management of datacenters, with US company Vigilent applying a learning-based algorithmic approach to optimize cooling for customers in several continents, and in the longer term Lawrence expects to see "AI-based efficiency services to be delivered as a service to datacenters".

The difference in datacenter power usage when Google turned the machine learning recommendations on and off.

Image: Google

'I'm agog at what we've been able to do'

Perhaps the most famous demonstration of the efficacy of DeepMind's machine-learning systems was the recent triumph of the DeepMind AlphaGo AI over a human grandmaster in Go, an ancient Chinese game whose complexity stumped computers for decades. Go has about 200 moves per turn, compared to about 20 in Chess. Over the course of a game of Go there are so many possible moves that searching through each of them in advance to identify the best play is too costly from a computational point of view. Instead, AlphaGo was trained how to play the game by taking moves played by human experts in 30 million Go games and feeding them into deep learning neural networks.

Training these deep learning networks can take a very long time, requiring vast amounts of data to be ingested and iterated over as the system gradually refines its model in order to achieve the best outcome.

"TPUs offer a huge performance advantage over currently available technology," says Sloss.

"Everybody who's working hard on ML [machine learning] at this point is chasing after performance. It gives you a large competitive advantage, because you can get to the point where you have modelled something useful in a fraction of the time that it would otherwise take."

While not making a firm commitment in regards to future rollouts of TPUs within Google's datacenters he says "I suspect that we will continue to make TPUs more widely available".

Even as an insider at Google, Sloss admits to being surprised at the rate at which machine-learning capabilities are advancing on the back of processors capable of manipulating huge amounts of data in parallel and the availability of enormous training datasets.

"I'm still fairly agog at what we've been able to do collectively with machine learning over the last few years," he says.

"I'm an expert chess player and if you had told me three years ago that the Go champion of the world in 2017 would be a computer, I would have chuckled politely, and yet here we are.

"I'm very interested to see what ML is able to do for the world over the next five years."

Tech News You Can Use Newsletter

We deliver the top business tech news stories about the companies, the people, and the products revolutionizing the planet.
Delivered Daily