This New Atari-Playing AI Wants to Dethrone DeepMind

Share

This New Atari-Playing AI Wants to Dethrone DeepMind

Getty Images

Artificial intelligence is not a contact sport. Not yet, at least. Currently, algorithms mostly just compete to win old Atari games, or accomplish historic board gaming feats like owning five human Go champions at once. These are just practice rounds, though, for the way more complicated (and practical) goal of teaching robots how to navigate human environments.

But first, more Atari! Vicarious, an AI company, has developed a new AI that is absolutely slammin' at Breakout, the paddle vs. brick arcade classic. Its AI, called Schema Networks, even succeeds at tweaked versions of the game—for instance, when the paddle is moved closer to the bricks. Vicarious says Schema Networks outperforms AIs that use deep reinforcement learning (currently the dominant paradigm in AI). Some critics aren't convinced, however. They say that in order to truly claim top score, Schema Networks must show its stuff against the world's best game-playing AI.

If you're going by numbers, Vicarious is a power player in the field. The company has raised more than $70 million from private funders. But, aside from a Captcha-busting program it debuted in 2013, Vacarious hasn't made many big AI splashes. Plus, its critics say that Captcha technology doesn't live up to the hype—Vicarious never released any peer-reviewed research on it. In fact, the company's publication record to date is fairly sparse compared to some other AI research groups, and the papers it does publish don't get cited very often by other researchers. Vicarious' skeptics point to that as evidence of the company's history of making claims it can't back up.

Citations, however, are just one way to gauge impact. Vicarious is a private company, under no obligation to share its work. And besides, it's raised money from the likes of Elon Musk, Vinod Khosla, and Mark Zuckerberg—not the dumbest investors, in other words.

So what's really going on here? Ask reps from Vicarious, and they say they aren't interested in competing with DeepMind. Ask the critics, and they point out that the company's recent paper specifically pits Schema Networks against the same class of AIs that DeepMind used to dominate Atari games over the past few years. So whether they admit it or not, they certainly seem to be gunning for the same goal.

A New High Score!

AlphaGo made DeepMind famous. But before the London-based company built the neural network that beat the best living player of the oldest continuously played game in history, it had to master Atari. Games like Breakout are pretty simple for humans to figure out: Move the paddle, bounce the ball, break the bricks. But to a computer, all those shapes and colors are gibberish. DeepMind tackled the problem using an approach called deep reinforcement learning.

As described in a 2013 paper published on the open-access research clearinghouse Arxiv, DeepMind experiences the game by getting raw image frames of game play. The AI reads three frames in a row. If the pixels in those three frames depict a ball hitting some bricks, the deep reinforcement learning network uses the points it scores in the game as a feedback mechanism, and rates that series of frames favorably. The AI, of course, can move the paddle left, right, and it can also release the ball. But it doesn't know it can do this. All it knows is that it can issue these three commands, and sometimes one of these commands will correlate with a favorable sequence of frames. Over time, it gets good at the game. To humans, it looks like technology is learning to move the paddle back and forth, release the ball, bounce the ball, earn points. It's better than brute force, but it's still nowhere near critical reasoning.

It was certainly impressive enough to earn DeepMind some major props from the AI community. Not long after that Atari work came out, Google scooped the company up. Then DeepMind turned its attention to Go—a game much older, and much more complicated, than those arcade classics—and in March 2016, its AlphaGo AI made history by defeating top ranked Go champion Lee Sedol using similar algorithms.

Player 2 Has Entered the Game

AlphaGo's feat of learning is impressive. But it's still far from a human-like intelligence that can generalize concepts from one domain to another. "To have AIs think the way you and I do, they need to move towards models that can reuse concepts, understand cause and effect," says D. Scott Phoenix, a co-founder of Vicarious. The problem with deep reinforcement learning networks, he says, is they are essentially trial and error. They are also limited by the fact that they rate the score from the whole frame of pixels, all at once. That means small tweaks to the operating environment—moving the paddle closer to the bricks, or changing the brightness of the colors on the screen—result in huge learning setbacks. It also means they are always reacting, but they can never set goals, and never plan.

This is not to say a system like that cannot do the unexpected. In game two of AlphaGo's showdown with Lee Sedol last March, the AI performed such a crazy move that the human grand master left the room for 15 minutes afterwards, because he was so flummoxed. But that doesn't mean it was following some elaborate strategy. It just made the move that its neural networks had deduced would be the most rewarding based on what the board looked like.

Vicarious' Schema Networks, on the other hand, think more like humans do—at least according to Phoenix. "It starts much like a child would, doing something and seeing what happens," he says. It learns objects—paddle, ball, brick—and it learns how those objects move and interact with one another. Schema Networks, says Phoenix, calculates probabilities for how the ball will fly off the paddle each time they collide. Based on these probabilities, it moves its paddle to the optimum place. It's not just breaking bricks, it's aiming to clear the level in the most efficient way possible.

In their paper, Phoenix and his co-authors pit Schema Networks against a deep reinforcement learning network in games of Breakout. Not only did Schema get a higher score in the standard game of Breakout, it also adapted way quicker when the Vicarious crew switched up the game's environment. In one scenario, they moved the paddle closer to the bricks. In another, they added an unbreakable obstacle between the paddle and the bricks. They even removed the bricks altogether, and made the paddle juggle three balls at once. In each scenario, Schema Networks outplayed the deep reinforcement learning networks' highest scores.

"The Schema Networks are all about actually learning the concepts of the game," says Phoenix. "What happens when a ball hits a paddle? It learns that concept, and then can generalize to different environments that it was never trained on." This is more akin to how humans learn—we don't figure out how to play every single video game on its own terms, we apply things we've learned from one to another.

Of course, the goal here isn't to create AI power gamers. "Video games are important for teaching AI simply because it’s a series of experiences that are totally digitized," says Chris Nicholson, the CEO and co-found of Skymind, an AI company. Games offer limited ranges of experiences, along with simple reward functions—points. "I think it is reasonable to say that the intent of winning video games is to move on to more complex visual arenas where robots move the world around them," says Nicholson. Both DeepMind and Vicarious are up front about their robot brain ambitions.

Game Genie

Vicarious' paper was presented today at the 2017 International Conference on Machine Learning in Sydney. Prior to being accepted to the conference, the paper went through peer review. But Nicholson and others who have read the paper still aren't convinced that it describes a truly revolutionary AI. "What I would have liked to have seen in this paper is proof that it can beat more than several versions of Breakout," says Nicholson. What he sees is pretty far from truly general AI. He contrasts this paper with DeepMind's 2013 Arxiv paper, which detailed how it learned to play seven different Atari games, and its follow up 2015 paperpublished in Nature, in which DeepMind's networks tackled more than two dozen arcade classics.

In a blog post accompanying its ICML presentation, Vicarious writes about Schema Networks playing two other games: Space Invaders and a complicated puzzler called Sokoban. The blog post—which is not peer reviewed, by the way—details how Schema Networks bested deep reinforcement learning in those other arenas.

But those arenas aren't the AI thunderdome. Oren Etzioni, the CEO of the Allen Institute for Artificial Intelligence in Seattle, says video games are pretty limited for testing AI with the ambition to power robots. "You observe the entire scene in Atari games. Does the method work in cases that you have partial observation? The answer is highly likely No," he says. "For example, a robot that operates in an apartment does not see the entire apartment." He thinks a far better test would be putting Schema Networks in the complex (AI2-THOR simulated indoor environment)[http://vuchallenge.org/thor.html] he and his colleagues have developed. More broadly, he says, Schema Networks simply seems impractical, and criticized the paper for being filled with unsubstantiated buzzwords like "intuitive physics." "They don't do any physics other than modeling ball collision for that specific game," says Etzioni.

I asked Nicholson, who is also skeptical of Vicarious' claims about Schema Networks, what it would take for him to believe Vicarious is pushing the boundaries of AI. He was blunt: "Here's what I want to see: Beat AlphaGo." Alas, DeepMind announced last week that it is retiring AlphaGo, so the team can move on to bigger challenges. Nicholson still might get his wish, though. DeepMind and Vicarious are both working towards developing AI brains for robots. If their eventual creations ever do meet, expect a full contact confrontation.

Related Video

Business

Google's AlphaGo Notches Another Win for AI

Google's AlphaGo artificial intelligence system edged out the best human Go player for a 2-0 win. But it is also playing with and against teams of professional human players.