AlphaZero vs Stockfish

The frustrating and scary thing is AlphaZero doesn't tell us how it came up with these insights.

I wonder, is it because AZ has the opportunity to test (almost) every possible scenario that a human could never have the ability to test? Did Alpha have the opportunity to essentially run the largest trial and error test in history over the course of billions or trillions (?) of moves.

I don't really understand the scale of which Alpha is able to simulate chess. " it examined only 60 thousand positions a second"...

There is the math you learn in grade school and high school and then there is THE MATH that spawns Alpha Zero.

Barry Longyear, the author of the novella Enemy Mine, wrote about a future where a supercomputer has a world leader executed.

When the United Nations demands a justification, printouts ten feet high and covering a football field are generated by the machine to explain all the factors that went into its decision.

I often think of algorithms in terms of shortcuts. Where Stockfish calculates and rates 60 million possible moves, AlphaZero has realized that only about 1/100th of those possibilities is worth considering.

How much time would it take humans to even review the data supporting these decisions?

It would be interesting, nonetheless, to see the shortcuts that AlphaZero has come up with, even if we lack the time to manually validate them.

It seems like the breakthroughs in thinking are coming from not having the preconceptions and biases that humans have.

It makes me wonder what would happen if we took AlphaZero to a realm where human preconceptions are strong and hugely consequential: politics. Would it be swayed by candidates' emotional claims at rallies?

My professor at Stanford said there is simple mathematics behind judging a Phd dissertation: its value is inversely proportional to the number of pages it contains. The pages increase as the quality of writing decreases; the author fills it with air if he or she was covering for not having advanced science much.

From my, admittedly limited, understanding of the process, the way AlphaZero did it was a two phase process. First phase consisted of playing a bunch of games against itself, making completely random (but chess legal) moves. Second phase consisted of letting a machine learning algorithm work on finding patterns in moves leading to victory, similar to the way neural networks train to find images of cats. It ended up with atrained model that doesn't have to brute-force examine all the possible moves (like Stockfish does) - it can dismiss vast number of possible moves easily (none of them ever led to winning games).

But the crucial part in its 'out of this world' style of play was the random nature of moves in the first phase. Its the reason why it avoids all the common styles of play humans have - humans all learned from other humans. AlphaZero didn't, so it gave every move a fair chance and ended up with lots of winning moves humans would outright dismiss because "that's not the way you play chess".

That's where we bump against the limitations of this approach. Soccer (or any other real-world games) has immeasurably more play states than chess, go, or other board games with discrete positions (how many positions are there for a soccer player?). Not only that, but the players' ability is wildly different (and changes over time!). Modelling such games in a fashion that would be usable for AlphaZero would be extremely hard. Besides, you would even have trouble with the first phase of the process: it would be really hard to find two teams willing to play a couple of billion of games with each player doing totally random moves. :-)

sounds like a perfectly reasonable human response (I was thinking the exact same thing)...

maybe Alpha Zero would say no problem let's just rule out 90% of the actions that we know won't result in a win like putting 10 players on one side of the field etc. It probably would start with the standard framework coaches start with and "learn".

Could Alpha Zero "watch" games by analyzing the movements of players from previously recorded games over the past 10 years? I'm sure they can turn the players movements into some kind of data that might look a bit like chess except all the players are the queen that can move in any direction?

Actually, that's an interesting idea. Place GPS devices on the players and the ball with some player metrics like weight and height into a data stream and feed that data to AlphaSoccer. In the past that would have been called statistics and people like Billy Beane would have become famous for it in MoneyBall.

Baseball has always leant itself to statistics, but if you had player and ball position data in soccer plus AI, whoa.

Yes, it can analyse the recording of existing games, but that would only get us up to AlphaGo, which learned Go by analysing human played games. The real breakthroughs come from forgoing the human experience and practice in particular game and starting from tabula rasa.

Don't miss a post

Follow this conversation to get notified about new posts. Find it later in the "Following" feed.