When a Poker A.I. Went Crazy with Five-Three Suited (Analysis)

In part 1 of this article series, we talked about the fact that the vast majority of poker players do not bluff often enough.

As a result, their betting actions generally represent “honesty” (value hands) rather than “dishonesty” (bluffs). This led me to introduce what I call the Honesty Principle.

The Honesty Principle: As a whole, the poker community bluffs much less than it should.

On the other side of the spectrum are Artificial Intelligences like Libratus, a Carnegie Mellon University creation that recently managed to demolish some of the top No Limit players in the world. Today we will look at an insane hand Libratus played to see what we can learn from its strategy.

Welcome to the Machine

At the core of Libratus’ strategy was the machine’s ability to close the gap between value bets and bluffs, making it very hard for the human players to guess which was which. The A.I. was much more balanced than the humans. Thus, it was quite difficult for the humans to put the machine on a hand.

Below is an example of the level of sophistication that Libratus brought to the table. In an interview with Doug Polk, Daniel McAulay describes a crazy hand he played against the computer.

The computer had 5♣ 3♣ and Daniel held X♥ Y♥ (Daniel’s specific cards do not matter here). Daniel raised in position, Libratus 3-bet, Daniel 4-bet, and finally Libratus called out of position.

Already, we see some unintuitive play by Libratus. Most humans would fold this every time, or perhaps put in a 5-bet bluff. A call from out of position would seem to be a losing play unless the player making it is capable of some “nasty” bets in the future. And capable Libratus was!

The flop came K♥ Q♥ J♣. Libratus checked and Daniel checked back with his flush draw.

Turn was a third ♥ giving Daniel the flush. Libratus checked and Daniel checked back again for deception.

The river was a brick (something like the 5♠) and Libratus bet with his measly pair. Daniel raised small, to make it look like a bluff, and sure enough Libratus went all-in turning his bottom pair into a bluff! Daniel called, of course, and won the hand.

Libratus losing the hand is besides the point, however. The line taken by the computer was a borderline insane one that almost no human is capable of ever taking, especially a winning player who knows that they are representing a very narrow range, given how passively they played the turn and the river.

This is exactly the point. A perfectly balanced player can and will show up with anything in any situation and with optimal frequencies. Please note that balance and optimality are keys here. For instance, human players may attempt to take actions that they may perceive as “random”. In reality, however, those actions are usually heavily biased. Our species has not yet figured out a way to find the fine line between the two.

If the above hand seems a bit over your head, you are not alone. The top pros who faced Libratus felt the same way, and they have millions (if not tens of millions) of hands of poker experience between them. I only present it here to make the case that it is borderline impossible for humans to produce this level of perfectly balanced unpredictability; a finely tuned mixture that keeps opponents guessing, while making a profit in the process.

We should take a moment to appreciate how hard that is.

For example, it would be very easy for Bob never to bluff, thus being totally honest and predictable all the time. It is equally easy for Bob to constantly bluff, thus again being predictably unpredictable since now opponents will correctly assume his bets to be rather weak. In both cases, Bob’s opponent Alice would know what to expect and thus she can adjust her strategy accordingly (simply by folding a lot against the first version and by fighting back against the second).

What is difficult is for Bob to find the fine line between bluffing and not bluffing, so that Alice no longer has a clear decision. Anything short of that, would make him either too honest or too dishonest, both of which could be easily exploited by a very attentive player such as Alice.

Editor’s Note

This 6-hour course gives you a front-row seat to Doug’s analysis of the hands played by Libratus. You’ll learn how the A.I. played in single raised pots, 3-bet pots, 4-bet pots, how it approached limping, and more.

Good news, bad news…

Ok, this is terrible news for Bob who has neither the time nor the desire to develop such an elaborate and complicated winning strategy. How about Alice? The optimal strategy that Libratus seems to be approximating so effortlessly does not look easy at all. And it is not. Then how could Alice figure it out? Luckily, she does not have to.

Alice does not play poker against Libratus. Nor does Alice play poker against the top players in the world. Alice plays poker against people like Bob and occasionally against other players like herself. All of these people play according to the Honesty Principle, with few if any exceptions.

This very point was made clear in an introductory statement by Mason Malmuth and David Sklansky at the beginning of Mathew Janda’s advanced strategy book, Applications of No Limit Hold’em. The title of their statement was “A cautionary Note about Bluff-catching,” and it was essentially a warning about the consequences of ignoring the Honesty Principle when trying to call big bets in a defensive manner.

What their statement was trying to accomplish was to warn readers that trying to protect themselves against being bluffed is not necessarily the most profitable choice. This is especially true at the beginner and intermediate levels where most players do not bluff nearly as much as they should.

It is possible for Alice to ensure that Bob stays honest by occasionally calling him to keep him on his toes. But if Bob is already bluffing less than he should, each of Alice’s calls would be losing ones in the long run. Not calling him at all would be more profitable to her.

Of course, by never calling Bob’s big bets, Alice is opening herself to the theoretical possibility that Bob could theoretically start taking advantage of her by increasing the frequency of his bluffs. As one who is trying to solve the game optimally, that is exactly Janda’s point. If Alice stops calling, Bob could theoretically exploit her.

Of course, Malmuth’s and Sklansky’s counterpoint is that most Bobs are not good enough to realize that and thus are unlikely to adjust. I wholeheartedly agree. Bob is not Libratus and thank goodness for that!

Konstantinos "Duncan" Palamourdas

Duncan is a math professor from UCLA who specializes in the mathematics of poker, as well as in poker education. He currently teaches poker classes at UCLA extension that always fill up early and have long waitlists. He also authored a poker book - contracted to be published by 2020 - where he uses simple language to scientifically explain how and why money flows from poker amateurs to professionals.