rithm described in Endgame Solving section takes
this “card removal” factor into account to an extent,
since the equities are computed for each hand against
the distribution the opponent could hold given that
hand; however, this does not fully take into account
the card removal effect. For example, the 3c2c and
3s2c hands would both have the lowest possible equity (it would be slightly above zero only because of
possible ties), and would be necessarily grouped into
the same bucket by the end-game information-abstraction algorithm (the worst bucket) despite the
fact that they have very different card removal properties.

Doug Polk said that he thought the river strategy
using the end-game solver overall was the strongest
part of Claudico; however, he thought that utilizing
the large betting sizes without properly accounting
for card removal was actually a significant weakness,
since Claudico would be bluffing with nonoptimal
hands. The Claudico team came to this conclusion
ourselves as well during the competition, and for this
reason decided to take out the large bets for Claudico from the end-game solver partway through the
competition, since this issue is most problematic for
those bet sizes (for smaller bet sizes, card removal is
still important, but significantly less important since
we are not just trying to “block” the opponent from
having a small number of extremely strong hands,
since he will be calling with many more hands).
Interestingly, Dong Kim told me after the competition that they had conducted analysis and Claudico
was actually profiting on the large bet sizes during
the time they were used, despite the theoretical issue
described above. I think everyone agrees that massive
“overbets” are part of full optimal strategies, and likely underutilized by even the best human players. But
card removal is also particularly important for these
sizes, and I think for an agent to use them successfully an improved algorithm for dealing with block-ers/card removal would need to be developed,
though I am still quite curious how well Claudico
would have performed if it continued with those
sizes included in the agent.

ConclusionIt is one thing to evaluate a poker agent against oth-er computer agents, who largely also play staticapproximations of equilibrium strategies; it is anoth-er to compete against the strongest human special-ists, who will adapt and attempt to capitalize on eventhe smallest perceived weaknesses. This was the firsttime a no-limit Texas hold ’em agent has competedagainst human players of this caliber, and our teamreally had no idea what to expect entering the com-petition, as previously all of our experiments hadbeen against computer agents from the AAAI AnnualComputer Poker Competition. Many valuable lessonswere learned that will be pivotal in developingimproved agents going forward. I have highlightedthe two most important avenues for future research.The first is to develop an improved approach for the“off-tree” problem where a mistake is made due to amisperception of the actual size of the pot after trans-lating an action for the opponent that is not in ouraction abstraction. I have outlined promising agen-das for attacking this problem, including improvedaction abstraction and translation algorithms, novelapproaches for real-time computation that addressthe portion of the game prior to the final round, andentirely new approaches specifically geared at solvingthe off-tree problem independently of the otherproblems. And the second is to develop an improvedapproach for information abstraction that betteraccounts for card removal/blockers (that is, thataccounts for the fact that having certain cards in ourhand modifies the probability of the opponent hav-ing certain hands). This issue is most problematicwithin the information abstraction algorithm for theend game, where the card removal effect is most sig-nificant due to the distributions for us and the oppo-nent being the most well defined (that is, there is nomore potential remaining in the hand due to uncer-tainty of public cards, and this relative certainty willlikely cause the distributions to put positive weighton fewer hands), and it limits our ability to utilizelarge bet sizes, which have been demonstrated to beoptimal in certain settings. Of course, it would bebeneficial to develop an improved informationabstraction algorithm that accomplishes this in thepart of the game prior to the end game as well.

At first glance it may appear that these issues are
purely pragmatic and specific to poker. While one of
the main goals is certainly to produce a poker agent
that can beat the strongest humans in two-player no-limit Texas hold ’em, there are deeper theoretical
questions related to each component of the agent
that has been described. Endgame solving has been
proven to have theoretical guarantees in certain
games while it can lead to strategies with high
exploitability in others (even if the full game has a
single Nash equilibrium and just a single end game is
considered) (Ganzfried and Sandholm 2015). It
would be interesting to prove theoretical bounds on
its performance on interesting game classes, perhaps
classes that include variants of poker. Empirically the
approach appears to be very successful on poker
despite its lack of theoretical guarantees. Recently an
approach has been developed for game decomposition that has theoretical guarantees (Burch, Johanson, and Bowling 2014); however, from personal
communication with the authors I have learned that
the approach performs worse empirically than our
approach that does not have a worst-case guarantee.

The main abstraction algorithms that have been
successful in practice are heuristic and have no theoretical guarantees. It is extremely difficult to prove
meaningful guarantees when performing such a large