Complexity of Checkers and Draughts on board sizes 6x6/8x8/10x10/12x12/14x14

The complexity of games is usually expressed by two numbers: the state-space complexity (SSC) and the game-tree complexity (GTC). The SSC is the number of positions that can be reached from the initial position. The GTC is usually estimated by b^d, where b is the branching factor and d is the average depth (game length in ply).

I'm interested in the complexity of checkers on board sizes 6x6 (small) and 8x8 (normal), and the complexity of draughts on board sizes 6x6 (small), 8x8 (Russian/Brazilian), 10x10 (International), 12x12 (Canadian) and 14x14 (South-African, a.k.a. Dumm). So we have 2 sets of rules and 5 board sizes.

Attachment:

5_sizes_dia.png [ 13.33 KiB | Viewed 4369 times ]

State-Space Complexity

In [1] the SSC of 8x8 checkers is calculated, and we can generalize the given formula to:

These are only upper bounds; the real numbers are (much) lower, even if we take into account that we need the number for both colors to move (a factor 2). For 6x6 checkers and 6x6 draughts I calculated the exact numbers, as they can be counted using a simple non-recursive function that traverses the entire game tree, keeping a large array in RAM with 1 bit per position (for both colors to move: 63838220543 x 2 = 127676441086 positions).

Spoiler:

Code:

// assume an array with 1 bit per position (init all zero)// upon completion, this contains the reachability databasefunction countDepthFirstWithMemory(Board board) : long integer count, ply := 0, 0 // ply is only for comment current := new Node(board) // root node (no parent) while current != null do if retrieve(index(current.board)) then // visited before? current := selectNext(current) else // new position store(index(current.board)) count++ if current.isTerminal() then current := selectNext(current) else generateChildren(current) current := current.children[0] ply++ fi fi od return countend

In [2] an estimate is given for the SSC of 10x10 draughts: 10^30 (no calculation given).

Game-Tree Complexity

In [2] estimates are given for the GTC of 8x8 checkers (b^d = 2.8^70 = 10^31) and 10x10 draughts (b^d = 4^90 = 10^54). These are still often cited, see e.g. [4]. However, in [3] the GTC of 8x8 checkers was corrected to b^d = 6.14^50 = 10^40, using several games databases.10x10 draughts needs a similar correction. Using all 500,000+ games from the database Turbo Dambase [5], I find b^d = 8.22^101 = 2.5 x 10^92. So like checkers much higher than previously estimated.

The GTC of the other variants is unknown. To estimate these, we need to know the branching factor and the average game length. These are determined experimentally. To this end, I developed a basic engine (named 'General') that can play all games at a reasonable level. Properties: data structure: single-dimensional array; search: principal variation search, transposition table, history heuristic, repetition check, single-move extension, quiescence search (side captures only), single thread, no forward pruning; evaluation: material (king = 1.5 men for checkers, king = sqrt(N) men for NxN draughts, runaways, outpost safety, balance, center control, mobility, holes, drawish material, trapped kings for checkers, mainly missing element: locks; parameters are hand-tuned; no opening book; no endgame databases; randomized: shuffle root move list before search. Tested for 10x10 draughts, the version with positional evaluation scored 78.5% against the material-only version (158 games, 5 sec/move).The resulting program can beat me, a former 10x10 draughts club player, in all variants. The 10x10 version scores about 1 draw per 20 games against Maximus (equal time), which amounts to expert level.

Before starting the experiment (playing a large amount of games) the following questions have to be answered.1. When do we stop a game that will probably end in a draw?2. How much time (or depth) do we give the program, for games to be representative?3. How many games do we play?

For draughts there is a natural moment to stop a game: if the search score is near zero and we have drawish material: the player who is behind has a king and the other player has 3 or less pieces (10x10, 12x12, 14x14) or 2 or less pieces (6x6, 8x8; exception: 1 king against 3 is also a draw when the king-alone is on the main diagonal - if I am not mistaken).For checkers there is no such moment. 2 kings win against 1, and sometimes also 1 against 1. The solution is to play with a high max. game length, and later truncate the games when a certain number of pieces is left. This corresponds to stopping when a virtual endgame database is reached (and we only want to know the game length, not the result). And this corresponds roughly to human players agreeing to a draw or resigning when they can predict the outcome. For 8x8 checkers the game length corresponds to the game length in human games if a 10-pieces (virtual) database is used - the largest feasible database that actually exists. For 10x10 draughts this is 8-pieces, for 12x12 draughts this would be 7-pieces, and for 14x14 draughts 6-pieces. For 6x6 checkers and 6x6 draughts I generated the 8-pieces database, but truncating games at 8 pieces results in artificially short games so I used 6-pieces in these cases (a bit arbitrary).

It is unclear which time or depth would be appropriate, so I chose to use incremental depth, which also makes it possible to observe the trend. I decided to play 10,000 games at each depth from 0 to at least 7, giving at least 70,000 games per variant. The games were stored, later truncated, duplicate games were removed, then the averages per game and per position were calculated. In total, over 600,000 games were played, of which over 450,000 unique (182 CPU days; 21 calendar days on 3 multi-core computers).

Attachment:

Branching Factor.png [ 18.1 KiB | Viewed 4369 times ]

Each datapoint represents <= 10,000 unique games.

Attachment:

Game Length.png [ 24.61 KiB | Viewed 4369 times ]

The branching factors found for 8x8 checkers and 10x10 draughts are higher than the previously found values for human games (checkers: 6.32 vs. 6.14; draughts: 8.95 vs. 8.22). This is influenced by the value of the evaluation mobility parameter. A lower value for mobility results in a lower branching factor. I tried several values and decided to use the one that resulted in the strongest play. Also the game length for 10x10 draughts is 10 ply more than in human games. But human games contain blunders, losing on time and grandmaster-draws, so the question is which number is more reliable.Anyway, it means that we should take the following results with a grain of salt, they are ballpark figures. For the branching factor I took the average over all games with search depth 3 and higher, for the game length I took the average over all games with the 3 highest search depths. This gives:

6x6 checkers and 6x6 draughts were also weakly solved (both are a draw), by building the 8-pieces WDL (win/draw/loss) databases and by using PNS (proof-number search), without a transposition table to avoid the GHI problem. But even the 4-pieces db's are sufficient for a quick proof (a few seconds). These games were previously solved by Rein Halbersma [6], using alpha-beta and a transposition table but without endgame databases (and disregarding GHI).The 8-pieces DTW (distance-to-win) databases for 6x6 checkers and 6x6 draughts were also built. The longest win in 6x6 checkers is 147 ply (with 68 consecutive king-slide-moves), in 6x6 draughts this is 83 ply.

Conclusions

The following figure shows the results found, compared to some popular games from literature.

Attachment:

Complexity.png [ 22.14 KiB | Viewed 4369 times ]

Concluding, we can say that:1. 6x6 checkers and 6x6 draughts are weakly solved.2. 8x8 checkers is weakly solved [3]. 8x8 draughts is a candidate to weakly solve, and it can be solved much faster than checkers.3. 10x10 draughts is more complex than currently "known" in the literature. The complexity is closer to chess than to checkers. Weakly solving 10x10 draughts is far away.4. 12x12 draughts is more complex than chess, and very tactical (is my impression; I lost all test games).5. 14x14 draughts is more complex than Go, and very strategic (is my impression; I won a few test games).

More complex means a game has a higher GTC, not necessarily it is more difficult. Especially 14x14 draughts will be a tough game for computers. Methods that are known to be highly successful for 8x8 checkers and 10x10 draughts will start to break down. There is virtually no use for an opening book and endgame databases, search depth will be strategically insignificant (compared to game length), and evaluation becomes increasingly difficult (to learn). So maybe AlphaGo-like techniques start to get useful here. And we don't have to get bored in the years to come.

Wow, nice overview Jan-Jaap! I have lots of questions, but first some easy ones.

Does your program generalize to W X H boards? The reason I ask is that in the special case that W = H +/- 1 or H +/- 2, (or vice versa), a 3 kings vs 1 king endgame is generally won instead of drawn. In Russia, there have been 10x8 (10 width) tournaments and here in the Netherlands the late Tjalling Goedemoed has organized some tournaments on 10x11 and 10x12 boards.

Did you actually build databases on a 14x14 board? How long did that take?

Wow, nice overview Jan-Jaap! I have lots of questions, but first some easy ones.

Does your program generalize to W X H boards? The reason I ask is that in the special case that W = H +/- 1 or H +/- 2, (or vice versa), a 3 kings vs 1 king endgame is generally won instead of drawn. In Russia, there have been 10x8 (10 width) tournaments and here in the Netherlands the late Tjalling Goedemoed has organized some tournaments on 10x11 and 10x12 boards.

Did you actually build databases on a 14x14 board? How long did that take?

Thank you. No, my program is written for N x N boards specifically. It is possible to generalize it to W x H, but that may be quite some work. I was vaguely aware that non-square boards may change the draw margin in the endgame, but I didn't have to consider that.

No, I only built the endgame databases for 6x6 checkers and 6x6 draughts (and, just now, for 10x10 draughts). But these were not used in the experiment, as they were part of a separate program ('Hexamus') with the code derived from Maximus. It uses bitboards and is better suited for db generation and solving the games (from a performance point of view).

For the experiment (with program 'General') I needed a stopping criterion for the games, so I pretend to have a database (a "virtual" database) with as many pieces as would be practically feasible for the current game type, and truncate a game when that number of pieces is left on the board. The assumption is that this gives a game length (never mind the game result) which more or less resembles the game length of human games, which are often used in GTC calculations.

Thank you. No, my program is written for N x N boards specifically. It is possible to generalize it to W x H, but that may be quite some work. I was vaguely aware that non-square boards may change the draw margin in the endgame, but I didn't have to consider that.

It is not that hard in my experience to generalize an NxN program to WxH. I only use "naked" width and height in writing position diagrams to stdout, and in some eval patterns (column balance etc.). All other stuff is hidden behind a table that maps square numbers to bit numbers. I assume that you also use such a square mapping for your array-based program, because it removes even/odd rows in move generation and eval. In that square mapping, some stuff depends on width and height. In any case, you have the perft numbers to compare in case you want to add it (and I'd love to have confirmation!).

It is not that hard in my experience to generalize an NxN program to WxH. I only use "naked" width and height in writing position diagrams to stdout, and in some eval patterns (column balance etc.). All other stuff is hidden behind a table that maps square numbers to bit numbers. I assume that you also use such a square mapping for your array-based program, because it removes even/odd rows in move generation and eval. In that square mapping, some stuff depends on width and height. In any case, you have the perft numbers to compare in case you want to add it (and I'd love to have confirmation!).

I was also thinking about GUI, PDN, DXP, etc. But if you only need perft confirmations I can have a look. Do you have a list of what you want?

It is not that hard in my experience to generalize an NxN program to WxH. I only use "naked" width and height in writing position diagrams to stdout, and in some eval patterns (column balance etc.). All other stuff is hidden behind a table that maps square numbers to bit numbers. I assume that you also use such a square mapping for your array-based program, because it removes even/odd rows in move generation and eval. In that square mapping, some stuff depends on width and height. In any case, you have the perft numbers to compare in case you want to add it (and I'd love to have confirmation!).

I was also thinking about GUI, PDN, DXP, etc. But if you only need perft confirmations I can have a look. Do you have a list of what you want?

Here I did 12x10 and 10x12: viewtopic.php?t=4563I am also thinking of doing 3 kings vs 1 king on 10x8 later this year. (I can generate perft now if you want to compare).

BTW, I just did perft for 6x6 (checkers and international) and 14x14 (international) to reasonable depth, so we can compare notes.

We both find 7.71788e+66 instead of 5.0e+66. Can you recompute and confirm?

You are right, thanks for reporting this. I failed to notice overflow (negative numbers) in my binomial table for cases 12x12 and 14x14. I changed the table data type from 64-bit integer to BigInteger, and new results are:

We both find 7.71788e+66 instead of 5.0e+66. Can you recompute and confirm?

You are right, thanks for reporting this. I failed to notice overflow (negative numbers) in my binomial table for cases 12x12 and 14x14. I changed the table data type from 64-bit integer to BigInteger, and new results are:

Good to hear! I did my calculations in an R script, so only floating point result. I used to have a Mathematica equivalent that did the full BigInteger calculations, but I no longer have a license for it. I trust that those final digits are correct

Who is online

Users browsing this forum: No registered users and 2 guests

You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot post attachments in this forum