Navigation

Who's new

Pattern databases for the 5x5 sliding puzzle

In 2002, Korf and Felner [1] used pattern databases to solve optimally 50 random instances of the 5x5 sliding puzzle. They used a static 6+6+6+6 partition of tiles (described below), along with its reflection in the main diagonal. In a 2004 paper by Felner, Korf and Hanan [2], the authors describe in a footnote the way they handled the empty tile as 'not trivial'; the empty tile was taken into account when precomputing the database, but then the tables were compressed by discarding the information about the empty location. The authors do not provide the distribution of values from the pattern databases, but do provide maximum values and the number of nodes generated when solving random instances of the 5x5 sliding puzzle.

I did not find online actual distributions, so they are given below (not yet confirmed independently). The counts are given in terms of three different entities which I'm calling 'states', 'compressed states' and 'buckets'. 'States' are various arrangements of seven tiles, including the empty tile; for each of the pattern databases discussed in this post, there are 25 × 24 × 23 × 22 × 21 × 20 × 19 = 2,422,728,000 'states'. 'Compressed states' are various arrangements of six physical tiles included in a pattern database; for each table, there are 25 × 24 × 23 × 22 × 21 × 20 = 127,512,000 'compressed states'. 'Buckets' may be described as equivalence classes of 'states' where two 'states' are considered equivalent iff they are reachable from each other by a sequence of moves neither of which affects any physical tile included in the pattern database. Every 'state' maps to exactly one 'bucket', and every 'bucket' maps to exactly one 'compressed state'.

I implemented an optimal solver which uses these tables and ran it on 12 out of 50 puzzle instances from [1]. For all of these 12 configurations, the distances and the number of generated nodes reported by my implementation match those given in [1]. (In my implementation, the search is terminated when the first solution is found; the moves are tried in the fixed order [up, left, right, down], with move direction corresponding to the direction of movement of the empty tile; there is no duplicate pruning other than to eliminate pairs of consecutive moves which cancel each other. The 12 processed instances are numbered 38, 40, 25, 32, 44, 37, 30, 13, 1, 28, 36, 5 in the paper [1].)

For the irregular pattern, there are 55 depth-3 'buckets' but only 54 depth-3 'compressed states'. This is because there are two different depth-3 'buckets' mapping to the same depth-3 'compressed state'.

For the 2x3 pattern, there are 103 depth-3 'buckets' but only 95 depth-3 'compressed states', so 8 "missing" 'compressed states' need to be explained. There are six pairs of depth-3 'buckets' such that two 'buckets' in the same pair map to the same depth-3 'compressed state'. Two other depth-3 'buckets' map to 'compressed states' already reached at depth 1.

Preliminary results and an update on WD (edited)

Late but hopefully not too late, here are some preliminary results from my experiments. (Nodecounts for DPDB match those given by Korf and Felner, so I believe my implementation of DPDB is correct. Nodecounts for WD, ID, WD+ID seem to match those produced by my slow solver written in 2011 (the new solver is written in another language without looking at the old code), so I believe these are correct too.)

I'm manually recompiling the code for different combinations of heuristics:

ID = the 5x5 version of the ID heuristic. My implementation uses almost no memory and generates 20M+ nodes per second. I implemented only direct generalization of the InvertDistance heuristic as proposed by Ken'ichiro Takahashi and did not use additional improvements suggested here.

MD = Manhattan distance.

According to my experiments, on 12 easier out of 50 test instances from (Korf Felner 2002), in terms of the number of generated nodes, DPDB is improved by several percents when WD and/or ID are added. In two out of 12 cases, my implementation generated approximately 17% less nodes when WD and ID are added.

I cannot say much about relative speeds since neither of my implementations is particularly optimized for speed. (I tried to optimize for memory first to make it possible to build all tables from scratch under 2 GB memory limit, and to be able to have two instances of the solver running at the same time.)

I ran the solver on some non-random instances. The configuration "rotate_180" seems to be a particularly bad case for disjoint pattern databases: ID using almost no memory generates less nodes to complete depth 140 than DPDB does. On the other hand, combining ID and/or WD with DPDB does reduce nodecounts even on "rotate_180". To complete depth 140, WD-only solver generates 17M nodes while (WD+DPDB) solver generates 15M nodes. To complete depth 146, WD-only solver generates almost 500 billion nodes, while (WD+DPDB) generates about 233 billion nodes.

To me it looks like DPDB is better on random instances (but WD and ID are still helpful), while WD and ID are much better on some non-random instances such as "rotate_180" (but DPDB is still helpful).

The WD ("WalkingDistance") heuristic suggested by Ken'ichiro Takahashi may be a re-discovery of "X-Y heuristic" (Prieditis 1993). I could not find any earlier mention of the ID ("InvertDistance") heuristic in the literature, though. I am trying to organize available to me information, but unfortunately I cannot tell how long it will take to me, so below are links for anyone interested.

Links and references

Armand E. Prieditis (1993) Machine Discovery of Effective Admissible Heuristics. [Describes the "X-Y heuristic" applied to the 3x3 sliding puzzle. As far as I can tell from the description, "X-Y" and "WD" are equivalent.]

Luc Edixhoven (2016) Attacking the n-puzzle using SAT solvers. 3.3. Heuristic functions. [Describes the "X-Y" heuristic and gives numbers of 105 states for the 3x3 version and 24964 states for the 4x4 version; these numbers exactly match corresponding numbers for the "WalkingDistance" heuristic.]

Solving 4x4 antipodes as 5x5 puzzle instances

Some 4x4 configurations can be solved with fewer moves, when extended to 5x5 configurations. The following instance is solvable in 31s* (single-tile moves) if the fringe tiles (rightmost column and bottom row) are fixed, but in 29s* when all tiles can be moved:

It seemed natural to check whether any of 17 antipodes of the 4x4 puzzle can be solved in less than 80 single-tile moves as a 5x5 configuration.
However, according to my solver, all of them remain 80s*, even when extended to 5x5.
I did not check whether there are any additional optimal solutions which temporarily disturb the fringe tiles.

On these 17 instances of the 5x5 sliding puzzle, WD-only solver generated roughly 7 to 18 times more nodes than DPDB-only solver, while DPDB-only solver generated roughly 1.6 to 2.2 times more nodes than (DPDB+WD) solver.

(Added 20 Apr 2017) (WD+ID) solver generated roughly 1.06 to 1.47 times less nodes than WD-only solver. Answering myself: My 5x5 solver found all optimal solutions to every 4x4 antipode extended to 5x5. Then I used Kociemba's 15 puzzle optimal solver to find all optimal solutions to every 4x4 antipode. The number of solutions was the same in both cases (it ranges between 440 and 4452), so for these 17 instances there are no optimal solutions which touch fringe tiles.

Nodecounts

The next table gives the number of optimal solutions and the number of generated nodes for each of 17 4x4 antipodes extended to 5x5. Nodecounts are recorded when the last iteration of IDA* is completed. The WD-only solver took almost two weeks to find all optimal solutions. Although there are pairs of antipodes equivalent by the reflection in the main diagonal (e.g. #1 and #16), I decided to actually run all 17 instances. As expected, the number of solutions and the number of generated nodes are the same for equivalent configurations.

The next table gives nodecounts recorded when the first solution is found, for the same 17 instances. Unlike previous table, this one may give different counts for mirror-symmetric pairs. As previously, the move ordering is u/l/r/d.