Main Menu

Solving Sudokus in Java

Sudoku, also spelled Su Doku, is a logic-based placement puzzle.
It consists in placing numbers from 1 to 9 in a 9-by-9 grid made up of
nine 3-by-3 subgrids, called regions or boxes orblocks, starting with various numerals given in some
cells, the givens or clues. It can be described
with a single rule:

Each row, column and region must contain all numbers from
1 to 9.

We can immediately deduce that for each row, column, and region
the values in the cells have to be different. Moreover, this
condition is sufficient; thus, the unique rule could be reformulated
as:

Each row, column and region must contain numbers from 1 to
9 that are all different.

"Solution to the Sudoku puzzle" />Figure 2. Solution to the Sudoku puzzle

The Wikipedia
entry on Sudoku has more information about the puzzle. Although
first published in 1979, Sudoku initially became popular in Japan
in 1986 and attained international popularity in 2005, both in
newspapers (many newspapers publish a daily Sudoku puzzle) and on
the Web. For example, see
"http://sudoku.koalog.com">Koalog's Free Sudoku of the Day to
play Sudoku online.

While Sudoku lovers invent very complex techniques with weird
names (X-wing, Swordfish, Nishio),
to solve Sudokus without guessing, computer programs usually use
brute-force methods. In this article, we will learn how to solve
Sudokus using
"http://en.wikipedia.org/wiki/Constraint_programming">Constraint
Programming in Java. This approach has many advantages:

It can solve any Sudoku.

It is fast.

It is as powerful as the most complex human techniques: most of
the Sudokus can be solved without trial and error.

It is much more robust than the brute
force approach: variants of the problem, with additional
constraints, could be easily solved the same way.

Constraint Programming

Constraint Programming (CP) is a technique of choice for solving
combinatorial problems such as the Sudoku problem.

Firstly, it consists of modelling the problem using mathematical
relations, or constraints. Precisely, the term constraint
denotes the implementation of the mathematical relation. Hence, a
Constraint Satisfaction Problem (CSP) is defined by:

A set of variables.

A set of domains (possible values) for the variables.

a set of constraints/relations on the variables.

Secondly, a strategy for enumerating the different solutions has
to be defined. CP differs from naive enumeration methods by the
fact that constraints contain powerful filtering algorithms
(usually inspired by
"http://en.wikipedia.org/wiki/Operational_research">Operational
Research techniques) that will drastically reduce the search
space by dynamically reducing the domains of the variables during
the enumeration phase (also known as the "search phase").

We will use Koalog
Constraint Solver, a Java library for Constraint Programming,
to write the code. None of the techniques presented in this article
depend on the choice of the solver, and could be implemented
using any commercial or open source solver.

Modelling the Sudoku Problem

Let us first create a problem object that will be the
placeholder for the problem modelization. Using Koalog Constraint
Solver, this can be done by subclassing aBaseProblem:

An important step in Constraint Programming is defining the
problem variables. In our case, we want to find the value for each
cell. Note that other models are possible: for example, we could
define, for each value, its position in each row or column or
block. It is also possible to mix several models. In our case, the
following simple model will be sufficient.

IntegerVariable[][] num = new IntegerVariable[9][9];

where the local variable num will be indexed by
rows, from 0 to 8, and then by columns, from 0 to 8.

We have defined the initial domain of each variable, taking into
account the givens. SparseDomain means that the domain
of each variable will be represented by a set of possible values.
Note that it is also possible to represent a domain by an interval
of the form [min, max] using theMinMaxDomain class.

In each of the nine blocks, we want the values to be different.
This means that we should enforce 36 (the number of distinct pairs
that can be selected from nine values: 9*8/2) inequations between the
nine num variables of each block. Instead, we simply use
a global constraint calledAllDifferent_SPARSE; it is suffixed by_SPARSE (a convention in Koalog Constraint Solver) to
take into account the sparse nature of the domains. These
constraints are added to the problem using its add
method:

We also want the values to be different in each row and in each
column. In order to do that, we can use theSquareMatrix class, which provides utility methods to
access the rows and columns of a square matrix:

The advantage of global constraints over a large number of small
constraints is not only that they make the constraint model easier
to read, but that they are also more powerful, since they usually
incorporate complex filtering algorithms. For example, theAllDifferent_SPARSE constraint uses an2.5 filtering algorithm (based on a maximal
matching algorithm from the graph theory). An interval
version of the same constraint exists. It is calledAllDifferent; it uses a nxlog(n) filtering
algorithm. The AllDifferent constraint is thus faster
but less powerful than the AllDifferent_SPARSE
constraint, which is why we used the latter in our model.

Finally, we define the array of variables that will be used by
the solver to display the solution:

setVariables((IntegerVariable[]) mnum.getElements());

We are done with the modelling, which only consists of 27AllDifferent_SPARSE constraints: nine for the blocks, nine
for the rows, and nine for the columns. The complete model can be
found in the source file, in the
"#resources">Resources section.

Solving the Sudoku Problem

This problem is simple enough to use one of the predefined
solvers available in Koalog Constraint Solver:

Solver solver = new DefaultFFSolver(problem);solver.solve();

where problem is a Sudoku problem instance andDefaultFFSolver is a solver conforming to the
first-fail heuristic. More precisely, this is syntactic sugar for:

Solver solver = new SplitSolver (problem, new SmallestDomainVariableHeuristic(), new IncreasingOrderDomainHeuristic());

where SmallestDomainVariableHeuristic selects the
variable with the smallest domain (with the hope of cutting large
branches from the search tree) andIncreasingOrderDomainHeuristic tries to instantiate
the selected variable by choosing the first value in its domain.
Note that these types of solver and heuristics are very common and
are available in any constraint solver.

Using this very simple model, most of the Sudokus are solved by
deduction (i.e., with no search). For example, here is an extract of
the output produced by Koalog Constraint Solver when solving the
Sudoku given to illustrate the beginning of this article. The test
was performed on a 1.6GHz PC running Linux SuSE 9.3:

A solution was found in 104ms by filtering 160 constraints (a
constraint solver needs to filter each constraint more than once
since the domains of the variables get reduced during the
process).

No trial and error was required.

This solution is unique (i.e., the Sudoku is well-formed).

The proof of this last result took only (109-104=) 5ms!

These results mean that the filtering algorithm, used by the 27AllDifferent_SPARSE constraints, is powerful enough to
deduce the unique solution of the problem.

A small percentage of Sudokus still require somebacktracking (backtracking is an enumeration algorithm),
but this remains very limited. In any case, solutions are found
very quickly (in less than 200ms).

Going Further

Helmut Simonis' article "Sudoku as a Constraint
Problem" presents redundant constraints dedicated to the Sudoku
problem, further reducing the need for backtracking. In particular,
one could take advantage of the LatinSquare constraint,
which will be available in the next release of Koalog Constraint
Solver, version 3.0. In any case, the solving of problems such as the
Sudoku problem is made very easy with the use of Constraint
Programming.

There are two other interesting, but harder, problems related to
Sudokus:

Generating well-formed Sudokus: more
precisely, generating givens that lead to a unique solution, which
is the case of the Sudoku provided as an example at the beginning
of this article.

Explaining Sudokus: more precisely, generating
a human-readable explanation of the resolution of a Sudoku.

Both problems can be addressed with Constraint Programming, but
are out of the scope of this article.