Figure 2: Median solution cost using dynamic backtracking for
solvable (solid lines) and unsolvable (dashed and dotted lines)
problems with number of variables n = 10 (black lines) and n =
20 (gray lines) as a function of number of nogoods divided by
problem size, m/n. All problems were generated using the
``generate-select'' method except for the unsolvable problems
shown by the dotted line, which were generated using the
``hill-climbing'' method. For problems of size 10, each point is
the median of 1000 problems solved 100 times, except for
unsolvable problems generated by ``generate-select'' at m/n = 3
(30 nogoods) and solvable problems at m/n = 14 (140 nogoods),
which are based on 100 problems. For problems of size 20, each
point is the median of 500 problems solved 100 times, except for
unsolvable problems at m/n = 5 (100 nogoods) and solvable
problems at m/n = 12 (240 nogoods), which are based on 15 and 35
problems, respectively. Error bars showing 95% confidence
intervals are included, but in most cases are smaller than the
size of the plotted points.

Figure 2 shows the median dynamic backtracking
solution cost for solvable and unsolvable random CSPs generated as
described above, for problems with number of variables n=10 and
n=20, with domain size three. Except where specified otherwise in
the figure caption, for problems of 10 variables we generated 1000
solvable and 1000 unsolvable problems for each point, and for problems
of 20 variables we generated 500 solvable and 500 unsolvable problems
for each point, using the ``generate-select'' method. We also
generated unsolvable problems of 10 variables with 10 to 70 nogoods
using the ``hill-climbing'' method. We overlap the range of problems
generated by the two methods to show how the different generation
methods affect search cost.

This figure clearly shows the easy-hard-easy pattern of solution cost
for both solvable and unsolvable problems, for both problem sizes.
The two methods of generating unsolvable problems give distinct
curves: the unsolvable problems generated by the ``hill-climbing''
method are harder than those generated by the ``generate-select''
method. Nonetheless, both sets of problems show the same
easy-hard-easy pattern.

Figure 3: Median solution cost for 3-coloring random graphs with
100 nodes as a function of connectivity using backtrack
search with the Brelaz heuristic. The solid and dashed curves
correspond to solvable and unsolvable cases respectively. These
results started with 100,000 random graphs at each value of ,
and additional samples were generated at the extremes to produce at
least 100 samples for each point. For random graphs, the crossover
from mostly solvable to mostly unsolvable occurs around a
connectivity of 4.5. Error bars showing 95% confidence intervals
are included.

Another example with the same behavior is shown in Figure 3
for the median search cost for instances of 3-coloring of random graphs.
In contrast to Figure 2, the solvable and unsolvable
cases have similar median search costs near the peaks. This is because,
as described above, the graph coloring searches for unsolvable cases
used the symmetry with respect to permutations of the colors to avoid
unnecessary search. Without this optimization, the costs for unsolvable
cases would be six times greater than the values shown in the figure.
Similar peaks are seen for other classes of graphs, such as connected ones,
although at somewhat different values of .

These data show that both random CSPs and graph coloring problems
exhibit an easy-hard-easy pattern for solvable and unsolvable problems
considered separately.

A peak in search cost for solvable problems such as we observed has
also been seen extensively in studies of local-repair search methods
and for problems generated with a pre-specified
solution [22, 14, 21]. These search methods
start with some assignment to all of the variables in the problem and
then attempt to adjust them until a solution is found. Generally,
such methods are not systematic searches: they can never determine
that a problem has no solution. Thus empirical studies of these
methods are restricted to consider solvable problems and incidentally
provide a useful examination of the properties of solvable problems.

Furthermore, a study of satisfiability problems with backtracking search is
consistent with a peak in cost for solvable problems [16],
but there were insufficient highly constrained solvable problems to make a
definite conclusion for the behavior with many constraints.

Figure 4: Mean (solid) and median (dashed) number of solutions on a log
scale as a function of the number of binary nogoods, for solvable
problems with 10 variables, 3 values each, based on 1000 problems
generated by the ``generate-select'' method at each multiple of 10
binary nogoods, except for 140 nogoods, which is based on 100
problems. At 0 nogoods there are = 59049 solutions.
Error bars showing 95% confidence intervals are included.

How does the existence of a peak for solvable problems fit with the
explanation given above? Certainly an explanation based on a
transition from solvable to unsolvable problems cannot apply directly
to the class of solvable problems. However, the competition between
increased pruning and decreased number of solutions still applies. As
shown in Figure 4, the number of solutions for
solvable random CSPs of size 10 at first decreases rapidly as
constraints are added but then nears its minimum value of one, giving
a slower decrease. Except for the change in minimum value from 0 to 1
solution, this behavior for the number of solutions is qualitatively
similar to that for the general case including both solvable and
unsolvable problems. The additional constraints continue to increase
the pruning of unproductive search paths. Thus the explanation given
above might continue to apply but now predicts the peak will be at the
point where solutions can drop no further (i.e., one solution) rather
than becoming unsolvable (i.e., zero solutions).

Figure 5: Fraction of problems with at least two solutions
as a function of number of nogoods divided by problem size, for
problems of size 10 (black line) and size 20 (gray line). Data
for problems of size 10 are based on 1000 solvable problems
created by the ``generate-select'' method at each point, except
for 100 solvable problems at m/n = 14 (140 nogoods).
Data for problems of size 20 are based on 500 solvable problems at
each point, except for 20 solvable problems at m/n =
12 (240 nogoods), also created by the ``generate-select'' method.
Error bars showing 95% confidence intervals are included.

Figure 5 evaluates this idea. This figure shows
how the fraction of problems with at least two solutions changes as a
function of the number of nogoods divided by the problem size for
random CSPs with 10 and 20 variables. For problems of size 10, the
second to last solution disappears, on average, between 90 and 100
nogoods: the median number of solutions has dropped to 2 by 90
nogoods, and to 1 by 100 nogoods (Figure 4). The
peak in solution cost for solvable problems is slightly lower than
this, at between 80 and 90 nogoods, close to the crossover point of
Figure 5 where half the solvable problems have only
one solution. This is perhaps close enough to be consistent with the
explanation given above. However, this relationship does not hold for
problems of size 20. For this class of problems, the cost peak of
solvable problems is at around 180 nogoods (m/n=9), whereas the
point at which half the problems have just one solution has still not
been reached by 240 nogoods (m/n=12). At 180 nogoods, the median
number of solutions is 4 (mean is 10.0), and at 240 nogoods, the
median is still 2 (mean is 1.83). This is inconsistent with the
explanation that the cost peak for solvable problems is due to the
increasing effect of pruning given no possible further decrease in
number of solutions.

Since the explanation depending on a change to insolubility does not
apply, and the pruning versus number of solutions explanation does not
fit the data, some other factors must be at work to produce the
easy-hard-easy pattern for solvable problems. We suspect the
explanation is related to the idea of minimal unsolvable subproblems.
A minimal unsolvable subproblem is a subproblem that is unsolvable,
but for which any subset of variables and their associated constraints
is solvable; [9] have referred to this aspect of SAT
problems as the minimal unsatisfiable subset. The idea is that once a
few bad choices have been made initially, such that the remainder of
the problem becomes unsolvable, unsolvability is much harder to
determine for some problems than for others. In particular, the more
variables that are involved in a minimal unsolvable subproblem, the
harder it is to determine that the subproblem is unsolvable. We make
the conjecture that the cost peak for solvable problems is tied to the
average size of the minimal unsolvable subproblem once a choice has
been made that results in the remaining problem being unsolvable.

Figure 6: Median solution cost as a function of number of
nogoods for problems of 10 variables, 3 values each, with exactly
one solution, generated using the ``generate-select'' method (solid
line), and by hill-climbing down to one solution starting from
solvable problems with many solutions produced using
``generate-select'' (dotted line), solved using dynamic backtracking.
Each point is the median of 1000 problems each solved 100 times,
except for hill-climbing generated problems at 25, 30 and 35 nogoods
and ``generate-select'' generated problems at 140 nogoods, of which
there are 100. Error bars showing 95% confidence intervals are
included.

A more interesting case is the behavior of the problems with no
solutions shown in Figures 2 and 3. As
a further example, Figure 6 shows the
solution cost for problems with exactly one solution. This also shows
a peak. These observations on problems with zero or one solution show
that even with the number of solutions held constant, problems exhibit
an easy-hard-easy pattern of solution cost.

According to the explanation of the transition, if the number of
solutions is held constant then the increase in pruning will be the
only factor, giving rise to a monotonic decrease in search cost as
constraints are added. Instead, we see in Figures
2, 3 and 6
that even when the number of solutions is held fixed at zero or one,
there is still a peak in solution cost, and at a smaller number of
nogoods. Thus the existing explanation does not capture the full
range of behaviors. Instead, it appears that there are other factors
at work in producing hard problems. By focusing more closely on these
factors we can hope to gain a better understanding of the structure of
hard problems, which may lead to more precise predictions of search
cost.

Figure 7: Comparison of median solution cost on a log scale
using the same sets
of unsolvable problems for chronological backtracking (black) and
dynamic backtracking (gray). Dotted lines are for problems
generated using the ``hill-climbing'' method, solid lines for the
``generate-select'' method. Each point is the median of 1000
problems each solved 100 times, except for the ``generate-select''
method at 30 nogoods, which is based on 100 problems. Error bars
showing 95% confidence intervals are included, but are smaller than
the size of the plotted points.

We also investigated the effect of algorithm on the pattern of
solution cost in unsolvable problems by repeating the search of random
CSPs using chronological backtrack. A comparison of chronological
backtracking search with our previous dynamic backtrack search results
for unsolvable problems is shown in Figure 7. In
this figure, the curves for dynamic backtracking are the same as those
for the unsolvable problems shown in Figure 2,
except that here the cost curves are shown on a logarithmic scale.
Interestingly, we do not see a peak in search cost for unsolvable
problems using the less sophisticated method of chronological
backtrack.

This observation raises an important point: the easy-hard-easy pattern
is not a universal feature of search algorithms for problems
restricted to a fixed number of solutions. This suggests that the
competition between number of solutions and pruning, when it occurs,
is sufficiently powerful to affect most search algorithms (very simple
methods, such as generate-and-test, do not make use of pruning and
show a monotonic increase in search cost as the number of solutions
decreases), but that only some algorithms are able to exploit the
features of weakly constrained problems with a fixed number of
solutions that make them easy.

In contrast to our observations, a monotonic decrease in cost has been
reported for unsolvable binary random constraint
problems [19] and for unsolvable 3SAT
problems [16]. In the case of 3SAT, the explanation
may well be choice of algorithm. Indeed, [2] recently
found that incorporating conflict-directed backjumping and learning
into the Tableau algorithm made a difference of many orders of
magnitude in problem difficulty specifically for rare, ``exceptionally
hard,'' unsatisfiable problems in the underconstrained region. It
would be interesting to see whether the easy-hard-easy pattern for
unsolvable problems would appear using their algorithm.

With respect to [19]'s observations, the difference may be
due to the range of problems generated, resulting from different
problem generation methods. Smith and Dyer used a method akin to our
``random'' generation method, that is, generating problems without
regard for solvability, then separating out the unsolvable ones. With
this method, the ``hit rate'' for unsolvable problems in the
underconstrained region is very low. It is possible that Smith and
Dyer's data do not extend down to the point at which the cost of
unsolvable problems begins to decrease simply because they stopped
finding unsolvable problems before that point.

There are two possible reasons why we might have found unsolvable
problems using random generation further into the underconstrained
region, where Smith and Dyer did not. One possibility is that since
we were specifically interested in unsolvable problems as far into the
underconstrained region as possible, we may have spent more
computational effort generating in that region. Indeed, at 40
nogoods, unsolvable problems occurred with frequency , and at 30 nogoods, with frequency . At
that rate, problems at 30 nogoods took about six hours apiece to
generate.

A second possibility relates to the details of the generation methods.
In Smith and Dyer's random generation method, every pair of variables
had exactly the same number of inconsistent value pairs between them.
This implies a degree of homogeneity in the distribution of the
nogoods. On the other hand, in our random generation method, each
variable-value pair had an equal probability of being selected as a
nogood, independent of one another. Thus it was at least possible in
our generation method, though still of low likelihood, for the nogoods
to occasionally clump, and to produce an unsolvable problem. This
idea is discussed further in Section 5.

The difference in our observation and [19]'s reinforces an
important point: that a relatively subtle difference in generation
methods can affect the class of problems generated. While the nogoods
will be more or less evenly distributed on average using our
generation method, they will also be clumped with some probability,
whereas with Smith and Dyer's generation method, a homogeneous
distribution over variable pairs is guaranteed. These types of
problems could be different enough to sometimes produce different
behavior.