The additive heuristic does not take reuse of actions (other than the
dummy action a0) into account, so it often overestimates the actual
number of actions needed to complete a plan. The need to take
positive interaction into account in order to obtain a more accurate
heuristic estimate has been recognized in both state space planning
(Nguyen & Kambhampati, 2000; Hoffmann & Nebel, 2001; Refanidis & Vlahavas, 2001)
and plan space planning (Nguyen & Kambhampati, 2001).
For IPC3 we used a slight modification of the additive heuristic to
address the issue of action reuse:

haddr() =

The underlying assumption for this heuristic cost function is that an
open condition
ai that can possibly be resolved by
linking to the effect of an existing action aj will not give rise
to a new action when resolved. This can of course lead to an overly
optimistic estimate of the number of actions required to complete the
plan. The modified heuristic is still not admissible, however, since
the same cost value as before is used for open conditions that cannot
be linked to effects of existing actions. In other words, we only
account for possible reuse of existing actions and not potential
actions.

To illustrate the difference between
hadd() and
haddr() consider a planning domain with two
action schemas A1 and A2, where A1 has no preconditions and
A2 has a single precondition q. Assume that q can only be
achieved through an action instance of A1. The heuristic cost for
the literal q is therefore 1 according to the additive heuristic.
Consider now a plan with two unordered actions a1 and a2
(ai being an instance of action schema Ai) and a single open
condition
a2. We have
hadd() = hadd(q) = 1 corresponding to the addition of a new instance
of action schema A1 to achieve q, but
haddr() = 0 because there is an action (viz. a1) that is not ordered after a2 and has an effect that unifies
with q. Table 1 shows that taking reuse into
account can have a significant impact on planning time in practice.
The modified additive heuristic
haddr clearly
dominates
hadd in the DriverLog and ZenoTravel domains
despite incurring a higher overhead per generated plan. The results
in the Satellite domain are more mixed, with
hadd having a
slight edge overall. We show planning times for the four flaw
selection strategies that were used by VHPOP at IPC3. These
and other novel flaw selection strategies are discussed in detail in
Section 3.2.2.

Table:
Planning times in seconds using different flaw selection strategies for a selection of problems in the DriverLog, ZenoTravel, and Satellite domains, showing the impact of taking reuse into account in the plan ranking heuristic. A dash (-) means that the planner ran out of memory (512Mb).

Problem

MW-Loc

MW-Loc-Conf

LCFR-Loc

LCFR-Loc-Conf

hadd

haddr

hadd

haddr

hadd

haddr

hadd

haddr

DriverLog6

8.65

0.16

4.41

0.13

87.58

2.01

-

1.16

DriverLog7

3.66

0.34

0.63

0.17

21.15

1.28

1.57

0.22

DriverLog8

-

-

110.26

1.48

-

177.27

-

2.05

DriverLog9

-

0.33

-

0.28

-

-

-

-

DriverLog10

4.13

2.11

0.71

0.76

3.79

0.64

1.30

0.83

ZenoTravel6

-

0.93

17.41

2.90

25.09

0.95

11.24

2.82

ZenoTravel7

-

-

-

37.81

-

-

-

33.10

ZenoTravel8

-

15.48

-

37.99

-

-

-

6.45

ZenoTravel9

-

86.21

-

11.53

-

33.37

26.33

9.49

ZenoTravel10

-

26.59

-

21.22

-

21.20

-

18.22

Satellite6

0.36

0.22

0.37

0.24

0.32

0.21

0.40

0.24

Satellite7

0.49

0.37

0.54

0.84

0.55

0.51

0.62

-

Satellite8

1.09

-

1.29

0.84

0.85

0.83

1.25

0.68

Satellite9

2.41

-

2.11

-

1.84

-

2.50

-

Satellite10

1.53

1.12

1.95

1.11

1.50

1.36

2.08

1.37

Hoffmann and Nebel (2001) describe the FF
heuristic that takes positive interaction between actions into account
by extracting a plan from the relaxed planning graph1, and argue that the accounting of action reuse is
one of the contributing factors to FF's performance advantage
over HSP. The FF heuristic can take reuse of
potential actions into account, and not just existing actions as is
the case with our modified additive heuristic. This should result in
a better estimate of actual plan cost, but requires that a plan is
extracted from the relaxed planning graph for every search node, which
could be costly. It would be interesting to see how the FF
heuristic performs if used in a plan space planner.

The heuristic cost function used in REPOP
(Nguyen & Kambhampati, 2001), a heuristic partial
order planner working solely with ground actions, is defined using a
serial planning graph.2 The heuristic is similar in spirit to the FF
heuristic, and can like the FF heuristic take reuse of potential
actions into account. The REPOP heuristic also takes into
account reuse of existing actions, but seemingly without considering
ordering constraints, which is something we do in our modified
additive heuristic. Furthermore, our
haddr
heuristic always takes reuse of any existing actions that achieves a
literal q into account, while the REPOP heuristic only
considers an existing action if it happens to be selected from the
serial planning graph as the the action that achieves q. The
results in Table 2 indicate that the REPOP
heuristic may be less effective than the additive heuristic (with and
without reuse) in certain domains.