LPG-td plan quality

At the 4th IPC (ICAPS 2004, Whistler, Canada) not all planners used the same notion of plan quality. Some planners adopted the number of plan actions, while others used the plan metric specified in the PDDL2.2 problem description (e.g., the plan makespan for temporal planning problems). LPG-td used the number of plan actions, for simple STRIPS problems, and the specified plan metric in all the other variants of the test problems. We believe that, for temporal or numerical domains, the use of the specified plan metric is much more natural and practically useful than the use of the number of plan actions. However, overall, the solutions produced by LPG-td.quality and especially LPG-td.bestquality are better than the solutions produced by the other IPC4 planners in terms of both the number of actions and the specified plan metric.

At IPC4 some domains had different formalizations (ADL or STRIPS), that the competitors were free to chose for their planner. The official results of IPC4, that are available from the IPC4 website, compared planners that addressed different formalizations of the same domain. Here we do the same, with the only exception of the temporal variant of the Airport domain, because the problems in this domain have optimal solutions with different quality in the STRIPS and ADL formalizations. We note, however, that the results of our analysis are similar when, for every domain, the planners are compared only if they use the same domain formalization.

Comparison of LPG-td.quality and the other planners of IPC4
(plan quality = problem-specified plan metric)

LPG-td.quality versus	Problems Solved by LPG-td/IPC4-Planner/Both	% Better Quality Plans minus % Worse Quality Plans	% Much Better Quality Plans minus % Much Worse Quality Plans
SGPlan	845 / 1090 / 771	57.6%	36.6%
Crickey	845 / 364 / 293	45.4%	36.2%
Downward-diagonally	845 / 380 / 305	49.2%	28.5%
Downward	845 / 360 / 296	41.2%	29.4%
Marvin	845 / 224 / 211	36.5%	28.4%
Yahsp	845 / 255 / 210	71.4%	47.1%
Macro-FF	845 / 189 / 138	84.8%	44.9%
Til-Sapa	845 / 63 / 63	82.5%	3.2%
P-mep	845 / 98 / 91	70.3%	39.6%
Roadmapper	845 / 52 / 51	84.3%	70.6%
FAP	845 / 81 / 28	71.4%	50.0%

Comment: Overall these results show that, using the problem-specified plan metric (that for STRIPS problems was the "Graphplan plan length" or number of time steps), LPG-td.quality performs much better than the other planners.

Comparison of LPG-td.quality and the other planners of IPC4
(plan quality = number of plan actions)

LPG-td.quality versus	Problems Solved by LPG-td/IPC4-Planner/Both	% Better Quality Plans minus % Worse Quality Plans	% Much Better Quality Plans minus % Much Worse Quality Plans
SGPlan	845 / 1090 / 771	5.7%	6.1%
Crickey	845 / 364 / 293	40.6%	22.2%
Downward-diagonally	845 / 380 / 305	15.4%	-2.0%
Downward	845 / 360 / 296	6.8%	-0.7%
Yahsp	845 / 255 / 210	30.5%	14.3%
Macro-FF	845 / 189 / 138	22.5%	0.7%
Roadmapper	845 / 52 / 51	56.9%	15.7%
FAP	845 / 81 / 28	39.3%	35

Comment: In this table we consider only the planners that did not attempt to optimize the problem-specified plan metric. As plan quality metric, these planners used the number of actions in the plan, or they just provided any solution they could find with no attempt to optimize plan quality. In terms of number of actions, the Downward planner found a few solutions that are much better than the solutions found by LPG-td.quality; however, if in the comparison we include plans with small differences in plan quality, then in general LPG-td.quality performed better than Downward.

LPG-td.bestquality versus	Problems Solved by LPG-td/IPC4-Planner/Both	% Better Quality Plans minus % Worse Quality Plans	% Much Better Quality Plans minus % Much Worse Quality Plans
SGPlan	845 / 1090 / 771	65.6%	41.0%
Crickey	845 / 364 / 293	56.3%	43.0%
Downward-diagonally	845 / 380 / 305	61.3%	36.1%
Downward	845 / 360 / 296	55.7%	36.8%
Marvin	845 / 224 / 211	46.0%	36.5%
Yahsp	845 / 255 / 210	84.8%	54.3%
Macro-FF	845 / 189 / 138	87.7%	49.3%
Til-Sapa	845 / 63 / 63	82.5%	3.2%
P-mep	845 / 98 / 91	70.3%	41.8%
Roadmapper	845 / 52 / 51	88.2%	72.5%
FAP	845 / 81 / 28	71.4%	53.6%

Comparison of LPG-td.bestquality and the other planners of IPC4
(plan quality = number of plan actions)

LPG-td.bestquality versus	Problems Solved by LPG-td/IPC4-Planner/Both	% Better Quality Plans minus % Worse Quality Plans	% Much Better Quality Plans minus % Much Worse Quality Plans
SGPlan	845 / 1090 / 771	12.5%	8.9%
Crickey	845 / 364 / 293	51.5%	27.3%
Downward-diagonally	845 / 380 / 305	27.5%	4.6%
Downward	845 / 360 / 296	22.3%	6.4%
Yahsp	845 / 255 / 210	41.9%	20.5%
Macro-FF	845 / 189 / 138	31.9%	4.3%
Roadmapper	845 / 52 / 51	62.7%	19.6%
FAP	845 / 81 / 28	42.9%	35.7%

Comment: In this table we consider only the planners that did not attempt to optimize the problem-specified plan metric. As plan quality metric, these planners used the number of actions in the plan, or they just provided any solution they could find with no attempt to optimize plan quality. Overall these results show that, using the number of plan actions, LPG-td.bestquality performs much better than the other planners.