finding admissible bounds for over- subscribed planning problems j. benton menkes van den...

Finding Admissible Bounds for Over-

subscribed Planning Problems

J. Benton

Menkes van den Briel

Subbarao Kambhampati

Arizona State University

Is this plan “good”?

How good is a given plan

How to drive a planner to find a

good planRelated

Admissibleheuristics

Need a heuristic schema that admits degrees of relaxation

Helps per-node useHelps one-shot use

Especially importantwhen quality may

vary widely{e.g.,

when wehave manysoft goals

Challenges

1. Build a strong admissible heuristic

2. Provide a way to add relaxation for varied use

An integer programming (IP) based heuristic

Use the linear programming (LP) relaxation

PSPUD

Partial Satisfaction Planning with Utility Dependency

cost: 20 cost: 5

(at t loc2)(in p1

t)

(move t loc2) (unload p1 loc2)(at t loc1)

(in p1 t)

(at t loc2)(at p1 loc2)

utility((at t loc1) & (at p1 t)) = 60

cost: 20

(move t loc1) (at t loc1)(at p1 loc2)

utility((at t loc1)) = 10 utility((at p1 loc2)) = 10

util(S0): 10

S0

util(S1): 0

S1

util(S2): 10

S2

util(S3): 10+10+60=80

S3

sum cost: 20 sum cost: 25 sum cost: 45

loc2loc1

net benefit(S0): 10-0=10net benefit(S1): 0-20=-20net benefit(S2): 10-25=-15net benefit(S3): 80-45=35

Actions have cost Goal sets have utility

Building a Heuristic

loc2loc1

A network flow model on variable transitions

truckpackage

Capture relevant transitions with multi-valued fluents

add prevail constraints

add initial statesadd goal states

add cost on actionsadd utility on goals

cost: 20

cost: 20

cost: 5

cost: 5

cost: 5

cost: 5

util: 10util: 10

util: 60

Building a Heuristic

truckpackage

cost: 20

cost: 20

cost: 5

cost: 5

cost: 5

cost: 5

util: 10

util: 60

Constraints of this model

2. If a fact is deleted, then it must be added to re-achieve a value.3. If a prevail condition is required, then it must be achieved.

1. If an action executes, then all of its effects and prevail conditions must also.

4. A goal utility dependency is achieved if its goals are achieved.

util: 10

FormulationVariablesaction(a) ∈ Z+ The number of times a ∈ A is executed

effect(a,v,e) ∈ Z+ The number of times a transition e in state variable v is caused by action a

prevail(a,v,f) ∈ Z+ The number of times a prevail condition f in state variable v is required by action a

endvalue(v,f) ∈ {0,1}

Equal to 1 if value f is the end value in a state variable v

goaldep(k) Equal to 1 if a goal dependency is achievedParameterscost(a) the cost of executing action a ∈ A

utility(v,f) the utility of achieving value f in state variable v

utility(k) the utility of achieving achieving goal dependency Gk

1. If an action executes, then all of its effects and prevail conditions must also.action(a) = Σeffects of a in v effect(a,v,e) + Σprevails of a in v

prevail(a,v,f)2. If a fact is deleted, then it must be added to re-achieve a value.

1{if f ∈ s0[v]} + Σeffects that add f effect(a,v,e) = Σeffects that delete f effect(a,v,e) + endvalue(v,f)3. If a prevail condition is required, then it must be achieved.

1{if f ∈ s0[v]} + Σeffects that add f effect(a,v,e) ≥ prevail(a,v,f) / M4. A goal utility dependency is achieved if its goals are achieved.

goaldep(k) ≥ Σf in dependency k endvalue(v,f) – |Gk| – 1

goaldep(k) ≤ endvalue(v,f) ∀ f in dependency k

FormulationVariablesaction(a) ∈ Z+ The number of times a ∈ A is executed

effect(a,v,e) ∈ Z+ The number of times a transition e in state variable v is caused by action a

prevail(a,v,f) ∈ Z+ The number of times a prevail condition f in state variable v is required by action a

endvalue(v,f) ∈ {0,1}

Equal to 1 if value f is the end value in a state variable v

goaldep(k) Equal to 1 if a goal dependency is achievedParameterscost(a) the cost of executing action a ∈ A

utility(v,f) the utility of achieving value f in state variable v

utility(k) the utility of achieving achieving goal dependency Gk

Objective FunctionΣv∈V,f∈Dv utility(v,f) endvalue(v,f) + Σk∈K utility(k) goaldep(k) – Σa∈A cost(a)

action(a)Maximize Net Benefit

Experimental Setup

Three modified IPC 3 domains: zenotravel, satellite, rovers

Compared with , a cost propagation-based heuristic

(maximize net benefit)One IPC 5 domain: Rovers, simple preferences(minimize (goal achievement violations + action

cost))

heuristic value at initial state versus optimal plan

Found using a branch and bound search

LP > IP > OPTIMALmaximizingLP < IP < OPTIMALminimizing

Results

ResultsIP LP

Summary

IP gives bound on quality of plan

Doubly relaxed (LP) to provide heuristic for search (Search I Session: Monday at 4:10 pm)

Future Work

Improve encoding (to give better LP values)

Use fluent merging

finding admissible bounds for over- subscribed planning problems j. benton menkes van den...

Documents

f effecta

value f

goaldepk f

condition f

utility dependency cost

utility slide

goals cost

t loc2 unload p1 loc2