siddharth srivastava, shlomo zilberstein, neil immerman university of massachusetts amherst hector...

19
Qualitative Numeric Planning Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra

Upload: brooks-lolley

Post on 30-Mar-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra

Qualitative Numeric Planning

Siddharth Srivastava, Shlomo Zilberstein, Neil ImmermanUniversity of Massachusetts Amherst

Hector GeffnerUniversitat Pompeu Fabra

Page 2: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra

The Story So Far…

Finite sets of states & registers

Actions with unit increments/decrements

[Lambek, ‘61]

Abacus Programs

Page 3: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra

Abacus Programs

The reachability problem for abacus programs as a method for reasoning about cyclic control flows

But reachability is equivalent to the halting problem for Turing machines ….

Undecidable

Approach: identify subclasses or

less expressive frameworks

… cannot capture TM, but still useful [Srivastava et al.,

ICAPS-10]

Page 4: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra

ND Quantitative Planning Problems

Consider situations where Actions increase or decrease numeric variables by unpredictable amounts Propositional variables can be added

Plans require cyclic control E.g., delivery problem with unknown ▪ Fuel▪ Distances ▪ Quantities of deliverables

Driving will use unpredictable amount of fuel

Page 5: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra

Formulation

X: set of positive valued variables, O: set of actions, I: initial state, G: goal condition

States: numeric assignments to variables

Action effects: : increases value of variable

: decreases value of variable

Actions may have multiple effects

Action preconditions & goal condition: or , for some subset of variables

Lower bound specific to execution.Need not be known

Page 6: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra

Example

A1: <> A2: <> Initial state: x=10, y=5; Goal: x=0 No finite acyclic solution!

Solution (intuitive):repeat (until x=0){ repeat (until y=0) { <>} <>}

Page 7: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra

ND Quantitative Planning Problems: Solutions

Policy: States Actions

Policy trajectory for :

Solution criterion: Every bounded policy trajectory must

terminate at a goal state in finitely many steps.

But how do we express policies?Cannot map all possible states (real-valued assignments to variables)

Page 8: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra

Expressing Solutions: Qualitative Formulation

Capture sets of ND numeric planning problems

Abstract/Qualitative states For each , only record or Initial state for previous example abstracted to:

Also represents infinitely many other non-zero assignments to and

Qualitative states capture sets of concrete states

Page 9: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra

Qualitative Formulation

X: Boolean variables; I: initial state, G: goal condition, O: action operators

State = Boolean assignment to each ()

Action effects (non-deterministic but finite)

Preconditions & goal condition

Page 10: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra

Solutions to the Qualitative Problem

Solutions represented as policies over qualitative states

Solution criterion: policy must solve every represented quantitative problem Termination of all –bounded trajectories

for all possible problem instantiations Goal achievement in all possible problem

instantiations

Page 11: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra

All We Need Are Qualitative Policies

A quantitative policy is essentially qualitative iff: Maps all states represented by a qualitative state to the

same action

Very useful: Cannot have explicit policy representations over

quantitative states anyway

Theorem

A non-deterministic quantitative planning problem P has a solution policy

iffP has a policy that is essentially qualitative

Page 12: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra

Solution Policy: Example

Policy

Transition Graph

Page 13: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra

Qualitative Solution Tests Can we tell if a policy is correct without ever

having to instantiate the problems?

Define the transition graph for a policy: Nodes = qualitative states Edge iff

Two aspects of the solution criteria: Goal-closed – termination possible only at goal states▪ Traverse the transition graph to check this

Finiteness of all possible instantiated trajectories▪ ??

Page 14: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra

Sieve Algorithm for Determining Termination

For every SCC:• Identify edges that

cannot be executed infinitely often

• Remove them, signifying stage when there executions have been exhausted

• Recurse on each resulting SCC

• Finally: terminating iff no SCC left on fixed point

Page 15: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra

Sieve Algorithm: Properties

Completeness: if Sieve algorithm returns non-terminating, an infinite execution is possible Surprising because of similarity to abacus

programs

Theorem

The sieve algorithm for determining termination of a qualitative policy is sound

and complete.

Page 16: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra

A Generate and Test Planner

Enumerate all possible policies (yes, this is impractical in general!) But computable! Check for

1. Goal-closed (any terminal nodes in transition graph must be goal nodes)

2. Termination using sieve algorithm

Page 17: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra

Results

Problems Nested variables Snow plow: using

snow blower spills snow onto the driveway

Delivery with fuel, unknown number of objects and truck capacities

Trash-collection

Solution Time (s)

Page 18: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra

Future Work

Improve generate and test: Start with strong cyclic qualitative

policies

Introduce constant landmarks/intervals of values

Identify limits of sieve algorithm’s applicability in abacus programs

Page 19: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra

Conclusions

QNP gives the first framework for planning with loops where termination and correctness are decidable properties For any class of loops Any number of unbounded variables