siddharth srivastava, shlomo zilberstein, neil immerman university of massachusetts amherst hector...
TRANSCRIPT
Qualitative Numeric Planning
Siddharth Srivastava, Shlomo Zilberstein, Neil ImmermanUniversity of Massachusetts Amherst
Hector GeffnerUniversitat Pompeu Fabra
The Story So Far…
Finite sets of states & registers
Actions with unit increments/decrements
[Lambek, ‘61]
Abacus Programs
Abacus Programs
The reachability problem for abacus programs as a method for reasoning about cyclic control flows
But reachability is equivalent to the halting problem for Turing machines ….
Undecidable
Approach: identify subclasses or
less expressive frameworks
… cannot capture TM, but still useful [Srivastava et al.,
ICAPS-10]
ND Quantitative Planning Problems
Consider situations where Actions increase or decrease numeric variables by unpredictable amounts Propositional variables can be added
Plans require cyclic control E.g., delivery problem with unknown ▪ Fuel▪ Distances ▪ Quantities of deliverables
Driving will use unpredictable amount of fuel
Formulation
X: set of positive valued variables, O: set of actions, I: initial state, G: goal condition
States: numeric assignments to variables
Action effects: : increases value of variable
: decreases value of variable
Actions may have multiple effects
Action preconditions & goal condition: or , for some subset of variables
Lower bound specific to execution.Need not be known
Example
A1: <> A2: <> Initial state: x=10, y=5; Goal: x=0 No finite acyclic solution!
Solution (intuitive):repeat (until x=0){ repeat (until y=0) { <>} <>}
ND Quantitative Planning Problems: Solutions
Policy: States Actions
Policy trajectory for :
Solution criterion: Every bounded policy trajectory must
terminate at a goal state in finitely many steps.
But how do we express policies?Cannot map all possible states (real-valued assignments to variables)
Expressing Solutions: Qualitative Formulation
Capture sets of ND numeric planning problems
Abstract/Qualitative states For each , only record or Initial state for previous example abstracted to:
Also represents infinitely many other non-zero assignments to and
Qualitative states capture sets of concrete states
Qualitative Formulation
X: Boolean variables; I: initial state, G: goal condition, O: action operators
State = Boolean assignment to each ()
Action effects (non-deterministic but finite)
Preconditions & goal condition
Solutions to the Qualitative Problem
Solutions represented as policies over qualitative states
Solution criterion: policy must solve every represented quantitative problem Termination of all –bounded trajectories
for all possible problem instantiations Goal achievement in all possible problem
instantiations
All We Need Are Qualitative Policies
A quantitative policy is essentially qualitative iff: Maps all states represented by a qualitative state to the
same action
Very useful: Cannot have explicit policy representations over
quantitative states anyway
Theorem
A non-deterministic quantitative planning problem P has a solution policy
iffP has a policy that is essentially qualitative
Solution Policy: Example
Policy
Transition Graph
Qualitative Solution Tests Can we tell if a policy is correct without ever
having to instantiate the problems?
Define the transition graph for a policy: Nodes = qualitative states Edge iff
Two aspects of the solution criteria: Goal-closed – termination possible only at goal states▪ Traverse the transition graph to check this
Finiteness of all possible instantiated trajectories▪ ??
Sieve Algorithm for Determining Termination
For every SCC:• Identify edges that
cannot be executed infinitely often
• Remove them, signifying stage when there executions have been exhausted
• Recurse on each resulting SCC
• Finally: terminating iff no SCC left on fixed point
Sieve Algorithm: Properties
Completeness: if Sieve algorithm returns non-terminating, an infinite execution is possible Surprising because of similarity to abacus
programs
Theorem
The sieve algorithm for determining termination of a qualitative policy is sound
and complete.
A Generate and Test Planner
Enumerate all possible policies (yes, this is impractical in general!) But computable! Check for
1. Goal-closed (any terminal nodes in transition graph must be goal nodes)
2. Termination using sieve algorithm
Results
Problems Nested variables Snow plow: using
snow blower spills snow onto the driveway
Delivery with fuel, unknown number of objects and truck capacities
Trash-collection
Solution Time (s)
Future Work
Improve generate and test: Start with strong cyclic qualitative
policies
Introduce constant landmarks/intervals of values
Identify limits of sieve algorithm’s applicability in abacus programs
Conclusions
QNP gives the first framework for planning with loops where termination and correctness are decidable properties For any class of loops Any number of unbounded variables