pat langley school of computing and informatics arizona state university tempe, arizona institute...
TRANSCRIPT
Pat Langley
School of Computing and InformaticsArizona State University
Tempe, Arizona
Institute for the Study of Learning and ExpertisePalo Alto, California
Robust Reasoning and LearningAbout Goal-Directed Activities
Thanks to T. Konik, D. Choi, U. Kutur, and D. Nau for their contributions. This talk reports work funded by grants from DARPA, which is not responsible for its contents.
Abductive Plan Understanding
We can state the task of abductive plan understanding as:
Given: A set of generalized conditional hierarchical plans;
Given: A partial sequence of observed actions or events;
Find: An explanation of these events in terms of other agents’ goals and intentions.
We can also state a related task that involves plan learning:
Given: A set of primitive action models (plan operators);
Given: A partial sequence of action/event sequences with associated goals;
Find: A set of generalized conditional hierarchical plans that explain these and future behaviors.
Learning Plan Knowledge from Demonstration
Plan Knowledge
If Impasse
Problem
?InitialState goal
LIGHT
Demonstration Traces
Background knowledge
Reactive Executor
Learnedplan
knowledge
Concept definitions
Action model
States and actions
HTNs
Expert
Primitive Concept
assigned-mission(?patient ?mission)
Nonprimitive Concept
patient-form-filled(?patient)
Inputs to LIGHT: Conceptual Knowledge
Conceptual kowledge is cast as Horn clauses that specify relevant relations in the environment
– Hierarchically organized in memory– Divided into primitive and non-primitive predicates
Inputs to LIGHT: Action Models
Effects Conceptarrival-time(?patient)
Precondition Conceptpatient(?p) and
travel-from(?p ?from) andtravel-to(?p ?to)
Actionget-arrival-time
(?patient ?from ?to)
Operators describe low-level actions that agents can execute directly in the environment
– Preconditions: legal conditions for action execution– Effects: expected changes when action is executed
Inputs to LIGHT: Expert Traces and Goals
Expert demonstration traces– actions the expert takes and the resulting belief state
State: set of concept instancesGoal is a concept instance in the final state
– LIGHT learns generalized skills that achieves similar goals
Action instanceget-arrival-time(P2)
Concept instanceassigned-flight
(P1 M1)
StateGoal Concept
all-patients-arranged
Outputs of LIGHT: HTN Methods
Methods decompose goals into subgoals– if you have a goal and its precondition is satisfied, then apply
its submethods or its operators Similar to regular HTNs but methods indexed by goals achieved
precondition concept
operator
HTN method
HTN goal concept HTN methodsubgoal
Learning HTNs by Trace Analysis
concepts
actions
Action Chaining
Learning HTNs by Trace Analysis
Concept Chaining
concepts
actions
Learning HTNs by Trace Analysis
Explanation Structure
dest-airportpatient1 SFO
arrival-timeNW32, 1pm
query-arrival-time
ScheduledNW32
locationpatient1 SFO 1pm
assignedpatient1 NW32
Flight-available
assignpatient1, NW32
transfer-hospitalpatient1, hospital2
arrange-ground-transportationSFO, hospital2, 1pm
close-airport hospital2, SFO
Time:1 Time:2
Time:3
Hierarchical Task Network
dest-airport?patient ?loc
arrival-time?flight ?time
query-arrival-time
Scheduled?flight
location?patient ?loc ?time
assigned?patient ?flight
Flight-available
assign?patient ?flight
transfer-hospital?patient ?hospital
arrange-ground-transportation?loc ?hospital ?time
close-airport ?hospital ?loc
Adapting HTNs to Plan Understanding
HTNs and methods for learning them (like LIGHT) are typically designed for generating and executing plans.
To adapt HTNs to plan understanding, we must revise the framework to support abductive inference when:
actions and events are only partially observed;
some goals and plans are more likely than others;
observations of others’ behaviors are inherently noisy.
These characteristics require extensions to our representation, performance methods, and learning mechanisms.
Markov Task Networks
To this end, we have designed a new representatonal formalism for plan knowledge – Markov task networks – that include:
A set of goal-indexed HTN methods, each with; a prior probability, , on the method ’s goal a conditional probability, , for its precondition a conditional probability, for each subgoal
A set of Horn clauses, each with: a prior probability, , of the clause ’s head a conditional probability, , for each condition
This framework appears better suited to abductive inference about goal-directed behavior than Markov logic.
( )kP G KM
( | )k KP R M
( | )k KP S M
( | )k qP D C
qC( )qP R
Markov Task Networks
A Markov task network is a goal-indexed HTN with probabilities for: goals the agent may aim to achieve subgoals he may pursue when using a given method preconditions that suggest he is using the method constraints among the subgoal orders
It also includes probabilistic information about relevant relational concepts.
P(Precondition|Method)
P(Goal) P(Subgoal| Method)
P(Goal)
P(Precondition|Method)
P(Subgoal| Method)
Inference Over Markov Task Networks
We can estimate the posterior probability of each goal in a Markov task network given a sequence of observed states by computing:
Where when is a primitive relation that occurs in .
To obtain actual probabilities, we normalize using the expressions:
This is a variant on cascaded Bayesian classifiers (Provan et al., 1996).
kG
: ( )'( | ) ( ) max ( | ) ( | )
q nn n k q k
q head C Rk
P R O P R P D C P D O=
= ∏
( | ) 1kP D O =kD O
O
( | ) '( | ) /( '( | ) '( | ))K k k kP G O P G O P G O P G O= +
( | ) '( | ) /( '( | ) '( | ))n n n nP R O P R O P R O P R O= +
)
Learning in Markov Task Networks
Like other probabilistic frameworks, Markov task networks require two forms of learning:
Parameter estimation occurs either:
by simple counting, as in naïve Bayes, in the fully supervised case where all goals/subgoals are given
by expectation maximization in the partly supervised case where only the top-level goal is provided
Structure learning occurs as in LIGHT, except that:
Explanation takes advantage of methods learned earlier
This process finds the most probable account of events
Both forms of learning should be efficient computationally and require few training cases.
Missing concepts
Missing actions
Learning Markov Task Networks by Trace Analysis
Trace analysis proceeds as before, but guided by probabilistic inference that allows for:
• Missing conceptual relations in states
• Missing actions that connect states
When an existing method is used to explain a trace, probabilities are updated accordingly.
Plans for Future Research
To evaluate the framework of Markov task networks, we must:
Implement the performance and learning algorithms
Design tasks in realistic simulators like OneSAF and MadRTS
Use these simulators to generate sequences of observed states
Provide background knowledge about these domains
Measure accuracy of goal inference given handcrafted task networks
Measure ability of learned task networks to produce similar results
Experimental results of this sort will suggest ways to improve our formulation and its methods for inference and learning.
Related Work on Abduction and Learning
Our approach incorporates ideas from a number of traditions:
Hierarchical task networks (Nau et al., 1999; Choi & Langley, 2005)
Logical methods for abductive inference (Ng & Mooney, 1990)
Relational Bayesian classifiers (Flach & Lachiche, 1999)
Cascaded Bayesian classifiers (Provan, Langley, & Binford, 1996) Explanation-based learning from expert traces (Segre, 1987)
Statistical relational learning (Muggleton, 1996; Domingos, 2004)
However, it adapts and combines them in ways appropriate to the task of abductive plan understanding and learning.
End of Presentation
Hierachical Concepts
(in-rightmost-lane ?self ?clane) :percepts (self ?self) (segment ?seg)
(line ?clane segment ?seg) :relations (driving-well-in-segment ?self ?seg ?clane)
(last-lane ?clane) (not (lane-to-right ?clane ?anylane))
(driving-well-in-segment ?self ?seg ?lane) :percepts (self ?self) (segment ?seg) (line ?lane segment ?seg) :relations (in-segment ?self ?seg) (in-lane ?self ?lane)
(aligned-with-lane-in-segment ?self ?seg ?lane)(centered-in-lane ?self ?seg ?lane)(steering-wheel-straight ?self)
(in-lane ?self ?lane) :percepts (self ?self segment ?seg) (line ?lane segment ?seg dist ?dist)) :tests (> ?dist -10) (<= ?dist 0)
Primitive Concepts
Nonprimitive Concepts
(in-rightmost-lane ?self ?line) :percepts (self ?self) (line ?line) :start (last-lane ?line) :subgoals (driving-well-in-segment ?self ?seg ?line)
(driving-well-in-segment ?self ?seg ?line) :percepts (segment ?seg) (line ?line) (self ?self) :start (steering-wheel-straight ?self) :subgoals (in-segment ?self ?seg)
(centered-in-lane ?self ?seg ?line)(aligned-with-lane-in-segment ?self ?seg ?line)(steering-wheel-straight ?self)
(in-segment ?self ?endsg) :percepts (self ?self speed ?speed)
(intersection ?int cross ?cross)(segment ?endsg street ?cross angle ?angle)
:start (in-intersection-for-right-turn ?self ?int) :actions (steer 1)
Hierarchical Methods
Primitive Skill
Nonprimitive Skill