pat langley school of computing and informatics arizona state university tempe, arizona institute...

23
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California Robust Reasoning and Learning About Goal-Directed Activities Thanks to T. Konik, D. Choi, U. Kutur, and D. Nau for their contributions. This talk reports work funded by grants from DARPA, which is not responsible for its contents.

Upload: brandon-blackburn

Post on 27-Mar-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California

Pat Langley

School of Computing and InformaticsArizona State University

Tempe, Arizona

Institute for the Study of Learning and ExpertisePalo Alto, California

Robust Reasoning and LearningAbout Goal-Directed Activities

Thanks to T. Konik, D. Choi, U. Kutur, and D. Nau for their contributions. This talk reports work funded by grants from DARPA, which is not responsible for its contents.

Page 2: Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California

Abductive Plan Understanding

We can state the task of abductive plan understanding as:

Given: A set of generalized conditional hierarchical plans;

Given: A partial sequence of observed actions or events;

Find: An explanation of these events in terms of other agents’ goals and intentions.

We can also state a related task that involves plan learning:

Given: A set of primitive action models (plan operators);

Given: A partial sequence of action/event sequences with associated goals;

Find: A set of generalized conditional hierarchical plans that explain these and future behaviors.

Page 3: Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California

Learning Plan Knowledge from Demonstration

Plan Knowledge

If Impasse

Problem

?InitialState goal

LIGHT

Demonstration Traces

Background knowledge

Reactive Executor

Learnedplan

knowledge

Concept definitions

Action model

States and actions

HTNs

Expert

Page 4: Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California

Primitive Concept

assigned-mission(?patient ?mission)

Nonprimitive Concept

patient-form-filled(?patient)

Inputs to LIGHT: Conceptual Knowledge

Conceptual kowledge is cast as Horn clauses that specify relevant relations in the environment

– Hierarchically organized in memory– Divided into primitive and non-primitive predicates

Page 5: Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California

Inputs to LIGHT: Action Models

Effects Conceptarrival-time(?patient)

Precondition Conceptpatient(?p) and

travel-from(?p ?from) andtravel-to(?p ?to)

Actionget-arrival-time

(?patient ?from ?to)

Operators describe low-level actions that agents can execute directly in the environment

– Preconditions: legal conditions for action execution– Effects: expected changes when action is executed

Page 6: Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California

Inputs to LIGHT: Expert Traces and Goals

Expert demonstration traces– actions the expert takes and the resulting belief state

State: set of concept instancesGoal is a concept instance in the final state

– LIGHT learns generalized skills that achieves similar goals

Action instanceget-arrival-time(P2)

Concept instanceassigned-flight

(P1 M1)

StateGoal Concept

all-patients-arranged

Page 7: Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California

Outputs of LIGHT: HTN Methods

Methods decompose goals into subgoals– if you have a goal and its precondition is satisfied, then apply

its submethods or its operators Similar to regular HTNs but methods indexed by goals achieved

precondition concept

operator

HTN method

HTN goal concept HTN methodsubgoal

Page 8: Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California

Learning HTNs by Trace Analysis

concepts

actions

Page 9: Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California

Action Chaining

Learning HTNs by Trace Analysis

Page 10: Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California

Concept Chaining

concepts

actions

Learning HTNs by Trace Analysis

Page 11: Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California

Explanation Structure

dest-airportpatient1 SFO

arrival-timeNW32, 1pm

query-arrival-time

ScheduledNW32

locationpatient1 SFO 1pm

assignedpatient1 NW32

Flight-available

assignpatient1, NW32

transfer-hospitalpatient1, hospital2

arrange-ground-transportationSFO, hospital2, 1pm

close-airport hospital2, SFO

Time:1 Time:2

Time:3

Page 12: Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California

Hierarchical Task Network

dest-airport?patient ?loc

arrival-time?flight ?time

query-arrival-time

Scheduled?flight

location?patient ?loc ?time

assigned?patient ?flight

Flight-available

assign?patient ?flight

transfer-hospital?patient ?hospital

arrange-ground-transportation?loc ?hospital ?time

close-airport ?hospital ?loc

Page 13: Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California

Adapting HTNs to Plan Understanding

HTNs and methods for learning them (like LIGHT) are typically designed for generating and executing plans.

To adapt HTNs to plan understanding, we must revise the framework to support abductive inference when:

actions and events are only partially observed;

some goals and plans are more likely than others;

observations of others’ behaviors are inherently noisy.

These characteristics require extensions to our representation, performance methods, and learning mechanisms.

Page 14: Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California

Markov Task Networks

To this end, we have designed a new representatonal formalism for plan knowledge – Markov task networks – that include:

A set of goal-indexed HTN methods, each with; a prior probability, , on the method ’s goal a conditional probability, , for its precondition a conditional probability, for each subgoal

A set of Horn clauses, each with: a prior probability, , of the clause ’s head a conditional probability, , for each condition

This framework appears better suited to abductive inference about goal-directed behavior than Markov logic.

( )kP G KM

( | )k KP R M

( | )k KP S M

( | )k qP D C

qC( )qP R

Page 15: Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California

Markov Task Networks

A Markov task network is a goal-indexed HTN with probabilities for: goals the agent may aim to achieve subgoals he may pursue when using a given method preconditions that suggest he is using the method constraints among the subgoal orders

It also includes probabilistic information about relevant relational concepts.

P(Precondition|Method)

P(Goal) P(Subgoal| Method)

P(Goal)

P(Precondition|Method)

P(Subgoal| Method)

Page 16: Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California

Inference Over Markov Task Networks

We can estimate the posterior probability of each goal in a Markov task network given a sequence of observed states by computing:

Where when is a primitive relation that occurs in .

To obtain actual probabilities, we normalize using the expressions:

This is a variant on cascaded Bayesian classifiers (Provan et al., 1996).

kG

: ( )'( | ) ( ) max ( | ) ( | )

q nn n k q k

q head C Rk

P R O P R P D C P D O=

= ∏

( | ) 1kP D O =kD O

O

( | ) '( | ) /( '( | ) '( | ))K k k kP G O P G O P G O P G O= +

( | ) '( | ) /( '( | ) '( | ))n n n nP R O P R O P R O P R O= +

)

Page 17: Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California

Learning in Markov Task Networks

Like other probabilistic frameworks, Markov task networks require two forms of learning:

Parameter estimation occurs either:

by simple counting, as in naïve Bayes, in the fully supervised case where all goals/subgoals are given

by expectation maximization in the partly supervised case where only the top-level goal is provided

Structure learning occurs as in LIGHT, except that:

Explanation takes advantage of methods learned earlier

This process finds the most probable account of events

Both forms of learning should be efficient computationally and require few training cases.

Page 18: Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California

Missing concepts

Missing actions

Learning Markov Task Networks by Trace Analysis

Trace analysis proceeds as before, but guided by probabilistic inference that allows for:

• Missing conceptual relations in states

• Missing actions that connect states

When an existing method is used to explain a trace, probabilities are updated accordingly.

Page 19: Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California

Plans for Future Research

To evaluate the framework of Markov task networks, we must:

Implement the performance and learning algorithms

Design tasks in realistic simulators like OneSAF and MadRTS

Use these simulators to generate sequences of observed states

Provide background knowledge about these domains

Measure accuracy of goal inference given handcrafted task networks

Measure ability of learned task networks to produce similar results

Experimental results of this sort will suggest ways to improve our formulation and its methods for inference and learning.

Page 20: Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California

Related Work on Abduction and Learning

Our approach incorporates ideas from a number of traditions:

Hierarchical task networks (Nau et al., 1999; Choi & Langley, 2005)

Logical methods for abductive inference (Ng & Mooney, 1990)

Relational Bayesian classifiers (Flach & Lachiche, 1999)

Cascaded Bayesian classifiers (Provan, Langley, & Binford, 1996) Explanation-based learning from expert traces (Segre, 1987)

Statistical relational learning (Muggleton, 1996; Domingos, 2004)

However, it adapts and combines them in ways appropriate to the task of abductive plan understanding and learning.

Page 21: Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California

End of Presentation

Page 22: Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California

Hierachical Concepts

(in-rightmost-lane ?self ?clane) :percepts (self ?self) (segment ?seg)

(line ?clane segment ?seg) :relations (driving-well-in-segment ?self ?seg ?clane)

(last-lane ?clane) (not (lane-to-right ?clane ?anylane))

(driving-well-in-segment ?self ?seg ?lane) :percepts (self ?self) (segment ?seg) (line ?lane segment ?seg) :relations (in-segment ?self ?seg) (in-lane ?self ?lane)

(aligned-with-lane-in-segment ?self ?seg ?lane)(centered-in-lane ?self ?seg ?lane)(steering-wheel-straight ?self)

(in-lane ?self ?lane) :percepts (self ?self segment ?seg) (line ?lane segment ?seg dist ?dist)) :tests (> ?dist -10) (<= ?dist 0)

Primitive Concepts

Nonprimitive Concepts

Page 23: Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California

(in-rightmost-lane ?self ?line) :percepts (self ?self) (line ?line) :start (last-lane ?line) :subgoals (driving-well-in-segment ?self ?seg ?line)

(driving-well-in-segment ?self ?seg ?line) :percepts (segment ?seg) (line ?line) (self ?self) :start (steering-wheel-straight ?self) :subgoals (in-segment ?self ?seg)

(centered-in-lane ?self ?seg ?line)(aligned-with-lane-in-segment ?self ?seg ?line)(steering-wheel-straight ?self)

(in-segment ?self ?endsg) :percepts (self ?self speed ?speed)

(intersection ?int cross ?cross)(segment ?endsg street ?cross angle ?angle)

:start (in-intersection-for-right-turn ?self ?int) :actions (steer 1)

Hierarchical Methods

Primitive Skill

Nonprimitive Skill