structured models for decision making daphne koller stanford university [email protected] muri...

45
Structured Models for Decision Making Daphne Koller Stanford University [email protected] MURI Program on Decision Making under Uncertainty July 18, 2000

Post on 24-Jan-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

Structured Models forDecision Making

Daphne KollerStanford University

[email protected]

MURI Program on Decision Making under UncertaintyJuly 18, 2000

Page 2: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

2

Roadmap

BayesNets

DBNs

FactoredMDPs

Static

Dynam

ic

Decisi

on

Proble

m

PRMs

DynamicPRMs

RelationalMDPs

EncapsulationReuse

EncapsulationApproximation

Factored Policy Iteration,Efficient PRM inference

Page 3: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

3

Outline

Probabilistic Relational Models– Representing complex domains

– Structural uncertainty

• Temporal models

• Decision making

Page 4: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

4

Basic units of knowledge

entitiespropertiesrelations

attributes

Page 5: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

5

So what?• Set of entities and relations between them is

determined at BN design time– structure must be known in advance– hard to adapt to changes

• BNs for complex domains are large & unstructured very hard to build• No ability to generalize

– across “similar” individuals– across related situations

BNs are not suitable for representing complex,

structured, flexible domains.

Page 6: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

6

Probabilistic Relational Models

• Combine advantages of predicate logic & BNs: – natural domain modeling: objects, properties, relations;– generalization over a variety of situations;– compact, natural probability models.

• Integrate uncertainty with relational model:– properties of domain entities can depend on properties

of related entities;– uncertainty over relational structure of domain.

Page 7: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

7

Real-World Case Study

• Example object classes:

– Battalion

– Battery

– Vehicle

– Location

– Weather.

• Example relations:

– At-Location

– Has-Weather

– Sub-battery/In-battalion

– Sub-vehicle/In-battery

Battlefield situation assessment for missile units• several locations• many units• each has detailed model

Page 8: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

8

Under Fire

At-Location

#(Launcher.status = ok)

Next Mission

Scud Battery: Simplified PRM

LauncherStatus

Report

Page 9: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

9

SCUD Battery Model

Page 10: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

10

Cargo Vehicle Group

Page 11: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

11

MURI Kickoff Meeting 7/18/00

Original BN*: SCUD Battery

Disadvantages• A lot more complex

– must include relevant attributes of related objects

• Hard to transfer information between different BN models

*Built by IET, Inc.

Page 12: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

12

Situation Models• Complex situations can be described compactly by

specifying objects and relations between them• Class model is instantiated for each object, with

probabilistic dependencies induced by relations

Angel Island Alcatraz

3rd Scud Battalion 17th Scud Battalion

Scud Battery 1 Scud Battery 2 Scud Battery 3

Launcher 1

Page 13: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

13

Example reasoning pattern

Scud-Battalion-Charlie

Battery1

under_fire

Group-TLs

hit

#reported_damaged

damaged

rep_damaged

TL1

damaged

rep_damaged

TL2

Loc

hide-support

hit

under_fire

#reported_damaged

hide-support

heavy

none

good

0.06 0.44 0.28 0.33

Page 14: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

14

Inference in PRMs

+Induces

BN overattributes

Under Fire

Attack

B1.Launch

B1.Success

B1.L1.Damaged

B1.L1.Report

B1.L2.Damaged

B1.L2.Report B2.Launch

B2.Success

B2.L1.Damaged

B2.L1.Report

B2.L2.Damaged

B2.L2.Report

Angel Island Alcatraz

3rd Scud Btn 17th Scud Btn

Scud Bty 1 Scud Bty 2 Scud Bty 3

Launcher 1

PRMSituationdescription

Page 15: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

15

Exploit Structure for Inference• Encapsulation: objects interact in limited ways

Inference can be encapsulated within objects, with “communication” limited to interfaces

• Reuse: objects from same class have same model Inference from one can be reused for others

Page 16: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

16

Effects of exploiting structure

0

1000

2000

3000

4000

5000

6000

1 2 3 4 5 6 7 8 9 10

flat BNno reuse

with reuse

#vehicles of each type / battery

runn

ing

time

in s

econ

ds

Page 17: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

17

Extension: Structural Uncertainty• Uncertainty about model structure:

– Set of objects: is that radar signal from a tank– Relations between objects: location of SCUD-Battalion-C

• Task 1: Seamless integration w. probabilistic model– structural variables can depend on other variables.

• Task 2: Efficient Inference– Use approximate inference to simplify model

• variational methods to summarize multiple potential influences• MCMC for traversing possible relationships

– Use structured inference (encapsulation/reuse) on simplified model

Page 18: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

18

Outline

• Probabilistic Relational Models

Temporal models– Structured belief-state tracking

– Dynamic PRMs: time, events and actions

• Decision making

Page 19: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

19

Dynamic Bayesian Nets

...Velocity(t+2)

Position(t+2)

Action(t+2)

Velocity(t+1)

Position(t+1)

Action(t+1)

Velocity(t)

Position(t)

Action(t)

Observed_pos(t) Observed_pos(t+1) Observed_pos(t+2)

))(|)((P

),|(P),|(P)|(P )()()()()()()()(

tStatetState

tttttttt VLLAVVAA

1

111

• Compact representation of system dynamics– discrete, continuous, hybrid

• Generalization of Kalman filters

Page 20: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

20

Tracking System State

• In discrete/hybrid systems, belief state representation is exponential in # of state variables

• In hybrid systems, # of distinct hypotheses grows exponentially over time

TaskTask: Maintain : Maintain Belief stateBelief state — — distribution over distribution over current state given evidence so farcurrent state given evidence so far

...Velocity(t+2)

Position(t+2)

Action(t+2)

Velocity(t+1)

Position(t+1)

Action(t+1)

Velocity(t)

Position(t)

Action(t)

Observed_pos(t) Observed_pos(t+1)

Page 21: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

21

Approximate Tracking• Decompose belief state along “subsystem lines”

– Maintain belief state as product of marginals

• In hybrid systems, keep mixture of hypotheses for every subsystem– Merge hypotheses associated with similar density

H

X D

i

ii

True

False

0.7

0.3

Page 22: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

22

Case Study: Diagnosis & Tracking for Five-Tank System

• State space per time slice– eleven-dimensional continuous space– 227 discrete failure modes

F1o F5oF23

observables

Page 23: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

23

The doomsday scenario

0

0.5

1

1.5

2

0 5 10 15 20 25 30 35 40 45 50

C12

C45

C23

Measurement errors: F23, F5o

Neg drift

Neg drift burst

burst

burst

Page 24: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

24

Algorithm Performance

0 5 10 15 20 25 30 35 40 45 500.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

C12

C45

P5

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

0 5 10 15 20 25 30 35 40 45 50

C12

C45

P5

Omniscient Kalman Filter

Page 25: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

25

Dynamic PRMs• Goal:Goal: Model complex structured systems Model complex structured systems

– that evolve over timethat evolve over time– where agents take compound structured actionswhere agents take compound structured actions

& construct effective scalable inference algorithm& construct effective scalable inference algorithm

• Easy part: Add time relation to PRMs– Allows notion of current and previous state– Maintains notions of structured objects and relations

• Challenges:– Appropriate representation for actions, events– Modeling changes in domain structure (objects, relations)– Effective inference that exploits structure

Page 26: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

26

Dynamic PRMs: Event Models

• Events can be triggered by external events– an agent’s action

or by system dynamics– e.g., a unit reaches its destination

• Events can influence the system structure– discrete change in continuous dynamics

• truck velocity goes to 0 when destination is reached

– modification of relational structure• aircraft taking off is no longer on aircraft carrier

– creation / deletion of objects• units entering/leaving battlespace

Events: Discrete points where the system undergoes a discontinuous change

Page 27: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

27

Dynamic PRMs: Adding Actions

• Use relational / hierarchical action representation– class hierarchy for Move action– an instantiation of a particular action is related to object

moving, road taken, origin, destination

• Actions can depend on and influence attributes of related objects– duration of Move action may depend on road condition,

influence status of moving objects

• Actions are like events, can change domain structure

• Complex actions can be composed of simpler ones:– Effects of complex action derived from that of subactions

Page 28: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

28

Inference in Dynamic Systems

• Main tasks:– situation monitoring– prediction

• Goal: Exploit structure as we did in PRMs

• First step: Encapsulation– Exploit structure of weakly interacting subsystems– Applied successfully to Dynamic Bayesian Nets

Page 29: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

29

Tracking in Dynamic PRMs• Use relational structure to guide belief state

approximation– direct dependencies only between related objects

• Deal with dynamic structure:– relations and even domain objects change over time– want to adjust our approximation to context– structural uncertainty critical

• Event-driven tracking– no reason to use fine-grained model of “boring bits”– but “fast forward” requires ability to propagate dynamics

over variable-length segments

Page 30: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

30

Outline

• Probabilistic Relational Models

• Temporal models

Decision making– Planning in factored MDPs

– Planning in relational MDPs

Page 31: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

31

What is a Markov Decision Process?

• An MDP is a controlled dynamic process• Stochastic transition between states• Actions affect system dynamics • Rewards or costs are associated with states

• Objective: Drive process to regions of high reward– MDP solutions are policies– Policies assign an action to every state

Page 32: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

32

MDP Policies & Value Functions

Suppose an expert told you the “value” of each state:

V(s1) = 10 V(s2) = 5

s1

s2

Action 1

0.5

0.5

s1

s2

Action 2

0.7

0.3

Page 33: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

33

Greedy Policy Construction

Pick action with highest expected future value:

'

)'(),|'()(maxarg)(sa sVassPsRs

Expectation overnext-state values

)(greedyV

Page 34: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

34

Bootstrapping: Policy Iteration

Guaranteed to find globally optimal policy ifV is defined over explicit states, i.e., if V is exponential

Guess VRepeat untilpolicy doesn’tchange

Idea: Greedy selection is useful even with suboptimal V

= greedy(V)V = value of acting on

Exploit Structure with Factored Policy Iteration

Page 35: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

35

Factored MDPs: DBNS + Rewards

X

Y

Z

t t+1

R1

Rewards have smallsets of parent variables too

Total reward addssub-rewards:R=R1+R2

R2

Page 36: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

36

Linearly Decomposable Value Functions

Approximate high-dimensional value functionwith combination of lower-dimensional functions

Motivation: Multi-attribute utility theory (Keeney & Raifa)

Note:Overlappingis allowed!

Page 37: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

37

Decomposable Value Functions

• Each basis function hi is the status of some small part(s) of a complex system– status of a machine– inventory of a store– status of a subgoal

wAshwsVi ii )()(

~Linear combination of restricted domain functions

Page 38: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

38

Exploiting Structure

Key operation: backprojection of a basis function thru a DBN transition

X

Y

Z

)(zfh 1)(yzfPh 1 Structure allows us to consider operations oversmall subsets of variables,not the entire state space.

x x

yz

zyzyzy

x x

yz

zyzyzy

Page 39: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

39

Policy FormatFactored value functions compact action effect descriptions

x x

yz

zyzyzy

+8+12

x x

yz

zyzyzy

+11

+1

+4

+7

Action 1 Action 2

Sorted result values form a decision list:

If then action 1 else if then action 2 else if then action 1

xyz

x

Page 40: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

40

Factored Policy Iteration: Summary

Guess V = greedy(V)V = value of acting on

Structure inducesdecision-list policy

Key operations isomorphicto BN inference

• Time per iteration reduced from O((2n)3) to O(Cbk3)• Cb = cost of Bayes net inference (function of structure)• k = number of basis functions (k << 2n)

Page 41: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

41

Run Times

Note: Nearly optimal policy found in all cases ( 6).

0

10000

20000

30000

40000

50000

60000

70000

4 6 8 10 12 14 16

CP

U S

econ

ds/

Sta

tes

State Variables

StatesSeconds

3n^3

Page 42: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

42

Planning in Relational MDPs• Replace DBN transition model with dynamic PRM• Generalize factored policy iteration

– Define basis functions via relational formulas:

– Replace BN inference with PRM inference as key step

• Exploit hierarchical structure of complex actions by encapsulating decision making along hierarchy

• Potential benefits:– Tractable approximate planning in relational domains– Unification of classical and stochastic planning

5- else 10, then base)closeto(x,)(tank: if xx

Page 43: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

43

Conclusions: Past & Present• PRMs compactly represent complex systems with

multiple interacting objects:– coherent (probabilistic) semantics;– structured representation: modularity & reuse.

• Scalable inference that exploits structure

• Tracking algorithms for DBNs that exploit system decomposition

• Planning algorithms in MDPs that exploit structure of system and of value functions

Theme: Representation & inference scale up,Theme: Representation & inference scale up, if we exploit structureif we exploit structure

Page 44: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

44

Conclusions: Future• Better inference for densely connected PRMs

• Extending PRMs with time, events, actions

• Exploit structure for inference in dynamic PRMs:– system decomposition into subsystems

– relational context

– varying time granularity

• Planning in dynamic PRMs:– extend factored policy iteration to PRMs

– exploit hierarchical action decomposition

Page 45: Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18,

MURI Kickoff Meeting 7/18/00

45

Acknowledgements• Students & postdocs

– Nir Friedman ( Hebrew U.)– Dirk Ormoneit– Ron Parr ( Duke)– Xavier Boyen– Urszula Chajewska– Lise Getoor– Carlos Guestrin– Uri Lerner– Uri Nodelman– Avi Pfeffer ( Harvard)– Eran Segal– Benjamin Taskar – Simon Tong – Brian Milch ( Berkeley)– Ken Takusagawa ( MIT)

• Support:– PECASE Award via ONR YIP– DARPA’s HPKB Program– MURI Program “Integrated

Approach to Intelligent Systems”– Sloan Faculty Fellowship– DARPA’s IA Program under

subcontract to SRI International– DARPA’s DMIF Program under

subcontract to IET Inc.– ONR grant

PhD

stu

dent

sP

ostd

ocs

Ugr

ad

http://robotics.stanford.edu/~koller/