dynamic bayesian networks cse 473. © daniel s. weld 2 473 topics agency problem spaces search...

32
Dynamic Bayesian Networks CSE 473

Post on 22-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

Dynamic Bayesian Networks

CSE 473

Page 2: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 2

473 Topics

Agency

Problem Spaces

Search

Knowledge Representation

Reinforcement

Learning Inference Planning Learning

Logic Probability

Page 3: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 3

Last Time• Basic notions• Bayesian networks• Statistical learning

  Parameter learning (MAP, ML, Bayesian L)  Naïve Bayes  Structure Learning  Expectation Maximization (EM)

• Dynamic Bayesian networks (DBNs)• Markov decision processes (MDPs)

Page 4: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 4

Generative Planning

InputDescription of (initial state of) world (in some KR)Description of goal (in some KR)Description of available actions (in some KR)

OutputController

E.g. Sequence of actionsE.g. Plan with loops and conditionalsE.g. Policy = f: states -> actions

Page 5: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 5

Simplifying Assumptions

Environment

Percepts Actions

What action next?

Static vs.

Dynamic

Fully Observable vs.

Partially Observable

Deterministic vs.

Stochastic

Instantaneous vs.

Durative

Full vs. Partial satisfaction

Perfectvs.

Noisy

Determ

inist

ic

vs.

Stoch

ast

ic

Page 6: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 6

Static Deterministic ObservableInstantaneousPropositional

“Classical Planning”

DynamicR

ep

lan

ni

ng

/S

itu

ate

d

Pla

ns

Durative

Tem

pora

l R

eason

in

g

Continuous

Nu

meri

c

Con

str

ain

t re

ason

ing

(LP

/ILP

)

Stochastic

Con

tin

gen

t/C

on

form

an

t P

lan

s,

Inte

rleaved

execu

tion

MD

P

Policie

sP

OM

DP

P

olicie

s

PartiallyObservable

Con

tin

gen

t/C

on

form

an

t P

lan

s,

Inte

rleaved

execu

tion

Sem

i-M

DP

P

olicie

s

Page 7: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 7

Uncertainty

• DisjunctiveNext state could be one of a set of states.

• ProbabilisticNext state is a probability distribution over the set of states.

Which is a more general model?

Page 8: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 8

STRIPS Action Schemata

(:operator pick-up :parameters ((block ?ob1)) :precondition (and (clear ?ob1)

(on-table ?ob1) (arm-empty))

:effect (and (not (clear ?ob1)) (not (on-table ?ob1))

(not (arm-empty)) (holding ?ob1)))

• Instead of defining ground actions: pickup-A and pickup-B and …

• Define a schema:Note: strips doesn’t

allow derived effects;

you must be complete!}

Page 9: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 9

Nondeterministic “STRIPS”?

pick_up_Aarm_empty

clear_A

on_table_A

?

+ arm_empty- on_table_A- holding_A

Page 10: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 10

Probabilistic STRIPS?

pick_up_Aarm_empty

clear_A

on_table_A

P<0.9

+ arm_empty- on_table_A- holding_A

But what does this all mean anyway?

Page 11: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 11

Defn: Markov Model

Q: set of states

init prob distribution

A: transition probability distribution

Page 12: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 12

Probability Distribution, A

• Forward Causality   The probability of st does not depend directly

on values of future states.• Probability of new state could depend on

  The history of states visited.  Pr(st|st-1,st-2,…, s0)

• Markovian Assumption  Pr(st|st-1,st-2,…s0) = Pr(st|st-1)

• Stationary Model Assumption  Pr(st|st-1) = Pr(sk|sk-1) for all k.

Page 13: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 14

Representing A

Q: set of states

init prob distributionA: transition probabilities

how can we represent these?

s0

s1

s2

s3

s4

s5

s6

s0 s1 s2 s3 s4 s5 s6

s0

s1

s2

s3

s4

s5

s6

p12

Probability of transitioning from s1 to s2

∑ ?

Page 14: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 15

Factoring Q

• Represent Q simply as a   set of states?

• Is their internal structure?  Consider a robot domain  What is the state space?

s0

s1

s2

s3

s4

s5

s6

Page 15: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 16

A Factored domain

• Variables :   has_user_coffee (huc) ,   has_robot_coffee (hrc),   robot_is_wet (w),   has_robot_umbrella (u),   raining (r),   robot_in_office (o)

• How many states?

  26 = 64

Page 16: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 17

Representing Compactly

Q: set of states init prob distribution

How represent this efficiently?

s0

s1

s2

s3

s4

s5

s6

r u hrc

w

With a Bayes net (of course!)

Page 17: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 18

Representing A Compactly

Q: set of states

init prob distributionA: transition probabilities

s0

s1

s2

s3

s4

s5

s6

s0 s1 s2 s3 s4 s5 s6 … s35

s0

s1

s2

s3

s4

s5

s6

s35

p12 How big is matrix version of A? 4096

Page 18: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 19

Dynamic Bayesian Network

huc

hrc

w

u

r

o o

r

u

w

hrc

huc 8

4

16

4

2

2

Total values

required to represent transition probability table = 36

Vs. 4096 required

for a complete

state probablity

table?

T T+1

Page 19: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 20

Dynamic Bayesian Network

huc

hrc

w

u

r

o o

r

u

w

hrc

huc

T T+1

Also known as a Factored Markov Model

Defined formally as * Set of random variables * BN for initial state * 2-layer DBN for transitions

Page 20: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 21

This is Really a “Schema”

huc

hrc

w

u

r

o o

r

u

w

hrc

huc

T T+1 “Unrolling”

Page 21: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 22

Semantics:A DBN A Normal Bayesian Network

o

r

u

w

hrc

huchuc

hrc

w

u

r

o

0

o

r

u

w

hrc

huc

o

r

u

w

hrc

huc

1 2 3

Page 22: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 23

Multiple Actions

• Variables :   has_user_coffee (huc) , has_robot_coffee

(hrc), robot_is_wet (w), has_robot_umbrella (u), raining (r), robot_in_office (o)

• Actions :  buy_coffee, deliver_coffee, get_umbrella,

move

• We need an extra variable• (Alternatively a separate 2DBN /

action)

Page 23: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 24

Actions in DBN

huc

hrc

w

u

r

o o

r

u

w

hrc

huc

T T+1 a

Page 24: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 25

What can we do with DBNs?

• State estimation• Planning

  If you add rewards  (Next class)

Page 25: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 26

State Estimation

OR

HucHrcUW

? ?Move BuyC

Page 26: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 27

Noisy Observationshuc

hrc

w

u

r

o o

r

u

w

hrc

huc

T T+1

So So

True state vars (unobserved)

Noisy, sensor reading

o oSo .9 .2

a Action var Known with

certainty

Page 27: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 28

State Estimation

OR

HucHrcUW

? ?Move BuyC

So=T …

Page 28: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 29

Maybe we don’t know action!huc

hrc

w

u

r

o o

r

u

w

hrc

huc

T T+1

So So

True state vars (unobserved)

Noisy, sensor reading

o oSo .9 .2

a Action var (other agent’s

action)

Page 29: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 30

State Estimation

OR

HucHrcUW

? ?? ?

So=T …

Page 30: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 31

Inference

• Expand DBN (schema) into BN  Use inference as shown last week  Problem?

• Particle Filters

Page 31: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 32

Particle Filters

• Create 1000s of “Particles”  Each one has a “copy” of each random var  These have concrete values  Distribution is approximated by whole set

•Approximate technique! Fast!• Simulate

  Each time step, iterate per particle•Use the 2 layer DBN + random # generator to•Predict next value for each random var

  Resample!

Page 32: Dynamic Bayesian Networks CSE 473. © Daniel S. Weld 2 473 Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning

© Daniel S. Weld 33

Resampling

• Simulate  Each time step, iterate per particle

•Use the 2 layer DBN + random # generator to•Predict next value for each random var

  Resample:•Given observations (if any) at new time point•Calculate the likelihood of each particle•Create a new population of particles by

– Sampling from old population – Proportionally to likelihood