connecting learning and logic eyal amir, cambridge present., may 2006 1 eyal amir u. of illinois,...

80
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir Eyal Amir U. of Illinois, Urbana- U. of Illinois, Urbana- Champaign Champaign Joint work with: Dafna Shahaf, Allen Chang Joint work with: Dafna Shahaf, Allen Chang Connecting Connecting Learning and Logic Learning and Logic

Upload: whitney-robinson

Post on 19-Jan-2016

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

1

Eyal AmirEyal Amir

U. of Illinois, Urbana-ChampaignU. of Illinois, Urbana-ChampaignJoint work with: Dafna Shahaf, Allen ChangJoint work with: Dafna Shahaf, Allen Chang

Connecting Connecting Learning and LogicLearning and Logic

Page 2: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

2

Problem: Learn Actions’ EffectsProblem: Learn Actions’ Effects• GivenGiven: a sequence of observations over : a sequence of observations over

timetime– Example: Action a was executedExample: Action a was executed– Example: State feature f has value TExample: State feature f has value T

• WantWant: an estimate of actions’ effect : an estimate of actions’ effect modelmodel– Example: Example: aa is executable if the state is executable if the state

satisfies some propertysatisfies some property– Example: under condition _, Example: under condition _, aa has effect _ has effect _

Page 3: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

3

Example: Light SwitchExample: Light SwitchTimeTime Action Action Observe (after action)Observe (after action)

Posn. Bulb SwitchPosn. Bulb Switch

00 E E ~up ~up

11 go-W go-W ~E ~E ~on ~on

22 sw-up sw-up ~E ~E ~on ~on FAIL FAIL

33 go-E go-E E E ~up ~up

44 sw-up sw-up E E up up

55 go-W go-W ~E on~E on

Page 4: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

5

Example: Light SwitchExample: Light SwitchState 1State 1 State 2State 2

west east west eastwest east west east

~up ^ ~on ^ E ~up ^ ~on ^ E up ^ on ^ E up ^ on ^ E

• Flipping the switch changes world stateFlipping the switch changes world state

• We do not observe the state fullyWe do not observe the state fully

~up~up~on~on

upuponon

Page 5: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

6

Motivation: Exploration AgentsMotivation: Exploration Agents• Exploring partially observable domainsExploring partially observable domains

– Interfaces to new softwareInterfaces to new software– Game-playing/companion agentsGame-playing/companion agents– Robots exploring buildings, cities, planetsRobots exploring buildings, cities, planets– Agents acting in the WWWAgents acting in the WWW

• Difficulties: Difficulties: – No knowledge of actions’ effects aprioriNo knowledge of actions’ effects apriori– Many featuresMany features– Partially observable domainPartially observable domain

Page 6: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

7

Rest of This TalkRest of This Talk1.1. Actions in partially observed domainsActions in partially observed domains

2.2. Efficient learning algorithmsEfficient learning algorithms

3.3. Related Work & ConclusionsRelated Work & Conclusions

4.4. [[Theory behind AlgorithmsTheory behind Algorithms]]

Page 7: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

8

• LearningLearning: Update knowledge of the : Update knowledge of the transition relation and state of the worldtransition relation and state of the world

Learning Transition ModelsLearning Transition Models

k4k4k3k3k2k2k1k1

a1

Knowledgestate

s1 s4s3s2World state

a2 a3 a4Action

TransitionTransitionRelationRelation

TransitionTransitionKnowledgeKnowledge

33 11 33 22

Page 8: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

9

Action Model:Action Model:<State,Transition> Set<State,Transition> Set

Problem: n world features Problem: n world features 2^(2^n) transitions 2^(2^n) transitions

TT11++

TT33++

TT22++

TT33++

TT11++11

11

22

22

22 TT11++

TT22++

TT22++

TT33++

TT33++

TT22++

TT33++

33

11

22

33

11

1122

Page 9: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

10

Rest of This TalkRest of This Talk1.1. Actions in partially observed domainsActions in partially observed domains

2.2. Efficient algorithmsEfficient algorithms1.1. Updating a Directed Acyclic Graph (DAG)Updating a Directed Acyclic Graph (DAG)

2.2. Factored update (flat formula repn.)Factored update (flat formula repn.)

3.3. Related Work & ConclusionsRelated Work & Conclusions

4.4. [[Theory behind AlgorithmsTheory behind Algorithms]]

Page 10: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

11

Compact Encoding (Sometimes)Compact Encoding (Sometimes)• Transition Belief StateTransition Belief State = a logical = a logical

formula (transition relation and state)formula (transition relation and state)

• Observation = logical state formulaeObservation = logical state formulae

Page 11: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

12

Compact Encoding (Sometimes)Compact Encoding (Sometimes)• Transition Belief StateTransition Belief State = a logical = a logical

formula (transition relation and state)formula (transition relation and state)

• Observation = logical state formulaeObservation = logical state formulae

• Actions = propositional symbols assert Actions = propositional symbols assert effect ruleseffect rules– ““sw-upsw-up causes causes on ^ upon ^ up if if EE””– ““go-W go-W keepskeeps up” up”

(= (= “go-W“go-W causes causes upup if if upup” …)” …)

– Prop symbol: Prop symbol: go-Wgo-W≈≈upup, sw-up, sw-upononEE, sw-up, sw-upupup

EE

Page 12: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

tr1

tr2

initlocked

PressB causes¬locked if locked

PressB causeslocked if ¬locked

expl(0)

Updating the Status of “Locked”Time 0

Page 13: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

tr1

tr2

initlocked

PressB causes¬locked if locked

PressB causeslocked if ¬locked

expl(0)

Updating the Status of “Locked”Time t

expl(t)

Page 14: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

tr1

tr2

initlocked

PressB causes¬locked if locked

PressB causeslocked if ¬locked

expl(0)

Updating the Status of “Locked”Time t+1

expl(t)

........ ¬

expl(t+1)

“locked” holdsbecause PressB

caused it

¬

........

“locked” holdsbecause PressBdid not change it

Page 15: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

16

Algorithm: Algorithm: Update of a DAG1.1. Given: action a, observation o, Given: action a, observation o,

transition-belief formula transition-belief formula φφtt

2. for each fluent f,a. kb:= kb Λ logic formula “a is executable”b. expl'f := logical formula for the possible

explanations for f’s value after action ac. replace every fluent g in expl’f with a

pointer to explgd. update explf := expl'f

3.3. φφt+1t+1 is result of 2 together with o is result of 2 together with o

Page 16: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

17

Fast Update: DAG Action ModelFast Update: DAG Action Model• DAG-update algorithm takes constant DAG-update algorithm takes constant

time (using hash table) time (using hash table) to update to update formulaformula

• Algorithm is Algorithm is exactexact

• Result DAG has size O(TnResult DAG has size O(Tnkk+|+|φφ00|)|)

– T steps, n features, k features in action T steps, n features, k features in action preconditionspreconditions

– Still only n features/variablesStill only n features/variables

• Use Use φφtt with a DAG-DPLL SAT-solver with a DAG-DPLL SAT-solver

Page 17: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

18

Experiments: DAG Experiments: DAG UpdateUpdate

0

0.3

0.6

0.9

1.2

0 1000 2000 3000 4000 5000

Number of Steps

Tim

e f

or

Ste

p (

ms

ec

)

19 Fluents

55 Fluents

109 Fluents

DAG SLAF: Time for Step

Page 18: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

19

Experiments: DAGExperiments: DAG Update Update

0

200

400

600

800

1000

0 1000 2000 3000 4000 5000

Number of Steps

Fo

rmu

la S

ize

(n

od

es

/10

^4

)

19 Fluents

55 Fluents

109 Fluents

DAG SLAF: Formula Size

Page 19: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

20

Experiments: DAG Experiments: DAG QueriesQueriesInference on the DAG

0

250

500

750

1000

1250

1500

1750

2000

0 1000 2000 3000 4000

Number of Steps

Tim

e (s

ec/1

000)

Find a Model

False Rule

True Rule

False Fluent

True Fluent

Page 20: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

21

Rest of This TalkRest of This Talk1.1. Actions in partially observed domainsActions in partially observed domains

2.2. Efficient algorithmsEfficient algorithms1.1. Updating a Directed Acyclic Graph (DAG)Updating a Directed Acyclic Graph (DAG)

2.2. Factored update (flat formula repn.)Factored update (flat formula repn.)

3.3. Related Work & ConclusionsRelated Work & Conclusions

4.4. [[Theory behind AlgorithmsTheory behind Algorithms]]

Page 21: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

22

Distribution for Some ActionsDistribution for Some Actions Project[a](Project[a](Project[a](Project[a](Project[a]Project[a]

(()) Project[a](Project[a](Project[a](Project[a](Project[a]Project[a]

(())Project[a](Project[a](Project[a](Project[a](Project[a](TRUE)Project[a](TRUE)

• Compute update for literals in the formula Compute update for literals in the formula separately, and combine the resultsseparately, and combine the results•Known Success/FailureKnown Success/Failure• 1:1 Actions1:1 Actions

Page 22: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

23

Actions that map states 1:1Actions that map states 1:1• Reason forReason for distribution over distribution over

Project[a](Project[a](Project[a](Project[a](Project[a](Project[a]())

Project[a](Project[a](Project[a](Project[a](Project[a](Project[a]())

1:11:1

Non-1:1Non-1:1

Page 23: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

24

Algorithm: Factored LearningAlgorithm: Factored Learning• Given: action a, observation o, Given: action a, observation o,

transition-belief formula transition-belief formula φφtt

1.1. Precompute update for every literalPrecompute update for every literal

2.2. Decompose Decompose φφtt recursively, update recursively, update

every literal separately, and combine every literal separately, and combine the resultsthe results

3.3. Conjoin the result of 2. with o, Conjoin the result of 2. with o, producing producing φφt+1t+1

Page 24: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

25

Fast Update of Action ModelFast Update of Action Model• Factored Learning algorithm takes time Factored Learning algorithm takes time

O(|O(|φφtt|) to update formula|) to update formula

• Algorithm is Algorithm is exact whenexact when– We know that actions are 1:1 mappings We know that actions are 1:1 mappings

between statesbetween states– Actions’ effects are always the sameActions’ effects are always the same

• Otherwise, Otherwise, approximate resultapproximate result: : includes includes exact action model, but also othersexact action model, but also others

• Resulting representation is flat (CNF)Resulting representation is flat (CNF)

Page 25: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

26

Compact Compact FlatFlat Representation: Representation: How?How?

• Keep some property of invariant, e.g.,Keep some property of invariant, e.g.,– K-CNF (CNF with k literals per clause)K-CNF (CNF with k literals per clause)– #clauses bounded#clauses bounded

• Factored LearningFactored Learning: compact repn. if : compact repn. if – We know if action succeeded, orWe know if action succeeded, or– Action failure leaves affected propositions Action failure leaves affected propositions

in a specified nondeterministic state, orin a specified nondeterministic state, or– Approximate: We discard large clauses Approximate: We discard large clauses

(allows more states)(allows more states)

Page 26: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

27

Compact Representation in CNFCompact Representation in CNF• Action affects and depends on Action affects and depends on ≤≤k k

features features | |φφt+1t+1| | ≤≤ ||φφtt|·n|·nk(k+1)k(k+1)

• Actions always have same effectActions always have same effect

||φφt+1t+1| | ≤≤ O(t·n) O(t·n)

– If also every feature observed every If also every feature observed every ≤≤k k stepssteps | |φφt+1t+1| | ≤≤ O(n O(nk+1k+1))

– If (instead) the number of actions If (instead) the number of actions ≤≤kk

||φφt+1t+1| | ≤≤ O(n·2 O(n·2kkloglogkk))

Page 27: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

28

Experiments: Factored LearningExperiments: Factored LearningTime per Step

0123456789

10

number of steps

time

per

step

(m

illis

ec)

Page 28: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

29

SummarySummary• Learning of effects and preconditions of Learning of effects and preconditions of

actions in partially observable domainsactions in partially observable domains• Showed in this talk: Showed in this talk:

– Exact DAG update for any actionExact DAG update for any action– Exact CNF update, if actions 1:1 or w/o Exact CNF update, if actions 1:1 or w/o

conditional effectsconditional effects– Can update model efficiently without Can update model efficiently without

increase in #variables in belief stateincrease in #variables in belief state– Compact representation Compact representation

• Adventure games, virtual worlds, robotsAdventure games, virtual worlds, robots

Page 29: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

30

InnovationInnovation in this Research in this Research• First scalable learning algorithm for partially First scalable learning algorithm for partially

observable dynamic domainsobservable dynamic domains• Algorithm (DAG) Algorithm (DAG)

– Always exact and optimalAlways exact and optimal– Takes constant update timeTakes constant update time

• Algorithm (Factored)Algorithm (Factored)– Exact for actions that always have same effectExact for actions that always have same effect– Takes polynomial update timeTakes polynomial update time

• Can solve problems with n>1000 domain Can solve problems with n>1000 domain features (>2features (>210001000 states) states)

Page 30: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

31

Current Approaches and WorkCurrent Approaches and Work• Reinforcement Learning & HMMsReinforcement Learning & HMMs

– [Chrisman’92], [McCallum’95], [Boyen & Koller [Chrisman’92], [McCallum’95], [Boyen & Koller ’98], [Murphy etal. ’00], [Kearns, Mansour, Ng ’00]’98], [Murphy etal. ’00], [Kearns, Mansour, Ng ’00]

– Maintain probability distribution over current stateMaintain probability distribution over current state– ProblemProblem: Exact solution intractable for domains of : Exact solution intractable for domains of

high (>100) dimensionalityhigh (>100) dimensionality– ProblemProblem: Approximate solutions have unbounded : Approximate solutions have unbounded

errors, or make strong mixing assumptionserrors, or make strong mixing assumptions

• Learning AI-Planning operatorsLearning AI-Planning operators– [Wang ’95], [Benson ’95], [Pasula etal. ’04],…[Wang ’95], [Benson ’95], [Pasula etal. ’04],…– ProblemProblem: Assume fully observable domain: Assume fully observable domain

Page 31: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

32

Open ProblemsOpen Problems• Efficient Inference with learned formulaEfficient Inference with learned formula• Compact, efficient stochastic learningCompact, efficient stochastic learning• Average case of formula size?Average case of formula size?• Dynamic observation models, filtering in Dynamic observation models, filtering in

expanding worldsexpanding worlds• SoftwareSoftware: http://www.cs.uiuc.edu/~eyal: http://www.cs.uiuc.edu/~eyal

Page 32: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

33

AcknowledgementsAcknowledgements• Dafna ShahafDafna Shahaf

• Megan NanceMegan Nance

• Brian HlubockyBrian Hlubocky

• Allen ChangAllen Chang

• … … and the rest of my incredible group of and the rest of my incredible group of studentsstudents

Page 33: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

34

THE THE ENDEND

Page 34: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

35

Talk OutlineTalk Outline1.1. Actions in partially observed domainsActions in partially observed domains

2.2. Representation and update of modelsRepresentation and update of models

3.3. Efficient algorithmsEfficient algorithms

4.4. Related Work & ConclusionsRelated Work & Conclusions

Page 35: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

36

Compact Encoding (Sometimes)Compact Encoding (Sometimes)• Transition Belief StateTransition Belief State = a logical = a logical

formula (transition relation and state)formula (transition relation and state)

• Observation = logical state formulaeObservation = logical state formulae

Page 36: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

37

Compact Encoding (Sometimes)Compact Encoding (Sometimes)• Transition Belief StateTransition Belief State = a logical = a logical

formula (transition relation and state)formula (transition relation and state)

• Observation = logical state formulaeObservation = logical state formulae

• Actions = propositional symbols assert Actions = propositional symbols assert effect ruleseffect rules– ““sw-upsw-up causes causes on ^ upon ^ up if if EE””– ““go-W go-W keepskeeps up” up”

(= (= “go-W“go-W causes causes upup if if upup” …)” …)

– Prop symbol: Prop symbol: go-Wgo-W≈≈upup, sw-up, sw-upononEE, sw-up, sw-upupup

EE

Page 37: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

38

Example: Light SwitchExample: Light Switch• Initial belief state (time 0) = set of pairs:Initial belief state (time 0) = set of pairs:

{ <E,~on,~up>, <E,on,~up>}{ <E,~on,~up>, <E,on,~up>} all transition rels.all transition rels.

Space = O(2^(2^n))Space = O(2^(2^n))• New encoding: New encoding:

E E ~up ~up

Space = 2Space = 2• Question: how to update new representation?Question: how to update new representation?

Page 38: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

39

Updating Action ModelUpdating Action Model• Transition belief state represented by Transition belief state represented by • Action-Definition(a)Action-Definition(a)t,t+1t,t+1

((a((at t (a(aff v (a v (affff fftt)) )) f ft+1t+1))

ff (a (at t fft+1 t+1 (a(aff v (a v (affff fftt))))))

(effect axioms + explanation closure)(effect axioms + explanation closure)

• UpdateUpdate: Project[: Project[aa](](tt)) = logical results = logical resultst+1t+1

of of tt Action-Definition(a)Action-Definition(a)t,t+1t,t+1

Page 39: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

40

Example Update: Light SwitchExample Update: Light Switch• Transition belief state: Transition belief state: tt = E = Ett ~up ~uptt

• Project[sw-on](Project[sw-on](tt) = ) =

(E(Et+1t+1 sw-on sw-onEEEE sw-on sw-onEE ) )

(~up(~upt+1t+1 sw-on sw-on~up~up~up~up sw-on sw-on~up~up) )

……

• UpdateUpdate: Project[: Project[aa](](tt)) = logical results = logical resultst+1t+1

of of tt Action-Definition(a)Action-Definition(a)t,t+1t,t+1

Page 40: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

41

Updating Action ModelUpdating Action Model• Transition belief state represented by Transition belief state represented by • t+1 t+1 = Update[= Update[oo](Project[](Project[aa](](tt))))

• ActionsActions: Project[: Project[aa](](tt)) = logical results = logical resultst+1t+1 of of tt Action-Definition(a)Action-Definition(a)t,t+1t,t+1

• ObservationsObservations: Update[: Update[oo](]() = ) = oo

Theorem: formula filtering equivalent to Theorem: formula filtering equivalent to <transition,state>-set semantics<transition,state>-set semantics

Page 41: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

42

Larger Picture:Larger Picture:An Exploration AgentAn Exploration Agent

World Model

Interface Module

Learning Module

Filtering Module

Decision Making Module

Knowledge Base

Commonsense extraction

Page 42: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

43

Example: Light SwitchExample: Light Switch• Initial belief state (time 0) = set of pairs:Initial belief state (time 0) = set of pairs:

{ <E,~on,~up>, <E,on,~up>}{ <E,~on,~up>, <E,on,~up>} all transition rels.all transition rels.• Apply actionApply action a = a = go-Wgo-W

..

• Resulting belief state (after action)Resulting belief state (after action) { <E,~on,~up> } x { transitions map to same state } { <E,~on,~up> } x { transitions map to same state } { <E,on,~up> } x { transitions map to same state } { <E,on,~up> } x { transitions map to same state } { <~E,~on,~up> } x { transitions set position to ~E } { <~E,~on,~up> } x { transitions set position to ~E } … …..

Page 43: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

44

Example: Light SwitchExample: Light Switch• Resulting belief state (after action)Resulting belief state (after action) { <E,~on,~up> } x { transitions map to same state } { <E,~on,~up> } x { transitions map to same state } { <E,on,~up> } x { transitions map to same state } { <E,on,~up> } x { transitions map to same state } { <~E,~on,~up> } x { transitions set position to ~E } { <~E,~on,~up> } x { transitions set position to ~E } … …..

• Observe: ~E, ~onObserve: ~E, ~on

Page 44: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

45

Experiments w/DAG-UpdateExperiments w/DAG-Update

0

150

300

450

600

750

900

0 1000 2000 3000 4000 5000 6000

Number of Steps

To

tal T

ime

(m

se

c)

DAG SLAF: Time to Find a Model

Page 45: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

46

Some Learned RulesSome Learned Rules

• Pickup(b1) causes Holding(b1)Pickup(b1) causes Holding(b1)

• Stack(b3,b5) causes On(b3,b5)Stack(b3,b5) causes On(b3,b5)

• Pickup() does not cause Arm-EmptyPickup() does not cause Arm-Empty

• Move(room1,room4) causes Move(room1,room4) causes At(book5,room4) if In-Briefcase(book5)At(book5,room4) if In-Briefcase(book5)

• Move(room1,room4) does not cause Move(room1,room4) does not cause At(book5,room4) if ¬In-Briefcase(book5)At(book5,room4) if ¬In-Briefcase(book5)

Page 46: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

47

Approximate LearningApproximate Learning• Always Always result of Factored-Learning result of Factored-Learning

( ( φφtt ) includes exact action model ) includes exact action model

• Same compactness results applySame compactness results apply

• Approximation decreases size: Discard Approximation decreases size: Discard clauses >k (allows more action models), clauses >k (allows more action models),

| |φφtt| = O(n^k)| = O(n^k)

Page 47: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

48

More in the PaperMore in the Paper• Algorithm that uses deduction for exact Algorithm that uses deduction for exact

updating the model representation updating the model representation alwaysalways

• Arbitrary preconditions and conditional Arbitrary preconditions and conditional effectseffects

• Formal justification of algorithms and Formal justification of algorithms and complexity resultscomplexity results

Page 48: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

49

ExperimentsExperiments

Time per Step

0123456789

10

number of steps

time

per

step

(m

illis

ec)

Page 49: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

DAG-SLAF: The AlgorithmInput: a formula φ , an action-observation sequence <a

i,o

i> , i=1..t

Initialize:for each fluent f, expl

f := init

f

kb:= φ , where each f is replaced by initf

<example here?>

Process Sequence:for i=1..t do Update-Belief(a

i,o

i)

return kb Λ base Λ (f ↔ explf )

Page 50: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

52

Current Game + TranslationCurrent Game + Translation• LambdaMOOLambdaMOO

– MUD code baseMUD code base– Uses database to store game world, Uses database to store game world, – Emphasis on player-world interaction Emphasis on player-world interaction – Powerful in-game programming languagePowerful in-game programming language

• Game sends agents logical description Game sends agents logical description of worldof world

Page 51: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

53

Page 52: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

54

Example: Light SwitchExample: Light SwitchTimeTime Action Action Observe (after action)Observe (after action)

Posn. Bulb SwitchPosn. Bulb Switch

00 E E ~up ~up

11 go-W go-W ~E ~E ~on ~on

22 sw-up sw-up ~E ~E ~on ~on FAIL FAIL

33 go-E go-E E E ~up ~up

44 sw-up sw-up E E up up

55 go-W go-W ~E on~E on

Page 53: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

55

Current Approaches and WorkCurrent Approaches and Work• Reinforcement Learning & HMMsReinforcement Learning & HMMs

– [Chrisman’92], [McCallum’95], [Boyen & Koller [Chrisman’92], [McCallum’95], [Boyen & Koller ’98], [Murphy etal. ’00], [Kearns, Mansour, Ng ’00]’98], [Murphy etal. ’00], [Kearns, Mansour, Ng ’00]

– Maintain probability distribution over current stateMaintain probability distribution over current state– ProblemProblem: Exact solution intractable for domains of : Exact solution intractable for domains of

high (>100) dimensionalityhigh (>100) dimensionality– ProblemProblem: Approximate solutions have unbounded : Approximate solutions have unbounded

errors, or make strong mixing assumptionserrors, or make strong mixing assumptions

• Learning AI-Planning operatorsLearning AI-Planning operators– [Wang ’95], [Benson ’95], [Pasula etal. ’04],…[Wang ’95], [Benson ’95], [Pasula etal. ’04],…– ProblemProblem: Assume fully observable domain: Assume fully observable domain

Page 54: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

56

Present ContributionPresent Contribution• Identify (useful) sufficient conditions for Identify (useful) sufficient conditions for

efficient computation of action modelsefficient computation of action models– Actions map states 1:1Actions map states 1:1– Deterministic actions with limited effectDeterministic actions with limited effect

• Polynomial-time algorithms for exact Polynomial-time algorithms for exact update of action modelupdate of action model

• (Useful) sufficient conditions for (Useful) sufficient conditions for compact representation of action modelcompact representation of action model

Page 55: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

57

Present ContributionPresent Contribution• Identify (useful) sufficient conditions for Identify (useful) sufficient conditions for

efficient computation of efficient computation of action modelsaction models– Actions map states 1:1Actions map states 1:1– Deterministic actions with limited effectDeterministic actions with limited effect

• Polynomial-time Polynomial-time algorithmsalgorithms for exact for exact update of action modelupdate of action model

• (Useful) sufficient (Useful) sufficient conditionsconditions for for compact representation of action modelcompact representation of action model

Page 56: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

58

Effect of Concept ExpansionEffect of Concept Expansion

0

200

400

600

800

1000

1200

1400

1 2 3 4 5 6 7 8 9 10

Concepts

Rel

atio

ns

Total Relations

Relations per retrievedconcept

Page 57: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

59

Text Translation to LogicText Translation to Logic• Difficulties:Difficulties:

– Counterintuitive LambdaMOO actionsCounterintuitive LambdaMOO actions– Enumerate observations for action resultEnumerate observations for action result

1.1.Create small game world Create small game world

2.2.Predicates needed to describe worldPredicates needed to describe world

3.3.Define STRIPS-like actions required to Define STRIPS-like actions required to interact with game worldinteract with game world

4.4.MUD outputs logical descriptions of worldMUD outputs logical descriptions of world

Page 58: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

60

Map Propositions to SemanticsMap Propositions to Semantics• E.g., assume that every action E.g., assume that every action aa is non- is non-

failing, deterministic, non-conditionalfailing, deterministic, non-conditional– For every domain description D, For every domain description D, –

TTDD = Rules = RulesD D (a (a≈f≈f v av aff

v av a-f-f))

ff– RulesRulesD D = { a= { aff

g | “a causes f if g” | “a causes f if g” D } D }

Page 59: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

61

Mapping Theory to SemanticsMapping Theory to Semantics• Every set of state-transition pairs Every set of state-transition pairs

represented using a logical formularepresented using a logical formula

• Theorem: Every consistent propositional Theorem: Every consistent propositional theory maps to a set of theory maps to a set of <transition,state> pairs and vice versa<transition,state> pairs and vice versa

• Have formulas for deterministic actionsHave formulas for deterministic actions– Conditional effect, sometimes inexecutableConditional effect, sometimes inexecutable– Non-conditional, sometimes inexecutableNon-conditional, sometimes inexecutable– Non-conditional, always executableNon-conditional, always executable

Page 60: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

62

Distribution PropertiesDistribution Properties Project[a](Project[a](Project[a](Project[a](Project[a](Project[a]())

• Filtering a DNF belief state by factoringFiltering a DNF belief state by factoring

Page 61: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

63

Distribution PropertiesDistribution Properties Project[a](Project[a](Project[a](Project[a](Project[a](Project[a]())

Project[a](Project[a](Project[a](Project[a](Project[a](Project[a]())

Project[a](Project[a](Project[a](Project[a](Project[a](TRUE)Project[a](TRUE)

• Filtering a DNF belief state by factoringFiltering a DNF belief state by factoring

Page 62: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

64

Knowledge About ActionsKnowledge About Actions

• Goal 1:Goal 1: conclude that one of sw-up, conclude that one of sw-up, go-W, go-E causes ongo-W, go-E causes on

• Goal 2:Goal 2: show that sw-up is possible only show that sw-up is possible only when E is true (with some assumptions)when E is true (with some assumptions)

Page 63: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

65

ChallengesChallenges(disregarding NLP for now):(disregarding NLP for now):

• Partially observable worldPartially observable world– E.g., Cannot see the light bulb in east roomE.g., Cannot see the light bulb in east room

• Incomplete knowledge about state of Incomplete knowledge about state of the world and effects of actionsthe world and effects of actions– E.g., do not know the effect/preconditions E.g., do not know the effect/preconditions

of flipping switchof flipping switch

• ....................

Page 64: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

66

ChallengesChallenges(disregarding NLP for now):(disregarding NLP for now):

• Partially observable worldPartially observable world– E.g., Cannot see the light bulb in east roomE.g., Cannot see the light bulb in east room

• Incomplete knowledge about state of Incomplete knowledge about state of the world and effects of actionsthe world and effects of actions– E.g., do not know the effect/preconditions E.g., do not know the effect/preconditions

of flipping switchof flipping switch

• ....................

Page 65: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

67

Relating Effect PropositionsRelating Effect Propositions

• Action-Definition(a)Action-Definition(a)t,t+1t,t+1

((a((at t (a(aff v (a v (affff fftt)) )) f ft+1t+1))

ff (a (at t fft+1 t+1 (a(aff v (a v (affff fftt))))))

Page 66: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

69

Example: Moving ItemsExample: Moving Items• From here: Assume that actions non-From here: Assume that actions non-

conditional, always succeedconditional, always succeed• Initial belief state:Initial belief state:

set of { at(r,closet,room)}set of { at(r,closet,room)} all transition all transition rels.rels.

• Apply actionApply action a = a = move(r,closet,room)move(r,closet,room)tt

• Resulting belief stateResulting belief state

{ at(r,closet,room) } x { a{ at(r,closet,room) } x { aat(r,closet,room)at(r,closet,room)at(r,closet,room)at(r,closet,room) …} …}

{ -at(r,closet,room) } x { a{ -at(r,closet,room) } x { a-at(r,closet,room)-at(r,closet,room)at(r,closet,room)at(r,closet,room) ..} ..}

… …..

Page 67: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

70

Example: Moving ItemsExample: Moving Items• Initial belief state:Initial belief state:

{ at(r,closet,room) } x { a{ at(r,closet,room) } x { aat(r,closet,room)at(r,closet,room)at(r,closet,room)at(r,closet,room) …} …}

{ -at(r,closet,room) } x { a{ -at(r,closet,room) } x { a-at(r,closet,room)-at(r,closet,room)at(r,closet,room)at(r,closet,room) ..} ..}

… …..

• Apply observationApply observation -at-at(r,closet,room)(r,closet,room)tt

• Resulting belief stateResulting belief state

{ -at(r,closet,room) } x { a{ -at(r,closet,room) } x { a-at(r,closet,room)-at(r,closet,room)at(r,closet,room)at(r,closet,room) ..} ..}

… …..

Page 68: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

71

Filtering Stochastic ProcessesFiltering Stochastic Processes• Dynamic Bayes Nets (DBNs): factored Dynamic Bayes Nets (DBNs): factored

representationrepresentation

s1 s4s3s2 s5

s1 s4s3s2 s5

s1 s4s3s2 s5

s1 s4s3s2 s5

Page 69: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

72

Filtering Stochastic ProcessesFiltering Stochastic Processes• Dynamic Bayes Nets (DBNs): factored Dynamic Bayes Nets (DBNs): factored

representationrepresentation

s4s3s2 s5

s4s3s2 s5

s4s3s2 s5

s4s3s2 s5

Page 70: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

73

Filtering Stochastic ProcessesFiltering Stochastic Processes• Dynamic Bayes Nets (DBNs): factored Dynamic Bayes Nets (DBNs): factored

representationrepresentation

s4s3 s5

s4s3 s5

s4s3 s5

s4s3 s5

Page 71: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

74

Filtering Stochastic ProcessesFiltering Stochastic Processes• Dynamic Bayes Nets (DBNs): factored Dynamic Bayes Nets (DBNs): factored

representationrepresentation

s4 s5

s4 s5

s4 s5

s4 s5O(2O(2nn) space) spaceO(2O(22n2n) time) time

Page 72: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

75

Filtering Stochastic ProcessesFiltering Stochastic Processes• Dynamic Bayes Nets (DBNs): factored Dynamic Bayes Nets (DBNs): factored

representation: representation: O(2O(2nn) space, O(2) space, O(22n2n) time) time

• Kalman Filter: Gaussian belief state and Kalman Filter: Gaussian belief state and linear transition modellinear transition model

s1 s4s3s2 s5

s1 s4s3s2 s5

s1 s4s3s2 s5

Page 73: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

76

Filtering Stochastic ProcessesFiltering Stochastic Processes• Dynamic Bayes Nets (DBNs): factored Dynamic Bayes Nets (DBNs): factored

representation: representation: O(2O(2nn) space, O(2) space, O(22n2n) time) time

• Kalman Filter: Gaussian belief state and Kalman Filter: Gaussian belief state and linear transition modellinear transition model

s4 s5

s4 s5

s4 s5O(nO(n22) space) spaceO(nO(n33) time) time

Page 74: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

77

Example: Moving ItemsExample: Moving Items• Initial belief state:Initial belief state:

set of all world statesset of all world states• Apply actionApply action

move(r,closet,room)move(r,closet,room)tt

• Resulting belief stateResulting belief state

all states that satisfy all states that satisfy in(r,closet)in(r,closet)t+1t+1• Reason:Reason:

– If initially If initially in(r,closet), then still in(r,closet), then still in(r,closet)in(r,closet)– If initially If initially in(r,closet), then now in(r,closet), then now in(r,closet)in(r,closet)

Page 75: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

78

Example: Filtering a LiteralExample: Filtering a Literal• Initial knowledge (time t):Initial knowledge (time t):

in(r,closet)in(r,closet)

• Apply Apply move(r,closet,room)move(r,closet,room)

Preconds:Preconds: in(r,closet) in(r,closet) locked(closet)locked(closet)

Effects:Effects: in(r,room) in(r,room) in(r,closet)in(r,closet)

• Resulting knowledge (time t+1):Resulting knowledge (time t+1):

in(r,room) in(r,room) in(r,closet) in(r,closet)

locked(closet) locked(closet) in(r,closet) in(r,closet)

Page 76: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

79

Example: Filtering a FormulaExample: Filtering a Formula• Initial knowledge (time t):Initial knowledge (time t):

in(broom,closet) in(broom,closet) locked(closet)locked(closet)

• Apply Apply move(r,closet,room)move(r,closet,room)

Preconds:Preconds: in(r,closet) in(r,closet) locked(closet)locked(closet)

Effects:Effects: in(r,room) in(r,room) in(r,closet)in(r,closet)

• Resulting knowledge (time t+1):Resulting knowledge (time t+1):

in(r,room) in(r,room) in(r,closet) in(r,closet) locked(closet)locked(closet)

Page 77: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

80

Brief Outline of Future EffortBrief Outline of Future Effort• Filtering FOL representationsFiltering FOL representations

• Compact reductions from FOL to Compact reductions from FOL to Propositional LogicPropositional Logic

• More classes of filtering that maintains More classes of filtering that maintains compact representationcompact representation

• Learning world models in partially Learning world models in partially observable domainsobservable domains

• Stochastic filteringStochastic filtering

Page 78: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

81

Open ProblemsOpen Problems• More families of actions/observationsMore families of actions/observations

– Stochastic conditions on observationsStochastic conditions on observations– Different data structures (BDDs? Horn?)Different data structures (BDDs? Horn?)

• Compact, efficient stochastic filteringCompact, efficient stochastic filtering

• Average case?Average case?

• Relational / first-order filteringRelational / first-order filtering

• Dynamic observation models, filtering in Dynamic observation models, filtering in expanding worldsexpanding worlds

Page 79: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

82

STRIPS-Filter: Experimental STRIPS-Filter: Experimental ResultsResults

[A. & Russell ’03]

Page 80: Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen

Connecting Learning and LogicEyal Amir, Cambridge Present., May 2006

83

STRIPS-Filter: Experimental STRIPS-Filter: Experimental ResultsResults

[A. & Russell ’03]