anna koop, leah hackman, richard s....
TRANSCRIPT
playing catch
catch procedure throw procedure
flick tail appropriate
direction and velocity
tail IR sensors not predicting success
ball predictions not active
suceeds when ball rolls
towards partner
terminates when ball not near tail
not near ballrolls towards
succeeds when tail surrounds ball
tail surrounds ball
IR sensors in tail are low
ball predictions are active
terminates when ball position unknown or unreachable
ball unreachable ball position unknown
no known procedure can lead to ball predictions
no procedure reliably leads to ball predictions
position to intersect ball
ball partner
another agent
predictions about
interactions
play
reward for learning
less chance of negative rewardthan competition
object predicting interactions
rolls when I bump into it looks greentemporal coherence to predictions
Bridging the Implementation GapFrom Sensorimotor Experience to Conceptual Knowledge
Anna Koop, Leah Hackman, Richard S. Sutton
Veri�able knowledge is ABOUT sensorimotor data
The RL Perspective
agent environment
m
s
The reinforcement learning agent is an input-output system, interacting with an environment that is only accessible via sensation and motor signals.
...m t-2s t-2m t-1s t-1 m ts t m t+1s t+1m t+2s t+2...
current futurepast
The Gapsensorimotor data conceptual knowledge
temporal, dynamic atemporal, (more) static
shareable, objectiveindividual, subjective
detailed, situated abstract, general
Veri�able Signals
At every timestep the agent receives a sensor signal and sends a motor signal. Experience is made up of past, present, and potential future sensorimotor data.
s t
h
c
p
xt-n xt
Historic Predictivext+nxt
Compositionalxt
We can construct signals that abstract over time and data that are still veri�able statements about sensorimotor experience.
Different Views
We see the Critterbot playing catch with a ball.
The Critterbot sees various sensor and motor signals.
RIAL
&