rl and deep learning - mlss2014.com · hierarchical reinforcement learning high-level model-based...

19
RL and deep learning Nando de Freitas

Upload: voduong

Post on 16-Feb-2019

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: RL and deep learning - mlss2014.com · Hierarchical reinforcement learning High-level model-based learning for deciding when to navigate, park, pickup and dropoff passengers. Mid-level

RL and deep learningNando de Freitas

Page 2: RL and deep learning - mlss2014.com · Hierarchical reinforcement learning High-level model-based learning for deciding when to navigate, park, pickup and dropoff passengers. Mid-level
Page 3: RL and deep learning - mlss2014.com · Hierarchical reinforcement learning High-level model-based learning for deciding when to navigate, park, pickup and dropoff passengers. Mid-level
Page 4: RL and deep learning - mlss2014.com · Hierarchical reinforcement learning High-level model-based learning for deciding when to navigate, park, pickup and dropoff passengers. Mid-level

Google’s neural net learns just by watching youtube videos

Page 5: RL and deep learning - mlss2014.com · Hierarchical reinforcement learning High-level model-based learning for deciding when to navigate, park, pickup and dropoff passengers. Mid-level

Place cells in the hippocampus

Page 6: RL and deep learning - mlss2014.com · Hierarchical reinforcement learning High-level model-based learning for deciding when to navigate, park, pickup and dropoff passengers. Mid-level

[Denil et al 2012]

Page 7: RL and deep learning - mlss2014.com · Hierarchical reinforcement learning High-level model-based learning for deciding when to navigate, park, pickup and dropoff passengers. Mid-level

Hierarchical reinforcement learning

High-level model-based learning for deciding when to navigate, park, pickup and dropoff passengers.

Mid-level active path learning for navigating a topological map.

Low-level active policy optimizer to learn control of continuous non-linear vehicle dynamics.

Page 8: RL and deep learning - mlss2014.com · Hierarchical reinforcement learning High-level model-based learning for deciding when to navigate, park, pickup and dropoff passengers. Mid-level

Active Path Finding in Middle LevelNavigate policy generates sequence of waypoints on a topological map to navigate from a location to a destination.

Page 9: RL and deep learning - mlss2014.com · Hierarchical reinforcement learning High-level model-based learning for deciding when to navigate, park, pickup and dropoff passengers. Mid-level

Low-Level: Trajectory following

Vx

VyYerr

err

TORCS: 3D game engine that implements complex vehicle dynamics complete with manual and automatic transmission, engine, clutch, tire, and suspension models.

Page 10: RL and deep learning - mlss2014.com · Hierarchical reinforcement learning High-level model-based learning for deciding when to navigate, park, pickup and dropoff passengers. Mid-level

Bayesian optimization was used to find the neural net low-level policy and the value function for the upper levels

Page 11: RL and deep learning - mlss2014.com · Hierarchical reinforcement learning High-level model-based learning for deciding when to navigate, park, pickup and dropoff passengers. Mid-level
Page 12: RL and deep learning - mlss2014.com · Hierarchical reinforcement learning High-level model-based learning for deciding when to navigate, park, pickup and dropoff passengers. Mid-level

Deepmind approach

Page 13: RL and deep learning - mlss2014.com · Hierarchical reinforcement learning High-level model-based learning for deciding when to navigate, park, pickup and dropoff passengers. Mid-level

Deepmind approach

Page 14: RL and deep learning - mlss2014.com · Hierarchical reinforcement learning High-level model-based learning for deciding when to navigate, park, pickup and dropoff passengers. Mid-level

Deepmind approach

Page 15: RL and deep learning - mlss2014.com · Hierarchical reinforcement learning High-level model-based learning for deciding when to navigate, park, pickup and dropoff passengers. Mid-level

Imitation learning & mirror neurons

Page 16: RL and deep learning - mlss2014.com · Hierarchical reinforcement learning High-level model-based learning for deciding when to navigate, park, pickup and dropoff passengers. Mid-level

Imitation learning for Atari

[Dejan, Miroslav, 2014]

Page 17: RL and deep learning - mlss2014.com · Hierarchical reinforcement learning High-level model-based learning for deciding when to navigate, park, pickup and dropoff passengers. Mid-level

Imitation learning for Atari

[Dejan Markovikj, Miroslav Bogdanovic, Misha Denil, NdF 2014]

Page 18: RL and deep learning - mlss2014.com · Hierarchical reinforcement learning High-level model-based learning for deciding when to navigate, park, pickup and dropoff passengers. Mid-level

Inverse RL…or teaching deepmind how to play Atari

Page 19: RL and deep learning - mlss2014.com · Hierarchical reinforcement learning High-level model-based learning for deciding when to navigate, park, pickup and dropoff passengers. Mid-level

Next lecture: scalable learning

Thank you