introduction to deep reinforcement learning · pdf fileintroduction to deep reinforcement...

53
Introduction to Deep Reinforcement Learning Shenglin Zhao Department of Computer Science & Engineering The Chinese University of Hong Kong

Upload: nguyenhuong

Post on 13-Mar-2018

245 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Introduction to Deep Reinforcement Learning

Shenglin Zhao Department of Computer Science & Engineering

The Chinese University of Hong Kong

Page 2: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Outline

• Background • Deep Learning • Reinforcement Learning • Deep Reinforcement Learning • Conclusion

Page 3: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Outline

• Background • Deep Learning • Reinforcement Learning • Deep Reinforcement Learning • Conclusion

Page 4: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Milestone Issues

• NIPS 2013, DeepMind, Playing Atari with Deep Reinforcement Learning, https://arxiv.org/abs/1312.5602

• Nature cover paper 2015, DeepMind, Human-level control through deep reinforcement learning, www.nature.com/articles/nature14236

• Nature cover paper 2016, DeepMind, Mastering the game of Go with deep neural networks and tree search, www.nature.com/articles/nature16961

Page 5: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Reinforcement Learning in a nutshell

http://icml.cc/2016/tutorials/deep_rl_tutorial.pdf

Page 6: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Deep Learning in a nutshell

http://icml.cc/2016/tutorials/deep_rl_tutorial.pdf

Page 7: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Deep Reinforcement Learning: AI = RL + DL

http://icml.cc/2016/tutorials/deep_rl_tutorial.pdf

Page 8: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Examples of Deep RL@DeepMind

http://icml.cc/2016/tutorials/deep_rl_tutorial.pdf

Page 9: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Outline

• Background • Deep Learning • Reinforcement Learning • Deep Reinforcement Learning • Conclusion

Page 10: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Deep Learning = Learning Representations/Features

• Traditional Model

• Deep Learning

http://www.cs.nyu.edu/~yann/talks/lecun-ranzato-icml2013.pdf

Page 11: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Deep Representations

http://icml.cc/2016/tutorials/deep_rl_tutorial.pdf

Page 12: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Deep Neural Network

http://icml.cc/2016/tutorials/deep_rl_tutorial.pdf

Page 13: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Example-CNN

http://xrds.acm.org/blog/2016/06/convolutional-neural-networks-cnns-illustrated-explanation/

Convolutional Operator

Discrete Form

Matrix Element-wise Multiplication

1 2 0

0 1 0

0 0 1

2 2 7

5 0 7

1 2 1 = 7

Page 14: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Example-CNN

https://adeshpande3.github.io/adeshpande3.github.io/A-Beginner's-Guide-To-Understanding-Convolutional-Neural-Networks/

Pixel representation of filter Visualization

Page 15: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Example-CNN

https://adeshpande3.github.io/adeshpande3.github.io/A-Beginner's-Guide-To-Understanding-Convolutional-Neural-Networks/

Original image What we find

Page 16: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Example-CNN

http://www.cs.nyu.edu/~yann/talks/lecun-ranzato-icml2013.pdf

Low-Level Feature

Mid-Level Feature

High-Level Feature

Trainable Classifier

Page 17: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Outline

• Background • Deep Learning • Reinforcement Learning • Deep Reinforcement Learning • Conclusion

Page 18: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Reinforcement Learning

http://icml.cc/2016/tutorials/deep_rl_tutorial.pdf

Page 19: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Agent and Environment

http://icml.cc/2016/tutorials/deep_rl_tutorial.pdf

Agent

Environment

Page 20: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

State

http://icml.cc/2016/tutorials/deep_rl_tutorial.pdf

Page 21: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Major Components

http://icml.cc/2016/tutorials/deep_rl_tutorial.pdf

Page 22: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Policy

http://icml.cc/2016/tutorials/deep_rl_tutorial.pdf

Page 23: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Value Function

Bellman equation

http://icml.cc/2016/tutorials/deep_rl_tutorial.pdf

Page 24: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Optimal Case

http://icml.cc/2016/tutorials/deep_rl_tutorial.pdf

Page 25: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Approaches to Reinforcement Learning

http://icml.cc/2016/tutorials/deep_rl_tutorial.pdf

Page 26: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

RL Example • Assumption

•  Suppose we have 5 rooms in a building connected by doors •  The outside of the building can be thought of as one big room (5)

• Target •  Put an agent in any room, and from that room, go outside the building

http://mnemstudio.org/path-finding-q-learning-tutorial.htm

Page 27: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

RL Example

Page 28: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Q-learning for the RL Problem

• Assuming rewards for each step, the goal is to reach the state with the highest reward. • Terms— state: room, action: move decision, reward: 0 or 100

RL Problem Markov Decision Process

http://mnemstudio.org/path-finding-q-learning-tutorial.htm

Page 29: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Q-learning for the RL Problem

Reward Table Q-value Table

Page 30: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Q-learning for the RL Problem

• Q-table is the brain of our agent, representing the memory of what the agent has learned through experience. • The agent starts out knowing nothing, the matrix Q is initialized to

zero. •  Simple transition rule of Q learning,

http://mnemstudio.org/path-finding-q-learning-tutorial.htm

Page 31: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Q-learning for the RL Problem

http://mnemstudio.org/path-finding-q-learning-tutorial.htm

Page 32: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Q-learning for the RL Problem

•  Example •  Initial state: room 1, action: move to 5

http://mnemstudio.org/path-finding-q-learning-tutorial.htm

Page 33: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Q-learning for the RL Problem

•  Example •  Initial state: room 3, action: move to 1

Q(3, 1) = 0 + 0.8 * 100 = 80

http://mnemstudio.org/path-finding-q-learning-tutorial.htm

Page 34: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Q-learning for the RL Problem

http://mnemstudio.org/path-finding-q-learning-tutorial.htm

Page 35: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Outline

• Background • Deep Learning • Reinforcement Learning • Deep Reinforcement Learning • Conclusions

Page 36: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Deep Reinforcement Learning

http://videolectures.net/rldm2015_silver_reinforcement_learning/

Deep Learning

Reinforcement Learning

Deep Reinforcement

Learning

Page 37: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Deep Q-learning

Deep Q-Network (DQN)

Page 38: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

DQN Architecture

Naive formulation of deep Q-network DQN in DeepMind paper https://www.nervanasys.com/demystifying-deep-reinforcement-learning/

Page 39: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

DQN Architecture

https://www.nervanasys.com/demystifying-deep-reinforcement-learning/

Page 40: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

DQN

Loss function Q-table update algorithm

https://www.nervanasys.com/demystifying-deep-reinforcement-learning/

Page 41: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Exploration-Exploitation

Page 42: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Value Iteration

Page 43: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Policy Iteration

Page 44: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Stability Issues with Deep RL

http://videolectures.net/rldm2015_silver_reinforcement_learning/

Page 45: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

DQN

http://videolectures.net/rldm2015_silver_reinforcement_learning/

Page 46: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Experience Replay

http://videolectures.net/rldm2015_silver_reinforcement_learning/

Page 47: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Fixed Target Q-Network

http://videolectures.net/rldm2015_silver_reinforcement_learning/

Page 48: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Reward/Value Range

http://videolectures.net/rldm2015_silver_reinforcement_learning/

Page 49: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

DQN Results in Atari

Page 50: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

DQN Atari Demo

Page 51: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Conclusion

Page 52: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

Demo

Page 53: Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao

References

•  http://karpathy.github.io/2016/05/31/rl/ •  https://gym.openai.com/docs/rl •  https://www.nervanasys.com/demystifying-deep-reinforcement-

learning/ •  http://mnemstudio.org/path-finding-q-learning-tutorial.htm •  http://videolectures.net/rldm2015_silver_reinforcement_learning/ •  http://icml.cc/2016/tutorials/deep_rl_tutorial.pdf