chapter 1, brushing up on reinforcement learning …...10 11 your goal the cliff you're here +5...

61
Chapter 1, Brushing Up on Reinforcement Learning Concepts

Upload: others

Post on 12-May-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

Chapter 1, Brushing Up on ReinforcementLearning Concepts

Page 2: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 2 ]

Page 3: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 3 ]

Page 4: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 4 ]

Page 5: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 5 ]

Page 6: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 6 ]

Page 7: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 7 ]

Page 8: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 8 ]

Page 9: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 9 ]

Page 10: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 10 ]

Chapter 2, Getting Started with the Q-Learning Algorithm

Page 11: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 11 ]

Page 12: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 12 ]

Page 13: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 13 ]

Page 14: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 14 ]

Page 15: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 15 ]

Page 16: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 16 ]

Chapter 3, Setting Up Your FirstEnvironment with OpenAI Gym

Page 17: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 17 ]

Page 18: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 18 ]

Page 19: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 19 ]

Page 20: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 20 ]

Page 21: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 21 ]

Page 22: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 22 ]

Page 23: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 23 ]

Page 24: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 24 ]

Chapter 4, Teaching a Smartcab to DriveUsing Q-Learning

Page 25: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 25 ]

Page 26: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 26 ]

Page 27: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 27 ]

Page 28: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 28 ]

Page 29: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 29 ]

Page 30: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 30 ]

Page 31: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 31 ]

Chapter 5, Building Q-Networks withTensorFlow

Page 32: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 32 ]

Page 33: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 33 ]

Page 34: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 34 ]

Page 35: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 35 ]

Chapter 6, Digging Deeper into Deep Q-Networks with Keras and TensorFlow

Page 36: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 36 ]

Page 37: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 37 ]

Page 38: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 38 ]

Page 39: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 39 ]

Page 40: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 40 ]

Page 41: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 41 ]

Chapter 7, Decoupling Exploration andExploitation in Multi-Armed Bandits

Page 42: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 42 ]

Page 43: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 43 ]

Page 44: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 44 ]

Page 45: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 45 ]

Page 46: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 46 ]

Page 47: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 47 ]

Page 48: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 48 ]

Chapter 8, Further Q-Learning Research andFuture Projects

Page 49: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 49 ]

Page 50: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 50 ]

Page 51: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 51 ]

Page 52: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 52 ]

Page 53: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 53 ]

Page 54: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 54 ]

Page 55: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 55 ]

Page 56: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 56 ]

Page 57: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 57 ]

Page 58: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 58 ]

Page 59: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 59 ]

Page 60: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 60 ]

Page 61: Chapter 1, Brushing Up on Reinforcement Learning …...10 11 Your goal The Cliff You're here +5 1.0 0.5 0.5 0.4 0.6 0.70 0.10 0.30 0.20 0.05 0.95 0.40 0.30 Local Maximum f(x) Global

[ 61 ]