lab 3: dummy q-learning (table) - github pages · pdf filelab 3: dummy q-learning (table)...

12
Lab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim <[email protected]>

Upload: vudung

Post on 10-Feb-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lab 3: Dummy Q-learning (table) - GitHub Pages · PDF fileLab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

Lab 3: Dummy Q-learning (table)

Reinforcement Learning with TensorFlow&OpenAI GymSung Kim <[email protected]>

Page 2: Lab 3: Dummy Q-learning (table) - GitHub Pages · PDF fileLab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

Learning Q(s, a): Tableinitial Q values are 0

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

Page 3: Lab 3: Dummy Q-learning (table) - GitHub Pages · PDF fileLab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

Learning Q(s, a) Table (with many trials)initial Q values are 0

Page 4: Lab 3: Dummy Q-learning (table) - GitHub Pages · PDF fileLab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

Learning Q(s, a) Table: one success!initial Q values are 0

11

11

1

111

Page 5: Lab 3: Dummy Q-learning (table) - GitHub Pages · PDF fileLab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

Learning Q(s, a) Table: one success!

11

11

1

111

Page 6: Lab 3: Dummy Q-learning (table) - GitHub Pages · PDF fileLab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

Dummy Q-learning algorithm

Page 7: Lab 3: Dummy Q-learning (table) - GitHub Pages · PDF fileLab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

Dummy Q-learning algorithm

Page 8: Lab 3: Dummy Q-learning (table) - GitHub Pages · PDF fileLab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

Code: setup

https://medium.com/emergent-future/simple-reinforcement-learning-with-tensorflow-part-0-q-learning-with-tables-and-neural-networks-d195264329d0#.pjz9g59ap

# https://gist.github.com/stober/1943451

Page 9: Lab 3: Dummy Q-learning (table) - GitHub Pages · PDF fileLab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

Code: (dummy) Q-learning

https://medium.com/emergent-future/simple-reinforcement-learning-with-tensorflow-part-0-q-learning-with-tables-and-neural-networks-d195264329d0#.pjz9g59ap

Page 10: Lab 3: Dummy Q-learning (table) - GitHub Pages · PDF fileLab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

Code: result reporting

Success rate: 0.95

Page 11: Lab 3: Dummy Q-learning (table) - GitHub Pages · PDF fileLab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

Q = np.zeros([env.observation_space.n, env.action_space.n])

LEFT DOWN RIGHT UP [[ 0. 0. 1. 0.] [ 0. 0. 1. 0.] [ 0. 1. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 0. 0.] [ 0. 1. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 0. 0.] [ 0. 1. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 1. 0.] [ 0. 0. 0. 0.]]

print(Q)

Page 12: Lab 3: Dummy Q-learning (table) - GitHub Pages · PDF fileLab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

Next

Exploit&exploration and

discounted future reward