lab 3: dummy q-learning (table) - github pages · pdf filelab 3: dummy q-learning (table)...

Lab 3: Dummy Q-learning (table)

Reinforcement Learning with TensorFlow&OpenAI GymSung Kim <hunkim+ml@gmail.com>

Learning Q(s, a): Tableinitial Q values are 0

Learning Q(s, a) Table (with many trials)initial Q values are 0

Learning Q(s, a) Table: one success!initial Q values are 0

Learning Q(s, a) Table: one success!

Dummy Q-learning algorithm

Code: setup

https://medium.com/emergent-future/simple-reinforcement-learning-with-tensorflow-part-0-q-learning-with-tables-and-neural-networks-d195264329d0#.pjz9g59ap

# https://gist.github.com/stober/1943451

Code: (dummy) Q-learning

https://medium.com/emergent-future/simple-reinforcement-learning-with-tensorflow-part-0-q-learning-with-tables-and-neural-networks-d195264329d0#.pjz9g59ap

Code: result reporting

Success rate: 0.95

Q = np.zeros([env.observation_space.n, env.action_space.n])

LEFT DOWN RIGHT UP [[ 0. 0. 1. 0.] [ 0. 0. 1. 0.] [ 0. 1. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 0. 0.] [ 0. 1. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 0. 0.] [ 0. 1. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 1. 0.] [ 0. 0. 0. 0.]]

print(Q)

Exploit&exploration and

discounted future reward

lab 3: dummy q-learning (table) - github pages · pdf filelab 3: dummy q-learning (table)...

Documents

large-scale study of curiosity-driven learninglarge-scale...

lecture 1: introduction - github pages 1: introduction...

adversarial approaches to bayesian learning and bayesian...

lab 2: playing openai gym games - github...

lab 4: q-learning (table) - github pages · lab 4:...

jonas schneider, head of engineering for robotics, openai

working in openai environments designing your own ·...

openai gym radionica - csnedelja.mg.edu.rs · openai gym...

arxiv:1810.12894v1 [cs.lg] 30 oct 2018 · exploration by...

lab 5: windy frozen lake nondeterministic world! · lab 5:...

geometry-aware neural...

dlddo: deep learning to detect dummy operations

can openai codex and other large language models help us

the state of physical attacks on deep learning systems ·...

ian goodfellow, openai research scientist guest lecture for...

long-term planning and situational awareness in openai...

extending the openai gym for robotics: a toolkit for...

10-703 deep rl and controls openai gym recitation api basic...

multiple linear regression (dummy variable treatment) · in...

lab 4: q-learning (table) - github...