lab 6-2: q network for cart pole - github pageshunkim.github.io/ml/rl/rl06-l2.pdf · 2017-10-02 ·...

Post on 06-Jun-2020

17 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Lab 6-2: Q Network for Cart Pole

Reinforcement Learning with TensorFlow&OpenAI GymSung Kim <hunkim+ml@gmail.com>

Cart Pole

https://gym.openai.com/docs

Random trials

Rewards

Cart Pole Q-network

(2)Ws(1)s

Q-Network training (Network construction)

(2)Ws(1)s

Q-Network training (linear regression)

(2)Ws(1)s

y = r + �maxQ(s0)

cost(W ) = (Ws� y)2

Code: Network and setup

Code: Training

Code: apply

Results: really poor!

Why does not work? Too shallow?

Excise

• Why does not work?

• Hint: DQN

top related