lab 3: dummy q-learning (table) - github pages · pdf filelab 3: dummy q-learning (table)...

12

Lab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim <[email protected]>

Upload: vudung

Post on 10-Feb-2018

217 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Page 1: Lab 3: Dummy Q-learning (table) - GitHub Pages · PDF fileLab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

Lab 3: Dummy Q-learning (table)

Reinforcement Learning with TensorFlow&OpenAI GymSung Kim <[email protected]>

Page 2: Lab 3: Dummy Q-learning (table) - GitHub Pages · PDF fileLab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

Learning Q(s, a): Tableinitial Q values are 0

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

Page 3: Lab 3: Dummy Q-learning (table) - GitHub Pages · PDF fileLab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

Learning Q(s, a) Table (with many trials)initial Q values are 0

Page 4: Lab 3: Dummy Q-learning (table) - GitHub Pages · PDF fileLab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

Learning Q(s, a) Table: one success!initial Q values are 0

11

11

1

111

Page 5: Lab 3: Dummy Q-learning (table) - GitHub Pages · PDF fileLab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

Learning Q(s, a) Table: one success!

11

11

1

111

Page 6: Lab 3: Dummy Q-learning (table) - GitHub Pages · PDF fileLab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

Dummy Q-learning algorithm

Page 7: Lab 3: Dummy Q-learning (table) - GitHub Pages · PDF fileLab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

Dummy Q-learning algorithm

Page 8: Lab 3: Dummy Q-learning (table) - GitHub Pages · PDF fileLab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

Code: setup

https://medium.com/emergent-future/simple-reinforcement-learning-with-tensorflow-part-0-q-learning-with-tables-and-neural-networks-d195264329d0#.pjz9g59ap

# https://gist.github.com/stober/1943451

Page 9: Lab 3: Dummy Q-learning (table) - GitHub Pages · PDF fileLab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

Code: (dummy) Q-learning

https://medium.com/emergent-future/simple-reinforcement-learning-with-tensorflow-part-0-q-learning-with-tables-and-neural-networks-d195264329d0#.pjz9g59ap

Page 10: Lab 3: Dummy Q-learning (table) - GitHub Pages · PDF fileLab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

Code: result reporting

Success rate: 0.95

Page 11: Lab 3: Dummy Q-learning (table) - GitHub Pages · PDF fileLab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

Q = np.zeros([env.observation_space.n, env.action_space.n])

LEFT DOWN RIGHT UP [[ 0. 0. 1. 0.] [ 0. 0. 1. 0.] [ 0. 1. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 0. 0.] [ 0. 1. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 0. 0.] [ 0. 1. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 1. 0.] [ 0. 0. 0. 0.]]

print(Q)

Page 12: Lab 3: Dummy Q-learning (table) - GitHub Pages · PDF fileLab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

Next

Exploit&exploration and

discounted future reward

10-703 Deep RL and Controls OpenAI Gym Recitation API Basic Datatypes ... Minecraft. VirtualEnv Installation ... 10-703 Deep RL and Controls OpenAI Gym Recitation Author: Devin Schwab

Lecture 1: Introduction - GitHub Pageshunkim.github.io/ml/RL/rl01.pdf · Lecture 1: Introduction Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

ut s OpenAI + DotA 2kanmy/courses/6101_1820/s13.pdf · ut s OpenAI Rapid[1] … a general-purpose RL training system . ut s Proximal Policy Optimization[1, 4, 5]

Lab 4: Q-learning (table) - GitHub Pageshunkim.github.io/ml/RL/rl-l04.pdfLab 4: Q-learning (table) exploit&exploration and discounted future reward Reinforcement Learning with TensorFlow&OpenAI

Lecture 4: Q-learning (table) - GitHub Pages · Lecture 4: Q-learning (table) exploit&exploration and discounted future reward Reinforcement Learning with TensorFlow&OpenAI Gym Sung

Lab 5: Windy Frozen Lake Nondeterministic world! · Lab 5: Windy Frozen Lake Nondeterministic world! Reinforcement Learning with TensorFlow&OpenAI Gym ... Score over time: 0.0165

Learning with Opponent-Learning Awareness · 2018-06-18 · Learning with Opponent-Learning Awareness Jakob Foerster†,‡ University of Oxford Richard Y. Chen† OpenAI Maruan Al-Shedivat‡

OpenAI Gym radionica - csnedelja.mg.edu.rs · OpenAI Gym radionica Vladimir Milenkovic, Filip Vesovi´ c´ Matematicka gimnazijaˇ NEDELJA4 INFORMATIKE 28. mart 2018. OpenAI Gym radionica

Model-Based Reinforcement Learning via Meta-Policy ...h2t.anthropomatik.kit.edu/pdf/Rothfuss2018.pdf · Jonas Rothfuss KIT, UC Berkeley [email protected] John Schulman OpenAI

NIPS 2016 · (2015) Google gave its introduction/tutorial on TensorFlow, released its best model on ImageNet (2015) OpenAI announced its existence OpenAI released their Universe platform

DLDDO: Deep Learning to Detect Dummy Operations

Jonas Schneider, Head of Engineering for Robotics, OpenAI

Geometry-Aware Neural Renderingpapers.nips.cc/paper/9331-geometry-aware-neural-rendering.pdf · Geometry-Aware Neural Rendering Josh Tobin OpenAI & UC Berkeley [email protected] OpenAI

Can OpenAI Codex and Other Large Language Models Help Us

Working in OpenAI Environments Designing Your Own · Designing Your Own Mike Rudd CS 885 Guest Lecture May 18, 2018. OpenAI* •Not-for-profit, funded by private ... Building Your

Lecture 1: Introduction - GitHub Pages 1: Introduction Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim Nature of Learning •We learn from past--

Large-Scale Study of Curiosity-Driven LearningLarge-Scale Study of Curiosity-Driven Learning Yuri Burda OpenAI Harri Edwards OpenAI Deepak Pathak UC Berkeley Amos Storkey Univ. of

Long-Term Planning and Situational Awareness in OpenAI Five · 2019. 12. 17. · 2 Approach 2.1 OpenAI Five In this paper, we focus speciﬁcally on OpenAI Five [8], a model trained

Adversarial Approaches to Bayesian Learning and Bayesian ... · Learning and Bayesian Approaches to Adversarial Robustness Ian Goodfellow, OpenAI Research Scientist NIPS 2016 Workshop

Lecture 3: Q-learning (table) - GitHub Pages · Lecture 3: Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

5. Dummy-Variable Regression and Analysis of Variance · Lecture Notes 5. Dummy-Variable ... Dummy-Variable Regression and Analysis of Variance 2 2. ... Dummy-Variable Regression

Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI

Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI

Multiple Linear Regression (Dummy Variable Treatment) · In Today’s Class 2 •Recap •Single dummy variable •Multiple dummy variables •Ordinal dummy variables •Dummy-dummy

Learning Montezuma’s Revenge from a Single Demonstration · 2018-12-11 · Learning Montezuma’s Revenge from a Single Demonstration Tim Salimans OpenAI, Google Brain Richard Chen

arXiv:1810.12894v1 [cs.LG] 30 Oct 2018 · EXPLORATION BY RANDOM NETWORK DISTILLATION Yuri Burda OpenAI Harrison Edwards OpenAI Amos Storkey Univ. of Edinburgh Oleg Klimov OpenAI ABSTRACT

Pythonではじめる OpenAI Gymトレーニング

Extending the OpenAI Gym for robotics: a toolkit for ... · OpenAI Gym [1] is a is a toolkit for reinforcement learning research that has recently gained popularity in the machine

Teacher-Student Curriculum Learning › pdf › 1707.00183.pdfTeacher-Student Curriculum Learning Tambet Matiisen,y University of Tartu Avital Oliver z OpenAI Taco Cohen University

Measuring the Algorithmic Efficiency of Neural NetworksMeasuring the Algorithmic Efﬁciency of Neural Networks Danny Hernandez OpenAI [email protected] Tom B. Brown OpenAI [email protected]

Lab 2: Playing OpenAI Gym Games - GitHub Pages · 2017-10-02 · Lab 2: Playing OpenAI Gym Games Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

Extending the OpenAI Gym for robotics: a toolkit for ...erlerobotics.com/whitepaper/robot_gym.pdfExtending the OpenAI Gym for robotics: a toolkit for reinforcement learning using ROS

Ian Goodfellow, OpenAI Research Scientist Guest lecture for CS 294

OpenAI Five Model Architecture - Amazon S3

GPT-3: Few-Shot Learning with a Giant Language Model · 2020. 12. 16. · GPT-3: Few-Shot Learning with a Giant Language Model Melanie Subbiah 1 OpenAI Columbia University