deep reinforcement learning

Deep RL Abecon 20.05.2016

Upload: leszek-rybicki

Post on 16-Apr-2017

188 views

Category:

Science

1 download

Report

Download

Embed Size (px):

TRANSCRIPT

Deep RLAbecon 20.05.2016

RL-type Problems

• game of chess, GO, Space Invaders

• balancing a unicycle

• investing in stock market

• running a business

• making fast food

• life…!

Markov Decision Process

• S - set of states

• A - set of actions (or actions for state)

• P(s, s’ | a) - state change

• R(s, s’ | a) - reward

• ∈ [0, 1] - discount factor

Maximize the total discounted reward:

The GOAL

: discount factor0 instant gratification

…

patience1

Value Functions

http://cs.stanford.edu/people/karpathy/reinforcejs/gridworld_dp.html

SARSA

http://cs.stanford.edu/people/karpathy/reinforcejs/gridworld_td.html

Q-Learning

http://cs.stanford.edu/people/karpathy/reinforcejs/gridworld_td.html

Reinforce.jshttp://cs.stanford.edu/people/karpathy/reinforcejs/

// create an environment objectvar env = {};env.getNumStates = function() { return 8; }env.getMaxNumActions = function() { return 4; }

// create the DQN agentvar spec = { alpha: 0.01 } agent = new RL.DQNAgent(env, spec);

setInterval(function(){ // start the learning loop var action = agent.act(s); // s is an array of length 8 agent.learn(reward);}, 0);

http://cs.stanford.edu/people/karpathy/reinforcejs/

2013: Deep RL

http://arxiv.org/abs/1312.5602

2014: Google buys DeepMind

2015: AlphaGO

Deep Q-Learning1. Do a feedforward pass for the current state s to get predicted Q-values

for all actions.

2. Do a feedforward pass for the next state s’ and calculate maximum overall network outputs max a’ Q(s’, a’).

3. Set Q-value target for action to r + γmax a’ Q(s’, a’) (use the max calculated in step 2). For all other actions, set the Q-value target to the same as originally returned from step 1, making the error 0 for those outputs.

4. Update the weights using backpropagation.

http://www.nervanasys.com/demystifying-deep-reinforcement-learning/

Deep Q-Learning

http://www.nervanasys.com/demystifying-deep-reinforcement-learning/

https://www.youtube.com/watch?v=32y3_iyHpBc

http://gabrielecirulli.github.io/2048/

https://www.youtube.com/watch?v=32y3_iyHpBc

http://www.youtube.com/watch?v=hUNTeZJSUWE

Asynchronous Gradient Descent

http://arxiv.org/abs/1602.01783

http://www.youtube.com/watch?v=cFWL_y9BVaQ

http://www.rethinkrobotics.com/baxter/

http://www.youtube.com/watch?v=NHWjlCaIrQo

DEEP FEATURE EXTRACTION FOR SAMPLE-EFFICIENT REINFORCEMENT … · Deep Reinforcement Learning Il deep reinforcement learning (DRL) `e una branca dell’apprendimento per rinforzo

Deep Reinforcement Learning for Robotics

Human-level control through deep reinforcement … control through deep reinforcement learning ... We demon- stratethatthedeepQ ... Human-level control through deep reinforcement learning

Deep Reinforcement Learning: Q-Learning - Svetlana …slazebni.cs.illinois.edu/spring17/lec17_rl.pdf · Deep Reinforcement Learning: Q-Learning Garima Lalwani Karan Ganju Unnat Jain

Reinforcement and deep reinforcement learning for wireless

Deep Reinforcement Learning of Video Games...the eld of machine learning and reinforcement learning, in particular deep reinforcement learning. 1.2.1 Architectural neural network design

Deep Learning with Reinforcement Learning · about statistics, information theory, machine learning, reinforcement learning and ﬁnally deep learning. Finally, an honest and special

From Reinforcement Learning to Deep Reinforcement …fagostin/assets/files/...Keywords: Machine learning · Reinforcement learning Deep learning · Deep reinforcement learning 1 Introduction

Deep Learning for Reinforcement Learning in Pacman · Deep Learning for Reinforcement Learning in Pacman Deep Learning für Reinforcement Learning in Pacman Vorgelegte Bachelor-Thesis

Reinforcement Learning: CNNs and Deep Q Learning

10703 Deep Reinforcement Learning

Deep Reinforcement Learning - KAISTalinlab.kaist.ac.kr/resource/AI602_Lec12_Deep_Reinforcement_Learni… · What is Reinforcement Learning? 2. Value-based Methods •Q-learning •Deep

Introduction to Deep Reinforcement Learningwnzhang.net/teaching/cs420/slides/13-deep-rl.pdf · 2020-06-08 · Deep Reinforcement Learning •Deep Reinforcement Learning •leverages

Hierarchical Deep Reinforcement Learning: Integrating ... · Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction ... expected future intrinsic (controller)

Tutorial: Deep Reinforcement Learning · 2018-09-10 · Tutorial: Deep Reinforcement Learning David Silver, Google DeepMind. Outline Introduction to Deep Learning Introduction to

10703 Deep Reinforcement Learning and Controlrsalakhu/10703/Lecture_Exploration.pdf · 10703 Deep Reinforcement Learning and Control Russ Salakhutdinov Machine Learning Department

Tutorial: Deep Reinforcement Learning - Machine Learning ...hunch.net/~beygel/deep_rl_tutorial.pdfTutorial: Deep Reinforcement Learning - Machine Learning

Deep Reinforcement Learning that Matters

Deep Reinforcement Learning - University of Waterloo · PDF file–CS885: Deep Reinforcement Learning •Instructor: Pascal Poupart •Spring 2018 •Deep Q Networks, Recurrent deep

Reinforcement Learning and Deep Reinforcement Learningcse.ucdenver.edu/.../Class-22-Reinforcement-learning-DL.pdf · 2018. 11. 28. · Outlines 1 Principles of Reinforcement Learning

Exploration in Deep Reinforcement Learning · 2017-11-10 · Exploration in Deep Reinforcement Learning Exploration in Deep Reinforcement Learning Vorgelegte Bachelor-Thesis von Markus

Deep Reinforcement Learning - Environments Tour

Deep Reinforcement Learning from Human Preferencespapers.nips.cc/...deep-reinforcement-learning-from-human-preferences.pdf · Deep Reinforcement Learning from Human Preferences Paul

Deep Reinforcement Learning - Cross Entropy

Deep Learning and Reinforcement Learning Workflows in A.I. · Deep Learning and Reinforcement Learning Workflows in A.I. Jon Cherrie Software Engineering Manager Deep Learning Toolbox

Deep Reinforcement Learning in System Optimization · arXiv:1908.01275v2 [cs.LG] 7 Aug 2019. Deep Reinforcement Learning in System Optimization Figure 1. Reinforcement learning. By

Deep Reinforcement Learning for Crazyhouse · Deep Reinforcement Learning for Crazyhouse MasterthesisbyJohannesCzech Dateofsubmission:December30,2019 1.Review:Prof.Dr.KristianKersting

Thinking While Moving: Deep Reinforcement Learning in ... While...Thinking While Moving: Deep Reinforcement Learning in Concurrent Environments. Thinking While Moving: Deep Reinforcement

Survey of Deep Reinforcement Learning for Motion Planning ... · Reinforcement Learning Autonomous vehicles Fig. 1: Web of Science topic search for ”Deep Reinforcement Learning”

Towards Deeper Deep Reinforcement Learning

Agente Sonic. Deep reinforcement learning

Intro to Deep Reinforcement Learning

Deep Reinforcement Learning and Control

Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao ...

Deep Reinforcement Learning with Double Q-Learning