soscon 2_1330_3.pdf · 2019-10-30 · alphago (2016) reinforcement learning soscon 2019 alphago...
TRANSCRIPT
![Page 1: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/1.jpg)
SAMSUNG OPEN SOURCE CONFERENCE 2019
SOSCONUnity ML-Agents
Development of AI Agents Using Unity ML-Agents
Hanyang University | Automotive Engineering | Kyushik Min20191017
![Page 2: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/2.jpg)
SAMSUNG OPEN SOURCE CONFERENCE 2019
SOSCON 2019
Self IntroductionReinforcement LearningUnity ML-AgentsDevelopment Case using Unity ML-Agents
01020304
Contents
![Page 3: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/3.jpg)
SOSCON 2019Self Introduction
Kyushik Min
Ph.D. Candidate of Automotive Department in Hanyang University
• Research Topics
- Self Driving Car, Driver Assistance System, Artificial Intelligence, Reinforcement Learning
• Career
- Unity Masters
- Manager of Reinforcement Learning Korea (Facebook Page)
- Creative Application Award Winner at ML-Agents Challenge
- Doing Research and writing papers using ML-Agents
- Conducting numerous seminars and lectures on ML-Agents
![Page 4: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/4.jpg)
SOSCON 2019SAMSUNG OPEN SOURCE CONFERENCE 2019
Unity ML-Agents
Reinforcement Learning
![Page 5: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/5.jpg)
SOSCON 2019Reinforcement Learning
Machine Learning
• Machine Learning
- Research areas that give computers the ability to learn without explicit programming
• Types of Machine Learning
- Supervised Learning, Unsupervised Learning, Reinforcement Learning
Supervised Learning Unsupervised Learning Reinforcement Learning
![Page 6: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/6.jpg)
SOSCON 2019Reinforcement Learning
Reinforcement Learning
• Learning by Reward
• Perform various experiences through trial and error
• Learn to choose actions in a way that maximizes rewards
Reward
![Page 7: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/7.jpg)
SOSCON 2019Reinforcement Learning
Reinforcement Learning
• Training Process of Reinforcement Learning
![Page 8: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/8.jpg)
SOSCON 2019Reinforcement Learning
AlphaGo (2016)
![Page 9: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/9.jpg)
SOSCON 2019Reinforcement Learning
AlphaGo Zero (2017)
![Page 10: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/10.jpg)
SOSCON 2019Reinforcement Learning
OpenAI Five (2018)
![Page 11: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/11.jpg)
SOSCON 2019Reinforcement Learning
AlphaStar (2019)
![Page 12: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/12.jpg)
SOSCON 2019Reinforcement Learning
Locomotion
![Page 13: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/13.jpg)
SOSCON 2019Reinforcement Learning
DeepMimic
![Page 14: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/14.jpg)
SOSCON 2019Reinforcement Learning
Reinforcement Learning Korea
https://www.facebook.com/groups/ReinforcementLearningKR/
![Page 15: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/15.jpg)
SOSCON 2019Reinforcement Learning
Reinforcement Learning Korea
https://github.com/reinforcement-learning-kr/how_to_study_rl
![Page 16: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/16.jpg)
SOSCON 2019SAMSUNG OPEN SOURCE CONFERENCE 2019
Unity ML-Agents
Unity ML-Agents
![Page 17: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/17.jpg)
SOSCON 2019Unity ML-Agents
Reinforcement Learning
• Training Process of Reinforcement Learning
![Page 18: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/18.jpg)
SOSCON 2019Unity ML-Agents
Reinforcement Learning
Deep Q Network
Rainbow DQN
Deep Deterministic Policy Gradient
Trust Region Policy Optimization
Proximal Policy Optimization
Agent Environment
OpenAI GYM
Atari
Super Mario
Mujoco
Malmo
![Page 19: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/19.jpg)
SOSCON 2019Unity ML-Agents
If you are using a created environment…
Difficulties modifying
the environment
May not have the
required environment
![Page 20: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/20.jpg)
SOSCON 2019Unity ML-Agents
Need to create RL environment
Concerns of people who study RL Create an environment for testing RL
![Page 21: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/21.jpg)
SOSCON 2019Unity ML-Agents
Unity
• Game engine that provides the development environment for 2D & 3D video games
• Also applied to various industries such as 3D animation, architectural visualization, VR
• Over 45% of the game engine market, over 5 million registered developers
![Page 22: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/22.jpg)
SOSCON 2019Unity ML-Agents
Unity
![Page 23: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/23.jpg)
SOSCON 2019Unity ML-Agents
Agent and Environment
Action
State, Reward
Agent Environment
![Page 24: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/24.jpg)
SOSCON 2019Unity ML-Agents
Agent and Environment
Action
State, Reward
Agent Environment
?
![Page 25: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/25.jpg)
SOSCON 2019Unity ML-Agents
Unity ML-Agents
Released 2017.09.19 -> Beta 0.10.0
![Page 26: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/26.jpg)
SOSCON 2019Unity ML-Agents
Unity ML-Agents
• API to simplify configuration for RL in Unity environments
• Communication between Python and Unity environment (State, Action, Reward)
• Consists of Agent, Brain, Academy
- Agent: Code for the Agent, configuration for Obs, action, reward
- Brain: Determine how to control agents (Player, Heuristic, Learning)
- Academy: Integrated management of brain, various settings for environment
![Page 27: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/27.jpg)
SOSCON 2019Unity ML-Agents
Unity ML-Agents
: Single Agent
: Multi Agent
: Adversarial Agents
: Imitation Learning
![Page 28: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/28.jpg)
SOSCON 2019Unity ML-Agents
Unity ML-Agents
![Page 29: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/29.jpg)
SOSCON 2019Unity ML-Agents
Unity ML-Agents
![Page 30: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/30.jpg)
SOSCON 2019Unity ML-Agents
Unity ML-Agents
![Page 31: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/31.jpg)
SOSCON 2019Unity ML-Agents
Unity ML-Agents
![Page 32: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/32.jpg)
SOSCON 2019SAMSUNG OPEN SOURCE CONFERENCE 2019
Unity ML-Agents
Development Case using Unity ML-Agents
![Page 33: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/33.jpg)
SOSCON 2019Development Case using Unity ML-Agents
Machine Learning Camp Jeju 2017
![Page 34: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/34.jpg)
SOSCON 2019Development Case using Unity ML-Agents
Project Proposal
• Various Advanced Driver Assistance Systems (ADAS) are already commercialized,
including lane keeping and lane changes.
• Autonomous driving is possible with a combination of ADAS
• The challenge is to determine which ADAS controls the vehicle in every state
Lane Keeping Lane Change Cruise Control
![Page 35: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/35.jpg)
SOSCON 2019Development Case using Unity ML-Agents
Project Proposal
• In case of RL, prediction of action selection is hard
• Applying collision avoidance systems such as AEB and lane change prevention
• Use sensors such as cameras, LIDAR, and RADAR that are used in autonomous vehicles
• But there is no simulator that satisfies the desired condition
Collision Avoidance Systems Camera, LIDAR, RADAR
![Page 36: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/36.jpg)
SOSCON 2019Development Case using Unity ML-Agents
Simulator
![Page 37: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/37.jpg)
SOSCON 2019Development Case using Unity ML-Agents
Simulator (Observations)
![Page 38: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/38.jpg)
SOSCON 2019Development Case using Unity ML-Agents
Simulator (Actions)
Keep Current State Acceleration Deceleration
Lane Change (Right) Lane Change (Left)
![Page 39: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/39.jpg)
SOSCON 2019Development Case using Unity ML-Agents
Simulator (Reward)
![Page 40: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/40.jpg)
SOSCON 2019Development Case using Unity ML-Agents
Network Architecture
![Page 41: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/41.jpg)
SOSCON 2019Development Case using Unity ML-Agents
Communication between Python & Unity
Socket Communication
Camera Image
Sensor Data
Reward
Terminal
ActionVehicle Simulator RL Algorithm
![Page 42: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/42.jpg)
SOSCON 2019Development Case using Unity ML-Agents
Result
![Page 43: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/43.jpg)
SOSCON 2019Development Case using Unity ML-Agents
Communication between Python & Unity
• Implemented using Socket communication, but there are many unstable parts and bugs
- Problem with communication interruption
- Lots of coding is required for small changes in the environment
- Synchronization Problems Between Communications
- Issue with speed differences between Unity and Python code
• Trying to solve problems for about 1~2 months
- About 70% of all problems were solved
- It was scheduled for release on Github
![Page 44: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/44.jpg)
SOSCON 2019Development Case using Unity ML-Agents
ML-Agents released (2017.09.19)
![Page 45: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/45.jpg)
SOSCON 2019Development Case using Unity ML-Agents
ML-Agents Challenge
![Page 46: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/46.jpg)
SOSCON 2019Development Case using Unity ML-Agents
ML-Agents Challenge
• Apply ML-Agents to the environment created in Jeju Camp
• Made with a simpler environment (static obstacles)
![Page 47: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/47.jpg)
SOSCON 2019Development Case using Unity ML-Agents
ML-Agents Challenge
![Page 48: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/48.jpg)
SOSCON 2019Development Case using Unity ML-Agents
ML-Agents Challenge
![Page 49: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/49.jpg)
SOSCON 2019Development Case using Unity ML-Agents
2018 IEEE IV Conference
Vehicle Simulator Unity ML-agents
![Page 50: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/50.jpg)
SOSCON 2019Development Case using Unity ML-Agents
2018 IEEE IV Conference
Speed Lane Change Overtake
Input Configuration Average Speed (km/h) # of Average Lane Change # of Average Overtaking
Camera Only 71.0776 15 35.2667
LIDAR Only 71.3758 14.2667 38.0667
Multi-Input 75.0212 19.4 44.8
![Page 51: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/51.jpg)
SOSCON 2019Development Case using Unity ML-Agents
2018 IEEE IV Conference
![Page 52: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/52.jpg)
SOSCON 2019Development Case using Unity ML-Agents
2018 IEEE IV Conference
![Page 53: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/53.jpg)
SOSCON 2019Development Case using Unity ML-Agents
IEEE T-IV
Same State&
Same Action
DifferentResult
• Driving situation is stochastic environment
• Even the same action in the same state can have different results!
• General RL: predicting the value as one scalar value
![Page 54: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/54.jpg)
SOSCON 2019Development Case using Unity ML-Agents
IEEE T-IV
• Distributional RL
- Predict the value which agent will receive in the future as a probability distribution
- Better performance in stochastic environments
![Page 55: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/55.jpg)
SOSCON 2019Development Case using Unity ML-Agents
IEEE T-IV
• Simulation Environment
- Add fog and sensor noise to verify the robustness of Distributional RL
- Sensor noise equation: 𝑑=𝑑+𝛼∗𝑅𝑎𝑛𝑑𝑜𝑚.𝑅𝑎𝑛𝑔𝑒(−𝑑, 𝑑) (𝛼: noise weight)
- Add sensor noise and fog only when performing post-training algorithm validation
![Page 56: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/56.jpg)
SOSCON 2019Development Case using Unity ML-Agents
IEEE T-IV
• Network Architecture
- Use QR-DQN, one of the Distributional RL algorithms
![Page 57: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/57.jpg)
SOSCON 2019Development Case using Unity ML-Agents
IEEE T-IV
• Result
- Fast average speed, minimum unnecessary lane changes
![Page 58: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/58.jpg)
SOSCON 2019Development Case using Unity ML-Agents
IEEE T-IV
![Page 59: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/59.jpg)
SOSCON 2019Development Case using Unity ML-Agents
IEEE T-IV
IEEE Transactions on IntelligentVehicles. Vol. 4, No. 3, Sep 2019
![Page 60: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/60.jpg)
SOSCON 2019Development Case using Unity ML-Agents
Github
• Upload the following items to Github!
- RL Algorithms
- Built Unity Environment
- Unity Files
https://github.com/MLJejuCamp2017/DRL_based_SelfDrivingCarControl
![Page 61: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/61.jpg)
SOSCON 2019Development Case using Unity ML-Agents
2019 IEEE ISPACS
• Multi-Agent Traffic Control Environment
- Difficult to change lanes to desired lane in complex road situations
- Overall traffic may slow down during lane changes
- Lane changes in complex situations can lead to accidents
• Goal
- Control multiple vehicles at the same time to move a specific vehicle to the target lane!
- Minimize the overall vehicle speed reduction
=> Multi-Agent Reinforcement Learning
![Page 62: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/62.jpg)
SOSCON 2019Development Case using Unity ML-Agents
2019 IEEE ISPACS
• Predator-Prey Environment
- Predators are learned to hunt the prey and Prey are leaned to runs away from the predators
![Page 63: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/63.jpg)
SOSCON 2019Development Case using Unity ML-Agents
2019 IEEE ISPACS
• Zombie Defense Environment
![Page 64: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/64.jpg)
SOSCON 2019Development Case using Unity ML-Agents
2019 IEEE ISPACS
• Multi-Agent Traffic Control Environment
![Page 65: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/65.jpg)
SOSCON 2019Development Case using Unity ML-Agents
2019 IEEE ISPACS
![Page 66: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/66.jpg)
SOSCON 2019Development Case using Unity ML-Agents
2019 IEEE ISPACS
ISPACS 2019Beitou, TaipeiDec. 3-6, 2019
![Page 67: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/67.jpg)
SOSCON 2019Development Case using Unity ML-Agents
Extra Research
![Page 68: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/68.jpg)
SOSCON 2019Conclusion
Conclusion
• Are you curious about reinforcement learning? Come to RLKorea!
• Unity ML-Agents make it easy to create RL environments!
- Easy environment creation using Unity
- Stable communication between Unity environment and Python code
- Support for creating a variety of RL environments (Multi-Agent, Curriculum,…)
• Perform various studies using ML-Agents
- Create a variety of game and vehicle environments
- RL performance verification using ML-Agents
![Page 69: SOSCON 2_1330_3.pdf · 2019-10-30 · AlphaGo (2016) Reinforcement Learning SOSCON 2019 AlphaGo Zero (2017) Reinforcement Learning SOSCON 2019 OpenAIFive (2018) Reinforcement Learning](https://reader034.vdocuments.site/reader034/viewer/2022042302/5ecdcb860334f65af7759626/html5/thumbnails/69.jpg)
SOSCON 2019SAMSUNG OPEN SOURCE CONFERENCE 2019
THANK YOU