論文紹介:value iteration networks(チームラボ勉強会)

31
Value Iteration Networks Aviv Tamar, Yi Wu, Garrett Thomas, Sergey Levine and Pieter Abbeel @ UC Berkeley NIPS 2016 チチチチチチチチチ チチチ チチチチチ チチチチチ 2017/03/09

Upload: ryo-yamamoto

Post on 19-Mar-2017

68 views

Category:

Technology


0 download

TRANSCRIPT

Value Iteration Networks

Value Iteration NetworksAviv Tamar, Yi Wu, Garrett Thomas, Sergey Levine and Pieter Abbeel @ UC BerkeleyNIPS 2016 2017/03/09

2

Qhttps://www.slideshare.net/yamaryox/20160421-70945023

3

Q Q

Q

Q4

Qreactive

5

CNN

6

Value Iteration Networks

7

s x, y s (s) a r

( (s), a ) 8

(s)

maxmax...

Value Iteration Value Iteration

Value Iteration Networks10

(s)

maxmax...

(s)

maxmax...

CNNCNN

Value Iteration Networks

(s) CNN

Conv

max& softmax13

13

Value Iteration Networks

CNNBack-Propagation14

15

Grid-World

Mars Rover Navigation

Continuous Control

Grid-World8x8, 16x16, 28x28 3x3CNN conv1=3x3x150, conv2=3x3x1 10, 20, 36 5000 7

CNNFCN(NN)17

Grid-World 18

Grid-World VIN 19

Grid-World 20

Grid-World 21

Mars Rover Navigation128x128108CNN16x16Conv(5x5x6), MaxPool(4x4), Conv(3x3x12), MaxPool(2x2), Conv(3x3x150), Conv(3x3x1) 10,000 7

22

Mars Rover Navigation 23

Mars Rover Navigation VIN84.8%

CNN90.3%

VIN24

Continuous Control(x, y, vx, vy)16x163x3NNCNN Conv1(3x3x150), Conv2(3x3x1)20040

25

Continuous Control

(s) CNN

Conv

max(5x5)& x 326

26

Continuous Control CNN27

Continuous Control

28

29

Value Iteration NetworksEnd-to-End

CNN

30

Web

VIN

31