visual perception for autonomous driving on the nvidia...

Visual Perception for Autonomous Driving on the NVIDIA DrivePX2 and using SYNTHIA

http://adas.cvc.uab.es/elektra/http://www.synthia‐dataset.net

Dr. Juan C. MoureDr. Antonio Espinosa

http://grupsderecerca.uab.cat/hpca4se/en/content/gpu

2

Our Background & Current Research WorkComputer Architecture Group:

GPU acceleration: Bioinformatics, CV, Image CompressionComputer Vision Group:

CV Algorithms + Deep Learning for Camera-based ADAS

GOAL: Camera-based Perception for Autonomous Driving Robotized carGPU-accelerated algorithmsDeep Learning & Simulation Infrastructure (SYNTHIA)

Elektra Car + DrivePX2

3

Overview of Presentation

GPU Accelerated PerceptionDepth ComputationSemantic & Slanted stixels

(Collaboration with Daimler)Speed up MAP estimation problem solved by DP using CNNs

SYNTHIA toolkitNew datasets, new ground-truth data, LIDARs …

4

Stereo Vision for Depth Computation

Disparity: distance between same point in left & right images higher disparity = Objects are closer

10 meters

5

SemiGlobal Matching (SGM) on GPU: Parallelism

…Matching Cost

SmoothedCost

Large Grain Parallelism

Medium Grain Parallelism

Fine Grain Parallelism

[Hernández ICCS‐2016]

6

SGM on GPU: Results

0

50

100

150 Performance ( Frames / Second, fps )960x3601280x4801920x720

Maximum Disparity= Image Height / 4SGM: 4 path directions

Tegra X1 (DrivePX) Tegra Parker (DrivePX2)

real-time

Tegra Parker improves performance ≈ 4x vs Tegra X1:• 3.5x Higher Effective Memory Bandwidth• Higher execution overlap among kernels

Stixel World: Compact representation of the world

Stereo Disparity Stixels

Obj.

Sky

Obj.

Obj.

Grnd

slopehorizon

Stereo Images

Stixel = Stick + Pixel

Fixed‐width, variable number of stixels per column

First proposed by a Research Group in DAIMLER [Pfeiffer BMVC‐2011]

Semantic Stixels: Unified approach

Stereo Disparity

Semantic Stixels

Buil.

Sky

Ped.

Side

Road

slopehorizon

Stereo Images

Semantic segmentation [Schneider IV‐2016]

Enhanced model: Slanted Stixels 9

• MAP estimation Problem joining Semantic & Depth Bayesian Model (converted to energy minimization)• Stixel Disparity Model includes slant b:

• Redefine Energy function (log-likelihood) : log

• Enforces prior assumptions: no sky below horizon, objects stand on road

, ∗

Stixels

Slanted Stixels

[Hernández BMVC‐2017]Best Industrial Paper

New SYNTHIA-San Francisco dataset10

• SF city designed with SYNTHIA toolkit • 2224 Photorealistic images featuring slanted roads, with pixel‐level depth &

semantic ground truth• Very expensive to generate equivalent real‐data images

Results: Quantitative & Visual 11

Left Image Original Stixels Slanted Stixels

3D representation

Accuracy Results on SYNTHIA‐SF

Disparity Error (%) from 30.9 to 12.9

IoU (%) from 46 to 48.5

Accuracy remainedthe same for other

datasets

Computation Complexity: Dynamic Programming 12

Work Complexity (per column) ( h2 ), h = image height

Semantic Segmentation

Disparity Image Ground Object Sky

Pixel Size

Each column processed independentlyDynamic programming strategy for efficient evaluation of all the possible configurations

h

Stixel (DP) Algorithm on GPU: Parallelism13

Stereo Disparity

Large Grain parallelism

Medium and Fine Grain parallelism

Sequential Operation withDecreasing Parallelism

CTA···

CTA

h

h

step 1 step 2 step 3 ….. step h

…..

Performance Results 14

Performance ( Frames/Second, fps )

53

369

24

164

749

050

100150200250300


960x3601280x4801920x720

• Real‐time performance on DrivePX2 for all image sizes ( ≈6x‐7x on DrivePX2 vs DrivePX )• Complex Stixel Model: 60‐70% of time for Stixel algorithm + 30‐40% of time for semantic inference

Original Stixel Model

17

107

849

3 170

50100150200250300


960x3601280x4801920x720

Slanted + Semantic Stixel Model(includes time for semantic inference)


Improving Computation Complexity: Pre-segmentation15



• Infer possible Stixel cuts (pre‐segmentation) from inputs• Avoid checking all possible Stixel combinations

NAÏVE Pre-segmentation

( h ) Accuracy degrades 10-20% when usingpre-segmentation

h h’

Work Complexity (column) ( h’×h’ ), h’ << h

Pre-segmentation using a DNN16



• Infer possible Stixel cuts from inputs by usinggeneral data relations (among columns)

h h’

Now accuracy improves slightlywhen using pre-segmentation

DNN-basedPre-segmentation

Improved Performance Results 17


37

193

17

87

735

050

100150200250300


960x3601280x4801920x720

• Improves performance on both DrivePX and DrivePX2 ( ≈2x )• Now 15-30% of time for Stixel algorithm + 70-85% of time for semantic inference

• Inference time increase almost neglectable ( <10% )• Most of the CNN for pre‐segmentation is shared with CNN for semantic segmentation

+ Pre‐segmentation

17

107

849

3 170

50100150200250300


960x3601280x4801920x720

Slanted + Semantic Stixel Model(includes time for semantic inference)


SYNTHIA Dataset Toolkit18

Image generator of precise annotated data for training DNNs on Autonomous Driving tasksGround truth data:

•RGB & Per pixel: depth, semantic class, optical flow, 3D rounding boxes• Fully compatible with Cityscapes classes

Generation of LIDAR dataProblem Customization: Synthia‐SanFrancisco

www.synthia‐dataset.net

Summary: Real sequence video19

Thank you

Autonomous University of Barcelona

Dr. Juan C. [email protected]

http://grupsderecerca.uab.cat/hpca4se/en/content/gpu

visual perception for autonomous driving on the nvidia...

Documents