visual perception for autonomous driving on the nvidia...
TRANSCRIPT
Visual Perception for Autonomous Driving on the NVIDIA DrivePX2 and using SYNTHIA
http://adas.cvc.uab.es/elektra/http://www.synthia‐dataset.net
Dr. Juan C. MoureDr. Antonio Espinosa
http://grupsderecerca.uab.cat/hpca4se/en/content/gpu
2
Our Background & Current Research WorkComputer Architecture Group:
GPU acceleration: Bioinformatics, CV, Image CompressionComputer Vision Group:
CV Algorithms + Deep Learning for Camera-based ADAS
GOAL: Camera-based Perception for Autonomous Driving Robotized carGPU-accelerated algorithmsDeep Learning & Simulation Infrastructure (SYNTHIA)
Elektra Car + DrivePX2
3
Overview of Presentation
GPU Accelerated PerceptionDepth ComputationSemantic & Slanted stixels
(Collaboration with Daimler)Speed up MAP estimation problem solved by DP using CNNs
SYNTHIA toolkitNew datasets, new ground-truth data, LIDARs …
4
Stereo Vision for Depth Computation
Disparity: distance between same point in left & right images higher disparity = Objects are closer
10 meters
5
SemiGlobal Matching (SGM) on GPU: Parallelism
…Matching Cost
SmoothedCost
Large Grain Parallelism
Medium Grain Parallelism
Fine Grain Parallelism
[Hernández ICCS‐2016]
6
SGM on GPU: Results
0
50
100
150 Performance ( Frames / Second, fps )960x3601280x4801920x720
Maximum Disparity= Image Height / 4SGM: 4 path directions
Tegra X1 (DrivePX) Tegra Parker (DrivePX2)
real-time
Tegra Parker improves performance ≈ 4x vs Tegra X1:• 3.5x Higher Effective Memory Bandwidth• Higher execution overlap among kernels
Stixel World: Compact representation of the world
Stereo Disparity Stixels
Obj.
Sky
Obj.
Obj.
Grnd
slopehorizon
Stereo Images
Stixel = Stick + Pixel
Fixed‐width, variable number of stixels per column
First proposed by a Research Group in DAIMLER [Pfeiffer BMVC‐2011]
Semantic Stixels: Unified approach
Stereo Disparity
Semantic Stixels
Buil.
Sky
Ped.
Side
Road
slopehorizon
Stereo Images
Semantic segmentation [Schneider IV‐2016]
Enhanced model: Slanted Stixels 9
• MAP estimation Problem joining Semantic & Depth Bayesian Model (converted to energy minimization)• Stixel Disparity Model includes slant b:
• Redefine Energy function (log-likelihood) : log
• Enforces prior assumptions: no sky below horizon, objects stand on road
, ∗
Stixels
Slanted Stixels
[Hernández BMVC‐2017]Best Industrial Paper
New SYNTHIA-San Francisco dataset10
• SF city designed with SYNTHIA toolkit • 2224 Photorealistic images featuring slanted roads, with pixel‐level depth &
semantic ground truth• Very expensive to generate equivalent real‐data images
Results: Quantitative & Visual 11
Left Image Original Stixels Slanted Stixels
3D representation
Accuracy Results on SYNTHIA‐SF
Disparity Error (%) from 30.9 to 12.9
IoU (%) from 46 to 48.5
Accuracy remainedthe same for other
datasets
Computation Complexity: Dynamic Programming 12
Work Complexity (per column) ( h2 ), h = image height
Semantic Segmentation
Disparity Image Ground Object Sky
Pixel Size
Each column processed independentlyDynamic programming strategy for efficient evaluation of all the possible configurations
h
Stixel (DP) Algorithm on GPU: Parallelism13
Stereo Disparity
Large Grain parallelism
Medium and Fine Grain parallelism
Sequential Operation withDecreasing Parallelism
CTA···
CTA
h
h
step 1 step 2 step 3 ….. step h
…..
Performance Results 14
Performance ( Frames/Second, fps )
53
369
24
164
749
050
100150200250300
Tegra X1 (DrivePX) Tegra Parker (DrivePX2)
960x3601280x4801920x720
• Real‐time performance on DrivePX2 for all image sizes ( ≈6x‐7x on DrivePX2 vs DrivePX )• Complex Stixel Model: 60‐70% of time for Stixel algorithm + 30‐40% of time for semantic inference
Original Stixel Model
17
107
849
3 170
50100150200250300
Tegra X1 (DrivePX) Tegra Parker (DrivePX2)
960x3601280x4801920x720
Slanted + Semantic Stixel Model(includes time for semantic inference)
Performance ( Frames/Second, fps )
Improving Computation Complexity: Pre-segmentation15
Semantic Segmentation
Disparity Image Ground Object Sky
• Infer possible Stixel cuts (pre‐segmentation) from inputs• Avoid checking all possible Stixel combinations
NAÏVE Pre-segmentation
( h ) Accuracy degrades 10-20% when usingpre-segmentation
h h’
Work Complexity (column) ( h’×h’ ), h’ << h
Pre-segmentation using a DNN16
Semantic Segmentation
Disparity Image Ground Object Sky
• Infer possible Stixel cuts from inputs by usinggeneral data relations (among columns)
h h’
Now accuracy improves slightlywhen using pre-segmentation
DNN-basedPre-segmentation
Improved Performance Results 17
Performance ( Frames/Second, fps )
37
193
17
87
735
050
100150200250300
Tegra X1 (DrivePX) Tegra Parker (DrivePX2)
960x3601280x4801920x720
• Improves performance on both DrivePX and DrivePX2 ( ≈2x )• Now 15-30% of time for Stixel algorithm + 70-85% of time for semantic inference
• Inference time increase almost neglectable ( <10% )• Most of the CNN for pre‐segmentation is shared with CNN for semantic segmentation
+ Pre‐segmentation
17
107
849
3 170
50100150200250300
Tegra X1 (DrivePX) Tegra Parker (DrivePX2)
960x3601280x4801920x720
Slanted + Semantic Stixel Model(includes time for semantic inference)
Performance ( Frames/Second, fps )
SYNTHIA Dataset Toolkit18
Image generator of precise annotated data for training DNNs on Autonomous Driving tasksGround truth data:
•RGB & Per pixel: depth, semantic class, optical flow, 3D rounding boxes• Fully compatible with Cityscapes classes
Generation of LIDAR dataProblem Customization: Synthia‐SanFrancisco
www.synthia‐dataset.net
Summary: Real sequence video19
Thank you
Autonomous University of Barcelona
Dr. Juan C. [email protected]
http://grupsderecerca.uab.cat/hpca4se/en/content/gpu