2016 06 nvidia-isc_supercomputing_car_v02

40
SUPERCOMPUTING IN A CAR Carlo Nardone, Senior Solution Architect EMEA Enterprise ISC 2016

Upload: carlo-nardone

Post on 17-Jan-2017

287 views

Category:

Technology


0 download

TRANSCRIPT

SUPERCOMPUTING IN A CAR Carlo Nardone, Senior Solution Architect EMEA Enterprise

ISC 2016

2

ENTERPRISE AUTO GAMING DATA CENTER PRO VISUALIZATION

THE WORLD LEADER IN VISUAL COMPUTING

3

IN THE BEGINNING

5

SIMULATION MEANS BETTER PRODUCTS, FASTER

ACTUAL CRASH SIMULATED CRASH

6

THE SELF DRIVING REVOLUTION

Safer Driving New Mobility Services Urban Redesign

7

AUTONOMOUS DRIVING IS HARD

8

Uber Enters the Race

Toyota Invests $1B in AI Lab

Volvo Drive Me on Public Roads in 2017

NHTSA: Computer Counts as Driver

Tesla Model 3: 300K pre-orders

2016: AN AMAZING YEAR FOR SELF-DRIVING CARS

Audi, BMW, Daimler Buy HERE

Tesla Model S Auto-pilot

Baidu Enters the Race

Honda, Nissan, Toyota Team Up

GM Buys Cruise

9

DEEP LEARNING FOR SELF-DRIVING CARS

10

NVIDIA PILOTNET VIDEO Paper on http://arxiv.org/abs/1604.07316

11

THE BIG BANG IN MACHINE LEARNING

DNN GPU BIG DATA

“The GPU is the workhorse of modern A.I.”

12

Image “Volvo XC90”

Image source: “Unsupervised Learning of Hierarchical Representations with Convolutional Deep Belief Networks” ICML 2009 & Comm. ACM 2011. Honglak Lee, Roger Grosse, Rajesh Ranganath, and Andrew Ng.

WHAT IS DEEP LEARNING?

13

TRAINING VS INFERENCE

DEEP LEARNING EVERYWHERE

NVIDIA Titan X

NVIDIA Jetson

NVIDIA Tesla

NVIDIA DRIVE PX

15 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

NVIDIA DGX-1 WORLD’S FIRST DEEP LEARNING SUPERCOMPUTER

170 TFLOPS FP16

8x Tesla P100 16GB

NVLink Hybrid Cube Mesh

Accelerates Major AI Frameworks

Dual Xeon

7 TB SSD Deep Learning Cache

Dual 10GbE, Quad IB 100Gb

3RU – 3200W

16

NVIDIA END-TO-END AUTONOMOUS DRIVING PLATFORM

NVIDIA DRIVE PX 2 NVIDIA DGX-1

NVIDIA DRIVENET

Localization

Planning

Visualization

Perception

DRIVEWORKS

17

NVIDIA DRIVE PX 2

World’s First AI Supercomputer for Self-Driving Cars

12 CPU cores | Pascal GPU | 8 TFLOPS | 24 DL TOPS | 16nm FF | 250W | Liquid Cooled

19

SELF DRIVING COMPUTER I/O NVIDIA Drive PX 2: 70 Gbps aggregate I/O

DISPLAY

DATA LOGGING

DRIVE TRAIN POWER TRAIN

Ethernet

GMSL

Ethernet

FlexRay/Ethernet

TEGRA TEGRA

SMART CAMERAS

CAMERAS

LIDAR

RADAR

CANBus

LVDS

USB/PCIE

20

DRIVE™ PX 2

COMPUTATION ENGINES

24 DL TOPS, 8 TFLOPS, high performance CPU/GPU complex

21

NVIDIA DRIVE PX SW STACK A full stack of rich software components

22

GPU INFERENCE ENGINE

High-performance framework makes it easy to develop GPU-accelerated inference

Production deployment solution for deep learning inference

Optimized inference for a given trained neural network and target GPU

Solutions for Hyperscale, ADAS, Embedded

Supports deployment of 32-bit or 16-bit inference

Maximum Performance for Deep Learning Inference

developer.nvidia.com/gpu-inference-engine

GPU Inference Engine for Automotive

Pedestrian Detection

Lane

Tracking

Traffic Sign

Recognition ---

NVIDIA DRIVE PX 2

ACTIVE LEARNING

Data Scientist Vehicle

Drive PX - Deploy

Model Classification

Detection

Segmentation DIGITS / Tesla - Train

Network

Solver

Dashboard

24

A COMPLETE DEEP LEARNING PLATFORM

MANAGE TRAIN DEPLOY

DIGITS

DATA CENTER AUTOMOTIVE

TRAIN TEST

MANAGE / AUGMENT EMBEDDED

GPU INFERENCE ENGINE

NVIDIA DRIVE™ PX 2 Selected by Volvo on Journey Towards a Crash-Free Future

26

WORLD’S FIRST AUTONOMOUS CAR RACE 10 teams, 20 identical cars | DRIVE PX 2 as “brain” in every car | 2016/17 Formula E season

THANK YOU! [email protected]

+39 335 5828197

www.nvidia.com/drive

28

DEEP LEARNING &

ARTIFICIAL INTELLIGENCE

Sep 28-29, 2016 | Amsterdam

www.gputechconf.eu #GTC16

SELF-DRIVING CARS VIRTUAL REALITY &

AUGMENTED REALITY

SUPERCOMPUTING & HPC

GTC Europe is a two-day conference designed to expose the innovative ways developers, businesses and academics

are using parallel computing to transform our world.

GTC EUROPE

2 Days | 800 Attendees | 50+ Exhibitors | 50+ Speakers | 15+ Tracks | 15+ Workshops | 1-to-1 Meetings

BACKUP SLIDES

30

MANY THINGS TO LEARN

31

THE BASIC SELF-DRIVING LOOP

LOCALIZE

MAP

CONTROL SENSE

PLAN

PERCEIVE

32

INTERFACES 70 Gigabits per second of I/O

Sensor Fusion Interfaces: GMSL Camera, CAN, GbE, BroadR-Reach, FlexRay, LIN, GPIO

Displays and Cockpit Computer Interfaces HDMI, FPDLink III and GMSL

Development and Debug Interfaces HDMI, GbE, 10GbE, USB3, USB 2 (UART/debug), JTAG

Auto Grade connectors Debug/Lab interfaces

33

GPU INFERENCE ENGINE Optimizations

• Fuse network layers

• Eliminate concatenation layers

• Kernel specialization

• Auto-tuning for target platform

• Select optimal tensor layout

• Batch size tuning TRAINED NEURAL NETWORK

OPTIMIZED INFERENCE RUNTIME

developer.nvidia.com/gpu-inference-engine

OPEN PLATFORM FOR ALL DEVELOPERS

37

AUTOMOTIVE PARTNERS Self Driving Vehicles

“ Using NVIDIA DIGITS deep

learning platform, in less than

four hours we achieved over 96%

accuracy using Ruhr University

Bochum’s traffic sign database.

While others invested years of

development to achieve similar

levels of perception with

classical computer vision

algorithms, we have been able

to do it at the speed of light.” Matthias Rudolph, Director of Architecture,

Driver Assistance Systems, Audi

“ Deep learning on NVIDIA DIGITS

has allowed for a 30x enhancement

in training pedestrian detection

algorithms, which are being further

tested and developed as we move

them onto the NVIDIA DRIVE PX.”

Dragos Maciuca, Technical Director,

Ford Research and Innovation Center

DGX-1 DEEP LEARNING SUPERCOMPUTER

41

42

DGX-1 GPU CLUSTER

Two fully connected quads, connected at corners

160GB/s per GPU bidirectional to Peers

Load/store access to Peer Memory

Full atomics to Peer GPUs

High speed copy engines for bulk data copy

PCIe to/from CPU