redefining the future of computing · 2.4tb/sec to solve the largest of ai and hpc problems....

1
© 2018 NVIDIA Corporation. All rights reserved. NVIDIA, the NVIDIA logo, Tesla, and NVLink are trademarks and/or registered trademarks of NVIDIA Corporation in the U.S. and other countries. Other company and product names may be trademarks of the respective companies with which they are associated. RECORD PERFORMANCE The HGX-2 platform is powered by NVIDIA NVSwitch which enables every GPU to communicate with every other GPU at full bandwidth of 2.4TB/sec to solve the largest of AI and HPC problems. REDEFINING THE FUTURE OF COMPUTING HGX-2 multi-precision computing platform allows high-precision calculations using FP64 and FP32 for scientific computing and simulations, while also enabling FP16 and Int8 for AI training and inference. This unprecedented versatility provides unique flexibility to support the future of computing. 0.5TB Aggregate High-Bandwidth GPU Memory EXPLOSION OF NETWORK COMPLEXITY AI models are becoming increasingly complex and diverse, from translating languages to autonomous driving. Solving these models requires massive compute capability, large memory, and extremely fast connections between the GPUs. 24X Higher GPU-to-GPU Bandwidth * * Compared to two HGX-1-based servers connected with 4x IB ports 2 PFLOPS Total Compute NVIDIA HGX-2 FUSING HPC AND AI COMPUTING INTO A UNIFIED ARCHITECTURE EMPOWERING THE DATA CENTER ECOSYSTEM NVIDIA works with a wide range of partners to deliver the ideal AI and HPC solution. With HGX-2, they can now integrate a state-of-the-art platform into their servers to advance the data center ecosystem. Convolutional Networks Sequence & Attention Networks Generative Adversarial Networks y1 y2 </s> GPU8 GPU3 GPU2 GPU1 </s> y1 y3 x3 x2 </s> GPU2 GPU2 GPU1 GPU3 GPU8 Reinforcement Learning New Species 9X9 9X9 20 6 Wij = [8x16] 16 DigitalCaps 10 10 ||L2|| 32 256 ReLU Conv1 PrimaryCaps 8 ...... Encoder/Decoder Concat ReLu Dropout BatchNorm Pooling LSTM WaveNet GRU CTC Beam Search Attention 3D-GAN Attention Speech Enhancement MedGAN ConditionalGAN DQN Simulation DDPG Mixture of Experts Neural Collaborative Filtering Block Sparse LSTM SEE HOW HGX-2 CAN ACCELERATE YOUR AI AND HPC WORKLOADS. www.nvidia.com/hgx AI Training HGX-2 Replaces 300 CPU-Only Server Nodes Workload: ResNet50, 90 epochs to solution CPU server: dual-socket Intel Xeon Gold 6140 HPC HGX-2 Replaces 60 CPU-Only Server Nodes Workload: MILC (particle physics HPC application) CPU server: dual-socket Intel Xeon Gold 6140 100 200 300 0 Dual-Socket CPU 1 HGX-2 20 40 60 0 Dual-Socket CPU HGX-2 60x 300x 1 NVIDIA NVSwitches Direct GPU-to-GPU Connection Between All 16 GPUs 12 NVIDIA ® Tesla ® V100 GPUs 0.5TB Memory 16

Upload: others

Post on 12-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: REDEFINING THE FUTURE OF COMPUTING · 2.4TB/sec to solve the largest of AI and HPC problems. REDEFINING THE FUTURE OF COMPUTING HGX-2 multi-precision computing platform allows high-precision

© 2018 NVIDIA Corporation. All rights reserved. NVIDIA, the NVIDIA logo, Tesla, and NVLink are trademarks and/or registered trademarks of NVIDIA Corporation in the U.S. and other countries. Other company and product names may be trademarks of the respective companies with which they are associated.

RECORD PERFORMANCEThe HGX-2 platform is powered by NVIDIA NVSwitch™ which enables every GPU to communicate with every other GPU at full bandwidth of

2.4TB/sec to solve the largest of AI and HPC problems.

REDEFINING THE FUTURE OF COMPUTING HGX-2 multi-precision computing platform allows high-precision calculations

using FP64 and FP32 for scientific computing and simulations, while also enabling FP16 and Int8 for AI training and inference. This unprecedented versatility provides unique flexibility to support the future of computing.

0.5TBAggregate High-Bandwidth

GPU Memory

EXPLOSION OF NETWORK COMPLEXITYAI models are becoming increasingly complex and diverse, from translating

languages to autonomous driving. Solving these models requires massive compute capability, large memory, and extremely fast connections between the GPUs.

24XHigher GPU-to-GPU

Bandwidth*

* Compared to two HGX-1-based servers connected with 4x IB ports

2 PFLOPSTotal Compute

NVIDIA HGX-2FUSING HPC AND AI COMPUTING INTO A UNIFIED ARCHITECTURE

EMPOWERING THE DATA CENTER ECOSYSTEMNVIDIA works with a wide range of partners to deliver the ideal AI and HPC

solution. With HGX-2, they can now integrate a state-of-the-art platform into their servers to advance the data center ecosystem.

Convolutional Networks Sequence & Attention Networks Generative Adversarial Networksy1 y2 </s>

GPU8

GPU3

GPU2

GPU1

</s> y1 y3x3 x2 </s>

GPU2

GPU2

GPU1

GPU3

GPU8

Reinforcement Learning New Species

9X9 9X9

20

6

Wij = [8x16]

16

DigitalCaps

1010

||L2||

32

256 ReLU Conv1

PrimaryCaps

8

......

Encoder/Decoder

Concat

ReLu

Dropout

BatchNorm

Pooling

LSTM

WaveNet

GRU

CTC

Beam Search

Attention

3D-GAN

Attention Speech Enhancement

MedGAN ConditionalGAN

DQN Simulation

DDPG

Mixture of Experts Neural Collaborative Filtering

Block Sparse LSTM

SEE HOW HGX-2 CAN ACCELERATEYOUR AI AND HPC WORKLOADS.

www.nvidia.com/hgx

AI TrainingHGX-2 Replaces

300 CPU-Only Server NodesWorkload: ResNet50, 90 epochs to solution

CPU server: dual-socket Intel Xeon Gold 6140

HPCHGX-2 Replaces

60 CPU-Only Server NodesWorkload: MILC (particle physics HPC application)

CPU server: dual-socket Intel Xeon Gold 6140

100

200

300

0 Dual-SocketCPU

1

HGX-2

20

40

60

0 Dual-SocketCPU

HGX-2

60x300x

1

NVIDIA NVSwitchesDirect GPU-to-GPU Connection

Between All 16 GPUs

12

NVIDIA® Tesla® V100 GPUs0.5TB Memory

16