solutions for scalable hpc · 2020-01-14 · leading supplier of end-to-end interconnect solutions...

24
Scot Schultz, Director HPC/Technical Computing HPC Advisory Council Stanford Conference | Feb 2014 Solutions for Scalable HPC

Upload: others

Post on 12-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Solutions for Scalable HPC · 2020-01-14 · Leading Supplier of End-to-End Interconnect Solutions MXM Mellanox Messaging ... 3x Increase in VMs per Physical Server Consolidation

Scot Schultz, Director HPC/Technical Computing

HPC Advisory Council – Stanford Conference | Feb 2014

Solutions for Scalable HPC

Page 2: Solutions for Scalable HPC · 2020-01-14 · Leading Supplier of End-to-End Interconnect Solutions MXM Mellanox Messaging ... 3x Increase in VMs per Physical Server Consolidation

© 2014 Mellanox Technologies 2

Leading Supplier of End-to-End Interconnect Solutions

MXM Mellanox Messaging

Acceleration

FCA Fabric Collectives

Acceleration

Management

UFM Unified Fabric Management

Storage and Data

VSA Storage Accelerator

(iSCSI)

UDA Unstructured Data

Accelerator

Comprehensive End-to-End Software Accelerators and Managment

Host/Fabric Software ICs Switches/Gateways Adapter Cards Cables/Modules

Comprehensive End-to-End InfiniBand and Ethernet Portfolio

Metro / WAN

Page 3: Solutions for Scalable HPC · 2020-01-14 · Leading Supplier of End-to-End Interconnect Solutions MXM Mellanox Messaging ... 3x Increase in VMs per Physical Server Consolidation

© 2014 Mellanox Technologies 3

13 Million Financial Transactions Per Day, 4 Billion Database Inserts

Real Time Fraud Detection

235 Supermarkets, 8 States, USA

Reacting to Customers’ Needs in Real Time!

Reducing Data Queries from 20 minutes to 20 seconds

Accuracy, Details, Fast Response

10X Higher Performance, 50% CAPEX Reduction

Microsoft

Bing Maps

Businesses Success Depends on Mellanox

97% Reduction in Database Recovery Time

From 7 Days to 4 Hours! Tier-1 Fortune100 Company

Web 2.0 Application

Page 4: Solutions for Scalable HPC · 2020-01-14 · Leading Supplier of End-to-End Interconnect Solutions MXM Mellanox Messaging ... 3x Increase in VMs per Physical Server Consolidation

© 2014 Mellanox Technologies 4

InfiniBand Enables Lowest Application Cost in the Cloud (Examples)

Microsoft Windows Azure

90.2% Cloud Efficiency

33% Lower Cost per Application

Cloud

Application Performance

Improved up to 10X

3x Increase in VMs per Physical Server

Consolidation of Network and Storage I/O

32% Lower Cost per Application

694% Higher Network Performance

Page 7: Solutions for Scalable HPC · 2020-01-14 · Leading Supplier of End-to-End Interconnect Solutions MXM Mellanox Messaging ... 3x Increase in VMs per Physical Server Consolidation

© 2014 Mellanox Technologies 7

Paving The Road for 100Gb/s and Beyond

Recent Acquisitions are Part of Mellanox’s Strategy

to Make 100Gb/s Deployments as Easy as 10Gb/s

Copper (Passive, Active) Optical Cables (VCSEL) Silicon Photonics

Page 8: Solutions for Scalable HPC · 2020-01-14 · Leading Supplier of End-to-End Interconnect Solutions MXM Mellanox Messaging ... 3x Increase in VMs per Physical Server Consolidation

© 2014 Mellanox Technologies 8

The Never Ending Race for Higher Performance

“Roadrunner” Mellanox Connected

1st

Mega Supercomputers

2000 2020 2010 2015 2005

3rd

TOP500 2003 Virginia Tech (Apple)

Bioscience Defense Research Space Oil & Gas Weather Automotive Multimedia

Petaflop Exaflop

Page 9: Solutions for Scalable HPC · 2020-01-14 · Leading Supplier of End-to-End Interconnect Solutions MXM Mellanox Messaging ... 3x Increase in VMs per Physical Server Consolidation

© 2014 Mellanox Technologies 9

Mellanox InfiniBand Paves the Road to Exascale Computing

Accelerating Half of the World’s Petascale Systems Mellanox Connected Petascale System Examples

Page 10: Solutions for Scalable HPC · 2020-01-14 · Leading Supplier of End-to-End Interconnect Solutions MXM Mellanox Messaging ... 3x Increase in VMs per Physical Server Consolidation

© 2014 Mellanox Technologies 10

FDR InfiniBand Delivers Highest Return on Investment

Higher is better

Higher is better Higher is better

Source: HPC Advisory Council

Page 11: Solutions for Scalable HPC · 2020-01-14 · Leading Supplier of End-to-End Interconnect Solutions MXM Mellanox Messaging ... 3x Increase in VMs per Physical Server Consolidation

© 2014 Mellanox Technologies 11

Fastest Performing FDR Solution

Connect-IB™

Page 12: Solutions for Scalable HPC · 2020-01-14 · Leading Supplier of End-to-End Interconnect Solutions MXM Mellanox Messaging ... 3x Increase in VMs per Physical Server Consolidation

© 2014 Mellanox Technologies 12

Mellanox Connect-IB The World’s Fastest Adapter

The 7th generation of Mellanox interconnect adapters

World’s first 100Gb/s interconnect adapter (dual-port FDR 56Gb/s InfiniBand)

Delivers 137 million messages per second – 4X higher than competition

Support the new innovative InfiniBand scalable transport – Dynamically Connected

Page 13: Solutions for Scalable HPC · 2020-01-14 · Leading Supplier of End-to-End Interconnect Solutions MXM Mellanox Messaging ... 3x Increase in VMs per Physical Server Consolidation

© 2014 Mellanox Technologies 13

Connect-IB Provides Highest Server and Storage Throughput

Source: Prof. DK Panda

Connect-IB FDR

(Dual port)

ConnectX-3 FDR

Connect-2 QDR

Competition (InfiniBand)

Connect-IB FDR

(Dual port)

ConnectX-3 FDR

Connect-2 QDR

Competition (InfiniBand)

Hig

he

r is

Be

tte

r

Performance Leadership

Page 14: Solutions for Scalable HPC · 2020-01-14 · Leading Supplier of End-to-End Interconnect Solutions MXM Mellanox Messaging ... 3x Increase in VMs per Physical Server Consolidation

© 2014 Mellanox Technologies 14

Commercial HPC Software Package

Mellanox Scalable HPC Toolkit

Page 15: Solutions for Scalable HPC · 2020-01-14 · Leading Supplier of End-to-End Interconnect Solutions MXM Mellanox Messaging ... 3x Increase in VMs per Physical Server Consolidation

© 2014 Mellanox Technologies 15

Mellanox ScalableHPC Toolkit – Commercial HPC Software

HPC communication libraries • MPI based on Open MPI

• SHMEM/PGAS based on OpenSHMEM

• UPC based on Berkeley UPC

CORE-Direct

• US Department of Energy (DOE) funded project – ORNL and Mellanox

• Adapter-based hardware offloading for collectives operations

• Includes floating-point capability on the adapter for data reductions

• CORE-Direct API is exposed through the Mellanox drivers

Communication accelerators • MXM – scalable and performance point-to-point

• FCA – collectives acceleration

Tools • IPM – Integrated Performance Monitoring

• Profiling tools

• Benchmarks

Support OS: • Linux, MLNX_OFED and Community-OFED (distro)

Supported protocols: • InfiniBand, Ethernet-TCP, RoCE, PSM (Intel)

Beta available in Q1’14

Page 16: Solutions for Scalable HPC · 2020-01-14 · Leading Supplier of End-to-End Interconnect Solutions MXM Mellanox Messaging ... 3x Increase in VMs per Physical Server Consolidation

© 2014 Mellanox Technologies 16

Native support for peer-to-peer communications between Mellanox HCA adapters and third-party devices

Mellanox PeerDirect™

Page 17: Solutions for Scalable HPC · 2020-01-14 · Leading Supplier of End-to-End Interconnect Solutions MXM Mellanox Messaging ... 3x Increase in VMs per Physical Server Consolidation

© 2014 Mellanox Technologies 17

PeerDirect™

PeerDirect is natively supported by Mellanox OFED 2.1 or later distribution

Supports peer-to-peer communications between Mellanox adapters and third-party devices

No unnecessary system memory copies & CPU overhead

• Provides copying data directly to/from system devices…

• No longer needs a host buffer for each device…

• No longer needs to share a host buffer either…

Supports NVIDIA® GPUDirect RDMA as a separate plug-in

Provide support for Intel Xeon PHI MPSS communication stack directly with-in MLNX_OFED 2.1

Support for RoCE protocol over Mellanox VPI

Supported with all Mellanox ConnectX-3 and Connect-IB Adapters

CPU

Chip

set

Chipset Vendor

Device

CPU

Chip

set

Chipset Vendor

Device 0101001011

Page 18: Solutions for Scalable HPC · 2020-01-14 · Leading Supplier of End-to-End Interconnect Solutions MXM Mellanox Messaging ... 3x Increase in VMs per Physical Server Consolidation

© 2014 Mellanox Technologies 18

0

200

400

600

800

1000

1200

1400

1600

1800

2000

1 4 16 64 256 1K 4K

1-Rail

1-Rail-GDR

Message Size (bytes)

Ban

dw

idth

(M

B/s

)

0

5

10

15

20

25

1 4 16 64 256 1K 4K

1-Rail

1-Rail-GDR

Message Size (bytes)

Late

ncy (

us

)

GPU-GPU Internode MPI Latency

Low

er is

Bette

r 67 %

5.49 usec

Performance of MVAPICH2 with GPUDirect RDMA

67% Lower Latency

5X

GPU-GPU Internode MPI Bandwidth

Hig

her

is B

ett

er

5X Increase in Throughput

Source: Prof. DK Panda

Page 19: Solutions for Scalable HPC · 2020-01-14 · Leading Supplier of End-to-End Interconnect Solutions MXM Mellanox Messaging ... 3x Increase in VMs per Physical Server Consolidation

© 2014 Mellanox Technologies 19

Mellanox PeerDirect™ with NVIDIA GPUDirect RDMA

HOOMD-blue is a general-purpose Molecular Dynamics simulation code accelerated on GPUs

GPUDirect RDMA allows direct peer to peer GPU communications over InfiniBand • Unlocks performance between GPU and InfiniBand

• This provides a significant decrease in GPU-GPU communication latency

• Provides complete CPU offload from all GPU communications across the network

Demonstrated up to 102% performance improvement with large particles

21% 102%

Page 20: Solutions for Scalable HPC · 2020-01-14 · Leading Supplier of End-to-End Interconnect Solutions MXM Mellanox Messaging ... 3x Increase in VMs per Physical Server Consolidation

© 2014 Mellanox Technologies 20

Long Haul VPI Solutions

MetroX

Page 21: Solutions for Scalable HPC · 2020-01-14 · Leading Supplier of End-to-End Interconnect Solutions MXM Mellanox Messaging ... 3x Increase in VMs per Physical Server Consolidation

© 2014 Mellanox Technologies 21

Extending High-Speed Connectivity and RDMA into Metro / WAN

RDMA connectivity over InfiniBand / Ethernet

From 10 to 80 Kilometers

Mega Data Centers, Mega Clouds, Disaster Recovery

“A common problem is the time cost of moving data between data

centers, which can slow computations and delay results. Mellanox's

MetroX lets us unify systems across campus, and maintain the

high-speed access our researchers need, regardless of the physical

location of their work.”

Mike Shuey, Purdue University

Page 22: Solutions for Scalable HPC · 2020-01-14 · Leading Supplier of End-to-End Interconnect Solutions MXM Mellanox Messaging ... 3x Increase in VMs per Physical Server Consolidation

© 2014 Mellanox Technologies 22

MetroDX and MetroX Features

TX6000 TX6100 TX6240 TX6280

Distance 1KM 10KM 40KM 80KM

Throughput 640Gb/s 240Gb/s 80Gb/s 40Gb/s

Port Density 16p X FDR10 long haul

16p X FDR downlink

6p X 40Gb/s long haul

6p X 56Gb/s downlink

2p X 10/40Gb/s long haul

2p X 56Gb/s downlink

1p X 10/40Gb/s long haul

1p X 56Gb/s downlink

Latency 200ns + 5us/km over fiber 200ns + 5us/km over fiber 700ns + 5us/km over fiber 700ns + 5us/km over fiber

Power ~200W ~200W ~280W ~280W

QoS One data VL + VL15 One data VL + VL15 One data VL + VL15 One data VL + VL15

Space 1RU 1RU 2RU 2RU

Page 23: Solutions for Scalable HPC · 2020-01-14 · Leading Supplier of End-to-End Interconnect Solutions MXM Mellanox Messaging ... 3x Increase in VMs per Physical Server Consolidation

© 2014 Mellanox Technologies 23

The Only Provider of End-to-End 40/56Gb/s Solutions

From Data Center to Metro and WAN

X86, ARM and Power based Compute and Storage Platforms

The Interconnect Provider For 10Gb/s and Beyond

Host/Fabric Software ICs Switches/Gateways Adapter Cards Cables/Modules

Comprehensive End-to-End InfiniBand and Ethernet Portfolio

Metro / WAN

Page 24: Solutions for Scalable HPC · 2020-01-14 · Leading Supplier of End-to-End Interconnect Solutions MXM Mellanox Messaging ... 3x Increase in VMs per Physical Server Consolidation

Thank You