mellanox technologies maximize cluster performance and...

Post on 07-Aug-2020

8 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Gilad Shainer, shainer@mellanox.comOctober, 2007

Mellanox TechnologiesMaximize Cluster Performance and Productivity

2 Mellanox Technologies

Enterprise Data CentersEnterprise Data Centers

EmbeddedEmbedded

Mellanox TechnologiesApplicationsHardware OEMs End-Users

HighHigh--Performance ComputingPerformance Computing

InfiniBand and Ethernet

ServersAnd Blades

Embedded

Switches

Storage

Interconnect: A Competitive Advantage

3 Mellanox Technologies

Connecting The Most Powerful Clusters

4500 server nodes

1300 servernodes

1280 server nodes

2300 server nodes

1400 server nodes

96 server nodes

4 Mellanox Technologies

0

100

200

300

400

500

8 Cores 16 Cores 32 Cores

Time to Compute 10 Time Steps

But Not Only The Most Powerful ClustersSikorsky CH-53K program• 24 nodes, dual core CPUs• Reducing simulations

duration from 4 days to several hours

Personal supercomputing (4-5 nodes)• Wolfram Air Pollution Simulation • Maximum utilization and efficiency

Seconds

HP

TYAN

5 Mellanox Technologies

Clusters• Commodity• Very flexible

InfiniBand Clusters• Maximum performance• Scalability, large-scale clusters• Multiple I/O traffics• Flexible and easy to manage

Proprietary Systems• Expensive• Not flexible

MellanoxInfiniBand

Off-the-ShelfStorage

CommodityServers

InfiniBand For Clustering

IndustryMegatrend

6 Mellanox Technologies

InfiniBand For I/O Growth Demands

Multi-core CPUs mandating 10Gb/s+ connectivity

More applications per More applications per server and I/Oserver and I/O

More traffic per I/O More traffic per I/O with server I/O with server I/O consolidationconsolidation

I/O capacity per server I/O capacity per server dictated by the most dictated by the most

demanding appsdemanding apps

10Gb/s+ 10Gb/s+ connectivity connectivity

for each for each serverserver

Multi Core CPUs

SAN Adoption

SharedResources

More applicationsMore applicationsdemand realdemand real--time time

responseresponse

LatencyRace

More applications per More applications per server and I/Oserver and I/O

More traffic per I/O More traffic per I/O with server I/O with server I/O consolidationconsolidation

I/O capacity per server I/O capacity per server dictated by the most dictated by the most

demanding appsdemanding apps

10Gb/s+ 10Gb/s+ connectivity connectivity

for each for each serverserver

Multi Core CPUs

SAN Adoption

SharedResources

More applicationsMore applicationsdemand realdemand real--time time

responseresponse

LatencyRace

7 Mellanox Technologies

InfiniBand For Virtualization

Hardware-based I/O virtualization • Multiple VMs, multiple traffic types per VM

Supports current and future servers• AMD and Intel IOV, PCI-SIG IOV

Better resource utilization• Frees up CPU through hypervisor offload• Enable significantly more VMs per CPU

Native OS performance• VMs enjoy native InfiniBand performance

ConnectX InfiniBand HCA

VirtualMachine 1

HCA Driver

DMA Remapping

Memory

VirtualMachine 1

HCA Driver

VirtualMachine 1

HCA Driver

Physical Server

Virtual Machine Monitor

VMM OffloadFunctions

8 Mellanox Technologies

TOP500 Interconnect Trends

Explosive growth of InfiniBand 20Gb/s• 55 clusters = 42% of the InfiniBand clusters

Top500 Interconnect Penetration

0

20

40

60

80

100

120

140

Year 1

Year 2

Year 3

Year 4

Year 5

Year 6

Year 7

Year 8

Year of Introduction

# of

Clu

ster

s

InfiniBand Ethernet

Ethernet year 1 – June 1996InfiniBand year 1 – June 2003

InfiniBand adoption is faster then Ethernet

InfiniBand In The Top500

0

20

40

60

80

100

120

140

June 05 Nov 05 June 06 Nov 06 June 07

Num

ber o

f Clu

ster

s

IB SDR IB DDR

LANL

SANDIA

AERONAUTICAL SYSTEMS CENTER

Top500 Interconnect Trends

020406080

100120140160180200220240260

Num

ber o

f Clu

ster

s

Jun-05 Nov-05 Jun-06 Nov-06 Jun-07InfiniBand Myrinet GigE

132

47

207

Top500 Interconnect Trends

020406080

100120140160180200220240260

Num

ber o

f Clu

ster

s

Jun-05 Nov-05 Jun-06 Nov-06 Jun-07InfiniBand Myrinet GigE

132

47

207

Growth Rate:

InfiniBand: +230%GigE: -19%

9 Mellanox Technologies

InfiniBand Value Proposition

Automotive Oil and Gas Fluid Dynamics

* Partial List

3X simulation efficiency increaseCar crash simulations eliminates the need for physical testing

Reduce reservoir modeling simulation runtime up to 55%Improves interpretation and modeling accuracy

3X performance improvement and near linear scalingIntensive simulation for computer-aided engineering

5X the bandwidth for real-time color gradingHigh-resolution commercials and feature films

4X improvement for photomaskmanufacturing - CATS

Monte Carlo simulation, astronomy, bioinformatics, chemistry and drug researchAccelerate parallel execution of matrix operations

Oil and GasAutomotive Fluid Dynamics

Electronic Design AutomationDigital Media Computational Science

10 Mellanox Technologies

Mellanox Product Roadmap

2000 2003 2004 2005 2006 20072001 2002

40Gb/s Total

160Gb/s Total

Two 10Gb/s

480, 960Gb/s Total

Two 10/20Gb/s IB

One 10/20Gb/s

1st Gen.

2nd Generation

3rd Generation

4th Generation

Adapter

Adapter

Adapter

Adapter + Switch

Switch

Switch

*Adapter Card Products Based on Adapter Silicon Not Shown

2.0 AdapterTwo 10/20/40 Gb/s InfiniBand or Two 1/10Gb/s Ethernet

11 Mellanox Technologies

Leading InfiniBand and 10GigE Adapters

Single-chip server and storage adapters• Optimized cost, power, footprint, reliability

Highest performing InfiniBand adapters• 20Gb/s (40Gb/s in 2008), 1us application latency

Highest performing 10GigE NICs• 17.6Gb/s throughput, < 7us application latency• Supports OpenFabrics RDMA software stacks

Multi-core CPU optimizedVirtualization accelerationCombination IB/Ethernet 4Q07First adapters to support PCI Express 2.0 10 & 20Gb/s InfiniBand

10 Gigabit Ethernet(Copper)

ConnectX IB InfiniBand HCA

ConnectX EN Ethernet MAC/NIC

ConnectX MP Multi-Protocol Adapter

12 Mellanox Technologies

ConnectX Multi Core PerformanceConnectX MPI Latency - Multi-core Scaling

0

2

4

6

1 2 3 4 5 6 7 8

# of CPU cores

Late

ncy

(use

c)

*Optimizations on going

IPoIB-CM ConnectX IB - SDR, DDR PCIe Gen1, DDR PCIe Gen2

0

500

1000

1500

20002 4 8 16 32 64 128

256

512

1024

2048

4096

8192

1638

432

768

6553

613

1072

2621

4452

4288

1048

576

Bytes

Ban

dwid

th (M

B/s

)

IB SDR PCIe Gen1 IB DDR PCIe Gen1 IB DDR PCIe Gen2

Sockets (TCP)

MPI (message passing)

13 Mellanox Technologies

InfiniBand scalability - car2car

0

10

1 2 4 8 16

Number of nodes

Perf

orm

ance

Superior Application Productivity

PAM-CRASH

05000

10000150002000025000300003500040000

16 CPUs 32 CPUs 64 CPUs

Elap

sed

Tim

e (s

ec)

GigE InfiniBand

FLUENT 6.3Beta - FL5L3 case

0

500

1000

1500

2000

2500

0 20 40 60 80 100 120 140

Number of Cores

Flue

nt P

erfo

man

ce R

atin

g

Mellanox InfiniBand Gigabit Ethernet

HP C-Class Blade System with Mellanox 20Gb/s InfiniBand I/O

14 Mellanox Technologies

Superior Application Performance

Mellanox InfiniBand versus QLogic InfiniPath - neon_refined_revised

50%

60%

70%

80%

90%

100%

2 nodes 4 nodes 8 nodes

Nor

mal

ized

Run

time

QLogic InfiniPath Mellanox InfiniBandLower isbetter

100

120

140

160

180

200

Tim

e (se

c)

Vector Distribution Benchmark

GigE InfiniBandLower is better

57%

0

100

200

300

400

500

8 Cores 16 Cores 32 Cores

Time to Compute 10 Time Steps

Systems: Intel Woodcrest 3.0GHz Interconnects: Mellanox InfiniBand SDR, QLogic InfiniPath

Mellanox InfiniBand – full CPU offload• Transport offload• RDMA• Robustness• Scalability• Efficiency

CPU runs application, not network

15 Mellanox Technologies

Mellanox Cluster Center

http://www.mellanox.com/applications/clustercenter.php

Neptune cluster • 32 nodes • Dual core AMD Opteron CPUs

Helios cluster • 32 nodes• Quad core Intel Clovertown CPUs

Vulcan cluster – coming soon• 32 nodes• Quad core AMD Barcelona CPUs

Utilizing “Fat Tree” network architecture (CBB)• Non-blocking switch topology • Non-blocking bandwidth

ConnectX InfiniBand 20Gb/sInfiniBand based storage

• InfiniBand storage• NFS over RDMA, SRP

16 Mellanox Technologies

SummaryMarket-wide adoption of InfiniBand • Servers/blades, storage and switch systems• Data Centers, High-Performance Computing, Embedded• Performance, Price, Power, Reliable, Efficient, Scalable

4th Generation adapter - connectivity to InfiniBand and Ethernet• Market leading performance, capabilities and flexibility

Driving key trends in the market• Clustering/blades, low-latency, I/O consolidation, multi-core, virtualization

Fibre Channel Storage

iSCSI StorageConverged Architecture with ConnectX – I/O Consolidation

InfiniBand Storage

InfiniBand orEthernet

Low Cost Bridge

InfiniBand orEthernet

Thank You

17

top related