deploying 10/40g infiniband applications over the wan

20
Deploying 10/40G InfiniBand Applications over the WAN Eric Dube ([email protected]) Senior Product Manager of Systems November 2011

Upload: others

Post on 18-Dec-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Deploying 10/40G InfiniBand Applications over the WAN

Eric Dube ([email protected])

Senior Product Manager of Systems

November 2011

© 2011 Bay Microsystems, Inc. 11/30/2011 2

■ About Bay

• Founded in 2000 to provide high performance

networking solutions

– Silicon Engineering & Headquarters:

• San Jose, CA

– Systems Engineering & Business Development:

• Germantown, MD; MA

■ Corporate Focus

• Development of complex integrated circuits that are applied to high performance packet processing and optical transport applications in support of our systems

• Systems that deliver high performance protocol agnostic encryption adaptation, protocol inter-working, and WAN acceleration for government agencies and commercial enterprises

Overview

© 2011 Bay Microsystems, Inc. 11/30/2011 3

Common issues include:

■ Ability to maintain link utilization over extended

distances

■ Providing congestion control and avoiding packet loss

■ Fair link sharing of bandwidth resources across

multiple, concurrent applications

■ TCP/IP packet acknowledgement delays exponentially

grows as distance increase causing performance

degradation for most applications

Wide Area Networking Challenges

Wide Area Networks can often be difficult to deploy

for many popular compute and storage applications

© 2011 Bay Microsystems, Inc. 11/30/2011 4

■ Improved performance

• Enables application and storage acceleration

through RDMA

■ Increased efficiency

• Provides maximum link utilization over the WAN

with fair sharing of resources between applications

■ Minimal latency

• Adds very little latency to the overall WAN

connection for latency-sensitive applications

■ Cost savings

• Expands existing WAN link capacity and offloads

CPU for application processing (saving on both

hardware processing and network bandwidth)

■ Seamless implementation

• Transparent application interoperability with

existing and new applications and storage

solutions

Benefits of using InfiniBand / RDMA

for Wide Area Network Applications

© 2011 Bay Microsystems, Inc. 11/30/2011 5

0100110

Campus, Metro, or Wide Area Network

(from 1 to 1000’s of kilometers)

■ The need to extend InfiniBand between data centers is essential for providing

disaster recovery, multi-site backups, and real-time data access solutions.

■ While InfiniBand’s crediting mechanism is an excellent and reliable way to

provide flow control, existing InfiniBand LAN hardware doesn’t provide enough

port buffering for deployment beyond a single site.

• A reduction in sustained bandwidth starts occurring at 500-600 meters or less (depending on the data

rate) due to inadequate port buffering if the number of virtual lanes aren’t reduced.

• Even with the minimum amount of virtual lanes configured, not enough packets can be kept in-flight

on the wire due to the port buffer credit starvation that occurs over extended distances such as

greater than 4 kilometers

Challenges of Extending InfiniBand Globally

© 2011 Bay Microsystems, Inc. 11/30/2011 6

Welcome to IBEx WAN Acceleration Solutions

Highlights:

■ Improves link utilization with 80-99% efficiency

■ Supports 4X InfiniBand QDR connectivity today with future software upgradability to

FDR10 for up to 40Gbps data rates

■ Provides lossless communication and true QoS capabilities for workflows

■ Flexible 10/40G WAN connectivity options over SONET/SDH, ITU-T G.709 OTN, and

Ethernet

Intelligent Bandwidth Exchange

The IBEx product family enables wide area networking acceleration using RDMA

over InfiniBand for compute and storage applications to any point on the globe

(up to 15,000 km and beyond.)

© 2011 Bay Microsystems, Inc. 11/30/2011 7

■ The IBEx InfiniBand product family supports:

• All native InfiniBand protocols

– IPoIB, SDP, RDS, MPI, uDAPL, iSER, SRP, IB Verbs Layer, etc.

– Support RDMA data transfers over the WAN for applications

• Connectivity for all InfiniBand data rates

– SDR, DDR, QDR, FDR10*

• Standard QSFP InfiniBand Interface

– Accepts both active optical and passive copper cabling in addition to optical

transceivers

• Operates as a typical InfiniBand switch device

– Appears as a 2-Port Switch in the InfiniBand fabric

• True 10/40G WAN-side data rates for extending native InfiniBand

– Provides 10/40G actual data rate throughput for InfiniBand extension with QDR and

FDR10* connectivity

* Future support for FDR10 data rates through IBEx system software upgrade

IBEx InfiniBand Support

© 2011 Bay Microsystems, Inc. 11/30/2011 8

IBEx Platform

Typical Data Center Deployment Diagrams

Applications

Point-to-Point Campus / Metro Area Network Deployment

IBEx

Extension

Platform

Optical

Amplification

InfiniBand

LAN Switching

Infrastructure

Servers

Storage 10/40G Dark Fiber

Campus /

Metro Area

Network

Customer Premise Carrier / Local Fiber Network

optional

optional

1 - 250 km

DWDM Metro / Wide Area Network Deployment

IBEx

Extension

Platform

Optical

Amplification

DWDM Optical

Transport Platform

InfiniBand

LAN Switching

Infrastructure

10/40G Wavelength

Metro /

Wide Area

Network

Customer Premise Carrier Network

1 - 15,000+ km

Applications

Servers

Storage

© 2011 Bay Microsystems, Inc. 11/30/2011 9

Need for Distributed InfiniBand Applications

and Multi-site Deployments

• High resolution patient imaging sharing

between offices

Distributed

Healthcare

Applications

• Disaster recovery solution for low latency

trading and market data feed applications

Financial

Services

• Clustered applications and cloud computing

• Post-processing and visualization

High

Performance

Computing (HPC)

• High performance/high volume data sharing

and storage virtualization between sites Global File Systems

& Storage

• Multi-site failover and data mirroring

• Real-time local access and information

sharing

Clustered

Databases and

Warehouses

• Global distribution for thousands of HD

videos over a single connection

Content

Distribution

© 2011 Bay Microsystems, Inc. 11/30/2011 10

Network Protocol Efficiencies

■ Network Protocols:

• TCP/IP

– Typically software protocol stack implemented

– TCP subject to significant “saw-tooth” performance effects upon any loss with

slow ramp to nominal utilization

– Conversion to UDP, using TCP spoofing techniques, helps performance but

looses all notion of congestion management, reliable transport, and in-order

delivery

– TCP/IP utilization significantly degrades with multi-session and any notion of

congestion due to its reactive congestion control

• RDMA over InfiniBand (IB)

– RDMA (Remote Direct Memory Access) is hardware transfer initiated by the

software application from local memory, across the network, to the remote

server or mass storage system

– InfiniBand is lossless with end-to-end flow pro-active flow control and reliable,

in order, detection and delivery

– With InfiniBand extension, InfiniBand can run on nearly any optical or traffic

engineered network utilizing up to 90%+ efficiency of the available bandwidth

© 2011 Bay Microsystems, Inc. 11/30/2011 11

Typical

RDMA/IB

Performance

TESTS ON 1 Gbps CIRCUIT (~8000 miles)

[ ~13,000 fiber miles]

Typical

TCP/IP/ETHERNET

Performance RDMA over IB provides very efficient use of

available bandwidth with near linear scaling

RDMA/IB performance ≥ 80%

TCP/IP performance ≤ 40%

RDMA/IB CPU usage estimated 4x less

InfiniBand connection is lossless with nearly

perfect fair sharing of bandwidth across

multiple, concurrent data flows

TESTS ON 8 Gbps CIRCUIT (~1200 miles)

[~2000 fiber miles]

Typical

RDMA/IB

Performance

* Slide content and performance data obtained from Large Data JCTD Public Presentation

Large Data JCTD

Protocol Performance Comparison

© 2011 Bay Microsystems, Inc. 11/30/2011 12

■ Application testing performed in July/August 2011 at Brookhaven National

Laboratory as part of the ESnet ANI Testbed project

■ Obtained 96% efficiency of useable bandwidth through concurrent streaming of

RDMA applications

■ Utilized a SONET OC-768 (40G) WAN circuit spanning 370 km from Upton, NY

to Long Island and back

Orange / ESnet / Bay Microsystems

40G IB Extension over SONET OC-768

IBEx G40

370 km Fiber Loop

Infinera Optical

Transport Platform

Infinera Optical

Transport Platform

NY Long Island

Metro Area Network

SONET OC-768 Service

AOFA BNL

Applications,

Servers, & Storage

Applications,

Servers, & Storage

ESnet ANI Testbed

IBEx G40

© 2011 Bay Microsystems, Inc. 11/30/2011 13

Orange / ESnet / Bay Microsystems

ANI Testbed Performance Data

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

1 2 4 8 16 32 64 128 256 512 1024

MB

/sec

on

d

Transmit Queue Depth

Bidirectional Maximum Bandwidth (RC) Message Size

64

128

256

512

1024

2048

4096

8192

16384

32768

65536

131072

262144

524288

© 2011 Bay Microsystems, Inc. 11/30/2011 14

SC11: Orange / ESnet / Bay Microsystems

~7000 Fiber

Miles Loop!

World’s First Long Distance 40G RDMA over InfiniBand

Data Transfer Demonstration

Native 4X InfiniBand QDR is

extended over 40G Ethernet /

100G MPLS network circuit

provided by ESnet

Salt Lake

City

Chicago

Seattle

© 2011 Bay Microsystems, Inc. 11/30/2011 15

■ Remote Visualization Demonstration

• Visualization accessing remote data, leaving the dataset intact at the remote node

■ Uncompressed Parallel HD Video Streaming over distance

• Transfer parallel, independent streams consuming file wire bandwidth

■ High Performance “Big Data” File Transfers

• Demonstrate high bandwidth transfers over long-haul wide area networks

SC11: Booth Demonstrations

© 2011 Bay Microsystems, Inc. 11/30/2011 16

■ IBEx M40 Main Features • 40G InfiniBand extension platform providing connectivity for:

– 4X InfiniBand QDR / FDR10 (up to 40Gbps) [1 x QSFP], 10G Ethernet [2 x SFP+],

and 1G Ethernet [2 x SFP]

• Provides 40G WAN extension over:

– 40G Ethernet (40GBase-SR4/LR4), IPv4/IPv6, or dark fiber

• Enhanced internal port buffering and flow control capabilities enabling global InfiniBand

extension at full line rate from 1-15,000km

• Easy To Use Secure Graphical User [HTTPS] and Command Line Interface [SSH]

• Compact, low power (<150 watts), 1U 19-inch rack mountable chassis

– Redundant, hot-swappable (dual-input) power supplies and fans

IBEx M40 4X InfiniBand QDR / FDR10

Extension / 40G WAN Acceleration Platform

Serial Console

Management Ethernet

2 x 1G Ethernet

4X IB QDR

1 x 40G Ethernet

2 x 10G Ethernet

© 2011 Bay Microsystems, Inc. 11/30/2011 17

■ IBEx G40 Main Features • 40G InfiniBand extension platform providing connectivity for:

– 4X InfiniBand QDR / FDR10 (up to 40Gbps) [1 x QSFP], 10G Ethernet [2 x SFP+],

and 1G Ethernet [2 x SFP]

• Provides 40G WAN extension over:

– SONET OC-768/SDH STM-256, ITU-T G.709 OTU3, or dark fiber

• Enhanced internal port buffering and flow control capabilities enabling global InfiniBand

extension at full line rate from 1-15,000km

• Easy To Use Secure Graphical User [HTTPS] and Command Line Interface [SSH]

• Compact, low power (<150 watts), 1U 19-inch rack mountable chassis

– Redundant, hot-swappable (dual-input) power supplies and fans

IBEx G40 4X InfiniBand QDR / FDR10 Extension /

40G WAN Acceleration Platform

Serial Console

Management Ethernet

2 x 1G Ethernet

4X IB QDR

2 x 10G Ethernet

40G WAN (SONET OC-768/SDH STM-256, ITU-T G.709

OTU3, Dark Fiber)

© 2011 Bay Microsystems, Inc. 11/30/2011 18

IBEx M10/G10/M20/G20 4X InfiniBand QDR

Extension / 10G WAN Acceleration Platforms

Single 10G (actual data rate) InfiniBand over the WAN via

SONET OC-192/SDH STM-64, ITU-T G.709 OTU2, or 10G Ethernet

Dual 10G (actual data rate) InfiniBand over the WAN for

site-to-side link redundancy or multi-site connectivity configurations

© 2011 Bay Microsystems, Inc. 11/30/2011 19

For more information please contact:

Eric Dube

Senior Product Manager of Systems

Bay Microsystems, Inc.

Phone: (301) 944-8149

Email: [email protected]

http://www.baymicrosystems.com

© 2011 Bay Microsystems, Inc. 11/30/2011 20

IBEx G40 Platform

Connectivity Diagram

4X

In

fin

iBa

nd

QD

R

(QS

FP

Po

rt 1

)

10

G E

the

rne

t L

AN

(2

x S

FP

+

Tra

ns

ce

ive

r)

Serial Console (RJ45)

Management Ethernet (RJ45)

SONET OC-768/SDH STM-256,

ITU-T G.709 OTU3, WDM, or Dark Fiber

4X

In

fin

iBa

nd

QD

R

(QS

FP

Po

rt 1

)

10

G E

the

rne

t L

AN

(2

x S

FP

+ T

ran

sc

eiv

er)

40

G W

AN

(L

C F

ibe

r)

1G

Eth

ern

et

LA

N

(SF

P T

ran

sc

eiv

er)

1

G E

the

rne

t L

AN

(S

FP

Tra

ns

ce

ive

r)

4X InfiniBand QDR and

1/10G Ethernet LAN connections are

encapsulated over the 40G WAN link