smb advanced networking - mellanox technologies · smb advanced networking for fault tolerance and...

26

Upload: others

Post on 22-Jun-2020

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SMB Advanced Networking - Mellanox Technologies · SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers ... Information on this slide
Page 2: SMB Advanced Networking - Mellanox Technologies · SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers ... Information on this slide

SMB Advanced Networking for Fault Tolerance and Performance

Jose Barreto

Principal Program Managers

Microsoft Corporation

Page 3: SMB Advanced Networking - Mellanox Technologies · SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers ... Information on this slide

Agenda

• SMB Remote File Storage for Server Apps

• SMB Direct (SMB over RDMA)

• SMB Multichannel

• SMB Scale-Out

Page 4: SMB Advanced Networking - Mellanox Technologies · SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers ... Information on this slide

Remote File Storage for Server Apps

• What is it? – Server applications storing their data files on

SMB file shares (UNC paths)

– Examples:

• Hyper-V: Virtual Hard Disks (VHD), config.

• SQL Server: Database and log files

• What is the value? – Easier provisioning – shares instead of LUNs

– Easier management – shares instead of LUNs

– Flexibility – dynamic server relocation

– Leverage network investments – no need for specialized storage networking infrastructure or knowledge

– Lower cost – Acquisition and Operation cost

• First class storage – Item by item, a storage solution that can match

the capabilities of traditional block solutions

File

Server

File

Server

Shared Storage

Hyper-V

SQL

Serve

r

IIS

VDI

Deskto

p

SQL

Serve

r

IIS

Page 5: SMB Advanced Networking - Mellanox Technologies · SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers ... Information on this slide

SMB Direct (SMB over RDMA) SMB DIRECT

(SMB OVER RDMA)

Page 6: SMB Advanced Networking - Mellanox Technologies · SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers ... Information on this slide

SMB Direct (SMB over RDMA) • New class of SMB file storage for the

Enterprise

– Minimal CPU utilization for file storage processing

– Low latency and ability to leverage high speed NICs

– Fibre Channel-equivalent solution at a lower cost

• Traditional advantages of SMB file storage

– Easy to provision, manage and migrate

– Leverages converged network

– No application change or administrator configuration

• Required hardware

– RDMA-capable network interface (R-NIC)

– Support for iWARP, InfiniBand and RoCE

File Client File Server

SMB Server SMB Client

User

Kernel

Application

Disk R-NIC

Network w/ RDMA

support

NTFS SCSI

Network w/ RDMA

support

R-NIC

Page 7: SMB Advanced Networking - Mellanox Technologies · SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers ... Information on this slide

What is RDMA?

• Remote Direct Memory Access Protocol – Accelerated IO delivery model

which works by allowing application software to bypass most layers of software and communicate directly with the hardware

• RDMA benefits – Low latency – High throughput – Zero copy capability – OS / Stack bypass

• RDMA Hardware Technologies – Infiniband – iWARP: RDMA over TCP/IP – RoCE: RDMA over Converged

Ethernet

File Server

SMB Direct

Client

RDMA NIC

SMB Direct

Ethernet or InfiniBand

SMB Serve

r

SMB Client

Memory Memor

y

NDKPI NDKPI

RDMA NIC

RDMA

Page 8: SMB Advanced Networking - Mellanox Technologies · SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers ... Information on this slide

File Server

SMB Direct

SMB over TCP and RDMA

1. Application (Hyper-V, SQL Server) does not need to change.

2. SMB client makes the decision to use SMB Direct at run time

3. NDKPI provides a much thinner layer than TCP/IP

4. Remote Direct Memory Access performed by the network interfaces.

Client

Application

NIC

RDMA NIC

TCP/ IP

User Kernel

SMB Direct

Ethernet and/or InfiniBand

TCP/ IP

Unchanged API

SMB Server SMB Client

Memory Memory

NDKPI NDKPI

RDMA NIC

NIC

RDMA 1

2

3

4

1

2

3

4

Page 9: SMB Advanced Networking - Mellanox Technologies · SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers ... Information on this slide

Comparing RDMA Technologies

Type (Cards*) Pros Cons

Non-RDMA Ethernet (wide variety of NICs)

• TCP/IP-based protocol

• Works with any Ethernet switch

• Wide variety of vendors and models

• Support for in-box NIC teaming (LBFO)

• Currently limited to 10Gbps per NIC port

• Higher CPU Utilization under load

• Higher latency

iWARP (Intel NE020*)

Low

er C

PU

Uti

lizat

ion

un

der

load

Low

er la

ten

cy

• TCP/IP-based protocol

• Works with any 10GbE switch

• RDMA traffic routable

• Currently limited to 10Gbps per NIC port*

RoCE (Mellanox ConnectX-2,

Mellanox ConnectX-3*)

• Ethernet-based protocol

• Works with high-end 10GbE/40GbE

switches

• Offers up to 40Gbps per NIC port today*

• RDMA not routable via existing IP infrastructure

• Requires DCB switch with Priority Flow Control

(PFC)

InfiniBand (Mellanox ConnectX-2,

Mellanox ConnectX-3*)

• Offers up to 54Gbps per NIC port today*

• Switches typically less expensive per port

than 10GbE switches*

• Switches offer 10GbE or 40GbE uplinks

• Commonly used in HPC environments

• Not an Ethernet-based protocol

• RDMA not routable via existing IP infrastructure

• Requires InfiniBand switches

• Requires a subnet manager (on the switch or the

host)

* This is current as of the release of Windows Server “8” beta. Information on this slide is subject to change as technologies evolve and new cards become available.

Page 10: SMB Advanced Networking - Mellanox Technologies · SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers ... Information on this slide

SMB Direct Performance

Workload IO Size IOPS Bandwidth Latency

Large IOs, high throughput (SQL Server DW) 512 KB 4,210 2.21GB/s 4.41ms

Typical application server (SQL Server OLTP) 8 KB 214,000 1.75GB/s 870µs

Small IOs, high IOPs (not typical, benchmark only ) 1 KB 294,000 0.30GB/s 305µs

Preliminary results based on Windows Server “8” beta

Page 11: SMB Advanced Networking - Mellanox Technologies · SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers ... Information on this slide

SMB Multichannel SMB MULTICHANNEL

Page 12: SMB Advanced Networking - Mellanox Technologies · SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers ... Information on this slide

Multiple RDMA NICs Multiple 1GbE NICs Single 10GbE RSS-capable NIC

SMB Server

SMB Client

SMB Multichannel

Full Throughput • Bandwidth aggregation

with multiple NICs • Multiple CPUs cores

engaged when using Receive Side Scaling (RSS)

Automatic Failover • SMB Multichannel

implements end-to-end failure detection

• Leverages NIC teaming (LBFO) if present, but does not require it

Automatic Configuration • SMB detects and uses

multiple network paths

SMB Server

SMB Client

SMB Server

SMB Client

Sample Configurations

Multiple 10GbE in LBFO team

SMB Server

SMB Client

LBFO

LBFO

Switch 10GbE

NIC 10GbE

NIC 10GbE

Switch 10GbE

NIC 10GbE

NIC 10GbE

NIC 10GbE

NIC 10GbE

Switch 1GbE

NIC 1GbE

NIC 1GbE

Switch 1GbE

NIC 1GbE

NIC 1GbE

Switch 10GbE/IB

NIC 10GbE

/IB

NIC 10GbE

/IB

Switch 10GbE/IB

NIC 10GbE

/IB

NIC 10GbE

/IB

Switch 10GbE

Page 13: SMB Advanced Networking - Mellanox Technologies · SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers ... Information on this slide

SMB Server

SMB Client

SMB Multichannel – Single 10GbE NIC

• Can’t use full 10Gbps

– Only one TCP/IP connection

– Only one CPU core engaged

• Full 10Gbps available

– Multiple TCP/IP connections

– Receive Side Scaling (RSS) helps distribute load across CPU cores

1 session, with Multichannel 1 session, without Multichannel

Switch 10GbE

NIC 10GbE

NIC 10GbE

SMB Server

SMB Client

Switch 10GbE

NIC 10GbE

NIC 10GbE

CPU utilization per core

Core 1

Core 2

Core 3

Core 4

CPU utilization per core

Core 1

Core 2

Core 3

Core 4

Page 14: SMB Advanced Networking - Mellanox Technologies · SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers ... Information on this slide

1 session, with Multichannel 1 session, without Multichannel

SMB Multichannel – Multiple NICs

• No automatic failover

• Can’t use full bandwidth

– Only one NIC engaged

– Only one CPU core engaged

• Automatic NIC failover

• Combined NIC bandwidth available

– Multiple NICs engaged

– Multiple CPU cores engaged

SMB Server 1

SMB Client 1

Switch 1GbE

SMB Server 2

SMB Client 2

NIC 1GbE

NIC 1GbE

Switch 1GbE

NIC 1GbE

NIC 1GbE

Switch 10GbE

Switch 10GbE

NIC 10GbE

NIC 10GbE

NIC 10GbE

NIC 10GbE

SMB Server 1

SMB Client 1

Switch 1GbE

SMB Server 2

SMB Client 2

NIC 1GbE

NIC 1GbE

Switch 1GbE

NIC 1GbE

NIC 1GbE

Switch 10GbE

Switch 10GbE

NIC 10GbE

NIC 10GbE

NIC 10GbE

NIC 10GbE

Page 15: SMB Advanced Networking - Mellanox Technologies · SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers ... Information on this slide

SMB Multichannel Performance

• Preliminary results using four 10GbE NICs simultaneously

• Linear bandwidth scaling – 1 NIC – 1150 MB/sec – 2 NICs – 2330 MB/sec – 3 NICs – 3320 MB/sec – 4 NICs – 4300 MB/sec

• Leverages NIC support for RSS (Receive Side Scaling) to engage multiple CPU cores per NIC

• Bandwidth for small IOs is bottlenecked on CPU

0

500

1000

1500

2000

2500

3000

3500

4000

4500

MB

/se

c

I/O Size

SMB Client Interface Scaling - Throughput 1 x 10GbE 2 x 10GbE 3 x 10GbE 4 x 10GbE

http://go.microsoft.com/fwlink/p/?LinkId=227841 Preliminary results based on

Windows Server “8” Developer Preview

Page 16: SMB Advanced Networking - Mellanox Technologies · SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers ... Information on this slide

1 session, with LBFO and MC 1 session, with LBFO, no MC

SMB Multichannel + LBFO

• Automatic NIC failover

• Can’t use full bandwidth

– Only one NIC engaged

– Only one CPU core engaged

• Automatic NIC failover (faster with LBFO)

• Combined NIC bandwidth available

– Multiple NICs engaged

– Multiple CPU cores engaged

SMB Server 1

SMB Client 1

SMB Server 2

SMB Client 2

LBFO

LBFO

LBFO

LBFO

Switch 10GbE

Switch 10GbE

NIC 10GbE

NIC 10GbE

NIC 10GbE

NIC 10GbE

Switch 1GbE

NIC 1GbE

NIC 1GbE

Switch 1GbE

NIC 1GbE

NIC 1GbE

SMB Server 2

SMB Client 1

Switch 1GbE

SMB Server 2

SMB Client 2

NIC 1GbE

NIC 1GbE

Switch 1GbE

NIC 1GbE

NIC 1GbE

Switch 10GbE

Switch 10GbE

NIC 10GbE

NIC 10GbE

NIC 10GbE

NIC 10GbE

LBFO

LBFO

LBFO

LBFO

Page 17: SMB Advanced Networking - Mellanox Technologies · SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers ... Information on this slide

1 session, with Multichannel 1 session, without Multichannel

SMB Direct and SMB Multichannel

• No automatic failover

• Can’t use full bandwidth

– Only one NIC engaged

– RDMA capability not used

• Automatic NIC failover

• Combined NIC bandwidth available

– Multiple NICs engaged

– Multiple RDMA connections

SMB Server 2

SMB Client 2

SMB Server 1

SMB Client 1

SMB Server 2

SMB Client 2

SMB Server 1

SMB Client 1

Switch 10GbE

Switch 10GbE

R-NIC 10GbE

R-NIC 10GbE

R-NIC 10GbE

R-NIC 10GbE

Switch 32GbIB

R-NIC 32GbIB

R-NIC 32GbIB

Switch 32GbIB

R-NIC 32GbIB

R-NIC 32GbIB

Switch 10GbE

Switch 10GbE

R-NIC 10GbE

R-NIC 10GbE

R-NIC 10GbE

R-NIC 10GbE

Switch 32GbIB

R-NIC 32GbIB

R-NIC 32GbIB

Switch 32GbIB

R-NIC 32GbIB

R-NIC 32GbIB

Page 18: SMB Advanced Networking - Mellanox Technologies · SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers ... Information on this slide

Troubleshooting SMB Multichannel

• PowerShell

– Get-NetAdapter

– Get-SmbServerNetworkInterface

– Get-SmbClientNetworkInterface

– Get-SmbMultichannelConnection

• Event Log

– Application and Services Log, Microsoft, Windows, SMB Client

• Performance Counters

– SMB2 Client Shares

Page 19: SMB Advanced Networking - Mellanox Technologies · SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers ... Information on this slide

SMB Scale-Out SMB SCALE-OUT

Page 20: SMB Advanced Networking - Mellanox Technologies · SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers ... Information on this slide

File Server Cluster

Active Passive

Historical: Windows Server 2008 R2

Active-Passive Multiple File Servers 2+ logical file servers

2+ virtual IP addresses Access to disparate shares through different nodes

\\FSA\Share1 \\FSB\Share1

Leverage investment

More complex to manage Multiple names

Active-Passive Single File Server 1 logical file server 1 virtual IP address Active/Passive \\FSA\Share1 \\FSA\Share2 Single name Simple Easy to manage

Name=FSA IP=10.1.1.3

FSA=10.1.1.3

Client

File Server Cluster

Active for FSA Active for FSB

Name=FSA IP=10.1.1.3

FSA=10.1.1.3 FSB=10.1.1.4

Client

Name=FSB IP=10.1.1.4

\\FSB\Share1 \\FSA\Share1 \\FSA\Share2 \\FSA\Share

1

Page 21: SMB Advanced Networking - Mellanox Technologies · SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers ... Information on this slide

File Server for scale-out application data New in Windows Server “8”

• Targeted for server app storage

– Example: Hyper-V and SQL Server

– Increase available bandwidth by adding cluster nodes

• Key capabilities:

– Active/Active file shares

– Fault tolerance with zero downtime

– Fast failure recovery

– CHKDSK with zero downtime

– Support for app consistent snapshots

– Support for RDMA enabled networks

– Optimization for server apps

– Simple management

Single File System Namespace

Cluster Shared Volumes

Single Logical File Server (\\FS\Share)

Hyper-V Cluster (Up to 64 nodes)

File Server Cluster (Up to 4 nodes)

Data Center Network (Ethernet, InfiniBand or combination)

Page 22: SMB Advanced Networking - Mellanox Technologies · SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers ... Information on this slide

Scale-Out File Server

• New File Server Type – File Server for scale-out application data – Manage all nodes as a single file share

service

• Leverages: – Clustered Shared Volumes (CSV)

• Single File System Namespace – no drive letters

• CSV volumes are online on all cluster nodes

– Distributed Network Name (aka DNN name) • Manages DNS registration and

deregistration of node IP addresses • Round Robin DNS to distribute clients

• Requirements: – Windows Failover Cluster with CSV – Both server application and file server

cluster must be running SMB 2.2 – SMB1 and earlier clients cannot connect to

scale-out file shares

Page 23: SMB Advanced Networking - Mellanox Technologies · SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers ... Information on this slide

Putting it all together PUTTING IT ALL TOGETHER

Page 24: SMB Advanced Networking - Mellanox Technologies · SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers ... Information on this slide

Putting it all together

1. SMB Direct – High throughput with low CPU utilization

and low latency

2. SMB Multichannel – Load balance with multiple interfaces – Failover with multiple interfaces

3. SMB Transparent Failover – Zero downtime for planned/unplanned

events

4. SMB Scale-Out – Active/active file shares across cluster nodes

5. Clustered Shared Volumes (CSV) – SMB used for inter-node traffic

6. SMB PowerShell – Management of File Shares – Enabling and disabling SMB features

7. SMB Performance Counters – Provide insight into storage performance – Equivalent to disk counters

8. SMB Eventing

Hyper-V Parent 1

Child 1 Config

VHD Disk

Hyper-V Parent N

Child N Config

VHD Disk

File Server 1

Share

File Server 2

Share

Shared SAS Storage

Disk Disk Disk Disk

CSV CSV

Switch 1 Switch 2

NIC1 NIC2 NIC1 NIC2

NIC1 NIC2 NIC1 NIC2

1 2

3

4 4

Administrator

5 5

6 7

8

Page 25: SMB Advanced Networking - Mellanox Technologies · SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers ... Information on this slide

Thank you!

Page 26: SMB Advanced Networking - Mellanox Technologies · SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers ... Information on this slide