new storage infrastructure with flash and nvmeover fabrics · 2017-11-27 · brocade storage...
Post on 21-Jun-2020
4 Views
Preview:
TRANSCRIPT
New Storage Infrastructure with Flash and NVMe over Fabrics
All or some of the products detailed in this presentation may still be under development and certain specifications, including but not limited to, release dates, prices, and product features, may change. The products may not function as intended and a production version of the products may never be released. Even if a production version is released, it may be materially different from the pre-release version discussed in this presentation.
Nothing in this presentation shall be deemed to create a warranty of any kind, either express or implied, statutory or otherwise, including but not limited to, any implied warranties of merchantability, fitness for a particular purpose, or non-infringement of third-party rights with respect to any products and services referenced herein.
Brocade, the B-wing symbol, and MyBrocade are registered trademarks of Brocade Communications Systems, Inc., in the United States and in other countries. Other brands, product names, or service names mentioned of Brocade Communications Systems, Inc. are listed at www.brocade.com/en/legal/brocade-Legal-intellectual-property/ brocade-legal-trademarks.html. Other marks may belong to third parties.
Legal Disclaimer
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC. 2
Agenda
• NVMe intro and overview• NVMe over fabrics• Brocade storage networking portfolio
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC.
$0
$2
$4
$6
$8
$10
$12
$14
2015 2016 2017 2018 2019 2020
WorldwideExternalESSRevenuebyStorageArrayType,2015-2020($B)
All-flasharrays(AFAs) Hybridflasharrays(HFAs) HDD-only
Flash Driving Need for Faster Connectivity
4
Total CAGR 0.3% (2015-2020)
By 2020 77% of market will be
flash-based (AFA and HFA)
Source: IDC Worldwide External Enterprise Storage Systems Forecast Update, 2016–2020, Oct. 2016, US41864216
All HDD -13.1%
AFA 23.7%
HFA 1.7%
Growth Outlook for All Flash Arrays
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC.
Why do we need NVMe?
• SATA 150 – 300 – 600MB/s• SAS 300 – 600 – 1.200MB/s
• PCIe – 4.000MB/s
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC. 5
StorageTB / Rpm
InterfaceSpeed
Flash forward
NVMe
HTTP://EMOTIMO.COM/WP-CONTENT/UPLOADS/2015/12/SPEED.JPG
more
for flash & SSD
A new interface
• NVMe stands for Non-Volatile Memory Express• An new interface to the controller for NVM providing commands to the
storage device
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC. 8
PCIe bus
NVMe Subsystem
What is NVMe?
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC.HTTP://WWW.NVMEXPRESS.ORG/WP-CONTENT/UPLOADS/NVME_OVERVIEW.PDF
NVM Express™ Organisationhttp://www.nvmexpress.org/
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC.
NVM Express, Inc.Includes more than 75 firms
from across the industry
Promoter GroupLed by 13 electedcompanies
NVMe Standards overview
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC. 11
Initial release Current releaseNVMe v1.0 (03/2011) v1.3 (05/2017)NVMe-oF v1.0 (06/2016)NVMe-MI v1.0 (11/2015) v1.0a (04/2017)
But what are thebenefits of NVMe?
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC. 12
Benefits of NVMe over SCSI: Stack efficiencyNumber of commands
SCSI:400 commands
NVMe13 commands
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC. 13
Benefits of NVMe over SCSI: Stack efficiencyWhat matters is not just IOPs and bandwidth …
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC.MEASUREMENTS TAKEN ON INTEL CORE , I5-2500K 3.3GHZ 6MB L3 CACHE QUAD-CORE DESKTOP PROESSOR USING LINUX REDHAT EL.6.0 2.6.32-71 KERNEL USING FIO WITH RAW IO. TESTING AND MEASUREMENT BY INTEL
14
• CPU cycles are expensive• Each CPU cycle required for an IO adds latency
Benefits of NVMe over SCSI: Parallelism
SCSI:32 commands1 queue
NVMe:64000 commands64000 queues
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC. 15
But when can I get it for thedata center?
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC. 16
Three phases of NVMe
Rack of Servers
PCI Expander
NVMe flash modules can plug directly into servers
Or they can plug into PCIe expansion slotsBasic (PCIe-based) NVMe in servers (last few years)
1
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC.
But …
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC. 18
But …
Direct attchedstorage13%
Externalstorage87%
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC. 19
Three phases of NVMe Rack of Servers
PCI Expander
NVMe flash modules can plug directly into servers
Or they can plug into PCIe expansion slots
…Server Flash Array
SATA
FC
SCSI
SCSI SAS
FCPCIe
Server
FC
Basic (PCIe-based) NVMe in servers (last few years)
Basic NVMe in storage backend(starting to ship)
1
2
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC.
Three phases of NVMe Rack of Servers
PCI Expander
NVMe flash modules can plug directly into servers
Or they can plug into PCIe expansion slots
…Server Flash Array
SATA
FC
SCSI
SCSI SAS
FCPCIe
Server
FC
Basic (PCIe-based) NVMe in servers (last few years)
Basic NVMe in storage backend(starting to ship)
NVMe over Fabrics (demoing now)
…Server Flash Array
SATA
FC
NVMe
NVMe SAS
FCPCIe
Server
FC
1
2
3
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC.
NVMe over Fabrics
But why is it important tochoose the right fabric?
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC. 23
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC. 24
Imagine having a Formula One race car, but no race track …Picture by POSITIVE IMPULSE® * © OSB GmbH www.positive-impulse.com
Enterprise NVMe Requires a Network
• Fibre Channel is the transport for the vast majority of today’s all flash arrays
• RoCEv2, iWARP and InfiniBand are RDMA-based but not compatible with each other
• FCoE leverages FC-NVMe and requires a DCB network
• NVMe over TCP (iNVMe) proposed by Intel, Facebook, etc.
25
NVMe Server Software
Server Transport Abstraction
FibreChannel Infiniband FCoERoCEv2 iWARP
Storage Transport Abstraction
NVMe SSDs
iNVMe
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC.
“Storage performance bottlenecks are moving out of arrays and into the storage network”
https://www.gartner.com/doc/reprints?id=1-3CQPTF5&ct=160726&st=sb © 2017 BROCADE COMMUNICATIONS SYSTEMS, INC.
Gartner: “The Future of Storage Protocols,” Valdis Filks, Stanley Zaffos, 29 June 2016
NVMe over Fabrics – Enterprise Options
Protocol Latency Scalable EnterpriseFootprint
Fibre Channel Lower Yes Dominant Storage Fabric
RoCEv2 Low Yes Negligible
iWARP Med Yes Negligible
FCoE Low Med Limited
NVMe over TCP/IP (iNVMe) High Limited None
Infiniband Lowest Limited Almost none
Fibre Channel
FCoE
IB
iNVMe
NVMe Universe
RoCE v2
iWARP
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC.
NVMe over Fibre Channel Fabrics
NVMe over FC Product StatusBrocade and the Ecosystem
• Servers/HBA vendors:– Supported on GEN6 capable HBAs from QLogic (Cavium) and Emulex (Broadcom)
• NVMe architected storage arrays:– On the roadmap for several AFA vendor products– External transport for NVMe adoption in 2018
• Brocade FC Products:– Currently shipping GEN5 and GEN6 products are capable of switching NVMe over FC traffic with
other SCSI data containing FC traffic– Enhancements in GEN6 products to provide visibility/monitoring/analytics into NVMe carrying FC
frames
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC. 29
Investment Protection: No Rip Out and Replace
• NVMe and SCSI coexist in the same server and SAN
• Dynamically migrate to NVMe on demand
• Transition applications and infrastructure at your own pace
• Existing GEN5 and 6 SANs can run NVMe without code changes or disruptions
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC. 30
SCSI
Emulex HBAs by
SCSISCSI
GEN5 HBAs
NVMe NVMe
NVMe Traffic
SCSI Traffic
NVMe - TakeawaysWhy Fibre Channel for NVMe?
• Dedicated Storage Network• Run NVMe and SCSI Side-by-Side• Robust and battle-hardened discovery and name service• Zoning and Security• Integrated Qualification and Support
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC. 31
NVMe over Fibre Channel for dummies bookhttp://media.wiley.com/assets/7359/40/9781119399711.pdf
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC. 32
Brocade Storage Networking Product Portfolio
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC.
© 2013 Brocade Communications Systems, Inc. 34
Generations of Fibre Channel
SFP / SFP+
Generation 1st Gen 2nd Gen 3rd Gen 4th Gen 5th Gen 6th Gen
Electrical / Optical Module
1GFC / GBIC/ SFP
2GFC / SFP
4GFC /SFP
8GFC /SFP+
16GFC /SFP+
32GFC / SFP+
Encoding 8b/10b 8b/10b 8b/10b 8b/10b 64b/66b 64b/66b
Availability 1997 2001 2006 2008 2011 2015
GBIC
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC. COMPANY PROPRIETARY INFORMATION
Brocade Storage Networking Product Portfolio
Management and Orchestration ToolsBrocade Network Advisor OpenStack VMwareRest API
Brocade 6510 Switch
Brocade FX8–24
Extension Blade
Brocade DCX 8510 Backbones
Brocade Gen 5 Blade Server Switches
Brocade 6505 Switch
Brocade 6520 Switch
Brocade FC16–32, –48, –64
Port Blades
Brocade 7840 Switch
35
Brocade Analytics Monitoring Platform
Brocade G620 Switch
Brocade X6 Directors
Brocade SX6 Extension Blade
Brocade FC32–48Port Blade
Brocade G610 Switch
36
Brocade Integrated Network Sensors
StorageArray
Server
Infrastructure Health
InfrastructurePerformance
InfrastructureAvailability
Application Performance
infrastructure utilization
Application baselining
End-to-End Monitoring
AMP
Fabric Vision
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC.
38
An Appliance for Fibrechannel Network TelemetryBrocade Analytics Monitoring Platform
Gain deeper insight into end-to-end application performance
Optimize infrastructure without downtime
Automate monitoring and alerting of abnormal behaviors
Quickly pinpoint problems, uncover issues before users are affected
AMP supports the move, from reactive to proactive
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC.
Analytics Monitoring Platform Metrics
Real-time and historical metrics• Read/write latency• Read/write IOPS transfer rate (avg/max)• Other SCSI command latency stats (reserve, release, inquiry, test
unit ready; aggregate or individual)• Pending/outstanding IOs, (indicator of queue depth)• IO size (avg/max)• Protocol error stat tracking:
– I/O aborts, timeouts, check conditions• Fabric latency• AMP tracks metrics for all flows at 10 s/5 m/all resolution• Historical metric retention via Brocade Network Advisor
(5-minute granularity)
Metrics by data size:• <8k• 8k–64k• 64k–512k• >512k• All sizes
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC.
“Fibre Channel will remain as the data center storage protocol of choice for the next decade.”
The full report is available on brocade.com
Gartner: “The Future of Storage Protocols,” Valdis Filks, Stanley Zaffos, 29 June 2016
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC.
Industry view on NVMeEvaluator Group Report
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC. 41
Report available on requestSummary
Analysis
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC. 42
Summary
Analysis
NVMe - Takeaways
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC.
43
• NVMe offers lower latency storage protocol
• NVMe over Fabrics support will enable networked solutions
• Initially offered in low latency JBOFs
• Then, integrated into Enterprise-class arrays
Flash Changes Everything
• NVMe fabrics require:• Lossless, Low latency,
Secure, Scalable & Proven
• Fibre Channel offers all of the above for both legacy FCP (SCSI) and FC-NVMe
Fibre Channel is the Natural Evolutionary Path
• 55% lower stack latency• Investment protection
• Extends FC technology• Concurrent SCSI & NVMe• Reduces investment and
adoption risk• Known vendor support models• Guaranteed interoperability
GEN6 Fibre Channel is the Right Choice
NVMe over FabricsStack Architecture• NVMe over Fabrics describes
how to transport the NVMeinterface across several scalable fabrics
• NVMe over Fabrics defined two types of fabrics for NVMetransport: Fibre Channel and RDMA
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC. 44
Enterprise NVMe Requires a Network
• NVMe over Fabrics describes how to transport the NVMeinterface across several scalable fabrics
• NVMe over Fabrics initially defined two types of fabrics for NVMe transport: Fibre Channel and RDMA
45
NVMe Server Software
Server Transport Abstraction
FibreChannel Infiniband FCoERoCEv2 iWARP
Storage Transport Abstraction
NVMe SSDs
iNVMe
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC.
NVMe-oFRoCEv2 - RDMA over Converged Ethernet v2
• Promoted by Mellanox
• RDMA-based
• Not compatible with other Ethernet options
• Lossless Ethernet required– Explicit Congestion Notification (ECN)
• RDMA + IP + lossless Ethernet (DCB) layer adds complexity
• Ethernet / IP strength as “best effort, commodity, internet scale” network does not make it ideal for datacenter storage use case
46
NVMe Server Software
Server Transport Abstraction
Storage Transport Abstraction
NVMe SSDs
RoCEv2
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC.
NVMe-oF: iWARPiWARP - Internet wide area RDMA protocoll
• Promoted by Intel (yet)
• RDMA-based
• Not compatible with other Ethernet options
• Lossless Ethernet not required but recommended (TCP will resend packets)
• RDMA + IP + lossless Ethernet (DCB) layer adds complexity
• Ethernet / IP strength as “best effort, commodity, internet scale” network does not make it ideal for datacenter storage use case
47
NVMe Server Software
Server Transport Abstraction
Storage Transport Abstraction
NVMe SSDs
iWARP
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC.
NVMe-oF
• Promoted by Mellanox
• RDMA-based (initial RDMA)
• Very rare deployed, special use cases for high performance computing or server-server/ cluster communication
• No broad vendor or interop support
48
NVMe Server Software
Server Transport Abstraction
Storage Transport Abstraction
NVMe SSDs
Infiniband
Infiniband
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC.
NVMe-oF
• Promoted by CISCO
• Leverages FC-NVMe
• Requires a DCB network (lossless) which is not very common (except TOR/HC solutions)
• Not compatible with other Ethernet options
• Lossless Ethernet (DCB) layer adds complexity
49
NVMe Server Software
Server Transport Abstraction
Storage Transport Abstraction
NVMe SSDs
FCoE
FCoE
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC.
NVMe-oF
• Promoted by facebook, DELL EMC, Intel
• Not RDMA-based
• Not yet part of the NVMe-oF standard– To be expected in the next release
(2018/19)
• Approach like iSCSI: connectivity for commodity devices to existing infrastructure
• Leverages software implementation of NVMe
• Limited in scalability and performance
50
NVMe Server Software
Server Transport Abstraction
Storage Transport Abstraction
NVMe SSDs
iNVMe
iNVMe - NVMe over TCP
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC.
NVMe-oF
• Promoted by Brocade, CISCO
• Not RDMA-based
• Fibre Channel was already a lossless, topology-agnostic fabric, NVMe is just a new “upper layer protocol”… RDMA not needed
• Natural extension to leverage fabrics for shared storage arrays
– FC being the predominant fabric for storage (70% of flash storage is FC)
• Will run on exiting GEN5/6 fabrics
51
NVMe Server Software
Server Transport Abstraction
Storage Transport Abstraction
NVMe SSDs
FibreChannel
Fibre Channel
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC.
NVMe over Fabrics protocol stacks
• Fibre Channel was already a lossless, topology-agnostic fabric, NVMe is just a new “upper layer protocol”… RDMA not needed
• RDMA + IP + lossless Ethernet (DCB) layer adds complexity to RoCEv2 & iWARPoptions
• Ethernet / IP strength as “best effort, commodity, internet scale” network does not make it ideal for a premium datacenter scale use case like storage
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC. 52
FC
F e FC-NVMe
FC-FS
Encoding
FC Physical
NVMe over RDMA
RDMA Verbs
IB Transport(with ECN)
UDP
IP (w ECN)
Ethernet DCB(w PFC, ETS, DCBx)
Ethernet Physical
NVMe over RDMA
RDMA Verbs
MPA
IP (w ECN)
Ethernet Physical
Ethernet DCB(w PFC, ETS, DCBx)
TCP (with ECN)
DDP protocol
RoCEv2 iWARP
NVMe over Fabrics
NVMe stack efficiency
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC. 53
Source: QLogic
FCP Protocol Transactions
• FCP Transactions look similar to RDMA
– For Read: FCP_DATA from Target– For Write: Transfer Ready and then
DATA to TargetIO Read
Initiator TargetIO Write
Initiator Target
SCSI Initiator
SCSITarget
FCPI/F Model
FCPI/F Model
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC.
Source: QLogic
NVMe-oF Protocol Transactions
• NVMe-oF over RDMA protocol transactions
– RDMA Write– RDMA Read with RDMA Read
ResponseIO Read
Initiator TargetIO Write
Initiator Target
NVMe-oFInitiator
NVMe-oFTargetNVMe-oF I/F Model
NVMe-oFI/F Model
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC.
Benefits of NVMe over SCSI: ParallelismMore and deeper queues: up to 64K queues, each 64K commands
• The increase in IOPS will be as important as reduced latency:
• Queuing is integrated into the protocol rather than layered on top (as with SCSI)– Reduces server-side context switching, eliminating vast wasted clock cycles
© 2017 BROCADE COMMUNICATIONS SYSTEMS, INC. 56
top related