lenovo distributed storage solution for ceph · copies of objects,]… ceph insures data integrity...
Post on 02-Nov-2019
15 Views
Preview:
TRANSCRIPT
1
Lenovo Distributed Storage Solution for Ceph
2016 Lenovo Unclassified. All rights reserved.
Lenovo Solutions Development
November 2016, Salt Lake City
2
Agenda
• Enterprise Storage – current state and future
• What is Ceph?
• Lenovo Portfolio for SAP HANA
• Lenovo Value Proposition
• Lenovo Architectures
• Summary
2016 Lenovo Unclassified. All rights reserved.
3
Welcome to the age of endless data growth
• 2016: ~59 Exabytes of external storage sales (IDC forcast) – 22 EB for FibreChannel
– 18 EB for Network Attached (NAS filer)
– 9 EB for iSCSI
– 8 EB for Direct Attached
… sold by the storage companies we all know
• Data created every day: ~2.5 Exabyte
2016 Lenovo Unclassified. All rights reserved.
4
The commodization of Enterprise storage
2016 Lenovo Unclassified. All rights reserved.
External storage systems:
Overall storage market: IDC march press release, covering FY2015:
"The enterprise storage market closed out 2015 on a slight downturn, as spending
on traditional external arrays continues to decline," said Liz Conner, Research
Manager, Storage Systems. "Over the past year, end user focus has shifted
towards server-based storage, software-defined storage, and cloud-based
storage. As a result, traditional enterprise storage vendors are forced to revamp
and update their product portfolios to meet these shifting demands."
Traditional storage vendors
losing market share to
ODM Direct vendors
ODM = Original Design
Manufacturers
Source: IDC
5
Storage Types
2016 Lenovo Unclassified. All rights reserved.
directory
tree
root dir
Files (NAS) Blocks (SAN) Objects
byte 0
byte 1
byte 2
byte n
. . .
file
/a/b
/c
byte 0
byte 1
byte k
. . .
file
/d/e
/f/g
. . .
read / write
byte range
block 0
block 1
block 2
block n
. . .
block 0
. . .
read / write
block range
(block=4K bytes)
. . . block 1
block k
. . .
read / write
entire objects
obje
ct
1
obje
ct
m
attr-i=val-i
attr-2=val-2 attr-1=val-1
attr-j=val-j
attr-2=val-2 attr-1=val-1
volu
me 1
volu
me m
Metadata
6
Evolution of Storage Topologies
Internal (DAS) Shared, Networked (some is SDS)
Hyper-converged (mostly SDS, virtualized)
Application
Server
Application
Server
Networked
Storage
Appliance
Ethernet /
FibreChannel
Networked
Storage
Appliance
Application/
Storage
Server
Application/
Storage
Server
Ethernet
Application
On Storage-
Rich Server
Ethernet
Hard to provision right
amount of storage;
doesn’t scale
2016 Lenovo Unclassified. All rights reserved.
7
Survey: Legacy vs Emerging Providers
2016 Lenovo Unclassified. All rights reserved.
Source: Tintri State of Storage Survey
https://www.tintri.com/news/tintri-state-storage-survey-reveals-biggest-pain-points-finds-buying-criteria-mismatched-today
8 8
Object-based storage is the future for Exascale storage:
"The future of storage is software based.”
"FOBS [file- and object-based storage] solutions are much more versatile and will quickly outpace more rigid, hardware-based options.“
Scale-up solutions, including unitary file servers and scale-up appliances and gateways, will fall on hard times
throughout the forecast period […] and will experience only sluggish growth through 2016 before beginning to decline in 2017.
IDC Storage Systems Research Director
Ashish Nadkarni (2013) 2016 Lenovo Unclassified. All rights reserved.
9
SDS market outlook
2016 Lenovo Unclassified. All rights reserved. Source: IT Brand Pulse
10
Market share trends per storage type
2016 Lenovo Unclassified. All rights reserved.
2014 2026 2016
HDD
flash
File-based storage, including SDS Networked
Server-attached
NVRAM
Growth driven by
Automatically
Generated,
unstructured data
Growth driven by
virtualization, SDS,
cloud storage
Hadoop,
Ceph,
Nutanix
EMC,
HP,
NetApp
11
New storage technologies IT professionals expect to
evaluate or deploy in 2015
2016 Lenovo Unclassified. All rights reserved.
Source: IT Brand Pulse
12
Why software-defined storage?
• Cost savings & flexibility: – Avoid large markup by storage vendor on hardware
– Share hardware resources between storage and application; increases utilization; get more work out of less hardware
– More customer flexibility in choosing the best hardware for their needs
• Disadvantages, when done by yourself: – Customer is responsible for selecting and installing hardware; may not provision adequately
for the needs of the software Lenovo Solution Architecture & Support
– Customer is responsible to debug problems and then work with server, storage, OS, or networking vendor single point of contact via SAP OSS ticketing system
– Storage vendor has to be prepared to support their software running on almost any reasonable hardware SUSE and Lenovo have defined a portfolio using few building blocks only
2016 Lenovo Unclassified. All rights reserved.
13 13
What is Ceph?
2016 Lenovo Unclassified. All rights reserved.
14
Looking under the hood of Ceph
2016 Lenovo Unclassified. All rights reserved.
RADOS
A software-based, reliable, autonomous, distributed object store
comprised of self-healing, self-managing, intelligent storage servers
specifically tuned to run SAP HANA workload
Server 1 Server 2 Server 3 Server X … Hardware
Software
rbd
(rados block device)
cephfs
(distributed POSIX
file system)
towards clients
API
REST gateway to
object store
(S3/Swift comp.)
15
The heart of Ceph: RADOS
RADOS elements:
• OSDs – Corresponds to one storage device in Linux (JBOD, RAID (HDD or SSD), NVMe)
– Store objects physically (see next slide)
– Act as fully autonomous devices to provide linear scalability and no SPOF
• Monitors – Manage cluster membership and cluster state – create a quorum of cluster nodes
2016 Lenovo Unclassified. All rights reserved.
RADOS
A software-based, reliable, autonomous, distributed object store
comprised of self-healing, self-managing, intelligent storage servers
specifically tuned to run SAP HANA workload
Server 1 Server 2 Server 3 Server X …
OSD = single Object Storage Devices,
fed into RADOS
16
The heart of Ceph: RADOS (cont)
• RADOS provides a hash-based placement algorithm (called CRUSH) – Foundation for linear scalability (each client can compute placement independently)
– Direct client to server data path
– Distributes data randomly among OSDs within the cluster
– Allows data placement rules and constraints
• Stores content of rbd images in 4 MB flat files
• Work is spread across all spindles in a cluster (much better utilization than traditional RAID arrays on Enterprise storage systems)
• Server or disk failure not fatal – one or more remaining OSDs still has data
2016 Lenovo Unclassified. All rights reserved.
17
Ceph Data Placement
• Default configuration for reduncany: size=3
• Copies are taken from the primary replica
• Data location determined by CRUSH map
2016 Lenovo Unclassified. All rights reserved.
18
Use cases (1/4) – Storage Snapshotting
• A snapshot is a read-only copy of the rbd image state at a particular point in time
• Snapshotting automatically built-in, no extra license fee like with Enterprise Storage systems
• Revert to snapshot support
• Usage: – Backup
– Fetch test data
– Save state before a HANA table modification
– … very popular
2016 Lenovo Unclassified. All rights reserved.
rbd with XFS
Snapshot is
triggered in SDS software,
and executed by
all OSDs in parallel.
19
Use cases (2/4) – Storage mirroring
• For synchronous operation, just increase the number of data copies
• For asynchronous operation, there is capability on rbd image level
2016 Lenovo Unclassified. All rights reserved.
Remote site
Local site
Replication on pool level, affecting
all rbd images in this pool.
Primary site uses RBD journaling image
feature to ensure crash-consistency.
Remote site pulls journal from time to time
and applies it locally.
20
Use cases (3/4) – Security features
2016 Lenovo Unclassified. All rights reserved.
Data encryption at OSD level
• Security comes at no extra license fee, it is built in – Encryption of data at rest
– Checksumming of data at rest
Checksumming & scrubbing (data copies match, etc)
“[In addition to making multiple
copies of objects,]… Ceph insures
data integrity by scrubbing
placement groups. Ceph
scrubbing is analogous to fsck on
the object storage layer.
Ceph generates a catalog of all
objects and compares each
primary object and its replicas to
ensure that no objects are missing
or mismatched.”
21
Use cases (4/4) – Efficient redundancy for scale-out env.
• Traditional RAID gets very inefficient when scaling big – Lots of capacity is lost to overhead
– RAID over dozens of disks not recommended
– During rebuild, only a fraction of overall spindles is involved
– Rebuilt puts heavy load on the storage subsystem (guess at what point HDDs fail …)
• New method: erasure-coding – Each object is stored as K+M chunks: K data chunks plus M coding chunks
Can sustain the loss of M cunks
– Example: K+M= 4+2 sustain the loss of two devices
2016 Lenovo Unclassified. All rights reserved.
22 22
Lenovo Portfolio for SAP HANA
Going into the details …
2016 Lenovo Unclassified. All rights reserved.
23
Lenovo Storage Solution for SAP HANA – Portfolio
2016 Lenovo Unclassified. All rights reserved.
High Availability thru redundant arrays
Scalability thru additional disks or arrays
beyond the defined node limitations
Unlimited growth thru scale out and replication
Each SES array has a minimum of
three x3650 M5 servers which is the
required SUSE minimum for a SES cluster.
All solutions available today under CA with
Lenovo development support.
Lenovo Storage Solution H
(HDD)
Lenovo Storage Solution F
(Flash)
Lenovo Storage Solution C
(Capacity)
x3650 M5 x3650 M5 x3650 M5
8871-AC1 8871-AC1 8871-AC1
Up to 4 nodes
per array
Up to 12 nodes
per array
Up to 16 nodes
per array
2x E5-2630v4 (min.) 2x E5-2690v4 2x E5-2690v4
256 GB DDR4 256 GB DDR4 256 GB DDR4
12x 1.2TB HDD 2.5“
with FlashCache (6x 400GB SSD)
5x 3.84TB SSD 2.5“ (max. 24) 12x (max. 36x) 2-10TB HDD 3.5“
with FlashCache (6x 400GB SSD, max. 24)
XFS on SUSE Enterprise Storage
XFS on SUSE Enterprise Storage
XFS on SUSE Enterprise Storage
10.8 TB data/log max. 92.1 TB data/log Max. 288 TB data/log
4x 40GbE
4x 1GigE
4x 40GbE
4x 1GigE
4x 40GbE
4x 1GigE
Internal upgradability:
+6 SSD (8 nodes)
24
Certified SAP HANA Hardware Directory
2016 Lenovo Unclassified. All rights reserved.
25
High-level view
2016 Lenovo Unclassified. All rights reserved.
Server with local storage devices (HDD, SSD, NVMe, ..)
plus software to turn it into SDS
26
Lenovo Storage Solution for SAP HANA
CephFS
2016 Lenovo Unclassified. All rights reserved.
27 27
Lenovo Value Proposition
2016 Lenovo Unclassified. All rights reserved.
28
Why Open-Source Storage Software?
Mostly due to the commoditization of storage: • Repeating what happened with Unix (Linux) and C compilers (gcc).
• Adherence to standards (NFS, CIFS, iSCSI, FCOe): – Makes it feasible to clone the functionality – not a moving target
– Makes different implementations interchangeable, i.e. commodities
• No one company can compete with the productivity and innovation of an active world-wide community of open-source developers.
• Cost: the price of any commoditized software eventually approaches $0.
• Final trigger for adoption is when a paid support model emerges from a trusted source (e.g. Red Hat or SUSE).
• Gartner predicts that 20% of storage will be open source as early as 2017.
2016 Lenovo Unclassified. All rights reserved.
29
Why Ceph as an engineered solution?
• Cost savings & flexibility: – Avoid large markup by storage vendor on hardware
– Share hardware resources between storage and application; increases utilization; get more work out of less hardware
– More customer flexibility in choosing the best hardware for their needs
• Disadvantages, when done by yourself: – Customer is responsible for selecting and installing hardware; may not provision adequately
for the needs of the software Lenovo Solution Architecture & Support
– Customer is responsible to debug problems and then work with server, storage, OS, or networking vendor single point of contact for support
– Storage vendor has to be prepared to support their software running on almost any reasonable hardware SUSE and Lenovo have defined a portfolio using few building blocks only
2016 Lenovo Unclassified. All rights reserved.
30
Lenovo Storage Solution for Ceph
Simple based on proven x3650 M5 server technology
Seamless providing block storage, configured e.g. with XFS
Software Defined based on SUSE Enterprise Storage
Safe High Availability thru replication (r2, sync or async)
Scalable scale up and/or scale out
Superior Technology scalable Flash acceleration, redundant 40GbE, RDMA etc.
replication, snapshotting, encryption, Object Storage etc.
Suitable Models HANA - Entry
Flash - Performance
Capacity - Density
SAP HANA ready certified Enterprise Storage for up to 64 nodes
2016 Lenovo Unclassified. All rights reserved.
31
Lenovo Storage Solution – Engineered end-to-end (1/2)
• Detailed workload analysis, example: Ceph internal behaviour for O_DIRECT 16k blocks
2016 Lenovo Unclassified. All rights reserved.
32
Lenovo Storage Solution – Engineered end-to-end (2/2)
• Detailed workload analysis, example: CPU bottleneck analysis (“How many cores do I really need?”)
2016 Lenovo Unclassified. All rights reserved.
4K random writes
64K random writes
1M streaming
33 33
Lenovo Architectures
2016 Lenovo Unclassified. All rights reserved.
34
Ceph for SAP HANA: Details
All server running
SUSE Enterprise
Storage (SES)
2016 Lenovo Unclassified. All rights reserved.
35
SUSE Enterprise Storage for SAP HANA
2016 Lenovo Unclassified. All rights reserved.
Each HANA server has:
• two rbd (data and log) which are XFS formatted.
• access to a CephFS distributed file system (HANA traces, …)
rbd with XFS (GB .. TB)
36
42U Rack
Example: Ceph for nextScale
• Design criteria: $/TB – big data read intensive (genomics, bio, imaging)
• up to 2 PB net usable in a single rack (10+2 EC) plus 15 TB Flash cache
• Net capacity (TB) using different redundancy levels (# coding chunks)
6U / 420 TB raw
2016 Lenovo Unclassified. All rights reserved.
Basic building block
3.5” SATA for capacity
Flash write cache Automatic pro/de-mote
Age
Access freq
Percentage full
Performance and
capacity layer can be
scaled independently.
37
Summary
• The storage market is changing heavily – Trend towards software-defined, better utilized (think of Uber or Airbnb)
– External storage systems are in trouble
• Don’t pay expensive license fees for storage features that you can get for free
• Avoid vendor lock-in, free your storage
• Lenovo has pioneered in collaboration with SUSE a new software defined storage solution for use with SAP HANA
– Get ready for the future of storage
– Come and talk to us on the booth
2016 Lenovo Unclassified. All rights reserved.
top related