ecs/cloud object storage - devops day
TRANSCRIPT
1© Copyright 2015 EMC Corporation. All rights reserved.
OBJECT STORAGETECHNICAL DISCUSSION
2© Copyright 2015 EMC Corporation. All rights reserved.
DATA GROWTH IS BREAKING TRADITIONAL STORAGESCALE-OUT FILE & OBJECT STORAGE CONTINUES TO GROW
• Overly complex– Multiple data protection schemes,
protocols, management tools
• Can’t economically scale– Inefficient, high overhead,
especially at geo-scale
• Not cloud-ready– Not architecturally suited and no
self-service
Source: IDC EMC Digital Universe Study 2014
3© Copyright 2015 EMC Corporation. All rights reserved.
BLOCK FILE FILE BLOCK FILE
Today’s Storage Infrastructure
BLOCK
4© Copyright 2015 EMC Corporation. All rights reserved.
WHY OBJECT?
5© Copyright 2015 EMC Corporation. All rights reserved.
Object Storage Characteristics
Linear ScalabilityScales to billions of objects
No LockingNo lock on write or create operations
Geo-scaleGeo-replicated and distributed
Support for large filesObject sizes are in TBs
Web friendly Firewall friendly, http, REST accessibility
Metadata and extensibilityObjects can be extended to multiple policies (Immutability, retention, etc…)
6© Copyright 2015 EMC Corporation. All rights reserved.
An Object platform offers … A flat namespace of millions of buckets
Buckets that scale to billions of objects
Geo distribution, protection, access
User meta data as a first-class entity
Snapshot consistency semantics
Multi-tenant access & metering
Multiple data access methods including via REST/HTTP (S3, Hadoop, CAS, Atmos & Swift)
OBJECT
OBJECT
OBJECT
OBJECT
OBJECT
OBJECT
7© Copyright 2015 EMC Corporation. All rights reserved.
File and Object Storage ComparisonFile Object
Writing to a file requires exclusive lock Object supports multiple writes, no locking
Limit on number of files in a directory. Objects are limitless in size, 1 MB to TBs, Objects scale across multiple files
File meta-data is fixed by file system, no user meta-data
Objects support extensible meta-data
Large files hard to seek Objects can be viewed with no limitation
File create operations require directory to be in exclusivity lock
No locking required to create files
CIFS/NFS access not Web or firewall-friendly – relies on file/folder access control and session-based authentication
Easy, fine-grained authentication and access control (per object), HTTP, REST-based access
8© Copyright 2015 EMC Corporation. All rights reserved.
Object Use Cases are Expanding
• Existing Use Cases: – Scalable content store for cloud-based
applications/services– Scalable content storage for vertical
applications– Tape rationalization/elimination
• Emerging Use Cases:– Storage for Big Data/Hadoop – NAS replacement/augmentation– Public IaaS alternative
• Migrate to alternative providers or “in-sourcing”
Applications, analytics and data growth drive Object
Source: 451 Research
9© Copyright 2015 EMC Corporation. All rights reserved.
MODERN APPS ARE BREAKING TRADITIONAL STORAGENOT DESIGNED FOR CLOUD AND BIG DATA APPLICATIONS
• Architecture is too complex– Locking, replication, High Availability, geo-
distribution is complex
• Not Web or firewall friendly– Distributed (WAN) access is complex
• Storage silos impede development– Different hardware for every data type and
access protocol
10© Copyright 2015 EMC Corporation. All rights reserved.
USE CASES
11© Copyright 2015 EMC Corporation. All rights reserved.
GLOBAL CONTENT REPOSITORYON-PREMISE UNSTRUCTURED STORAGE PLATFORM
PROBLEM• Can’t cost-effectively manage or scale storage to
support explosive growth in unstructured content.• Traditional storage not suited for new Web, mobile
and cloud applications.• Difficult and costly to manage data lifecycle and
retention policies across archive silos and sites
VALUE• Reduce complexity and cost–one globally
accessible, geo-efficient archive that serves multiple applications and content types at lower cost than public cloud.
• Anywhere data access – All data globally accessible by Web, mobile and cloud apps.
• Enterprise-grade data protection – Efficient geo-protection and policy-based retention for basic compliance and governance.
https://accesspoint.yourcompany.com
U.K.L.A.
Memphis
Applications Tiering, Archiving, Backup
12© Copyright 2015 EMC Corporation. All rights reserved.
MODERN APPLICATION PLATFORMEFFICIENT GEO-CAPABLE STORAGE & ANYWHERE ACCESS
https://accesspoint.yourcompany.com
U.K.L.A.
Memphis
PROBLEM• Traditional storage architecture not optimized for
multi-site, mobile access to content• Writing to multiple file systems and proprietary
APIs complicates development• Can’t access or process large data sets
VALUE• Anywhere access - Provides anywhere access to
geo-replicated content• Simpler, faster development - Supports
multiple industry standard APIs/protocols and anywhere access with strong consistency
• Unmatched access and efficiency - Geo-protection, active-active architecture optimizes both access and storage efficiency for Big Data – large and small files
13© Copyright 2015 EMC Corporation. All rights reserved.
GEO-SCALE BIG DATA ANALYTICSEFFICIENT GEO-SCALE STORAGE & GLOBAL BIG DATA ANALYTICS
https://accesspoint.yourcompany.com
U.K.L.A.
New York
ANALYTICSPROBLEM• Large (and growing) data volumes lead to
exponential storage costs• Traditional Hadoop replication leads to
unmanageable DC footprint with data growth• Always have to move data to the analytics cluster
VALUE• Cost Efficient Storage• HDFS Archive – Bring state of the art patented
technology to provide highly dense storage for Hadoop
• Global Analytics –Bring analytics to geo-distributed data and archives
14© Copyright 2015 EMC Corporation. All rights reserved.
PROBLEM• Unstructured data growth - Reclaim costly Tier 1 storage • Current solutions aren’t scalable or cost efficient• Instant access to cold-stored data is required • “No Public Cloud” policy - Data needs to be on-premises
VALUECosts less than public cloud - Provides on-premises security
U.K.L.A.
Memphis
LA
N/
WA
N
Video
UnstructuredData
Sensory Data
Images
COLD ARCHIVECOST EFFECTIVE LONG TERM RETENTION
15© Copyright 2015 EMC Corporation. All rights reserved.
PROBLEM• Need cost effective solution to store hue amounts of
unstructured data generated by IOT and sensors• “No Public Cloud” policy - Data needs to be on-premises• Data collection via modern cloud applications requires
compatibility with APIS’s like S3 and OpenStack• Analytics workflow is slow, expensive and complicated using
Hadoop direct attach or public cloud storage
VALUE• Cost per GB is less than public clouds• Provide high availability with on-premises security• Compatible with S3, OpenStack, and other popular API’s• HDFS compatible and enables a streamlined Hadoop workflow
for “data in place” analytics
‘IOT’ CLOUD STORAGE PLATFORM‘INTERNET OF THINGS’ – SENSORY & TELEMETRY DATA COLLECTION
16© Copyright 2015 EMC Corporation. All rights reserved.
EMC & OBJECT STORAGE
17© Copyright 2015 EMC Corporation. All rights reserved.
Hyper Scale - Sales Out to Billions of Objects
“Public Cloud-Like” – Secure Access, Anytime, Anywhere
Comprehensive Multi-Tenant Management
Active/Active Geo-Distributed Architecture
Multiple Protocol Support – REST and HDFS Ready
Compelling Economics –Appliance or SW Only (DIY)
ECS HYPERSCALE CLOUD STORAGE
18© Copyright 2015 EMC Corporation. All rights reserved.
CUSTOMERS CAN LEVERAGE COMMODITY PLATFORMS
SOFTWARE-DEFINED STORAGE
Software-DefinedStorage
CommodityPlatforms
19© Copyright 2015 EMC Corporation. All rights reserved.
COMMODITY HARDWARE VALUE PROPOSITION
• Utilize standardized, open technologies and mass market components
• Individual components provide lower performance, reliability, etc.
• At sufficient scale, with the right software, the component pool provides superior characteristics
20© Copyright 2015 EMC Corporation. All rights reserved.
ECS SOFTWAREEnterprise & SPsObject & HDFS DIY Commodity
ECS APPLIANCEEnterprise & SPsObject & HDFS
integrated appliance
ViPR DATA SERVICES
Enterprise & SPsObject & HDFS
File-based Arrays
CHOICE AND FLEXIBILITY
21© Copyright 2015 EMC Corporation. All rights reserved.
EMC OBJECT ECOSYSTEM
Enterprise Information Archiving
Enterprise Content Management
Analytics
Cloud Gateways
Migration
Sync & Share
CLOUD BOOST
21© Copyright 2015 EMC Corporation. All rights reserved.
Analytics
Protocols CAS
22© Copyright 2015 EMC Corporation. All rights reserved.
Learn | Try | Develop | Collaborate
Explore how-to videos, helpful guides, and training
Download ViPR FREE with no time limit – for non-production use
Access SDKs, FAQs, forums, technical documentation, sample apps, and more
Ask the experts, talk to peers, share ideas and experiences
www.emc.com/viprcommunity
JOIN THE ViPR COMMUNITY…
24© Copyright 2015 EMC Corporation. All rights reserved.
ECS TECHNICAL DETAIL• ECS STORAGE ENGINE • WRITE PATH, READ PATH, BOX CARTING• ECS GEO-CAPABILITIES
25© Copyright 2015 EMC Corporation. All rights reserved.
ECS ARCHITECTURE OVERVIEW
Object
Storage Engine
HDFS NFS*• Multi-head access - ability to access same data concurrently
through multiple access protocols.
• Provides High Availability and Scalability.• Manages transactions and persistent data.• Protects data against failures, corruption and disasters
ECS Appliance
Commodity
EMC and 3rd party file arrays
26© Copyright 2015 EMC Corporation. All rights reserved.
COMPREHENSIVE DATA ACCESSCOMPATIBILITY WITH COMMON INDUSTRY API’S
• Simultaneous access to underlying data through multiple interfaces– Object, HDFS, File (future)
• HDFS compatible with Cloudera, Hortonworks, Pivotal etc.
• Support for S3, Swift, Atmos and Centera CAS APIs object
• Extensions to APIs– Byte-Range updates, Atomic appends, Rich
ACLs etc.
ATMOS
27© Copyright 2015 EMC Corporation. All rights reserved.
DESIGN PRINCIPLE: LAYERED ARCHITECTURE
Limitless Scale: Each layer is independently scalable, highly available, and has no single point of failure.
Scale-Out Architecture: Scale by adding more nodes, no special nodes or roles
Global Namespace: Any node has full system view off data and meta-data
Persistence Layer
Storage Engine
JBODs
OBJECTHDFS
OBJECTHDFS
OBJECTHDFS
28© Copyright 2015 EMC Corporation. All rights reserved.
TRANSACTION FLOW (WRITE)
Node 1
1. Create Object request (Name, data, metadata)
Node 1 Node 4 Node 5
2. Write of data and metadata in chunk
All three copies written in parallel. Write successful only if all copies ack
Node 2
5. Back to client
3. Index update (name, location) to the owner
partition
4. Journal write
29© Copyright 2015 EMC Corporation. All rights reserved.
ERASURE CODING
Data is written into chunks - 3 copies
Erasure coding begins as the chunks are shipped
Once EC completes, the data becomes fully protected and the 3 copies are deleted
A
A
AA
30© Copyright 2015 EMC Corporation. All rights reserved.
GARBAGE COLLECTION
In Append-only systems updates/deletes cause files to have blocks of data that are unused
This is done at the level of chunk
Unused chunks reclaimed by a background task
31© Copyright 2015 EMC Corporation. All rights reserved.
Node 1
1. Read Object request
3. Read data
Node 2
4. Send data back
2. Get Location
TRANSACTION FLOW (READ)
32© Copyright 2015 EMC Corporation. All rights reserved.
BOX CARTING: CHUNK WRITE
Node
Buffered Writer
Acks(PARALELL SYNC WRITE)
33© Copyright 2015 EMC Corporation. All rights reserved.
DATA WRITTEN IN APPEND ONLY CHUNKS
• Data is written in an append-only pattern.
• No data is overwritten or modified.
• No locking required for I/O.
• No cache invalidation required.
• Journaling, snapshot and versioning natively built-in
• ECS stores all types of data and index in “chunks”
• Chunks are– Logical containers of
contiguous space (128MB)
– Written in an append-only pattern
• All data protection operations are done on chunks
34© Copyright 2015 EMC Corporation. All rights reserved.
ECS STORAGE ENGINE: KEY BENEFITS
• All nodes can process write requests for the same object simultaneously, and write to different sets of disks.
• Throughput takes advantage of all spindles and NICs in cluster.
• Payload from multiple small objects are aggregated in memory and written in a single disk write
• Efficient storage for both small and large data
35© Copyright 2015 EMC Corporation. All rights reserved.
Unstructured configurations– Object & HDFS
Available in multiple capacities within a rack
Clustering across racks scales to 100s of PBs
HARDWARE CONFIGURATIONS
36© Copyright 2015 EMC Corporation. All rights reserved.
ANALYTICS
37© Copyright 2015 EMC Corporation. All rights reserved.
SAMPLE HADOOP WORKFLOW
HDFS
Analytical Models(Hive, HAWQ)
Data Visualizations(Tableau)
Variety of Data Sources
Data Cleansing
Ingest
Data Scientists
Store
Analyze
Surface
38© Copyright 2015 EMC Corporation. All rights reserved.
HADOOP PROCESSING MODELSHARED STORAGE MODEL
VNX VMAX Isilon CommodityVNX Isilon 3rd Party Commodity
• Enables common Data Lake for LOB application storage and analytics
• Scale compute & storage independently
• Multiple distributions, clusters connect to the same data
ECS
39© Copyright 2015 EMC Corporation. All rights reserved.
CHALLENGES WITH HDFS
• HDFS not Enterprise-Grade– Requires three full copies of data, no erasure
coding– No Geo-distribution, Limited DR, Multi-tenancy– Inefficient for handling small files
• High Availability Still In Progress– No Active-Active Failover even with the
secondary NN
• DAS architecture not suitable for some customers– Lack of Enterprise Data Governance Features
40© Copyright 2015 EMC Corporation. All rights reserved.
HDFS DATA SERVICE OVERVIEW• Addresses limitations of off-the-
shelf HDFS
• Brings HDFS to existing storage hardware
• Enables HDFS/Object/File scenarios
• Flexible software model allows for future co-location of compute and storage
41© Copyright 2015 EMC Corporation. All rights reserved.
HDFS DATA SERVICE OVERVIEW• API head
– Custom client/server protocol optimized for high scale
– Uses the same unstructured storage engine as ECS/ViPR Object data service
• Client library over the HDFS API– Provides a viprfs:// drop-in replacement for HDFS 2.0– Can be seamlessly added to existing Hadoop
distributions
• Implemented as a Hadoop Compatible Filesystem (HCFS)– Supports HDFS 2.0 and 2.2
42© Copyright 2015 EMC Corporation. All rights reserved.
HDFS ARCHITECTURE
RM / AsM
Commodity Compute & Storage
Node Manager
Data Store
M a p R e d u c e Ta s k
MapReduce Task
Client
Node Manager
Data Store
M a p R e d u c e Ta s k
MapReduce Task
Node Manager
Data Store
M a p R e d u c e Ta s k
MapReduce Task
NAME NODE
SECNAME NODE
43© Copyright 2015 EMC Corporation. All rights reserved.
ViPR/ECS HDFS ARCHITECTURE
Client
Node ManagerM a p R e d u c e Ta s k
MapReduce Task
ViPR/ECS Client
ViPR/ECS Client
Node ManagerM a p R e d u c e Ta s k
MapReduce Task
ViPR/ECS Client
ViPR/ECS Client
Node ManagerM a p R e d u c e Ta s k
MapReduce Task
ViPR/ECS Client
ViPR/ECS Client
NAME NODERM/AsM
SECNAME NODE
44© Copyright 2015 EMC Corporation. All rights reserved.
Customer’s Hadoop Compute Cluster
HDFS – ECS APPLIANCE DEPLOYMENT
ViPR Controller VMViPR Controller
VMViPR Controller VM
Data Read/Write
Object/HDFS
…
Object/HDFS
45© Copyright 2015 EMC Corporation. All rights reserved.
HDFS DATA SERVICE/ECS ARCHITECTURE
ECS Storage Engine
HDFSAPI
Head
S3API
Head
Customer’s Hadoop Compute Cluster
ViPR Data Service Node
Data Read/Write via ViPR HDFS
46© Copyright 2015 EMC Corporation. All rights reserved.
HDFS DATA SERVICE VALUE PROPOSTION
• High Availability Built-In, No SPOF
• Avoids multiple copies of data
• Erasure Coding Support
• Geo-Distributed Across Sites
• Multi-tenancy, Metering, Chargeback
• Allows Byte-Range Updates Through S3 Interface
• ViPR Controller aids in Management & Monitoring
47© Copyright 2015 EMC Corporation. All rights reserved.
UPCOMING ECS INTEGRATIONS
48© Copyright 2015 EMC Corporation. All rights reserved.
ISILON CLOUDPOOLSSMART TIERING TO OBJECT STORES
Key Features
Benefits
Stub to Cloud of choice
Extending SmartPools workflow to CloudPools
Ability to send encrypted data to the cloud
Compression for efficient transport
Simple policy based management
Combine file & object store benefits
Use stubs to optimize local storage space, with offsite archive protection
Seamless placement and availability of data per policy
One Accessible namespace
SmartPools -> CloudPools
Clients
SMB | NFS | REST| HDFS | SWIFT
OneFS
Service Provider
PublicCloud
ECS
49© Copyright 2015 EMC Corporation. All rights reserved.
EMC CLOUDBOOST LONG TERM RETENTION TO THE CLOUD
LAN LAN/WANCloudBoostappliance
DesktopsLaptops
Files NAS/NDMP
VMware &Hyper-V
Databases
Email Applications
DB
ROBO
Protected by NetWorker, Avamar, NetBackup
Key Features
Benefits
Long-term retention to ECS for NetWorker, Avamar, NetBackup
Inline variable de-duplication and compression
Data encrypted in-flight and at rest
Cloud choice: private/public clouds
Appliance cache for ROBO
Capacity of up to 6PB logical per appliance
Central management via a cloud portal
Lower storage cost per TB
Efficiency: Reduced network and storage consumption
Lower risk, operational overhead than tape.
Airtight security
50© Copyright 2015 EMC Corporation. All rights reserved.
WRITE PATH, READ PATH, BOX CARTING
51© Copyright 2015 EMC Corporation. All rights reserved.
TRANSACTION FLOW (WRITE)
Node 1
1. Create Object request (Name, data, metadata)
Node 1 Node 4 Node 5
2. Write of data and metadata in chunk.
All three copies written in parallel. Write successful only if all copies ack.
Node 2
5. Back to client
3. Index update (name, location) to
the owner partition.
4. Journal write
52© Copyright 2015 EMC Corporation. All rights reserved.
ERASURE CODING
Data is written into chunks 3 copies
Once a chunk fills to 128 MB, erasure coding starts
Once it is completed and data is protected the 3 copies are deleted.
A
A
AA
53© Copyright 2015 EMC Corporation. All rights reserved.
GARBAGE COLLECTION
In Append-only systems updates/deletes cause files to have blocks of data that are unused.
This is done at the level of chunk
Unused chunks reclaimed by a background task
54© Copyright 2015 EMC Corporation. All rights reserved.
Node 1
1. Read Object request
3. Read data
Node 2
4. Send data back
2. Get Location
TRANSACTION FLOW (READ)
55© Copyright 2015 EMC Corporation. All rights reserved.
BOX CARTING: CHUNK WRITE
Node
Buffered Writer
Acks(PARALELL SYNC WRITE)
56© Copyright 2015 EMC Corporation. All rights reserved.
• ECS GEO-STORAGE OVERVIEW• DATA PROTECTION• GLOBAL DATA ACCESS
ECS GEO-CAPABILITES
57© Copyright 2015 EMC Corporation. All rights reserved.
ECS GEO-STORAGE OVERVIEW
• Data Protection– Protection against data center failure– Seamless failover and recovery
• Global Data Access– Global namespace– Ability to read/write data from any site
• Optimized Storage– Low storage overhead– WAN Optimization
• Applicable to all unstructured Storage Engine based heads– Object, HDFS– File when available
58© Copyright 2015 EMC Corporation. All rights reserved.
DATA PROTECTION
59© Copyright 2015 EMC Corporation. All rights reserved.
INDUSTRY SOLUTION: MIRROR COPY
• Mirrored copy in a backup site
• Benefit: Achieves Local reconstruction on hardware failure
• Shortcoming: Storage overhead -> 2.66xPrimary Secondary
60© Copyright 2015 EMC Corporation. All rights reserved.
INDUSTRY SOLUTION: DISTRIBUTED ERASURE CODING
• Distributing fragments across sites
• Benefit: Achieves low Storage Overhead ~ 1.6x
• Shortcoming: Disk/Node failure requires fragments to be fetched over the WAN.
Site 1 Site 2
Site 3 Site 4
61© Copyright 2015 EMC Corporation. All rights reserved.
ECS MODEL: BEST OF BOTH WORLDS
• Achieves low Storage Overhead ~ 1.8x
• Local hardware failure recovery requires no WAN traffic.
• Handles local hardware and full data center failures– Disk, Node, Rack, Data Center are
failure domains
Site 1 Site 2
Site 3 Site 4
62© Copyright 2015 EMC Corporation. All rights reserved.
GLOBAL DATA ACCESS
63© Copyright 2015 EMC Corporation. All rights reserved.
INDUSTRY SOLUTION: SEGREGATED NAMESPACE
• Customers are asked to pick a location for each bucket.
• Shortcoming: sites are vertical silos, unaware of each other’s namespaces.
Site 1 Site 2
app app
Bucket BBucket A
64© Copyright 2015 EMC Corporation. All rights reserved.
INDUSTRY SOLUTION: MULTI-ACCESS WITH EVENTUAL CONSISTENCY
• Global Namespace with read only replicas.
• Replicas have eventual consistency
• Shortcomings: Difficult to write applications against eventual consistency models
Site 1 Site 2
app
Bucket A
app
read only
65© Copyright 2015 EMC Corporation. All rights reserved.
ECS: MULTI-ACCESS WITH STRONG CONSISTENCY
• Global Namespace: buckets stretches across sites
• Global Access: Any data can be read and written to any site
• Strongly consistent: Always returning latest version without requiring synchronous write.
Site 1 Site 2
Bucket A
app app app
66© Copyright 2015 EMC Corporation. All rights reserved.
OPTIMIZED STORAGE
• Low Storage Overhead: ~1.8x replication over head across 4 sites
• WAN Optimization: All node and disk failures are repaired within the site, without any WAN traffic.
67© Copyright 2015 EMC Corporation. All rights reserved.
STORAGE OVERHEAD
# of Data Centers Overhead1 1.33 x
2 2.67 x
3 2.00 x
4 1.77 x
5 1.67 x
6 1.60 x
7 1.55 x
8 1.52 x
68© Copyright 2015 EMC Corporation. All rights reserved.
ECS GEO KEY DIFFERENTIATORS
• Tolerates one site disaster along with up to 2 node failures in all the rest of the sites.
• Component failures are recovered using fragments from local site without WAN traffic
• Geo-efficient (~1.8 copies across 4 sites) without WAN read/write penalties
69© Copyright 2015 EMC Corporation. All rights reserved.
ECS APPLIANCE
70© Copyright 2015 EMC Corporation. All rights reserved.
Not Contradictory!
Components are Commodity
x86 Servers
Ethernet Networking
SATA Disk Drives
Innovation in how they’re put together to enable reliability, availability, and serviceability!
COMMODITY INNOVATION
71© Copyright 2015 EMC Corporation. All rights reserved.
ECS APPLIANCE CHARACTERISTICS
• Use COTS Components– Economies of scale
• Density Optimized– Up to 72TB Raw / Rack Unit– Saves Power/GB, Real Estate costs, etc.
• Labor Optimized– Manage the cluster, not the devices– Maximize Serviceability
• Protection Efficiency– Geo-efficient storage