storage for hpc, hpda and machine learning (ml) · · 2017-07-17storage for hpc, hpda and machine...
TRANSCRIPT
Storage for HPC, HPDA and
Machine Learning (ML)
Frank Kraemer, IBM Systems Architect
mailto:[email protected]
IBM Data Management for Autonomous Driving (AD)
▪ significantly increase development
efficiency by reducing manual efforts for
video tagging, eliminated wasted time for
data search and manual data copy/move
processes and by automating workflows
▪ significantly increase test through-put,
allowing you to run more test cases in
less time, therefore increasing time-to-
market as well as the quality of your camera
and ADAS products
▪ to reduce IT costs for local storage
hardware by globally centralizing data
▪ increase the entire flexibility through the
ability to move work-load from one place
to another
▪ guarantee long-year data verifiability and
recoverability of test data via archiving
Introducing IBM Spectrum Scale
Remove data-related bottlenecks
Demonstrated 400 GB/s throughput, building to 2.5TB/s
Local caching for Read and Write
Enable global collaboration
Data Lake serving HDFS, files & object across sites
Multi-cluster configurations; Sync & Async
Optimize cost and performance
Up to 90% cost savings & 6x flash acceleration
Transparently Tier to Cloud
Ensure data availability, integrity and security
End-to-end checksum, Spectrum Scale RAID, NIST/FIPS certification
Compression, Encryption, Audit Logging
Highly scalable high-performance
unified storage for files and objects with integrated analytics
The History of Spectrum Scale
* Gartner, Magic Quadrant for Distributed File Systems and Object Storage, 20 October 2016, Document No. G00307798
Infrastructure requirement IBM answer
Scalability Parallel File System
Flexibility Software Defined Storage
Agility Unified Storage
New gen workloads HDFS connector
Performance Parallel File system
Cloud OpenStack integration & Transparent Cloud Tiering
Object Unified Storage
Global AFM for multi-cluster storage
IBM Storage for unstructured data
Spectrum Scale: The flexible cognitive Storage Solution
Client workstations
Users and applications
HPC & AI
Computefarm
Big DataAnalytics
Global name space
IBM Spectrum ScaleAutomated data placement and data migration
SMB/CIFSNFSPOSIX HDFS Controller
Disk Tape Storage RichServers
Flash
On/Off Premise
OpenStack
Cinder Swift
Glance Manila
TransparentCloud Tiering
▪Site B
▪Site A
▪Site C
Cloud Data Sharing
Users and applications
Object
Container
Docker
Kubernetes
text
Highly Available Write Cache (HAWC)• Improves performance of small synchronous writes• Small synch writes are written to the log. As log fills, rewrite to home
Local Read Only Cache (LROC)• Extend the page pool memory to include local DAS/SSD for read caching
Compression• Compress what makes sense & extends to cache
Quality of Service• Throttle background functions such as rebuild or async replication• Set by flexible policy, such as day-of-week and time-of-day
Distributed and flash accelerated metadata• Metadata includes directories, inodes, indirect blocks
Lift data to the highest tiers based on the file’s “heat”• Automate workload pipelines with Spectrum Computing LSF
IBM Spectrum Scale performance featuresPerformance Leadership for large and small files
Advanced File Management (AFM)Tie together multiple clusters to serve users across the globe
Spans geographic distance and unreliable networks • Caches local ‘copies’ of data distributed to one or more
Spectrum Scale clusters
• Low latency ‘local’ read and write performance
• As data is written or modified at one location, all other locations see that same data
• Efficient data transfers over wide area network (WAN)
Speeds data access to collaborators and resources around the world• Unifies heterogeneous remote storage
Asynchronous DR is a special case of AFM • Bidirectional awareness for Fail-over & Fail-back with data integrity
• Recovery Point Objectives for volume & application consistency
HPC Performance with simplified user accessTransparent Tiering & Data Migration
|
1
0
Analyze and Archive In-Place
▪ Enterprise HPC with Flash for performance
▪ Network Shared Disk for modular scaling
▪ Tier data based upon policy, users actions or workflow
▪ Lower economics with tape, object, or cloud
▪ Data always available to end-users
▪ Auto-migrate to higher tiers
▪ Full data, not stubs
▪ Global namespace extends across physical storage and
multiple sites
▪ General High capacity tiered NAS with fast data
ingest/retention/share and long term retention
▪ Deployed today in multiple clients
System pool
(Flash)
Gold pool
(Disk)
Tape Library
NFS/SMB
Cluster
2nd Site
text
Unified File & Object + HDFSStore everywhere. Run anywhere.
Challenge: Object storage for data & cloud• Seamless scaling• RESTful data access• Object metadata replaces hierarchy
IBM Spectrum Scale Swift & S3• High-performance for object• Native OpenStack Swift support w/ S3• File or object in; Object or file out• Enterprise data protection & Features
Full OpenStack cloud support • Cinder (block), Manilla (file), Swift (object)
text
Analytics without complexityStore everywhere. Run anywhere.
Challenge: Separate storage systems for ingest, analysis, results
• HDFS requires locality aware storage (namenode)• Data transfer slows time to results• Different frameworks & analytics tools use
data differently
HDFS Transparency• Map/Reduce on shared, or shared nothing storage• No waiting for data transfer between storage systems• Immediately share results• Single ‘Data Lake’ for all applications• Enterprise data management• Archive and Analysis in-place
Ingest
ObjectFile
Direct Access
POSIX
Raw Data Analysis
Comparing HDFS v. IBM Spectrum Scale
DASS Initial Serial Performance
• Compute the average temperature for
every grid point (x, y, and z)
• Vary by the total number of years
• MERRA Monthly Means (Reanalysis)
• Comparison of serial c-code to
MapReduce code
• Comparison of traditional HDFS
(Hadoop) where data is sequenced
(modified) with GPFS where data is
native NetCDF (unmodified, copy)
• Using unmodified data in GPFS with
MapReduce is the fastest
• Only showing GPFS results to
compare against HDFS
Preliminary Results
• Presented
SC16 U/G
• 20 nodes
(compute & storage)
• Cloudera
Map/Reduce
http://files.gpfsug.org/presentations/2016/SC16/06_-_Carrie_Spear_-_Spectrum_Sclale_and_HDFS.pdf
text
IBM Spectrum Scale: Transparent Cloud TieringSingle namespace and control of data placement for hybrid cloud
Intelligent data placement
• On or off-premises objects
Policy driven tiering• Managed data placement or migration
of cold data
Automated data movement• Recall on user demand
IBM Spectrum Scale• High-performance
• Single namespace
• Unified file, object and HDFS
• Encrypted
• Secure data in cloud
text
IBM Spectrum Scale: Cloud Data SharingPolicy-driven data movement for hybrid cloud
Managed data sharing
• Policy driven replication and
synchronization
• Granular control: Type, action,
metadata or heat
Bridging cloud and file
• Storage-to-storage
• Data and metadata
Automated data movement
• Secure, reliable connection
• High-speed and scalable
• Clustered configurations IBM Spectrum Scale• High-performance file, object and HDFS
• Clustered, tiered and scalable
• Bridge legacy applications and new
workloads
Cloud storage• Cloud native applications
• Dev/Ops development
• New workloads
IBM Spectrum ScaleFeatures and Benefits
Simplified, self-tuning options
New GUI & health monitoring
Unified File, Object & HDFS
Distributed metadata &
high-speed scanning
QoS management
1 Billion Files &
yottabytes of data
Multi-cluster and system
management integration
with IBM Spectrum Control
Advanced routing with latency
awareness
Read or Write Caching
Active File Management for
WAN deployments
File Placement Optimization
End-to-end data integrity
Snapshots
Sync or Async DR
zLinux support
Tier seamlessly
Incorporate and share flash
Policy driven compression
Data protection with erasure
code and replication
Native Encryption and Secure
Erase compliance
Target object store and cloud
Leading performance for
Backup and Archive
Heterogeneous commodity
storage: Flash, disk & tape
Software, appliance or Cloud
Data driven migration
to practically any target
File/Object In/Out with
OpenStack SWIFT & S3
Transparent native HDFS
Integration with cloud
Storage management at scale
Store everywhere. Run anywhere.
Improve dataeconomics
Software DefinedOpen Platform
17
ESS New Generation
Performance and Capacity
Announced on April 11, 2017
New all Flash options in Q3
Spectrum
Scale
ESS
New! Model GL2S: 2 Enclosures, 14U
166 NL-SAS, 2 SSD
New! Model GL4S: 4 Enclosures, 24U
334 NL-SAS, 2 SSD
New! Model GL6S: 6 Enclosures, 34U
502 NL-SAS, 2 SSD
ESS 5U84
Storage
ESS 5U84
Storage
Max: .9PB raw Max: 1.6PB raw Max: 1.8PB raw Max: 3.3PB raw Max: 2.8PB raw Max: 5PB raw
Model GL2: 2 Enclosures, 12U
116 NL-SAS, 2 SSD
ESS
5U84
Storage
ESS
5U84
Storage
ESS
5U84
Storage
ESS
5U84
Storage
Model GL6:6 Enclosures, 28U
348 NL-SAS, 2 SSD
Model GL4: 4 Enclosures, 20U
232 NL-SAS, 2 SSD
ESS
5U84
Storage
ESS
5U84
Storage
ESS
5U84
Storage
ESS
5U84
Storage
ESS
5U84
Storage
ESS
5U84
Storage
36 GB/s
25 GB/s
17 GB/s
12 GB/s
8 GB/s
24 GB/s
18
ESS All Flash ESS: All Flash Performance *NEW*
System x3650 M40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
System x3650 M40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
EXP3524
8
9
16
17
Model GS1S
24 SSD
EXP3524
8
9
16
17
System x3650 M40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
System x3650 M40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
EXP3524
8
9
16
17
Model GS2S
48 SSD Drives
EXP3524
8
9
16
17
System x3650 M40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
System x3650 M40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
EXP3524
8
9
16
17
EXP3524
8
9
16
17
EXP3524
8
9
16
17
Model GS4S
96 SSD Drives
All Flash Speed
368 Raw TB
14 GB/sec736 Raw TB
26 GB/sec
1472 Raw TB
40 GB/sec
Capacity calculations are
based on 15.36 TB SSD’s
Performance Numbers are
based on 4MB blocksize and
100% Reads without any
cache hits. Writes are typically
20% less than Read
Performance
All numbers are based on 100
Gb EDR (4 ports connected
per ESS Node)
IBM Software-Defined Storage Portfolio
Flash
AnyStorage
Cloud Services
Family of Storage Management
and Optimization Software
Private, Publicor Hybrid Cloud
Storage Rich Servers
Secure Efficient HybridCloud
High-Performance
Simplified copy data management that can
increase business velocity and efficiencyIBM Spectrum CDM
High-performance, highly scalable
hybrid cloud storage for unstructured dataIBM Spectrum Scale
Highly flexible, scale-out enterprise block
storage for hybrid clouds that deploys in minutesIBM Spectrum Accelerate
Long term retention for active archive data that lowers costs up to 90% by delivering a fast tape file system
IBM Spectrum Archive
Virtualization and optimization of of hybrid cloud block environments that helps improve flexibility and stores up to 5x more data
IBM Spectrum Virtualize
Optimized hybrid cloud data protection that can simplify restores and reduce backup costs by up to 53 percent
IBM Spectrum Protect
Hybrid cloud storage and data management that helps optimize applications and reduce costs by up to 73%
IBM Spectrum Control
Flexible and economical scalable hybrid cloud object storage with geo-dispersed enterprise availability and security
IBM Cloud Object Storage