ibm storage strategy highlights - oljedirektoratet · ibm storage strategy highlights alternative...
TRANSCRIPT
© 2017 IBM Corporation
IBM Storage Strategy Highlights
Alternative storage onboard vessels and datacenters
Dr. Robert Haas, IBM CTO Storage Europe
August 24th, 2017
Outline
1. Flash and Beyond: leveraging NVMe, and optimizing the full software stack
2. Supporting New Workloads: Spectrum Scale and our Object offerings
3. Object Storage on Tape• Extending OpenStack Swift for High Latency Media: expose high-latency media to object-
storage based apps
• Providing OpenLTFS as open-source entry point towards Spectrum Archive EnterpriseEdition
4. Cognitive Storage: A quantum step beyond EasyTier
IBM Storage & SDIIBM Storage Awards and Recognition
IBM Storage2013, 2015 & 2016
IBM MalaysiaCustomer Care Award Winner
IBM Storwize V7000F2016
Product of the Year Finalist:All-flash Systems
IBM FlashSystem A9000R2016
Product of the Year Finalist:All-flash SystemsIBM FlashSystem
2015 - 2016A Major Player in All-Flash Array by
IDC MarketScape4
IBM Spectrum Storage2014 - 2016
#1 Software-Defined StorageController Software2
IBM Tape Library2014 - 2016
Market Leader
IBM Cloud Object Storage2016
Market LeaderScale-Out Object Storage Software
IBM FlashSystem 9002016
Best of ShowMost Innovative Flash MemoryEnterprise Business Application
1Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist ofthe opinions of Gartner's research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for aparticular purpose. GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally, and is used herein with permission. All rights reserved. 2IDC WW Quarterly Storage Software Qview 4Q,2016. 3IDC Custom Tape Report, 2H’16 4IDC MarketScape WW All-Flash Array 2015-2016 #US40721815 Dec. 2015; 5IDC MarketScape WW Object-Based Storage 2016 #US41918416 Dec. 2016
IBM StorageSingapore Readers Choice
Awards2015 & 2016: Storage Hardware2013: Enterprise Disk Systems
IBM Storwize, XIV, DS80002013 - 2016
A Leader:Magic Quadrant for General-Purpose Arrays1
IBM FlashSystem2014 - 2016A Leader:
Magic Quadrant for Solid State Arrays1
IBM Spectrum Protect2011 - 2016A Leader:
Magic Quadrant for Data Center Backup &Recovery Software1
IBM Cloud Object Storage2016
Highest Scores for Analytics, Archiving & CloudStorage Use Cases:
Critical Capabilities for Object Storage1
IBM Spectrum Scale &IBM Cloud Object Storage
2016A Leader:
Magic Quadrant for Distributed File Systems &Object Storage1
IBM Spectrum VirtualizeSoftware
2016Winner: Virtualization Category
Tech Innovator AwardsIBM Cloud ObjectStorage
2016A Leader in Object-Based Storage
by IDC MarketScape5
3
IBM Spectrum VirtualizeSoftware-Defined Flash
Storage2016
Top 10 Coolest Flash Storage and SSDProducts
IBM FlashSystem 9000Family
2017iF International Design Award
IBM Spectrum Scale2016
Top File Management StorageSoftware Package
IBM2016
Top File Management StorageSoftware Supplier
IBM Tape2006 - 2016
#1 Branded Tape Revenue3
IBM FlashSystem A90002016
Product of the Year Finalist:All-flash Systems
IBM Flash Solutions makefast storage simple
Software-defined storageto speed innovation and
hybrid cloud
IBM Software DefinedStorage
• IBM Spectrum Storage Suite• IBM Spectrum Control• IBM Spectrum Protect• IBM Spectrum Accelerate• IBM Spectrum Archive• IBM Spectrum Scale• IBM Spectrum Virtualize• IBM Spectrum Copy Data
Mgmt
IBM Cloud Object Storage
IBM Converged Infrastructure
VersaStack• IBM FlashSystem V9000• IBM FlashSystem A9000• IBM FlashSystem 900• IBM Storwize V7000/V7000F
/V7000U• IBM Storwize V5030F/V5030/
V5020/V5010• IBM SAN Volume Controller
IBM PurePower• IBM Storwize V7000
IBM Storage Solutions
IBM All Flash• IBM FlashSystem A9000• IBM FlashSystem A9000R• IBM FlashSystem V9000• IBM FlashSystem 900• IBM DS8888• IBM Storwize V7000F/V5030F• IBM DeepFlash 150
IBM Hybrid Storage• IBM DS8884/DS8886• IBM XIV Storage System• IBM Storwize
V7000/V7000U/V5030/V5020/V5010
IBM Elastic Storage Server
IBM Software DefinedComputing
• IBM Spectrum Symphony• IBM Spectrum LSF• IBM Spectrum Conductor
Faster applications, fastertime to benefits, easy,efficient and versatile,
certified and tested for you
Defining a new generation ofsoftware-defined computing
infrastructure
Tape Storage for data protectionand long term retention .Storage
Networking for increasedperformance, security and
flexibility
IBM Business Continuity& Connectivity
IBM Tape & Virtual Tape Systems• TS7700, TS7760• Tape Libraries• LTO7 and enterprise tape drives• ProtecTIER Deduplication
IBM Storage Networking (SAN)• Directors• Switches• Specialty Switches
Broadest Storage and Software-Defined Portfolio
5
© 2017 International Business Machines Corporation11
Flash and BeyondWhy NVMe will unlock the potential of Flash, driving theneed to optimize the full software stack
Major Shifts in the Storage Industry with Flash
• Primary Storage:
• Flash adoption is accelerating and starting to replace hybrid for many newinstallations
• Secondary Storage and unstructured data
• Flash density, management and environmental factors driving flash into this market
• NVMe brings a true flash optimized interface to the masses
• SCSI was invented for HDDs (although modified in recent years for Flash)
• Storage class memories about to burst on the scene
• New applications driving new workloads, Cognitive, Real Time analytics – allhave new constraints on the storage
12
3D has allowed the scalingof flash to hyper accelerate!
Flash Adoption is Accelerating
• Flash media cost improving at25-30% CAGR
• 3D is hyper accelerating thistrend
• Capacities going through theroof
• Even QLC going to happen
• Allows for low latency andhigh IOPs
• Fast enough to handle DataReduction built in
13
Flash technology can be used in many forms …
IBM Systems Flash Storage Offerings
All-Flash Array (AFA)
Mixed(HDD/SSD/CFH)
All-CustomFlash Hardware
(CFH)All-SSD
Hybrid-Flash Array(HFA)
CFH defines an architecture that uses optimizedflash modules to provide better performance andlower latency than SSDs. Examples of CFH are:
• High-Performance Flash Enclosure Gen2• FlashSystem MicroLatency Module
All-flash arrays are storage solutions that onlyuse flash media (CFH or SSDs) designed todeliver maximum performance for application andworkload where speed is critical.
Hybrid-flash arrays are storage solutions thatsupport a mix of HDDs, SSDs and CFH designedto provide a balance between performance,capacity and cost for a variety of workloads
IBM AFA Portfolio• V5000 / V7000 – AFA SSD based• V9000 – AFA CFH based• A9000 – AFA CFH based• DS8880 – AFA CFH based
Source: IDC's Worldwide Flash in the Datacenter Taxonomy, 2016
14
NVMe Performance Potential
• Intel presented this chart atFlash Memory Summit 2016showing how the latency ofstorage devices is rapidlydecreasing, leading to theneed to decrease softwareand networking latency withhigher-speed networks (like25GbE) and RDMA
https://www.mellanox.com/blog/2016/12/storage-predictions-for-2017/
16
Hardware Performance Does Not Trickle Up the Stack
HW: 1000xhigher throughput,lower latency!
Only 2xbetter applicationperformance!
Distributed File Storage
Operating System
Java Runtime
IBM BigInsights Hadoop MR
Hypervisor
Spark
DRAM Flash Infiniband
Shark GraphX MLLib
Ap
plic
ation
s
“Traditional” Batch Analyticsbased on MapReduce
Next-generation AnalyticsInteractive, Streaming, Graph, ML
Hard
ware
Hadoop FS (HDFS) API
Sys
tem
So
ftw
are
Accelerating Spark analytics performance on modernnetworks and storage hardware
IBM Spectrum Scale / Apache HDFS
In-memory processing
Crailusing DAS 3DXP & Flash
Memory extension
Input Output
Crailusing shared, TOR 3DXP & Flash
Crail plugs into the system without changes to Spark or to Spark applications.Crail plugins instruct Spark when and how to use the Crail storage tier – no user involvement required.
Durable storage tier(reliability, high availability, fault
tolerance)
Performance storage tier(performance-critical, short-lived data)
crailhttps://crail.io
© 2017 International Business Machines Corporation21
For the New Generation WorkloadsSpectrum Scale and Spectrum Archive,and Object offerings
Clientworkstations
Users andapplications
Computefarm
Spectrum Scale: Redefining unified storage
Traditionalapplications
Powered by
Single name space
Spectrum Scale
SMBSMB NFSNFS POSIXPOSIX
Transparent HDFSTransparent HDFS
Disk Tape Shared NothingCluster
Flash
Off Premise
OpenStack Object
SwiftSwift S3S3
CinderCinder SwiftSwift
GlanceGlance ManillaManilla
22
IBM Systems
IBM Spectrum Scale for common workloads
Big Data Analytics Archive and analyze in place
Hadoop Transparency
Content Repository Seamless growth
Unified file and object
Private Cloud Data management at scale
Integrated with OpenStack
Compute Clusters Scalable performance & throughput
Advanced routing and caching
| 23
FLAPE (Flash + Tape): Best of Both Worlds
Flash for outstanding performance, micro-latency, and enterprise reliability
Tape for cost-effectiveness, high capacity,scalability, low power consumption & footprint
« HOT »data
« COLD »data
A cloud storage forunstructured data
IBM Spectrum Archive: Tape Integration into Spectrum Scale
• Spectrum Scale plus Spectrum Archive - Changingthe economics of storage with low cost file systembased storage
• Seamlessly incorporates tape storage to keep dataonline at much lower costs
• Data still listed in directories
• Once data is accessed it is moved to disk
• Other than longer access times, users have no ideadata is stored on tape
• Spectrum Archive and Spectrum Protect (TSM) HSMare complementary products
• TSM provides enterprise class HSM and backupfunctions for many environments
• Spectrum Archive provides tape tier for SpectrumScale on Linux
• Typical use-cases for Spectrum Archive
• Long-time archival of scientific datasets, seismicdata, etc
FlashGold Pool
DiskSilver Pool
Tier 1 Tier 2
Single name space
Spectrum Scale
CIO Finance Engineering
TapeLTFS
Tier 3
Spectrum Archive
Object Storage
• Object Stores for:
• Enterprise Clouds, and Cloud Storage Services
• Geographically shared storage
• Active Archive and Cold Storage
• Becoming the ‘third storage model’ next to fileand block – and likely the largest
ActiveArchive
EnterpriseCollaboration
Business Continuity
Content Repository Storage as ServiceBack-up
© 2017 International Business Machines Corporation28
Cloudifying our Tape StorageExtending OpenStack Swift for High Latency MediaProviding OpenLTFS as open-source entry point towardsSpectrum Archive Enterprise Edition
Tape in the Cloud Today
CSP (IntelSuper 7)
Tape inCloud
Implementation
Anonymized Y 1200 LTO tape drives installed
Anonymized Y 1300 Jag tape drives by end of 2017
Anonymized Y 10’s of thousands of LTO tape drivesinstalled
Anonymized N New promising engagement
Anonymized N Had tried tape in past, re-evaluating
Anonymized N Had tried tape in past, re-evaluating
Anonymized N Investigating
How to build a Zeta Byte of storageon a budget• Aaron Ogus, Microsoft Azure• Data@Scale Seattle - June,
2015
How Google Backs Up the Internet• Raymond Blum, Google Site
Reliability• Fujifilm Global IT Exec Summit
- September, 2014
Use Cases:• Cold Archive• Backup/Restore
Value Prop:• Low Cost Storage• “Air Gap” / offline copy
OpenStack Swift Object Storage on Tape: SwiftHLM
• Augment cloud object storage with alow-cost, cold storage tier
• Tape, optical, MAID
• Archive/backup use cases
• Reduced cost• E.g. tape up to 6x cheaper than disk
(current HW/media specs)
• Future projections in favor of tape
• Reduced availability• Minutes, 10s of minutes, or hours
(depending on use case and SLA)
primary storage
highly available
archival storage
low-cost
archive
restore
OpenStack Swift Cluster
Standard API (REST)
ClientApplication
HDD High-latencylow-cost media
SwiftHLM (High-Latency Media) Description
• Swift API extension for HLM archiving:
• Migrate (Disk -> High-Latency Media, async)
• Recall (High-Latency Media -> Disk, async)
• Query status for Object (sync)
• Query status for Request (sync)
• Object and container level operations
Swift API
Swift API extensionfor archiving
HLMBackend
ExtendedAttributes
SwiftHLMmiddleware
Swift
CLI
Diskcach
e
Tape
MAID
Optical
Disc
POSIXFile System
Generic interface for HLM backends, suitablefor e.g.:
– IBM Spectrum Archive (LTFS Enterprise Edition)
– IBM Spectrum Protect (TSM/HSM)
– BDT Tape Library Connector (open source)
Available as open-source at https://github.com/ibm-research/swifthlm
Open LTFS Approach Developed with FUSE
Spectrum Archive EE(proprietary)
FUSE + Tape
GPFSmulti-node
DMAPI
TSM-HSM
MMM of LTFS EE:Cluster-wide Request and
Tape Mgmt
LTFS LE+
File Interface
Client
OpenSource
(NewDevelopment)
FUSE FSsingle node
LTFS LE
Fuse Connector
Request and Tape Mgmt
Data Cache(Linux File System)
Small library ora partition of a big library
File Interface
Swift (IceTier)multi-node
Consider BDTopen sourcecollaboration
• Migrate 1 million files in a system with a single tape drive (completes in less than 10min)
• Migration states of files (resident, pre-migrated, migrated)
• UI for tracking progress
• Recall
• Selective recall: the user explicitly requires recall of a number of files
• Transparent recall: the user reads a migrated file and OpenLTFS transparently recalls it
• Migration to 2 tape cartridges using two drives
• Replication to two cartridges
• Co-location of files in cartridges
• File path as extended attribute in LTFS LE
OpenLTFS Demonstrated Functions
/mnt/lxfs /mnt/ltfs
XFS LTFS LE
/mnt/lxfs.managed
OpenLTFS OverlayMigrate 1m files
/mnt/lxfs /mnt/ltfs
XFS LTFS LE
/mnt/lxfs.managed
OpenLTFS Overlay
Selective & Transparent Recall
Open-sourcing expected in 2H2017
© 2017 International Business Machines Corporation40
Cognitive StorageQuantum Step in Managing Very Large StorageInfrastructures
Receive eventsand index content
CloudObject Storage
SpectrumScale
Se
arc
hR
ES
TA
PI
Qu
eu
ein
gA
PI
Queue IndexMetaOcean
consumer
consumer
Beyond Data Ocean: MetaOcean to Collect andLeverage Metadata
• Automatically index and catalog files and objects from Cloud Object Storage, Spectrum Scale, and Spectrum Archive
• Open, pluggable architecture enables easy integration of new source systems to capture metadata
• Provide rich search API as foundation for advanced data services
• Trigger actions based on metadata using MetaOcean policy engine
• Index and leverage custom metadata via tagging and policy based extraction
customconsumer
Extract custom metadata from objects (e.g,Microsoft Office, Watson APIs)
SpectrumArchive
Trigger actions basedon metadata
MetaOcean PolicyEngine
Search / Analysis GUI
Cognitive Data Services
Visualization Dashboard
Cognitive Storage Quantum Leap
• Dealing with phenomenal data growth requires new approaches: Cognitive storagerevolutionizes data and infrastructure management
• Need to predict value of data
• Value of data is related to the insight that can be gained– easily reproducible data has little value
• By tuning data storage policies based on an assessment of data value, cognitive storage allowsoptimal allocation of resources for value preservation
• For a given storage budget, much more insight can be obtained using cognitive storage than bysimply treating data based only on its size
• Need to predict accesses to data
• Even without any access history
42
Cognitive Storage Case Study:Optimizing Risk of Loss ofValue against StorageOverhead
• Example showing 32xlower risk of loss of valueat only a 27% higher cost,compared to a simpletwo-copy replicationscheme
43
32x
27%
Loss of value~ risk
Storage overhead~ cost
Actualcognitivestoragesolution
Cognitive Storage Case Study:Hiding Latency of Tape with Predictive Prefetch
• Predict future accesses to data in archive based onmetadata of data accessed in the recent history
• Prefetch data most likely to be accessed to a diskcache before the requests are made
• Conventional algorithms such as FIFO and LRU alwaysincur a first cache miss; predictive prefetching preventsthis first cache miss as well as further cache misses bypredicting future accesses based on metadata
• For the ASTRON staging server, initial results show thatdisk cache hit rates can be doubled compared toconventional caching algorithms such as FIFO andLRU
Cache hits thanks to data prefetchingbased on two different accessprediction methods
Cache hits forLRU andFIFO
Summary
1. NVMe adoption fueled by Flash decreasing costs; requires new approaches atacross the full software stack
2. New Workloads with a mix of: Spectrum Scale, Archive, and Object
3. Tape Ready for Object Storage• High Latency Media extensions for object-storage based apps
• OpenLTFS as open-source entry point towards Spectrum Archive Enterprise Edition
4. Cognitive Storage: A quantum step beyond EasyTier
Notices and Disclaimers
Copyright © 2017 by International Business Machines Corporation (IBM). No part of this document may be reproduced or transmitted in any form without written permissionfrom IBM.
U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM.
Information in these presentations (including information relating to products that have not yet been announced by IBM) has been reviewed for accuracy as of the date ofinitial publication and could include unintentional technical or typographical errors. IBM shall have no responsibility to update this information. THIS DOCUMENT ISDISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IN NO EVENT SHALL IBM BE LIABLE FOR ANY DAMAGE ARISING FROM THEUSE OF THIS INFORMATION, INCLUDING BUT NOT LIMITED TO, LOSS OF DATA, BUSINESS INTERRUPTION, LOSS OF PROFIT OR LOSS OF OPPORTUNITY.IBM products and services are warranted according to the terms and conditions of the agreements under which they are provided.
IBM products are manufactured from new parts or new and used parts. In some cases, a product may not be new and may have been previously installed. Regardless, ourwarranty terms apply.”
Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without notice.
Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are presented as illustrations of how those customershave used IBM products and the results they may have achieved. Actual performance, cost, savings or other results in other operating environments may vary.
References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries inwhich IBM operates or does business.
Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not necessarily reflect the views of IBM. All materialsand discussions are provided for informational purposes only, and are neither intended to, nor shall constitute legal or other guidance or advice to any individual participant ortheir specific situation.
It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal counsel as to the identification andinterpretation of any relevant laws and regulatory requirements that may affect the customer’s business and any actions the customer may need to take to comply with suchlaws. IBM does not provide legal advice or represent or warrant that its services or products will ensure that the customer is in compliance with any law
48
Notices and Disclaimers Con’t.
Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has nottested those products in connection with this publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products.Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. IBM does not warrant the quality of any third-party products, or theability of any such third-party products to interoperate with IBM’s products. IBM EXPRESSLY DISCLAIMS ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING BUTNOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
The provision of the information contained h erein is not intended to, and does not, grant any right or license under any IBM patents, copyrights, trademarks or other intellectualproperty right.
IBM, the IBM logo, ibm.com, Aspera®, Bluemix, Blueworks Live, CICS, Clearcase, Cognos®, DOORS®, Emptoris®, Enterprise Document Management System™, FASP®,FileNet®, Global Business Services ®, Global Technology Services ®, IBM ExperienceOne™, IBM SmartCloud®, IBM Social Business®, Information on Demand, ILOG,Maximo®, MQIntegrator®, MQSeries®, Netcool®, OMEGAMON, OpenPower, PureAnalytics™, PureApplication®, pureCluster™, PureCoverage®, PureData®,PureExperience®, PureFlex®, pureQuery®, pureScale®, PureSystems®, QRadar®, Rational®, Rhapsody®, Smarter Commerce®, SoDA, SPSS, Sterling Commerce®,StoredIQ, Tealeaf®, Tivoli®, Trusteer®, Unica®, urban{code}®, Watson, WebSphere®, Worklight®, X-Force® and System z® Z/OS, are trademarks of International BusinessMachines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBMtrademarks is available on the Web at "Copyright and trademark information" at: www.ibm.com/legal/copytrade.shtml.
49