has your data gone rogue?
TRANSCRIPT
© 2016 IBM Corporation
Has Your Data Gone Rogue?Using IBM Flash and solutions to obtain enhanced business insights
Tony Pearson, IBMMaster Inventor and Senior IT Architect
© 2016 IBM Corporation1
What is Happening?
Why did it Happen?
What might happen next?
What actions should
we take?
Client 1: Rebel Alliance
Descriptive Analytics
Diagnostic AnalyticsPredictive Analytics
Prescriptive Analytics
© 2016 IBM Corporation
Structured, Repeatable,
Linear
OLAP cube
Unstructured,Exploratory,
Iterative
Rebels are Inquisitive!
Reports Visualization and Discovery
Hadoop / Spark
Data warehousing
Stream Computing
Integration and Governance
Text Analytics
BusinessAnalyst
DataScientist
Analyze data2
2
“It’s no longer hard to find the answer to a given question; the hard part is finding the right question. And as questions evolve, we gain better insight into our ecosystem and our business.”
-- Kevin Weil, Lead Analyst at Twitter
© 2016 IBM Corporation
Clients are facing explosive growth in Unstructured Data, which is exactly why Analytics is so critical
3
* E
xa
byt
es
0
20
40
60
80
100
120
2009 2010 2011 2012 2013 2014 2015 2016 2017
Unstructured Data
Structured Data
Source: IDC
Unstructured data growth of
60–80%per yearcreates
Web-scale storage needs
Problem – Traditional Legacy Storage Designed for Transactional, Structured Data
© 2016 IBM Corporation
IBM Systems Storage PortfolioFlash for all primary storage workloads
DS8880
FlashSystem A9000
IBM FlashCore™ Technology Optimized
FlashSystemA9000R FlashSystem
V9000
All flash array -virtualizing the hybrid Data Center
• Best performance with storage services & selectable data reduction
• Targeting database/ analytics workloads
All flash array for cloud service providers
• Best performance with full time data reduction
• Targeting VDI and VMware
FlashSystem 900
All flash array for application acceleration
XIV Gen3
High End Capacity Optimized
All flash array for large deployments
• Best performance with full time data reduction
• Targeting mixed workloads
High End Server- Mainframe- Power
• Extreme reliability and replication
• Available in All Flash & Hybrid configurations
Storwize V7000
V7000F
Mid-Range
Storwize V5000
V5000F
Entry / Mid-Range
SVC
DeepFlashElastic
Storage Server (ESS)
• Extreme performance• Targeting database acceleration
& Spectrum Storage booster
Big Data Flash
4
© 2016 IBM CorporationIBM Systems
New Class of Flash: Big Data FlashScalable capacity and performance at low price points for big data
Performance can lead
to business results,
faster time to insights
Often do not benefit
from data reduction
technology, already
compressed files
Written once but
read often: video and
images
Source: IDC, 2015
Performance consistently better than that of the best HDDs
today
Cost comparable to that of performance optimized HDDs
Flash media that leverages flash Economics
Systems implementations that support massive scalability and
meet enterprise Requirements
Targeted primarily at big data and secondary storage
environments
“
”
Petabyte Scale of
unstructured data
and growing rapidly
Big Data
Attributes
Source: IDC, 2015
5
© 2016 IBM Corporation
HDD DeepFlash Conventional Flash
Price $ $$ $$$
Performance 10’s of milliseconds Sub Milliseconds Micro Seconds
Attributes • High ingest rate• Low change rate• High read rate
• Extremely latency sensitive• Can justify price premium
Typical use cases
Big Data analytics (ex: video, health care data), Hadoop, Spark
VDI, Server Virtualization, Database and Application Acceleration
Not conventional Flash, a new class of Flash: Big Data Flash
Scalable capacity and performance at low price points for big data
• Performance consistently better than that of the best HDDs
• Cost comparable to that of performance-optimized HDDs
• Systems implementations that support massive scalability and meet enterprise requirements
6
© 2016 IBM Corporation
� IBM InfoSphere BigInsights is a 100% standard Hadoop distribution� By default, open source components are always deployed� Elect to use proprietary capabilities depending on your needs� In some cases, proprietary capabilities offer significant benefits
Open standards first, but with freedom of choice
7
HDFS
YARN
HIVE
MapReduce
PIG
SpectrumScale
PlatformSymphony
Big SQL
AdaptiveMapReduce
BigSheets
Share data with non-Hadoop applications and simplify data management
Re-use existing tools and expertise, Avoid additional development costs
Boost performance, support time-critical workloads, do more with less
True multi-tenancy to boost service levels and avoid duplication on infrastructure
Simplify access for end-users, minimize software development
© 2016 IBM Corporation
Hadoop Analytics – HDFS vs IBM Spectrum Scale™
HDFS SaveResults
Discard
Re
st
IBM HDFS Transparency
Connector allows
HDFS-based programs to process data without
application changes (100% compatible)
IBM Spectrum Scale
Application data stored on IBM Spectrum Scale is readily available for analytics
SaveResults
JFS2
NTFS
EXT4
Data Sources mashup of structured and unstructured data from a variety of sources
Actionable Insights Provides answers to the
Who, What, Where, When, Why and How
Business Intelligence & Predictive Analytics> Competitive Advantages> New Threats and Fraud
> Changing Needs and Forecasting
> And More!
8
© 2016 IBM Corporation
Elastic Storage Server (ESS) with Spectrum Scale
5146-GLx modelsGL2, GL4, GL660-drive 4U drawers• SSD and Nearline HDD
5146-GSx modelsGS1, GS2, GS4, GS624-drive 2U drawers• All SSD• SSD and 10K HDD
IBM POWER8servers
NSD Client
Twin-tailed
Elastic Storage
Server
TCP/IP orRDMA
DeepFlash ESS (5147-GFx)64-drive 3U drawers• Pre-loaded with 32 drives• All SSD (8 TB)
© 2016 IBM CorporationIBM Systems
New Big Data alternative: instead of HDD, use Big Data FlashFor clients who value application response time and/or throughput per rack unit
Improve application response time by 8X
Improve throughput/rack unit by 2.8X
Improve MTBF
Improve power & cooling costs by 30%-50%
8X faster response timeand same throughputas the HDD version
28U25GB/S
File Server
Ha
rd D
rive
sH
ard
Dri
ve
s
File Server
All Flash
All Flash
Move from Big Data HDD configuration
To this Big Data Flashconfiguration
10U25GB/S
10
© 2016 IBM Corporation
IBM DeepFlash 150 storage enclosure
|
11
© 2016 IBM Corporation
Introducing IBM DeepFlashTM Elastic Storage Server8X faster response time, 8X lower latency compared to HDD version*
2 Enclosures, 10U360 TB of usable Flash
Max Read 26.6 GB/sec;Max Write 16.6 GB/sec
1 Flash Enclosures, 7U180TB of usable Flash; Max Read 13.6 GB/sec;Max Write 9.3 GB/sec
ESS GF1 ESS GF2
*based on SPEC SFS results
Spectrum Scale I/O server(POWER8)
DeepFlash JBOF
DeepFlash JBOF
12
© 2016 IBM Corporation
Data Protection Schemes
Tolerate 1 drive failure Tolerate 2 drive failures Tolerate “M” failures
RAID-1 / RAID-10K pieces � 2 x K slices
RAID-5K pieces � K + 1 slices
2.0X
1.2X
3.0X
1.5X
1.3XTriplicationK pieces � 3 x K slices
RAID-6K pieces � K + 2 slices
Erasure CodingK pieces � K+M = N
slices
© 2016 IBM Corporation
Share-Nothing versus Shared-Disk Deployments
DataData
Data
Parity
DataData
Data
CopyCopy
Copy
CopyCopy
CopyTCP/IPor RDMA
Need more compute? Add another node!
Elastic Storage Server reduces storage to one copy of the data with Erasure Coding
Scale compute and storage capacity separately
Many solutions keep 3 replicas
of the data
Need more storage capacity?
Add another node!
3x versus 1.3x
TCP/IPor RDMA
Data
© 2016 IBM Corporation
Introducing Spectrum Control Storage Insights…
• Convergence of analytics, cloud, and data management
• Designed to
� Reduce storage costs, without the traditional up-front investments
� Enable actionable visibility within minutes
� Provide rapid insights to critical assets
15
Deployed instantly from the cloud� Understand the storage environment and its
usage
� Monitor capacity and performance
� Reclaim allocated, but unused space
� Optimize data placement with advanced analytics
IBM is the only major storage vendor with a cloud-based SaaS offering for Storage Management
© 2016 IBM Corporation16
Client 2: Galactic Empire
Our major project is behind schedule!
A major test is imminent!
Too many clones!
How do we keep these
plans secret?
© 2016 IBM Corporation
IBM FlashSystem Models
17
900 V9000 A9000 A9000R
Tier 0 – Lean & Mean Tier 1 – Robust functionality
Optimized for:• Application Boost
Optimized for:• Traditional SAN• Databases• Automated Tiering• Virtualize almost 400
vendor devices
Optimized For:• Cloud / Multi-tenancy• Virtual Desktop
Infrastructure (VDI)• Virtual Machines
• VMware, HyperV, etc.
© 2016 IBM CorporationSource: IDC, The Copy Data Problem: An Order of Magnitude Analysis, doc #239875
50+Copies
COPY DATA GROWTH
Sto
rag
e G
row
th
Time
Primary Data
~35%YoY
Copy Data will be a $51B problem by 2018
• Consumes as much as 60% of disk capacity
• Drives 65% of Storage Software and 85% of the Storage Hardware spending
• Almost all copies sit idle
Copy Data Mgmt Gap
Geometric CopyData Growth
Linear Data Growth1 Resilient workload
(Disk Backup) 23
Non prod workload(Test/Dev or DevOps) 6
Resilient workload(Mirror) 1
Compliance workload(Archive) 1
Big Data workload(Analytics) ?
PrimaryData 1
Today’s IT Challenge: Too many clones!
18
© 2016 IBM Corporation
Your Infrastructure
IBM Storwize V7000,V5000, V3000
IBM Spectrum Copy Data Management
Software-Defined
Copy Data ManagementPlatform
• Cloud integrated• DevOps enabled
Transform
Catalog• Discover• Search
Automate• SLA compliance• Policy-based
LE
VE
RA
GE
Use Cases
Protection and Disaster Recovery
Hybrid Cloud
Applications
IBM FlashSystem A9000
IBM FlashSystem A9000R
IBM FlashSystem V9000
Also supports:SAN Volume Controller Spectrum Virtualize Spectrum Accelerate XIV Storage ArraysVersaStackEMC VNX and UnityNetAPP
DevOps, Test/Dev
Automated CopyManagement
IT Modernization through “In Place” Copy Data Management
19
© 2016 IBM Corporation
Security Strength is based on Algorithm and Number of Bits in Key
20
AES RSA ECC Years
1024 160 106
2048 224 109
128 3072 256 1015
192 7680 384 1033
256 15360 512 1051
Data*Data
Data* Data
**
Symmetric Key (AES 256)
• Same key is used to encrypt/decrypt
• Fast, ideal for large amounts of data
• Must keep the key secret
Encryption “Public” Key
Decryption “Private” Key
� Pairs of different keys are used to encrypt & decrypt data
� Encrypt with “Public” key; it may be distributed widely available without fear of compromise
� Decrypt with “Private” key; must keep this key secret
Asymmetric Key (RSA 2048)
ED
KeyPair
DataData
Data Data
E
DAES – Advanced Encryption StandardRSA – Rivest Shamir Adleman
ECC – Elliptical Curve Cryptography
© 2016 IBM Corporation
Two-Tier Encryption Scheme
21
Problem:Realtors, Landlords, and Apartment managers must carry hundreds of keys, one unique to each dwelling unit
Solution:All units have their unique key kept inside a locked box hanging on the door knob.
Realtors, Landlords, and Apartment managers carry a single master key that opens every lockbox
DataA
E
D
A
DataB
B
Encryption:Each flash, disk, or tape assigned a unique symmetric “Data Key”
Data key itself is encrypted or “wrapped” with master“encryption key”
Decryption:Data key is decrypted with master “decryption key”
Unique data key for this flash, disk, tape used to read and write contents
© 2016 IBM Corporation
How Encryption Keys are used in different IBM storage devices
22
DataA
A
DataB
B
• System power-on• System restart / firmware update• User-initiated re-key operation
• Tape mount
1 key pair per system 1 key pair per cartridge
FlashSystem 900
XIV, DS8000
SpectrumVirtualize
Enterprise Tape
1 key per self-encrypting
flash card
1 key perself-encrypting
drive (SED)
1 key per storage pool
1 key per cartridge
ED
KeyPair
A B
External Master Key:Asymmetric keys (RSA 2048-bit) stored in volatile memoryNeeded only for:
Internal Data Key:Symmetric key (AES 256-bit) randomly generated, encrypted by master key and stored on the storage media, used for high-speed read/write activity
© 2016 IBM Corporation
keystore
IBM Security Key Lifecycle Manager (SKLM)
23
SKLM
SecurityAdmin
StorageAdmin
secure communication
EDKeyPair
External Master Key:Asymmetric keys (RSA 2048-bit) stored in volatile memory, only needed for:• System power-on• System restarts (such as
firmware upgrades)• Re-key operations
Device requests key from IBM SKLM,SKLM sends master key to device
Storage admin requests USB thumb drive from Security team,
inserts into device
lockbox Or just leave USB thumb drive in device all the time
© 2016 IBM Corporation
SKLM
IBM SKLM supports flash, disk and tape
storage
Spectrum Virtualize supports either USB or
IBM SKLM
Encrypted storage pools can mix
devices
Where is Encryption Performed?
24
IBM Spectrum Virtualize™
SVC, Storwize, FlashSystem V9000, VersaStack
SAS
Internal storage,
Expansion drawers
CPU
FlashSystem 900
XIV, DS8000,FlashSystem
A9000/R
Non-encryptingstorage TS1120,
LTO4 and newer
SANSAS controlleruses HW chip
Uses AES-NIinstructions
Smart enough not to “double encrypt”
© 2016 IBM Corporation
Motivations for Data-at-Rest Encryption
Broken drives Decommission Mandate Theft
Withoutencryption
“90% of drives returned hadreadable data”-- Seagate
Physically destroy drive, or do not return them to manufacturer
Hire storage vendor to securely erase drives, using Department of Defense (DoD) method of multiple over-writes
Fail government or corporate compliance audits
Declare data breach
Pay for all affected clients and employees credit monitoring
Encryption-- USB drive left in device
Return broken drives to manufacture for warranty replacement
Overwrite or erasedecryption keys ���� data is “cryptographically erased”
Remove USB drives before auditors or inspectors arrive!
Encryption--Lockbox or SKLM server Pass audits
No breach if thievesdo not have access to decryption keys
25
© 2016 IBM Corporation26
Galactic Empire• Project is behind schedule, and a major test is imminent
• IBM FlashSystem• IBM Spectrum Copy Data Management
• Need to protect secret plans• IBM Security Key Lifecycle Manager
Rebel Alliance• Reckless, aggressive, and
undisciplined• Rebels are inquisitive!
• IBM DeepFlash ESS• IBM Spectrum Control Storage
Insights
© 2016 IBM Corporation
And now… enjoy the movie…
27
May the Force be with us!
© 2016 IBM Corporation
About the Speaker
Tony Pearson is a Master Inventor and Senior IT Architect for the IBM Storage product line. Tony joined IBM Corporation in
1986 in Tucson, Arizona, USA, and has lived there ever since. In his current role, Tony presents briefings on storage topics
covering the entire IBM Storage product line, IBM Spectrum Storage software products, and topics related to Cloud Computing,
Analytics and Cognitive Solutions. He interacts with clients, speaks at conferences and events, and leads client workshops to
help clients with strategic planning for IBM’s integrated set of storage management software, hardware, and virtualization
solutions.
Tony writes the “Inside System Storage” blog, which is read by thousands of clients, IBM sales reps and IBM Business Partners
every week. This blog was rated one of the top 10 blogs for the IT storage industry by “Networking World” magazine, and #1
most read IBM blog on IBM’s developerWorks. The blog has been published in series of books, Inside System Storage: Volume
I through V.
Over the past years, Tony has worked in development, marketing and consulting for various storage hardware and software
products. Tony has a Bachelor of Science degree in Software Engineering, and a Master of Science degree in Electrical
Engineering, both from the University of Arizona. Tony holds 19 patents for inventions on storage hardware and software
products.
9000 S. Rita Road
Bldg 9032 Floor 1
Tucson, AZ 85744
+1 520-799-4309 (Office)
Tony Pearson
Master Inventor
Senior IT Architect
IBM Storage
28
© 2016 IBM Corporation
The Right Flash for the Right Workload
Key Attributes
Typical Workloads,
Applications & Use Cases
Business Critical Storage
z/OS Support High PerformanceHighest Availability
z/OS (GDPS)Power HAPower i HA
Three-site/Four-siteSix 9’s ReliabilityEnterprise Scalability
High-availability/Low RTO applicationsHigh-performance OLTPReal time analyticsHigh-performance data warehouse
IBM DS8888
Virtual Storage Infrastructure
Heterogeneous Enterprise-class Data ServicesDynamic Data MigrationMulti-Vendor ManagementData Reduction (Compression)Multi-site active-active
Traditional structured workloads required block storage
Systems of RecordOLTPData Warehousing w/ Oracle, DB2, SQL Server, MySQL, SAP, SASAnalytics
FlashSystem V9000Storwize V7000FStorwize V5000F
Grid Scale Cloud Storage
Cloud-optimized (QOS, Multi-Tenancy)Predictable High Performance with Data Reduction Technologies (including deduplication)Ease-of-management
Large-scale distributed block workloads & applications
VDISAP (Oracle)ExchangeVMware / KVM server environmentsCSPs (Mixed workloads, Multi-tenancy)Hybrid cloud architectures
FlashSystem A9000 FlashSystem A9000R
Big Data Storage
Multi-protocol supportPolicy-driven tieringSingle namespace data oceanHigh-performance file storageHigh bandwidth
Distributed file/objectHadoop (M/R)Media Streaming / VideoSASSpark (In-Memory)HPCContent RepositoriesHigh-performance backup targetNAS filer consolidation
IBM DeepFlash ESSw/ IBM Spectrum Scale
29
© 2016 IBM Corporation
Spectrum Control ‘ice breaker’ Assets
30
© 2016 IBM Corporation
IBM Spectrum Control on IBM Cloud Marketplacehttp://www.ibm.com/marketplace/cloud/analytics-driven-data-management/us/en-us
31
© 2016 IBM Corporation
Email:[email protected]
Twitter:twitter.com/az990tony
Blog: ibm.co/Pearson
Books:www.lulu.com/spotlight/990_tony
IBM Expert Network on Slideshare:www.slideshare.net/az990tony
Facebook:www.facebook.com/tony.pearson.16121
Linkedin:https://www.linkedin.com/in/az990tony
Additional Resources from Tony Pearson
32
© 2016 IBM Corporation
IBM Tucson Executive Briefing Center
• Tucson, Arizona is home for storage hardware and software design and development
• IBM Tucson Executive Briefing Center offers:
• Technology briefings
• Product demonstrations
• Solution workshops
• Take a video tour!
• http://youtu.be/CXrpoCZAazg
33
© 2016 IBM Corporation
Trademarks and Other Disclaimers
34
Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency which is now part of the Office of Government Commerce. Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office. UNIX is a registered trademark of The Open Group in the United States and other countries. Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates. Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom. Linear Tape-Open, LTO, the LTO Logo, Ultrium, and the Ultrium logo are trademarks of HP, IBM Corp. and Quantum in the U.S. and other countries.
STAR WARS ROGUE ONE is a trademark of Lucasfilm Ltd. LLC.
Other product and service names might be trademarks of IBM or other companies. Information is provided "AS IS" without warranty of any kind
The customer examples described are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics may vary by customer.
Information concerning non-IBM products was obtained from a supplier of these products, published announcement material, or other publicly available sources and does not constitute an endorsement of such products by IBM. Sources for non-IBM list prices and performance numbers are taken from publicly available information, including vendor announcements and vendor worldwide homepages. IBM has not tested these products and cannot confirm the accuracy of performance, capability, or any other claims related to non-IBM products. Questions on the capability of non-IBM products should be addressed to the supplier of those products.
All statements regarding IBM future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.
Some information addresses anticipated future capabilities. Such information is not intended as a definitive statement of a commitment to specific levels of performance, function or delivery schedules with respect to any future products. Such commitments are only made in IBM product announcements. The information is presented here to communicate IBM's current investment and development activities as a good faith effort to help with our customers' future planning.
Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput or performance improvements equivalent to the ratios stated here.
Prices are suggested U.S. list prices and are subject to change without notice. Starting price may not include a hard drive, operating system or other features. Contact your IBM representative or Business Partner for the most current pricing in your geography.
Photographs shown may be engineering prototypes. Changes may be incorporated in production models.
© IBM Corporation 2016. All rights reserved. References in this document to IBM products or services do not imply that IBM intends to make them available in every country.
Trademarks of International Business Machines Corporation in the United States, other countries, or both can be found on the World Wide Web at http://www.ibm.com/legal/copytrade.shtml. ZSP03490-USEN-00