has your data gone rogue?

35
© 2016 IBM Corporation Has Your Data Gone Rogue? Using IBM Flash and solutions to obtain enhanced business insights Tony Pearson, IBM Master Inventor and Senior IT Architect

Upload: tony-pearson

Post on 12-Jan-2017

38 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Has Your Data Gone Rogue?

© 2016 IBM Corporation

Has Your Data Gone Rogue?Using IBM Flash and solutions to obtain enhanced business insights

Tony Pearson, IBMMaster Inventor and Senior IT Architect

Page 2: Has Your Data Gone Rogue?

© 2016 IBM Corporation1

What is Happening?

Why did it Happen?

What might happen next?

What actions should

we take?

Client 1: Rebel Alliance

Descriptive Analytics

Diagnostic AnalyticsPredictive Analytics

Prescriptive Analytics

Page 3: Has Your Data Gone Rogue?

© 2016 IBM Corporation

Structured, Repeatable,

Linear

OLAP cube

Unstructured,Exploratory,

Iterative

Rebels are Inquisitive!

Reports Visualization and Discovery

Hadoop / Spark

Data warehousing

Stream Computing

Integration and Governance

Text Analytics

BusinessAnalyst

DataScientist

Analyze data2

2

“It’s no longer hard to find the answer to a given question; the hard part is finding the right question. And as questions evolve, we gain better insight into our ecosystem and our business.”

-- Kevin Weil, Lead Analyst at Twitter

Page 4: Has Your Data Gone Rogue?

© 2016 IBM Corporation

Clients are facing explosive growth in Unstructured Data, which is exactly why Analytics is so critical

3

* E

xa

byt

es

0

20

40

60

80

100

120

2009 2010 2011 2012 2013 2014 2015 2016 2017

Unstructured Data

Structured Data

Source: IDC

Unstructured data growth of

60–80%per yearcreates

Web-scale storage needs

Problem – Traditional Legacy Storage Designed for Transactional, Structured Data

Page 5: Has Your Data Gone Rogue?

© 2016 IBM Corporation

IBM Systems Storage PortfolioFlash for all primary storage workloads

DS8880

FlashSystem A9000

IBM FlashCore™ Technology Optimized

FlashSystemA9000R FlashSystem

V9000

All flash array -virtualizing the hybrid Data Center

• Best performance with storage services & selectable data reduction

• Targeting database/ analytics workloads

All flash array for cloud service providers

• Best performance with full time data reduction

• Targeting VDI and VMware

FlashSystem 900

All flash array for application acceleration

XIV Gen3

High End Capacity Optimized

All flash array for large deployments

• Best performance with full time data reduction

• Targeting mixed workloads

High End Server- Mainframe- Power

• Extreme reliability and replication

• Available in All Flash & Hybrid configurations

Storwize V7000

V7000F

Mid-Range

Storwize V5000

V5000F

Entry / Mid-Range

SVC

DeepFlashElastic

Storage Server (ESS)

• Extreme performance• Targeting database acceleration

& Spectrum Storage booster

Big Data Flash

4

Page 6: Has Your Data Gone Rogue?

© 2016 IBM CorporationIBM Systems

New Class of Flash: Big Data FlashScalable capacity and performance at low price points for big data

Performance can lead

to business results,

faster time to insights

Often do not benefit

from data reduction

technology, already

compressed files

Written once but

read often: video and

images

Source: IDC, 2015

Performance consistently better than that of the best HDDs

today

Cost comparable to that of performance optimized HDDs

Flash media that leverages flash Economics

Systems implementations that support massive scalability and

meet enterprise Requirements

Targeted primarily at big data and secondary storage

environments

Petabyte Scale of

unstructured data

and growing rapidly

Big Data

Attributes

Source: IDC, 2015

5

Page 7: Has Your Data Gone Rogue?

© 2016 IBM Corporation

HDD DeepFlash Conventional Flash

Price $ $$ $$$

Performance 10’s of milliseconds Sub Milliseconds Micro Seconds

Attributes • High ingest rate• Low change rate• High read rate

• Extremely latency sensitive• Can justify price premium

Typical use cases

Big Data analytics (ex: video, health care data), Hadoop, Spark

VDI, Server Virtualization, Database and Application Acceleration

Not conventional Flash, a new class of Flash: Big Data Flash

Scalable capacity and performance at low price points for big data

• Performance consistently better than that of the best HDDs

• Cost comparable to that of performance-optimized HDDs

• Systems implementations that support massive scalability and meet enterprise requirements

6

Page 8: Has Your Data Gone Rogue?

© 2016 IBM Corporation

� IBM InfoSphere BigInsights is a 100% standard Hadoop distribution� By default, open source components are always deployed� Elect to use proprietary capabilities depending on your needs� In some cases, proprietary capabilities offer significant benefits

Open standards first, but with freedom of choice

7

HDFS

YARN

HIVE

MapReduce

PIG

SpectrumScale

PlatformSymphony

Big SQL

AdaptiveMapReduce

BigSheets

Share data with non-Hadoop applications and simplify data management

Re-use existing tools and expertise, Avoid additional development costs

Boost performance, support time-critical workloads, do more with less

True multi-tenancy to boost service levels and avoid duplication on infrastructure

Simplify access for end-users, minimize software development

Page 9: Has Your Data Gone Rogue?

© 2016 IBM Corporation

Hadoop Analytics – HDFS vs IBM Spectrum Scale™

HDFS SaveResults

Discard

Re

st

IBM HDFS Transparency

Connector allows

HDFS-based programs to process data without

application changes (100% compatible)

IBM Spectrum Scale

Application data stored on IBM Spectrum Scale is readily available for analytics

SaveResults

JFS2

NTFS

EXT4

Data Sources mashup of structured and unstructured data from a variety of sources

Actionable Insights Provides answers to the

Who, What, Where, When, Why and How

Business Intelligence & Predictive Analytics> Competitive Advantages> New Threats and Fraud

> Changing Needs and Forecasting

> And More!

8

Page 10: Has Your Data Gone Rogue?

© 2016 IBM Corporation

Elastic Storage Server (ESS) with Spectrum Scale

5146-GLx modelsGL2, GL4, GL660-drive 4U drawers• SSD and Nearline HDD

5146-GSx modelsGS1, GS2, GS4, GS624-drive 2U drawers• All SSD• SSD and 10K HDD

IBM POWER8servers

NSD Client

Twin-tailed

Elastic Storage

Server

TCP/IP orRDMA

DeepFlash ESS (5147-GFx)64-drive 3U drawers• Pre-loaded with 32 drives• All SSD (8 TB)

Page 11: Has Your Data Gone Rogue?

© 2016 IBM CorporationIBM Systems

New Big Data alternative: instead of HDD, use Big Data FlashFor clients who value application response time and/or throughput per rack unit

Improve application response time by 8X

Improve throughput/rack unit by 2.8X

Improve MTBF

Improve power & cooling costs by 30%-50%

8X faster response timeand same throughputas the HDD version

28U25GB/S

File Server

Ha

rd D

rive

sH

ard

Dri

ve

s

File Server

All Flash

All Flash

Move from Big Data HDD configuration

To this Big Data Flashconfiguration

10U25GB/S

10

Page 12: Has Your Data Gone Rogue?

© 2016 IBM Corporation

IBM DeepFlash 150 storage enclosure

|

11

Page 13: Has Your Data Gone Rogue?

© 2016 IBM Corporation

Introducing IBM DeepFlashTM Elastic Storage Server8X faster response time, 8X lower latency compared to HDD version*

2 Enclosures, 10U360 TB of usable Flash

Max Read 26.6 GB/sec;Max Write 16.6 GB/sec

1 Flash Enclosures, 7U180TB of usable Flash; Max Read 13.6 GB/sec;Max Write 9.3 GB/sec

ESS GF1 ESS GF2

*based on SPEC SFS results

Spectrum Scale I/O server(POWER8)

DeepFlash JBOF

DeepFlash JBOF

12

Page 14: Has Your Data Gone Rogue?

© 2016 IBM Corporation

Data Protection Schemes

Tolerate 1 drive failure Tolerate 2 drive failures Tolerate “M” failures

RAID-1 / RAID-10K pieces � 2 x K slices

RAID-5K pieces � K + 1 slices

2.0X

1.2X

3.0X

1.5X

1.3XTriplicationK pieces � 3 x K slices

RAID-6K pieces � K + 2 slices

Erasure CodingK pieces � K+M = N

slices

Page 15: Has Your Data Gone Rogue?

© 2016 IBM Corporation

Share-Nothing versus Shared-Disk Deployments

DataData

Data

Parity

DataData

Data

CopyCopy

Copy

CopyCopy

CopyTCP/IPor RDMA

Need more compute? Add another node!

Elastic Storage Server reduces storage to one copy of the data with Erasure Coding

Scale compute and storage capacity separately

Many solutions keep 3 replicas

of the data

Need more storage capacity?

Add another node!

3x versus 1.3x

TCP/IPor RDMA

Data

Page 16: Has Your Data Gone Rogue?

© 2016 IBM Corporation

Introducing Spectrum Control Storage Insights…

• Convergence of analytics, cloud, and data management

• Designed to

� Reduce storage costs, without the traditional up-front investments

� Enable actionable visibility within minutes

� Provide rapid insights to critical assets

15

Deployed instantly from the cloud� Understand the storage environment and its

usage

� Monitor capacity and performance

� Reclaim allocated, but unused space

� Optimize data placement with advanced analytics

IBM is the only major storage vendor with a cloud-based SaaS offering for Storage Management

Page 17: Has Your Data Gone Rogue?

© 2016 IBM Corporation16

Client 2: Galactic Empire

Our major project is behind schedule!

A major test is imminent!

Too many clones!

How do we keep these

plans secret?

Page 18: Has Your Data Gone Rogue?

© 2016 IBM Corporation

IBM FlashSystem Models

17

900 V9000 A9000 A9000R

Tier 0 – Lean & Mean Tier 1 – Robust functionality

Optimized for:• Application Boost

Optimized for:• Traditional SAN• Databases• Automated Tiering• Virtualize almost 400

vendor devices

Optimized For:• Cloud / Multi-tenancy• Virtual Desktop

Infrastructure (VDI)• Virtual Machines

• VMware, HyperV, etc.

Page 19: Has Your Data Gone Rogue?

© 2016 IBM CorporationSource: IDC, The Copy Data Problem: An Order of Magnitude Analysis, doc #239875

50+Copies

COPY DATA GROWTH

Sto

rag

e G

row

th

Time

Primary Data

~35%YoY

Copy Data will be a $51B problem by 2018

• Consumes as much as 60% of disk capacity

• Drives 65% of Storage Software and 85% of the Storage Hardware spending

• Almost all copies sit idle

Copy Data Mgmt Gap

Geometric CopyData Growth

Linear Data Growth1 Resilient workload

(Disk Backup) 23

Non prod workload(Test/Dev or DevOps) 6

Resilient workload(Mirror) 1

Compliance workload(Archive) 1

Big Data workload(Analytics) ?

PrimaryData 1

Today’s IT Challenge: Too many clones!

18

Page 20: Has Your Data Gone Rogue?

© 2016 IBM Corporation

Your Infrastructure

IBM Storwize V7000,V5000, V3000

IBM Spectrum Copy Data Management

Software-Defined

Copy Data ManagementPlatform

• Cloud integrated• DevOps enabled

Transform

Catalog• Discover• Search

Automate• SLA compliance• Policy-based

LE

VE

RA

GE

Use Cases

Protection and Disaster Recovery

Hybrid Cloud

Applications

IBM FlashSystem A9000

IBM FlashSystem A9000R

IBM FlashSystem V9000

Also supports:SAN Volume Controller Spectrum Virtualize Spectrum Accelerate XIV Storage ArraysVersaStackEMC VNX and UnityNetAPP

DevOps, Test/Dev

Automated CopyManagement

IT Modernization through “In Place” Copy Data Management

19

Page 21: Has Your Data Gone Rogue?

© 2016 IBM Corporation

Security Strength is based on Algorithm and Number of Bits in Key

20

AES RSA ECC Years

1024 160 106

2048 224 109

128 3072 256 1015

192 7680 384 1033

256 15360 512 1051

Data*Data

Data* Data

**

Symmetric Key (AES 256)

• Same key is used to encrypt/decrypt

• Fast, ideal for large amounts of data

• Must keep the key secret

Encryption “Public” Key

Decryption “Private” Key

� Pairs of different keys are used to encrypt & decrypt data

� Encrypt with “Public” key; it may be distributed widely available without fear of compromise

� Decrypt with “Private” key; must keep this key secret

Asymmetric Key (RSA 2048)

ED

KeyPair

DataData

Data Data

E

DAES – Advanced Encryption StandardRSA – Rivest Shamir Adleman

ECC – Elliptical Curve Cryptography

Page 22: Has Your Data Gone Rogue?

© 2016 IBM Corporation

Two-Tier Encryption Scheme

21

Problem:Realtors, Landlords, and Apartment managers must carry hundreds of keys, one unique to each dwelling unit

Solution:All units have their unique key kept inside a locked box hanging on the door knob.

Realtors, Landlords, and Apartment managers carry a single master key that opens every lockbox

DataA

E

D

A

DataB

B

Encryption:Each flash, disk, or tape assigned a unique symmetric “Data Key”

Data key itself is encrypted or “wrapped” with master“encryption key”

Decryption:Data key is decrypted with master “decryption key”

Unique data key for this flash, disk, tape used to read and write contents

Page 23: Has Your Data Gone Rogue?

© 2016 IBM Corporation

How Encryption Keys are used in different IBM storage devices

22

DataA

A

DataB

B

• System power-on• System restart / firmware update• User-initiated re-key operation

• Tape mount

1 key pair per system 1 key pair per cartridge

FlashSystem 900

XIV, DS8000

SpectrumVirtualize

Enterprise Tape

1 key per self-encrypting

flash card

1 key perself-encrypting

drive (SED)

1 key per storage pool

1 key per cartridge

ED

KeyPair

A B

External Master Key:Asymmetric keys (RSA 2048-bit) stored in volatile memoryNeeded only for:

Internal Data Key:Symmetric key (AES 256-bit) randomly generated, encrypted by master key and stored on the storage media, used for high-speed read/write activity

Page 24: Has Your Data Gone Rogue?

© 2016 IBM Corporation

keystore

IBM Security Key Lifecycle Manager (SKLM)

23

SKLM

SecurityAdmin

StorageAdmin

secure communication

EDKeyPair

External Master Key:Asymmetric keys (RSA 2048-bit) stored in volatile memory, only needed for:• System power-on• System restarts (such as

firmware upgrades)• Re-key operations

Device requests key from IBM SKLM,SKLM sends master key to device

Storage admin requests USB thumb drive from Security team,

inserts into device

lockbox Or just leave USB thumb drive in device all the time

Page 25: Has Your Data Gone Rogue?

© 2016 IBM Corporation

SKLM

IBM SKLM supports flash, disk and tape

storage

Spectrum Virtualize supports either USB or

IBM SKLM

Encrypted storage pools can mix

devices

Where is Encryption Performed?

24

IBM Spectrum Virtualize™

SVC, Storwize, FlashSystem V9000, VersaStack

SAS

Internal storage,

Expansion drawers

CPU

FlashSystem 900

XIV, DS8000,FlashSystem

A9000/R

Non-encryptingstorage TS1120,

LTO4 and newer

SANSAS controlleruses HW chip

Uses AES-NIinstructions

Smart enough not to “double encrypt”

Page 26: Has Your Data Gone Rogue?

© 2016 IBM Corporation

Motivations for Data-at-Rest Encryption

Broken drives Decommission Mandate Theft

Withoutencryption

“90% of drives returned hadreadable data”-- Seagate

Physically destroy drive, or do not return them to manufacturer

Hire storage vendor to securely erase drives, using Department of Defense (DoD) method of multiple over-writes

Fail government or corporate compliance audits

Declare data breach

Pay for all affected clients and employees credit monitoring

Encryption-- USB drive left in device

Return broken drives to manufacture for warranty replacement

Overwrite or erasedecryption keys ���� data is “cryptographically erased”

Remove USB drives before auditors or inspectors arrive!

Encryption--Lockbox or SKLM server Pass audits

No breach if thievesdo not have access to decryption keys

25

Page 27: Has Your Data Gone Rogue?

© 2016 IBM Corporation26

Galactic Empire• Project is behind schedule, and a major test is imminent

• IBM FlashSystem• IBM Spectrum Copy Data Management

• Need to protect secret plans• IBM Security Key Lifecycle Manager

Rebel Alliance• Reckless, aggressive, and

undisciplined• Rebels are inquisitive!

• IBM DeepFlash ESS• IBM Spectrum Control Storage

Insights

Page 28: Has Your Data Gone Rogue?

© 2016 IBM Corporation

And now… enjoy the movie…

27

May the Force be with us!

Page 29: Has Your Data Gone Rogue?

© 2016 IBM Corporation

About the Speaker

Tony Pearson is a Master Inventor and Senior IT Architect for the IBM Storage product line. Tony joined IBM Corporation in

1986 in Tucson, Arizona, USA, and has lived there ever since. In his current role, Tony presents briefings on storage topics

covering the entire IBM Storage product line, IBM Spectrum Storage software products, and topics related to Cloud Computing,

Analytics and Cognitive Solutions. He interacts with clients, speaks at conferences and events, and leads client workshops to

help clients with strategic planning for IBM’s integrated set of storage management software, hardware, and virtualization

solutions.

Tony writes the “Inside System Storage” blog, which is read by thousands of clients, IBM sales reps and IBM Business Partners

every week. This blog was rated one of the top 10 blogs for the IT storage industry by “Networking World” magazine, and #1

most read IBM blog on IBM’s developerWorks. The blog has been published in series of books, Inside System Storage: Volume

I through V.

Over the past years, Tony has worked in development, marketing and consulting for various storage hardware and software

products. Tony has a Bachelor of Science degree in Software Engineering, and a Master of Science degree in Electrical

Engineering, both from the University of Arizona. Tony holds 19 patents for inventions on storage hardware and software

products.

9000 S. Rita Road

Bldg 9032 Floor 1

Tucson, AZ 85744

+1 520-799-4309 (Office)

[email protected]

Tony Pearson

Master Inventor

Senior IT Architect

IBM Storage

28

Page 30: Has Your Data Gone Rogue?

© 2016 IBM Corporation

The Right Flash for the Right Workload

Key Attributes

Typical Workloads,

Applications & Use Cases

Business Critical Storage

z/OS Support High PerformanceHighest Availability

z/OS (GDPS)Power HAPower i HA

Three-site/Four-siteSix 9’s ReliabilityEnterprise Scalability

High-availability/Low RTO applicationsHigh-performance OLTPReal time analyticsHigh-performance data warehouse

IBM DS8888

Virtual Storage Infrastructure

Heterogeneous Enterprise-class Data ServicesDynamic Data MigrationMulti-Vendor ManagementData Reduction (Compression)Multi-site active-active

Traditional structured workloads required block storage

Systems of RecordOLTPData Warehousing w/ Oracle, DB2, SQL Server, MySQL, SAP, SASAnalytics

FlashSystem V9000Storwize V7000FStorwize V5000F

Grid Scale Cloud Storage

Cloud-optimized (QOS, Multi-Tenancy)Predictable High Performance with Data Reduction Technologies (including deduplication)Ease-of-management

Large-scale distributed block workloads & applications

VDISAP (Oracle)ExchangeVMware / KVM server environmentsCSPs (Mixed workloads, Multi-tenancy)Hybrid cloud architectures

FlashSystem A9000 FlashSystem A9000R

Big Data Storage

Multi-protocol supportPolicy-driven tieringSingle namespace data oceanHigh-performance file storageHigh bandwidth

Distributed file/objectHadoop (M/R)Media Streaming / VideoSASSpark (In-Memory)HPCContent RepositoriesHigh-performance backup targetNAS filer consolidation

IBM DeepFlash ESSw/ IBM Spectrum Scale

29

Page 31: Has Your Data Gone Rogue?

© 2016 IBM Corporation

Spectrum Control ‘ice breaker’ Assets

30

Page 32: Has Your Data Gone Rogue?

© 2016 IBM Corporation

IBM Spectrum Control on IBM Cloud Marketplacehttp://www.ibm.com/marketplace/cloud/analytics-driven-data-management/us/en-us

31

Page 33: Has Your Data Gone Rogue?

© 2016 IBM Corporation

Email:[email protected]

Twitter:twitter.com/az990tony

Blog: ibm.co/Pearson

Books:www.lulu.com/spotlight/990_tony

IBM Expert Network on Slideshare:www.slideshare.net/az990tony

Facebook:www.facebook.com/tony.pearson.16121

Linkedin:https://www.linkedin.com/in/az990tony

Additional Resources from Tony Pearson

32

Page 34: Has Your Data Gone Rogue?

© 2016 IBM Corporation

IBM Tucson Executive Briefing Center

• Tucson, Arizona is home for storage hardware and software design and development

• IBM Tucson Executive Briefing Center offers:

• Technology briefings

• Product demonstrations

• Solution workshops

• Take a video tour!

• http://youtu.be/CXrpoCZAazg

33

Page 35: Has Your Data Gone Rogue?

© 2016 IBM Corporation

Trademarks and Other Disclaimers

34

Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency which is now part of the Office of Government Commerce. Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office. UNIX is a registered trademark of The Open Group in the United States and other countries. Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates. Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom. Linear Tape-Open, LTO, the LTO Logo, Ultrium, and the Ultrium logo are trademarks of HP, IBM Corp. and Quantum in the U.S. and other countries.

STAR WARS ROGUE ONE is a trademark of Lucasfilm Ltd. LLC.

Other product and service names might be trademarks of IBM or other companies. Information is provided "AS IS" without warranty of any kind

The customer examples described are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics may vary by customer.

Information concerning non-IBM products was obtained from a supplier of these products, published announcement material, or other publicly available sources and does not constitute an endorsement of such products by IBM. Sources for non-IBM list prices and performance numbers are taken from publicly available information, including vendor announcements and vendor worldwide homepages. IBM has not tested these products and cannot confirm the accuracy of performance, capability, or any other claims related to non-IBM products. Questions on the capability of non-IBM products should be addressed to the supplier of those products.

All statements regarding IBM future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.

Some information addresses anticipated future capabilities. Such information is not intended as a definitive statement of a commitment to specific levels of performance, function or delivery schedules with respect to any future products. Such commitments are only made in IBM product announcements. The information is presented here to communicate IBM's current investment and development activities as a good faith effort to help with our customers' future planning.

Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput or performance improvements equivalent to the ratios stated here.

Prices are suggested U.S. list prices and are subject to change without notice. Starting price may not include a hard drive, operating system or other features. Contact your IBM representative or Business Partner for the most current pricing in your geography.

Photographs shown may be engineering prototypes. Changes may be incorporated in production models.

© IBM Corporation 2016. All rights reserved. References in this document to IBM products or services do not imply that IBM intends to make them available in every country.

Trademarks of International Business Machines Corporation in the United States, other countries, or both can be found on the World Wide Web at http://www.ibm.com/legal/copytrade.shtml. ZSP03490-USEN-00