ilm tutorial - tiered storage

28
EDUCATION ILM: Tiered Services & The Need For Classification Edgar StPierre, EMC 2 SNW San Diego April 2007

Upload: datacenters

Post on 24-Jan-2017

204 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification

Edgar StPierre, EMC2

SNW San DiegoApril 2007

Page 2: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 2

SNIA Legal Notice

• The material contained in this tutorial is copyrighted by the SNIA.

• Member companies and individuals may use this material in presentations and literature under the following conditions:– Any slide or slides used must be reproduced without modification– The SNIA must be acknowledged as source of any material used

in the body of any document containing material from these presentations.

• This presentation is a project of the SNIA Education Committee.

Page 3: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 3

About SNIA and the DMF

About the Storage Networking Industry Association (SNIA)• SNIA’s primary goal is to ensure that storage networks become complete

and trusted solutions across the IT community• For additional information about SNIA see www.snia.org• SNIA’s “Dictionary of Storage Networking Terminology” is online at

www.snia.org/dictionary

About the Data Management Forum (DMF)• Founded in 2004, the Data Management Forum is a sub-group of SNIA

specializing in data management and protection throughout the lifecycle of information.

• More information about the DMF including resources on data and information lifecycle management can be found at www.snia-dmf.org

Page 4: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 4

AgendaIntroduction

–– What is ILMWhat is ILM–– Tiers of service with service level managementTiers of service with service level management–– ClassificationClassification

Section 1 – Tiers of Service–– What are tiers of service?What are tiers of service?–– What technologies compose a tier? What technologies compose a tier? –– What tiers do you need in your Data Center? What tiers do you need in your Data Center? –– What is the business value of tiered storage?What is the business value of tiered storage?

Section 2 – Data Classification–– Why classify data?Why classify data?–– Using Information to drive classificationUsing Information to drive classification–– Aligning Classified Data with Available ResourcesAligning Classified Data with Available Resources

Section 3 – Implementation Considerations–– Unstructured Unstructured –– Files in file systemsFiles in file systems–– SemiSemi--structured structured –– Email messages in Email systemsEmail messages in Email systems–– Structured Structured –– Database recordsDatabase records

Section 4 – Automated Classification

Page 5: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 5

Abstract

Tiered Services & The Need For Classification

Establishing an ILM strategy is a high priority for many organizations as they struggle to deal with data growth and regulatory compliance. The definition of tiered services is an important part of that ILM strategy. Two keys to a successful tiered service implementation are:

Resource classification - organizing your storage-related services into tiers that will meet the requirements of your business.

Data Classification - if you don't know what you have or how valuable it is, it’s very hard to decide what data should be placed on a particularservice tier.

In this tutorial we will look at how to think about organizing your storage-related service tiers based on your needs, and at data classification techniques and emerging technologies that can help you manage the placement of data across different tiers of service, or the movement of data across storage within the same service tier.

Page 6: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 6

Information Growth

• Save everything– 161 Exabytes in 2006– 988 Exabytes in 2010

• Source: IDC “The Expanding Digital Universe”

• Major new drivers– Digital images, voice and TV

• Growth for both size and units• Beyond “size”:

– 70% created by individuals– Organizations responsible for

managing 85% of:• Security• Privacy• Reliability• Compliance

Page 7: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 7

Increased Requirements on IT for Service Delivery

•• Senior IT Management: Senior IT Management: –– Optimize storage costs with a flat budgetOptimize storage costs with a flat budget

•• Legal Counsel: Legal Counsel: –– Need to find the right information at the right Need to find the right information at the right

time for legal and patent issuestime for legal and patent issues•• Security Officer: Security Officer:

–– Manage enterpriseManage enterprise--wide information security wide information security and riskand risk

•• Compliance Officer / Records Information Compliance Officer / Records Information Manager (RIM): Manager (RIM):

–– Comply with government and corporate Comply with government and corporate regulations for retention and accessregulations for retention and access

•• Chief Risk Officer:Chief Risk Officer:–– Addressing corporate risk managementAddressing corporate risk management

•• Business users:Business users:–– The average knowledge worker The average knowledge worker

spends spends six hours per weeksix hours per week searchingsearchingfor information*for information*

*Sources: IDC; Kahn/Blair

Page 8: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 8

Mitigation: ILM and Tiers of Service

Key characteristics of ILM:– Standard configurations to form “tiers

of service” (Service Catalog)– Data classification to organize data

based on information requirements

StandardConfigurations

Silver

Bronze

Gold

Service CatalogCapabilities

DR

InformationRequirements

DefinesAlign

Guides

Page 9: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 9

Section 1 - Tiers of Service

• What are Tiers of Service?• What technologies compose a “Tier”?• What do you need in your Data Center?• What is the Business Value of Tiered Services?

Page 10: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 10

Service Catalog – a customer sample

Alignment AttributesSpecification

Tier 3Tier 2Tier 1 Tier 4Scheme

PrimaryStorage

SecondaryStorage

OperationalRecovery

(OR)

Disaster Recovery

(DR)

Response time

Throughput

Maximum downtime (year)

Retention period

Data shredding compliance

Read access frequency

Guarantee of authenticity

Recovery point objective

Amount of data loss

Time to restore data Ability to recoverbacked up data Time data is retained

Recovery granularity

Maximum unplanned downtime per year (mins)

Performance throughputper port (I/O sec) Response time (ms)

Amount of data loss

Time to restore data

< 1 second < 1 second < 24 hours

<= 300 Mbps <= 700 Mbps <= 280 Mbps

<5.25 mins <52.56 mins < 175.2 hours

< 30 years < 10 years < 3 years

Yes No No

< Hourly > Hourly Daily

Yes No No

< 1 minute < 28 hours < 38 hours

Completeapp. restore

File or filesys. restore

1 hour 24 hours 24 hours

< 30 minutes < 30 minutes 7 GB/minute

100% 100% 98%

2 hours 24 hours 3 Weeks

Completeapp. restore

< 26.5< 26.5

5,000+ 3,500 – 5,000 1,500 – 3,500

< 8ms 7-14ms 12-30ms

< 52.5

500 – 1,500

12-30ms

< 263

File or filesys. restore

30 days

.5 GB/minute

95%

15 months

0 minutes < 4 hours 24-48 hours

< 2 hours <12 hours < 48 hours

24-48 hours

<72 hours

Availability

Accessibility

Data Integrity

Retention & Disposition

Offsite

Performance

Recovery Granularity

Recovery Point Objective (RPO)

Recoverability

Retention period

Recovery Time Objective (RTO)

Availability

Guaranteed Performance

Recovery Point Objective (RPO)

Recovery Time Objective (RTO)

Courtesy of EMC2 Consulting

Page 11: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 11

Storage – which technology to use?

Storage technology recommendations today are obsolete tomorrow– Technology driven and constantly changing– Some capabilities may be critical to decision-making– Best practices have significant influence on success

Your data and lifecycle requirements & your existing capabilities must drive– How many stages of storage & what technologies to choose

Page 12: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 12

Data Protection Technologies

Technology recommendations today are still obsolete tomorrow– Technology driven and constantly changing– Best practices have significant influence on success

•• Critical best practice strategy:Critical best practice strategy:–– Separate your requirements for copies:Separate your requirements for copies:

•• OR vs. DR vs. ArchiveOR vs. DR vs. Archive•• Operational Recovery (OR):Operational Recovery (OR):

–– Save the company $ with fast recovery from logical errorsSave the company $ with fast recovery from logical errors–– Multiple layers of ORMultiple layers of OR–– When to fall back to DR copiesWhen to fall back to DR copies

•• Disaster Recovery (DR):Disaster Recovery (DR):–– Is your business continuity plan in place?Is your business continuity plan in place?–– Leverage OR copies for DRLeverage OR copies for DR

•• Archive: Archive: –– Not all Not all ““archivearchive”” definitions are the same!definitions are the same!

•• Data Data movedmoved to secondary or tertiary online storageto secondary or tertiary online storage•• Offline/offsite Offline/offsite copycopy of dataof data•• A A copycopy of data on backup tapes of data on backup tapes

–– What are your stakeholdersWhat are your stakeholders’’ requirements for requirements for ““archivearchive”” access and protection?access and protection?

From the SNIA Online Dictionary for “Archive”: A collection of data that is maintained as a long-term record of a business, application, or information state. Archives are typically kept for auditing, regulatory, analysis or reference purposes rather than for application or data recovery. http://www.snia.org/education/dictionary/a/#archive

Page 13: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 13

What tiers do you need

•• Start from your requirements!Start from your requirements!–– Bring in ILM Professional Services to facilitateBring in ILM Professional Services to facilitate–– Include all the information stakeholdersInclude all the information stakeholders

•• Determine how many tiers of Determine how many tiers of serviceservice you needyou need–– The fewer the better!The fewer the better!–– Some services are specific to your needsSome services are specific to your needs

•• ThenThen determine solutions to deliverdetermine solutions to deliver–– Including how many stages in each data lifecycleIncluding how many stages in each data lifecycle–– Requirements for storage and protection at each stageRequirements for storage and protection at each stage–– Solutions to manage the data stage transitionSolutions to manage the data stage transition

•• Iterate and negotiateIterate and negotiate–– Pilot the process with one applicationPilot the process with one application–– Add applications with subsequent iterationsAdd applications with subsequent iterations

Page 14: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 14

Business value of tiered storage

•• Improve efficiency by aligning data with the most appropriate reImprove efficiency by aligning data with the most appropriate resourcessources–– Delay or reduce purchase of high end storageDelay or reduce purchase of high end storage–– Delete data if and when appropriateDelete data if and when appropriate

•• Improve service delivery by Improve service delivery by –– Managing to requirementsManaging to requirements

•• No more nor less than whatNo more nor less than what’’s neededs needed–– Focusing on fewer configurationsFocusing on fewer configurations

•• Improve scalability of personnelImprove scalability of personnel•• Establish a baseline for measurementEstablish a baseline for measurement

Page 15: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 15

Section 2 - Data Classification

•• Why classify data?Why classify data?•• Using information to drive data classificationUsing information to drive data classification•• Aligning classified data with available resourcesAligning classified data with available resources

Page 16: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 16

Information Classification:Gathering requirements from stakeholders

•• CorporateCorporate information information is simplyis simply data data to the data centerto the data center–– Data is what I.T. manages: files, volumes, bits and bytesData is what I.T. manages: files, volumes, bits and bytes–– Information is data with context: decisions are based on Information is data with context: decisions are based on informationinformation–– Use aUse a collaborative process to identify collaborative process to identify informationinformation service requirementsservice requirements

•• Use these requirements to define an SLAUse these requirements to define an SLA

•• Line of Business (LOB) information stakeholders:Line of Business (LOB) information stakeholders:–– Application performance, availability, recoverability, Application performance, availability, recoverability, ……–– Staff response time, asset reporting, Staff response time, asset reporting, ……–– CostCost

•• Corporate information stakeholders:Corporate information stakeholders:–– Security officer: Secret, confidential, proprietary, Security officer: Secret, confidential, proprietary, ……–– Records Manager: retention time, Records Manager: retention time, ……–– Compliance officer: authorization, retention, Compliance officer: authorization, retention, ……

•• This process enables the IT Organization to:This process enables the IT Organization to:–– Create a service catalogCreate a service catalog–– Create data classification policiesCreate data classification policies–– Match data with appropriate resources based on its service requiMatch data with appropriate resources based on its service requirements rements

DBA

Business ProcessAnalyst

RecordsManager

Security Officer

Legal

Data Admin

IT Admins

IT Architect

Information ClassificationData

Requirements SLA

App Owner

DBA

Business ProcessAnalyst

RecordsManager

Security Officer

Legal

Data Admin

IT Admins

IT Architect

Information ClassificationData

Requirements SLA

App Owner

Page 17: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 17

Classifying Data

•• Benefits it Benefits it willwill provide:provide:–– Identify alignment of IT with business prioritiesIdentify alignment of IT with business priorities–– Identify gaps in alignment, utilization & managementIdentify gaps in alignment, utilization & management–– Identify and organize data for regulatory complianceIdentify and organize data for regulatory compliance

•• Benefits it Benefits it couldcould provide:provide:–– Reduced footprint Reduced footprint –– Improved environmental resource utilizationImproved environmental resource utilization–– Data that is deleted is no longer an expense and no Data that is deleted is no longer an expense and no

longer a liabilitylonger a liability

•• Simplifies introduction of ITIL practicesSimplifies introduction of ITIL practices

Page 18: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 18

How is Data Classified?

Data Classification methods:Data Classification methods:–– Classify by business process or applicationClassify by business process or application

•• All data assigned same classificationAll data assigned same classification•• Simple; good start; a first approximationSimple; good start; a first approximation•• Net effect: ranking of applications to service tiersNet effect: ranking of applications to service tiers

–– Classify by metadataClassify by metadata•• Time last accessed, owner, file name, path, etcTime last accessed, owner, file name, path, etc•• Useful for aligning data to tiers of serviceUseful for aligning data to tiers of service

–– E.g., the CEOE.g., the CEO’’s email receives different service than yourss email receives different service than yours•• Or for placement of data to appropriate stage within a service tOr for placement of data to appropriate stage within a service tierier

–– E.g., Hierarchical Storage Management (HSM) for a file serverE.g., Hierarchical Storage Management (HSM) for a file server

–– Classify by contentClassify by content•• ContentContent--driven alignment of data to service level requirementsdriven alignment of data to service level requirements•• Added value for Business Intelligence, Compliance & Added value for Business Intelligence, Compliance & eDiscoveryeDiscovery

Page 19: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 19

Section 3 – Classification Implementation Considerations

Three primary data typesThree primary data types

•• Unstructured Unstructured –– Files in file systemsFiles in file systems•• SemiSemi--structured structured –– Email messages in Email systemsEmail messages in Email systems•• Structured Structured –– Database recordsDatabase records

Each type presents different challengesEach type presents different challenges–– Each type requires a different form of virtualizationEach type requires a different form of virtualization

Page 20: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 20

Unstructured Data (Files)

ItIt’’s more than just HSMs more than just HSM…………ItIt’’s file virtualization!s file virtualization!•• Direct access: Network or NASDirect access: Network or NAS•• File redirect: typically NASFile redirect: typically NAS•• Traditional migrate/recallTraditional migrate/recall

PolicyPolicy--driven movement:driven movement:•• Internal or external policy enginesInternal or external policy engines•• Age/owner type metadataAge/owner type metadata•• File contentFile content

Movement can beMovement can be1.1. TierTier--toto--tier (bitier (bi--directional)directional)2.2. To compliance archiveTo compliance archive3.3. To To ““trashtrash””4.4. To/from SAN or NASTo/from SAN or NAS5.5. To offsite archive vaultsTo offsite archive vaults

File System Access

Directaccess

Tier 1 Tier 2 Tier 3 Tier 4

RedirectMigrate/recallPolicy Engine

File system 2 File system 3

Caveat: be sure your data protection solution is integrated with your storage tiers!

Page 21: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 21

Semi-Structured Data (Email)

EmailServer

Selection criteria includeSelection criteria include•• To/From type metadataTo/From type metadata•• Age & access metadataAge & access metadata•• Content, attachmentsContent, attachments

Actions might includeActions might include1.1.Create single instance store Create single instance store

& migrate attachments& migrate attachments2.2.Send appropriate email to Send appropriate email to

compliance archivecompliance archive3.3.Delete email upon expirationDelete email upon expirationTrash

Policy Engine

Tier 1 Tier 2 Tier 3

MAPI

Typical email deployment

Page 22: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 22

Represents significant challenges because:Represents significant challenges because:•• Content & organization is a function of the applicationContent & organization is a function of the application•• Each application is differentEach application is different•• Must maintain transactional integrityMust maintain transactional integrity

Three basic approachesThree basic approaches……..

Structured Data (RDBMS)

Application

Transparency + Policy

RDBMS (Tier 2)

RDBMS (Tier 1)

ApplicationTransparency

+ Policy

RDBMS (Tier 1)

RDBMS (Tier 2)

Application

Transparency + Policy

RDBMS

Page 23: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 23

Represents significant challenges because:Represents significant challenges because:•• Conversion of records to files in long term reConversion of records to files in long term re--usable formatusable format•• Allow for import to current and future releases of RDBMSAllow for import to current and future releases of RDBMS•• Allow for import to different database technologies/productsAllow for import to different database technologies/products

Caveat: Archiving loses database transparencyCaveat: Archiving loses database transparency

Structured Data Archiving

Application

Transparency + Policy

RDBMS

Tier 1 Tier 2 Tier 3

Page 24: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 24

Tools

Emerging product areas…

• Automated data classification products– Create classification schemes– Based on metadata, content or both

• Policy engine– Service Level Management – Data lifecycle management– Managing data based on its classification

• Catalog for metadata & content– Extracted and maintained externally– Independently or as extensions of file system metadata

• Leverage data movement capabilities– OS, HSM, Backup, etc.– In band and out of band – Interfaces to archive devices to set retention, etc.

Page 25: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 25

Issues to consider with automated data classification

•• Metadata repositoryMetadata repository•• ScalabilityScalability•• Automation of data movementAutomation of data movement•• Defining policies Defining policies –– information classificationinformation classification

Page 26: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 26

Continue Your SNIA Education Experience At SNW• Attend Hands-On Labs in:

Data ClassificationKey to Service Level Management

Data Security and Protection Data Assurance Solutions to Meet Corporate

RequirementsIP Storage

iSCSI, Your IP SANStorage Management

Manage Storage or Be Managed By ItStorage Virtualization

Increasing ProductivityZero to SAN

• Fibre Channel Connectivity in No Time

Sessions begin Monday afternoon, April 16 and continue through Wednesday, April 18. All sessions in Emma/Maggie/Annie, 3rd

Floor of the Hyatt Manchester.Registration at the SNW Registration area

Page 27: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 27

DMF’s ILM Framework for the DatacenterDMFDMF’’ss ILM Framework for the DatacenterILM Framework for the Datacenter

Bus

ines

s Fr

amew

ork

Bus

ines

s Fr

amew

ork

Business Business ProcessProcess

Def

ine

Def

ine

Data Management ServicesData Management Services

Network Network InfrastructureInfrastructure

Compute Compute InfrastructureInfrastructure

Storage Storage InfrastructureInfrastructure

IT InfrastructureIT Infrastructure

Info

rmat

ion

Policies, Instrumentation, FiltersPolicies, Instrumentation, Filters

Goals ManagementGoals Management

RequirementsRequirements

Business RequirementsBusiness Requirements

Information Management ServicesInformation Management Services

ApplicationsApplications

ILM

Fra

mew

ork

ILM

Fra

mew

ork

Req

uire

men

tsR

equi

rem

ents

For more information on SNIAFor more information on SNIA’’s s Data Management Forum Data Management Forum (DMF) visit the DMF website at (DMF) visit the DMF website at

http://www.sniahttp://www.snia--dmf.orgdmf.org

At SNW, see:At SNW, see:•• The The Data Management Solutions CenterData Management Solutions Center in in

the Show Exhibit areathe Show Exhibit area•• Information Classification: The Cornerstone Information Classification: The Cornerstone

to Information Managementto Information Management•• The Secret Sauce of ILMThe Secret Sauce of ILM——The Professional The Professional

ILM AssessmentILM Assessment•• Data Classification Hands On LabData Classification Hands On Lab

Page 28: ILM Tutorial - Tiered Storage

EDUCATION

ILM: Tiered Services & The Need For Classification© 2007 Storage Networking Industry Association. All Rights Reserved. 28

Please send any comments on this tutorial to SNIA: [email protected]

Many thanks to the following individuals for their contributions to this tutorial:

Edgar StPierre Nik SimpsonBob Rogers Bill Pierce