cost effective backup with deduplication · 2014-12-03 · reduces backup storage by up to 50x and...
TRANSCRIPT
© Copyright 2009 EMC Corporation. All rights reserved.
Cost Effective Backup with Deduplication
© Copyright 2009 EMC Corporation. All rights reserved.2
Agenda
Today’s Backup Challenges
Benefits of Deduplication
Source and Target Deduplication
Introduction to EMC Backup Solutions–
Avamar, Disk Library, and NetWorker
Question and Answer
© Copyright 2009 EMC Corporation. All rights reserved.3
Data growth is unavoidable
Exponential growth in backup–
Typically represents a factor of 4-30x plus production capacity
–
Daily, weekly, and monthly full backups kept for months or years
New requirements to keep more data for longer periods
–
Cost for management, media, and offsite storage costs multiply
24x7 data center reality–
No good time to run backups–
Bandwidth limitations–
Virtualization drives consolidation
AMOUNT OF DIGITAL INFORMATION CREATED AND REPLICATED EACH
YEAR
Source: IDC White Paper, "The Diverse and Exploding Digital Universe”, March 2008 – Sponsored by EMC
DigitalInformationInform
ation Growth ≈60% CAGR
≈60% CAGR
173billion
gigabytes
1,773billion gigabytes
(1.773 zetabytes)
2006 2007 2008 2009 2010 20111996 1997 1998 1999 2000 2001 2002 20032004 2005
Today’s Backup ChallengesBENEFITS OF DEDUPLICATION
© Copyright 2009 EMC Corporation. All rights reserved.4
“The process of detecting and identifying the unique data segments within a given set of information, enabling the elimination of redundancy when stored or moved.”
Before: total segments = 39 After: Unique segments = 6
Data Set 3
Data Set 2
Data Set 1
Deduplication
EMC’s Definition of DeduplicationBENEFITS OF DEDUPLICATION
© Copyright 2009 EMC Corporation. All rights reserved.5
A B C D
Unique data stored on disk, available for immediate recovery
Only unique data segments are backed up
AB
CD
Data already backed up, so only a unique ID pointer is stored (20 bytes)
E
E New data segment identified and backed up
First Instance Duplicate Instance Modified Instance
A B
C D
A B
C D
B
C D
E
May 2007 May 2007 June 2008
Data Deduplication: How it WorksBENEFITS OF DEDUPLICATION
© Copyright 2009 EMC Corporation. All rights reserved.6
DEDUPLICATION AT TARGETDEDUPLICATION AT SOURCE
SourceClient software agents identify repeated sub-file data segments at the source
Only new, unique segments are transferred across the network and stored to disk
Shorter backup window, reduces daily impact on physical/virtual infrastructure
Target Backup application sends native data to a target storage device
Data is deduplicated once it reaches the target – during or after the backup
Found in VTLs or LAN B2D appliances
Transparency to backup application offers users a “plug and play” experience
Network Network
Where Can Data Deduplication Occur?SOURCE AND TARGET DEDUPLICATION
© Copyright 2009 EMC Corporation. All rights reserved.7
IMMEDIATE OR SCHEDULED DEDUPLICATION
IMMEDIATE DEDUPLICATION AT SOURCE
Immediate at the source—before data is sent across the network
Data is deduplicated at source (client)
Ideal for slow, congested infrastructure (e.g. remote offices, VMware)
Leverages existing network links and infrastructure for fast, daily full backups
NetworkNetwork
When Can Data Deduplication Occur?SOURCE AND TARGET DEDUPLICATION
Immediate—while the backup is running
Content is deduplicated while backup happens
Ideal for when the backup window is not a limiting design factor, and for optimizing capacity
Scheduled—after some or all backup is complete
Content stored in original format, dedupe later
Well-suited for optimal performance in tight backup windows
© Copyright 2009 EMC Corporation. All rights reserved.8
These factors apply for all backup deduplication technologies
Data deduplication
performance is tied to a
number of factors—
even small
variations can have a
significant impact
Factors that Impact Data Deduplication Ratios
Type of data–
Duplication in user-generated data is greater than from natural sources–
Encrypted and compressed data are not ideal candidates for dedupe–
More user created content = higher deduplication ratio
Data change rate–
Small data change rates = more duplicate data in subsequent backups–
Less change = higher deduplication ratio
Retention policy–
Longer retention increases chances data will be repeatedly backed up–
Longer retention policy = higher deduplication ratio
Ratio of full backups to incremental backups–
More full backups increase the amount of data being repeatedly backed up
–
More full backups = higher deduplication ratio
SOURCE AND TARGET DEDUPLICATION
© Copyright 2009 EMC Corporation. All rights reserved.9
REPLICATE AFTER DEDUPLICATION
Backup deduplication
Without deduplicationNo reduction in local backup storage
No reduction in replication time nor bandwidth
No reduction in offsite storage
Leveraging deduplicationReduced local backup storage
Reduced replication time and bandwidth
Reduced offsite storage
OFFSITE REPLICATION WITHOUT DEDUPLICATION
Primary Site Remote Site Primary Site Remote Site
Data Deduplication Impact
Remote Replication and Bandwidth Requirements
SOURCE AND TARGET DEDUPLICATION
© Copyright 2009 EMC Corporation. All rights reserved.10
Lowers infrastructure costs–
Reduces backup infrastructure requirements
–
Reduced power, cooling, and floor space
Enables longer backup retention periods
–
Less data is easier and less costly to manage
–
Meets regulatory requirements
Improves data protection–
Daily full backup now achievable–
Disk-based backup also speeds restore times
Improved security–
Disk eliminates risks of lost tapes
DEDUPLICATION
Value of Data Deduplication for Backup-to-DiskSOURCE AND TARGET DEDUPLICATION
© Copyright 2009 EMC Corporation. All rights reserved.11
Disk Library Family
Backup-to-disk solution, now with the power of policy-based data deduplicationWorks with existing backup applications and infrastructureFlexible solutions from small to large environmentsHigh performance, direct tape creation, and HA architecture
EMC Avamar
Complete backup and recovery solutionDedupes at the source and globallySingle step recoveryIntegrated HA (RAIN)Flexible deployment (e.g. Data Store, virtual appliance, SW only)
EMC NetWorker
Industry-leading backup and recovery software Integration with both Avamar and Disk Library
NetWorkerAvamar Disk Library
EMC Data Deduplication Backup SolutionsEMC BACKUP SOLUTIONS
© Copyright 2009 EMC Corporation. All rights reserved.12
•
Full-featured backup solution–
Software and hardware with data deduplication
•
Source-based, global data deduplication–
Reduces data at source (client) –
Reduces data globally (at backend disk)
•
Fast, daily full backups –
Up to 10x faster daily full backups–
Leverages existing infrastructure
•
Integrated high availability and reliability –
RAIN for high availability and fault tolerance–
Avamar server and data recoverability verified daily
•
Flexible deployment options–
Avamar software –
Avamar Data Store –
Avamar Virtual Edition for VMware environments
Avamar Data StoreScalable, turnkey solution
for small offices to datacenters
EMC Avamar
Complete Backup and Recovery Solution
EMC BACKUP SOLUTIONS
© Copyright 2009 EMC Corporation. All rights reserved.13
EMC Avamar: Real-World Results
Data Type
Amount of Primary Data Backed Up
Amount of Data Moved Daily
Daily De-duplication Ratio
Windows file systems 3,573 GB 6.1 GB 586:1
Mix of Windows, Linux, and UNIX file systems 5,097 GB 11.7 GB 436:1
Engineering files on NAS (NDMP backups) 3,265 GB 24.2 GB 135:1
Mix of 20 percent databases, 80 percent file systems (Windows and UNIX) 9,583 GB 80.0 GB 120:1
Mix of Linux file systems and databases 7,831 GB 104.2 GB 75:1
Source: EMC
Avamar Daily Full Backups vs. Traditional Daily Full Backups
While results will vary by data type and mix, Avamar can dramatically improve backup performance and efficiency!
© Copyright 2009 EMC Corporation. All rights reserved.14
Avamar Success Story: Corporate Express
Before AvamarStorage demands were rapidly increasingTape library was reaching slot capacity and upgrading was not ideal due to age and maintenance costsNeeded to control costs and simplify data managementBackup and disaster recovery was time consuming
With AvamarReduced stored data by more than 50%, from 92 TB to 44 TBAchieved significant financial savingsEnabled disk-based backups to be completed in 30 minutes, compare to 6 hours in the past for tapeReduced restoration times for business-critical data from 24 hours to minutes
EMC BACKUP SOLUTIONS
Time Shortened, Costs Reduced for Remote Office/Branch Office Backup
“We were blown away by the simplicity of the management interface and the comprehensive capabilities offered by Avamar. After carrying out a proof of concept, we clearly understood the benefits Avamar would bring to our business.”
—
Mark Jones, Technology Infrastructure Manager
© Copyright 2009 EMC Corporation. All rights reserved.15
•
Virtual tape libraries and LAN backup-to-disk platforms
•
Policy-based deduplication
•
IP or SAN connectivity
•
IP replication of deduplicated content
•
Industry-proven CLARiiON back-end
–
High performance–
5-9s high reliability
Disk Library FamilyUp to 8 TB/hr performance4–674 TB scalabilityHardware compressionEnergy-efficiency optionsConsolidated media managementIP or SAN connectivityIP replication
Data Deduplication Capabilities for All Platforms
EMC Disk Library FamilyEMC BACKUP SOLUTIONS
© Copyright 2009 EMC Corporation. All rights reserved.16
•
DL3D 1500–
4–36 TB capacity–
Up to 720 GB/hour performance (SAN)
•
DL3D 3000–
8–148 TB capacity–
Up to 1.44 TB/hour performance (SAN)
•
Policy-based data deduplication–
Select ‘Immediate’
or ‘Scheduled’
deduplication–
Optimize for storage utilization or for backup performance
•
Replication of deduplicated content for HA–
Up to 10 sources to one target–
Data encryption—128-bit AES—with ability to turn on/off
DL3D 30008 Gigabit Ethernet ports for CIFS/NFS4 Fibre Channel SAN ports (VTL)4 TB upgrades 3-year Enhanced warranty
DL3D 15006 Gigabit Ethernet ports for CIFS/NFS 2 Fibre Channel SAN ports (VTL)4 TB upgrades 3-
year Enhanced warranty
New LAN-based backup-to-disk platforms with Data Deduplication
DL3D 1500 and DL3D 3000EMC BACKUP SOLUTIONS
© Copyright 2009 EMC Corporation. All rights reserved.17
•
Based on proven CLARiiON CX3-80 arrays –
Single or dual engine systems
•
Over a PB usable compressed capacity–
1 TB SATA drives; up to 930 drives
•
Enhanced system throughput–
Hardware compression –
First and only end-to-end 4 Gb/s solution
•
Policy-based data deduplication–
Optimize performance; reduce storage and replication costs
•
Energy-efficient–
Automatic drive spin-down and low-power drives DL4000 Series
Industry’s only virtual tape library, built from the ground up with
4 Gb/s components
Industry’s Most Popular SAN VTL—Now with Deduplication
DL4000 SeriesEMC BACKUP SOLUTIONS
© Copyright 2009 EMC Corporation. All rights reserved.18
Disk Library Family
Success Story: Oil & GasEMC BACKUP SOLUTIONS
Time Shortened for Backup and Restore
Before Disk LibraryNot meeting backup windowsNeeded to speed restores
With Disk LibraryProvided flexibility and control to increase performance Increased overall performance to meet backup windowsProvided simplicity, reliability, and more efficient managementGenerated significant cost savings
Oil & Gas
Disk Library 3000 policy-based de-
duplication provides the flexibility and control to optimize ingest performance and overall performance
© Copyright 2009 EMC Corporation. All rights reserved.19
EMC NetWorker
Centralized control of traditional and next-generation backup
–
Combining today’s technologies with tomorrow’s in a common framework
Industry-leading global data deduplication
–
Reduces backup storage by up to 50x and data moved by up to 500x—ideal for VMware environments
Broad backup to disk–
Disk library integration, replication, snapshot management, continuous data protection, and NAS backup to disk
Enterprise performance–
Securely backups and reliable recoveries
Better recoverability from tape backups–
Future-proofed Open Tape Format with better recoverability from damaged tape media
Complete Backup and Recovery from EMC
EMC BACKUP SOLUTIONS
© Copyright 2009 EMC Corporation. All rights reserved.20
NetWorker Server and Management Console
EMC NetWorker and Deduplication
NetWorker client and Management Console communicate with AvamarAvamar appears as a NetWorker dedupe node enabled via client propertiesNetWorker manages metadata and data sent to the dedupe node
Avamar Storage Node
Server and Storage
Disk, VTL, Tape
Dedupe Node
NetWorker ClientsNetWorker and Avamar Integration
EMC BACKUP SOLUTIONS
© Copyright 2009 EMC Corporation. All rights reserved.21
•
Integrated deduplication–
Select source or targeted based on need–
Optimize dedupe for the greatest benefit
•
Source using NetWorker client—
integrated with Avamar
–
Managed via NetWorker for client config, schedules and policies, monitoring, and reporting, full indexing, etc.
•
Target using EMC Disk Libraries –
DL 1500/3000 for LAN backup-to-disk or VTL
–
DL 4000 and optional policy-based deduplication
Keep you backup infrastructure running smooth with EMC Data Protection Advisor
DL3D 1500/3000DL 4000
NetWorker Clients
Disk Tape
NetWorker
Avamar Data Store
EMC NetWorker and Deduplication
Manage Source and Target Data Deduplication
EMC BACKUP SOLUTIONS
© Copyright 2009 EMC Corporation. All rights reserved.22
NetWorker Success Story: RetailEMC BACKUP SOLUTIONS
NetWorker integrated with Avamar provides an efficient solution for the centralized management of data deduplication and backup
Retail
Centralized Backup Management, Increased Efficiencies, Reduced Costs
Before NetWorker85% of the environment was virtualized on VMware Restores unreliable and difficult to manage
With NetWorkerProvided integration with Avamar to deduplicate the VMware environment, reducing the size of file system backupsOffered centralized backup management Saved money by increasing FTE efficiency
© Copyright 2009 EMC Corporation. All rights reserved.23
•
Depends on:–
Application and data type–
Service-level requirements–
Current backup challenges and environment
•
EMC tools are available to help you understand the benefits of each solution
–
Deduplication analyzer tools–
TCO tools–
Backup, e-mail, and file system assessments
Let Us Help You Determine the Right Solution
Which EMC Deduplication is Right for You?EMC BACKUP SOLUTIONS
© Copyright 2009 EMC Corporation. All rights reserved.24
•
Comprehensive, integrated set of deduplication solutions–
Avamar, Disk Library, NetWorker…–
Saves money and drives efficiencies throughout backup recovery lifecycle
•
Only vendor that can deliver a deduplication solution for any customer need–
From refresh-to-redesign of the backup and recovery infrastructure–
Tailored to the size of your company, specific need, and budget
Talk to your EMC or partner representativeto share your backup requirements
NetWorkerAvamar Disk Library
Deduplication is a Differentiator
Why EMC for Backup-to-Disk with DeduplicationEMC BACKUP SOLUTIONS