emc next generation backup & data de-duplication high level overview and strategy
DESCRIPTION
EMC Next Generation Backup & Data De-Duplication High Level Overview and Strategy. Joe Staiber EMC Corporation Data De-Duplication Product Manager Backup, Recovery and Archive Division. Long Backup Windows Affecting Production Tape Cost License Cost Cost to use Disk Technology - PowerPoint PPT PresentationTRANSCRIPT
1© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
EMC Next Generation Backup & Data De-Duplication
High Level Overview and Strategy
Joe StaiberEMC CorporationData De-Duplication Product ManagerBackup, Recovery and Archive Division
2© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
Typical Issus with Traditional Backup
Long Backup Windows
Affecting Production
Tape Cost
License Cost
Cost to use Disk Technology
Client Licensing
VMWare Resources
VCB Infrastructure
Tape Drive Failure
Tape Read/Write Errors
Tape Drive Maintenance
Intraday Restore needs
Retention
Backup Servers
Server Cost / Licensing
VM Guest Proliferation
Off-site storage
Iron Mountain / Transport
Tape Rotation and Changes
Restore times
Restore complexity
Multiple Solutions
Remote office backup
DR / Business Continuity
GROWTH / TIME
3© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
What is Data De-Duplication? – An Analogy
How many times does the word “THE” appear in a sentence, a chapter, an entire book, a library?
Data is not unlike words in print, only instead of words, data uses strings of 1’s and 0’s.
A book may contain 4 million words in it, but only 200,000 different words, 3.8 million words are repeats. Some of them, hundreds or thousands of times.
The Amount of de-duplication possible in your data center is in line with these numbers…. Its staggering
“Would you rather copy 200,000 or 4 million words every day?”
4© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
Remote Site 1 Duplicate Instance
Data already backed up, so only unique IDs stored (20 byte pointers)
DataCenter
A B C D
De-duplication Server (stored backup data)
Only unique data segments are backed up
E
E
AB
CD
First InstanceRemote Site 2 Modified Instance
New data segment identified and backed up
How it Works
Simple Example of Global, Source Data De-duplication
5© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
Where can De-Duplication Occur?
IT’S NOT JUST IN BACKUP!!!!!!
De-Dup is theoretically possible ANYWHERE
But it comes with a price…. Processing, latency, bandwidth, and most importantly TIME
Who does the actual processing?
Storage Array?
Software?
Backup Server?
Tape Device / VTL?
De-duplication Device
BackupServer
6© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
De-Duplication Concepts: Prominent Use Cases
Backup – address significant inefficiency & cost due to redundant data– Integrated end-to-end backup software stack– B2D H/W Target component for incumbent backup environments
Archive Applications and Platforms – efficient retention over time– Low cost, “acceptable performance” secondary storage for mid-term retention, where regulatory
compliance is not required– As an efficiency feature in compliant archive (e.g. Centera)
Primary Storage - “Capacity Optimized” ILM tier – Block and file for tier 2 applications– Different performance and cost characteristics
Replication – Save bandwidth & time by moving less data– Inherent in most storage use case solutions– Also found in WAAS/WAFS solutions
Where is De-Duplication being applied today?
7© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
De-Duplication in PRIMARY Storage will Change the GAME !!!!
CLARiiON InvistaEMC
Centera
Symmetrix
Celerra
EMCDisk Library
DMX-4 and DMX-3
DMX-4 950
NSX
NS80
NS20
NS40G NS80G
NS40
CX3 UltraScale Series
AX150
Fiber Channeland iSCSI
EMC Centera Gen 4 LPNode
DL210
DL4x00
Connectrix
DL6000
New
FlashDrives
Technologies like Flash drives and NAS subfile de-dup are HERE.
8© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
Different Vendors De-Dup in Different Places
Lets look at the Vendors who play in this equation
What happens when the backup application does the de-dup? (such as Commvault)
– Do we need DD or Exagrid to do it again? No we don't
What happens when the primary SAN does it? (NetApp & EMC Celerra)
– Do we need Commvault or DD to do it again? No we don’t
And if they did, they would have to “un-dedup (rehydrate) the data to even be able to read it!!!
Primary Storage
SAN/NAS
BackupApplication
TargetDevice
EMC, HPNetApp, IBM
Etc etc
SymantecCommvault
Etc etc
Data DomainExagrid, Quantum
Etc etcData De-dupe
Data De-dupe
TARGET BASED DE-DUPData Domain / Exagrid
SOFTWARE BASED DE-DUPCommvault / PureDisk
9© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
BUYER BEWARE!!!!!
EMC IS THE ONLY COMPANY THAT MANUFACTURES PRODUCTS IN EVERY SECTION OF THE DE-DUPLICATION MARKET
EMC is ready and capable in leveraging de-duplication across the spectrum
What happens to vendors like Data Domain and Commvault, when the data is already de-duplicated???
Other vendors see De-Dup as a product, not a technology…
Primary Storage
SAN/NAS
BackupApplication
TargetDevice
Data De-dupe
Data De-dupe
Data De-dupe
10© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
SOURCE DE-DUPLICATIONTARGET DE-DUPLICATION
What is Most Impactful to You TODAY?
It is proven and available
It is out of the production window
There are several ways to De-Duplicate data in a backup environment
But first, lets define the backup challenge we all are facing…
Backup is still the best and most efficient application for De-Dup today
De-DupDevice
De-duplication Device
B B B B B B
BackupServer
11© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
Backup De-duplication – Media Impact
Avamar
Avamar makes backup to disk more economical
Traditional Backup vs. EMC AvamarTotal Cumulative Storage - 10TB MS Office file data
(Traditional = weekly fulls + daily incrementals; EMC Avamar = daily fulls)
16
32
48
64
80
96
112
128
144
160
176
192
8
1624
3240
4856
6472
80
8896
5.55.35.14.94.74.44.24.03.83.63.43.20
20
40
60
80
100
120
140
160
180
200
1 2 3 4 5 6 7 8 9 10 11 12
Weeks
Cu
mu
lati
ve
TB
Traditional BackupTraditional Backup with Compression (2:1)EMC Avamar
Traditional Backup
Traditional Backup w/Compression (2:1)
EMC Avamar
Traditional Backup vs. EMC AvamarTotal Cumulative Storage - 10TB MS Office file data
(Traditional = weekly fulls + daily incrementals; EMC Avamar = daily fulls)
0
20
40
60
80
100
120
140
160
180
200
1 2 3 4 5 6 7 8 9 10 11 12
Traditional Backup v. EMC AvamarCumulative Media Required
4 weeks 8 weeks
12© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
The Avalanche
Would you rather stop the avalanche here?
Or here?
The Goal is to De-Duplicate as close
to the SOURCE as possible
13© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
The Power of Avamar and De-Duplication
70 Hour backup down to 4 Hours 400 servers backed up in 5 hours
over T1 or less bandwidth Eliminated Tape Eliminated 40 backup servers Improved Restore times Centralized all Backup Operations 300GB of backup stored in 10GB 99.8% de-dup rate in Windows 99% de-dup rate in SQL
What has Avamar resulted in for Customers:
13
10x Faster Backups
500:1 reduction in network bandwidth
50:1 reduction in backup
infrastructure
Elimination of off-site tape
storage
14© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
BC & Disaster Recovery:
Primary Data Storage: 50TB
Daily Cumulative: 8TB
Weekly Cumulatives: 48TB
Weekly Full Backups: 50TB
98TB
70 hour staged full backup window reduced to 4 hours Cost-effective replication to two sites
95+% Less
Primary Data Storage:50TB
Axion Daily Snapups: .5TB
Axion Weekly Snapups:3.5TB
Weekly Full Backups: N/A
3.5TB
“Avamar has a game changing solution. Through their innovative technology, we have been able to rethink our backup, recovery and replication
infrastructure, providing Morgan Stanley with better local and remote recovery at a greatly reduced TCO.”
—Guy Chiarello, CTO/CIO, Morgan Stanley
15© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
Current Full Backups 5 TBBackup Window 1.2 HoursMedia Used (1 year) = 6 TB
Current Full Backups 5 TBBackup Window 28 hrs
Media Used (1 year) = 106 TB
Expected Results for Manufacturing Co.
28 hour staged full backup window reduced to 1.2 Hours Cost savings estimated at $23,851 for 3 Years New Functionality, Centralized, Faster Backups, Streamlined
Backup Exec
Avamar starts at 17k and goes up from there based on Capacity and Retention periods
16© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
Avamar Customers (Notable)
Verizon Wireless Home Depot The Limited Ann Taylor GE Cardinal Health Nationwide Pepsi AT&T VMWare Nexon Travelers Corporate Express Wellesley College Sterilite CRI Technologies Danvers Bank
CISCO
Arizona Dept of Education
PPG
Medco
Kelley Drye & Warren
Brooks Automation
Chrysler
Morriston Forester
Lexis Nexis
Kroger
Baker & McKenzie
Rob Roy
Reckitt Benckiser
Steamship Authority
La Quinta
21st Century
Oklahoma Turnpike
Churchill Downs
New Albany
Bank of New York
Dell
City of Kirkland
Univ of CA
Kiewit
Komatsu
Iowa Dept of Transportation
Duoline
Citizens Bank
Nodaway Bank
Plymouth Rock
Chadwick Martin
Auto Owners
Mile High Banks
Farallon
17© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
Eastern Regional Avamar Installed Customers(Commercial)
Montgomery County Public Schools Howard County Public Schools Country Meadow Associates Arraya Solutions IPR Evolve IP HydroGeneLogic American Healthways NetTelCos Expedient ADLCM Kirklands Retail Restaurant Services Inc SEA Medical Center DCH Informed Medical Orange Lake Resorts Welbro Construction FCCJ Seminole Community College Avocent
Debartolo Properties First Bank GPX Leesburg Regional Hospital Northside Hospital Lithonia Lighting Manatee County Palm Beach County Parker Hudson Rainer & Dobbs LLC Miles & Stockbridge Reynolds Smith and Hills Sarasota County Clerk Satilla Regional Medical Southern Bone & Joint Success For All CGI Mecklenburg County Barlowworld ABNB Federal Credit Union Wunderlich Microstrategy
18© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
Where is Avamar MOST common
Avamar is used in nearly every industry
Every type of infrastructure
Across most platforms
Its biggest Value comes in areas where backup time / bandwidth are limited:
Remote Offices / Branch Offices
Data Centers / Enterprise Backup Management
VMWare & File Sytems
NAS
19© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
Remote Office Backup Via WAN
Challenges WAN blockage Poor reliability Decentralized Untrained IT staff
Central Data Center
WANDe-dupeServerWAN
Data De-dupe
Data De-dupe
Data De-dupe
Advantages Automated Encrypted Centralized Outstanding ROI
Without Avamar Clients With Avamar Clients
Target approach requires hardware at every site
Real Example from AvamarMD Public School System (WAN)
WR777-DATA1 (T1) Day 1 Day 2 Day 3 2-11-08 2-13-08
23.5 GB 23.5 GB 23.6 GB 23.6 GB 23.7 GB
10.59 GB 70 MB 118 MB 70 MB 47 MB
54.9% 99.7% 99.5% 99.7% 99.8%
18h:51m 53m 55m 1h:4m 50m
WR777-APPS1 (T1) Day 1 Day 2 Day 3 2-11-08 2-13-08
24.8 GB 24.8 GB 24.8 GB 24.8 GB 24.8 GB
8.28 GB 12 MB 24 MB 24 MB 12 MB
66.6% 99.9% 99.9% 99.9% 99.9%
15h:7m 7m 7m 7m 6m
OL502-DATA1 (T1) Day 1 Day 2 Day 3 2-11-08 2-13-08
20.8 GB 20.7 GB 20.7 GB 20.8 GB 20.8 GB
6.32 GB 62 MB 82 MB 104 MB 62 MB
69.6% 99.7% 99.4% 99.5% 99.7%
21h:46m 52m 52m 55m 55m
OL502-APPS1 (T1)
Day 1 Day 2 Day 3 2-11-08 2-13-08
20.7 GB 20.7 GB 20.7 GB 20.7 GB 20.7 GB
5.23 GB 20 MB 20 MB 41 MB 20 MB
74.7% 99.9% 99.9% 99.8% 99.9%
15h:45m 7m 7m 7m 6m
21© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
Virtualization Creates New Backup Challenges
OLD PARADIGM
Low overall utilization and plenty of bandwidth for backup
NEW PARADIGM
High overall server utilization, but low bandwidth for backup
22© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
Up to 95% reduction in data moved
Up to 90% reduction in backup times
Up to 50% reduction in disk impact
Up to 95% reduction in NIC usage
Up to 80% reduction in CPU usage
Up to 50% reduction in memory usage
All backups stored as “virtual full backups,” ready for immediate restore
Maintain effective consolidation ratios without over-taxing CPU utilization
Traditional moves ~200% weekly
Avamar moves ~2% weekly
Backup Built for VMware Infrastructure
Avamar Efficiently Protects Virtual Machines
23© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
AVAMAR SERVER BACKUP SOLUTIONS
AVAMAR CLIENT BACKUP SOLUTIONS
Guest VCB
Service Console
Avamar Software Avamar Virtual Edition
Avamar Data Store
EMC Avamar Solutions for VMware Infrastructure
Flexible, Fast, Efficient and Reliable Backup and Recovery
24© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
Avamar: Efficient Full Backups Traditional Incremental + Full Backups
Full
Incrementals
Avamar: Efficient Full Backups
• Avamar reduces backup times by up to 90% weekly
• CPU utilization slightly higher during backup operation (~15%)
• Reduced time = weekly CPU utilization reduced by up to 85%
• Avamar backups set in “nice mode” or low priority: minimizes CPU contention
Total CPU Utilization by Event (Time Elapsed)
Lightweight Agents / Reduced CPU Utilization
25© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
Avamar Data Store– Multi-node configuration starts at 4 TB and scales to
support up to 32 TB licensable de-duplicated capacity– Equivalent of up to 1.1 PB of cumulative traditional disk
or tape backup storage* – Backup media requirement reduced 20–40 times– High availability and reliability with RAIN architecture,
RAID, daily integrity checks, and redundant power
Avamar Data Store, Single Node– Supports 1 TB and2 TB licensable de-duplicated
storage capacity configurations– Equivalent of up to 70 TB of cumulative traditional disk
or tape backup storage*– Designed for easy deployment at remote offices– Offers fast, local recovery without dependence on a
WAN connection
*Note: Equivalent traditional backup capacity assumptions: 100 percent MS Office file data, weekly full and daily incremental backups, no compression, 10 percent daily change rate, 90-day retention
EMC Avamar Data Store Gen 2
SUSTAINABLE GRID (RAIN) TECHNOLOGY
26© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
Avamar’s Major Competitive DifferentiationsWho is REALLY less expensive?
Symantec PureDisk
Data Domain Exagrid Commvault AVAMAR
Software Only
Hardware Only Hardware Only Software Only HW & SW
Purchase Hardware
Additional $$
Purchase Software
Additional $$
Purchase Software
Additional $$
Purchase Hardware
Additional $$
All Components Included
Must have NetBackup and license Agents
Additional $$
Must License each Agent
Additional $$
Must License each Agent
Additional $$
Must License each Agent
Additional $$
All Agents included
Not Pure Source De-Dup – occurs at Media Server
De-Dup Starts over with each additional
box –Target Only
De-Dup Starts over with each additional box – Target Only
Not Pure Source De-Dup – occurs at Media Server
Grid Architecture allows for true Global
De-Dup – scalable by adding nodes
Requires HW and SW at Remote
offices
Additional $$
Requires HW and SW at Remote
offices
Additional $$
Requires HW and SW at Remote
offices
Additional $$
Requires HW and SW at Remote
offices
Additional $$
No Hardware or Software required to
backup remote offices
Does NOT significantly improve
Backup times
Does NOT significantly improve
Backup times
Does NOT significantly improve
Backup times
Does NOT significantly improve
Backup times
SIGNIFICANTLY improves Backup
timesRequires separate
backup servers
Additional $$
Requires separate backup servers
Additional $$
Requires separate backup servers
Additional $$
Requires separate backup servers
Additional $$
No additional servers required
27© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
Why an Integrated Solution?
HARDWARE ONLY SOLUTIONS: (Data Domain / Exagrid)– As software is now performing the De-Duplication, the hardware de-duplication is NO LONGER
required (This is now the case with Symantec, Commvault v8, and Avamar)
SOFTWARE ONLY SOLUTIONS: (Symantec, Commvault, etc)– As primary storage arrays begin to utilize Data De-Duplication technologies, the Backup Software
is not aware and its value diminishes if the data is already in a De-Duplicated state. Re-Hydration would be required. (This is already the case with NetApp and Celerra NAS based De-Dup and there are more to come)
EMC is the ONLY vendor in the De-Duplication space that manufactures Primary Storage, Backup Software and Backup Hardware.
– Regardless of where the de-duplication occurs, EMC is ready and capable to leverage and optimize it.
EMC|Avamar is the only vendor to utilize variable length segments when de-duplicating data. It will ALWAYS store less, send less and backup faster!
“What vendor do you want to make a strategic investment in?”
Ask these vendors what their strategy is, as data is already de-duplicated before it gets to their product…
28© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
The Economics of Backup & Recovery
TOTAL COST =
$16
6,10
0
TOTAL COST =
$90
,000
Avamar Example
$90,000 Investment
3 years all inclusive (HW, SW, Maint)
No recurring tape spend
All client software/agents incl.
Offisite replication included
All data retained on disk and all media included for the 3 years of retention
20% growth rate of data was factored into the system
Backup window reduced by 300%
Restore times improved
Time to first byte of restore within minutes
Traditional Backup Solution
$35,000 Investment
1 year included HW and Maint
No software, use existing. 9k per year in maint
$3500 per year for new clients
$9000 for VCB SW (vRanger) + a server
$2700 per year for additional media
11,500 per year in offsite costs
No data growth factored in
New media / upgrade required year 2 at 18k. New server too?
HW maint of 12k years 2 and 3
No significant backup improvements
29© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
Avamar SOLVES issues
Long Backup Windows
Affecting Production
Tape Cost
License Cost
Cost to use Disk Technology
Client Licensing
VMWare Resources
VCB Infrastructure
Tape Drive Failure
Tape Read/Write Errors
Tape Drive Maintenance
Intraday Restore needs
Backup Servers
Server Cost / Licensing
VM Guest Proliferation
Off-site storage
Iron Mountain / Transport
Tape Rotation and Changes
Restore times
Restore complexity
Multiple Solutions
Remote office backup
DR / Business Continuity
GROWTH / TIME
30© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
Intuitive, Policy-Based Management Console
31© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
In Summary
File System and VMWare benefits of Source De-Dup alone, justify the investment
You can start SMALL with Avamar (single use) and grow it easily into a full integrated enterprise solution
Source Based De-Duplication makes the Difference, beware of the values of a Target Based De-Dup
Competitive Solutions around De-Dup have value, but understand the differences. They are Band-Aid’s not long term solutions
EMC has the ONLY broad based De-Dup strategy that will grow and continue to add value as De-Dup stretches into new areas
32© Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber
Next Steps
Live Demo’s provided every FRIDAY at 11am EST– Performed by an Avamar Engineer– Live via Web– Ask questions, see the product in action
Solution Sizing– How much data is transferred in a full backup today– % of data is FS/Exchange/DB/Images/VMWare– Retention periods on disk– Replication?
Avamar Virtual Demo
Configuration / Pricing / Cost Justifications
Commonality Analysis
Proof of Concept / Evaluations
where PRIMARY information lives where TIERED information lives
where VIRTUAL information liveswhere BACKUP information lives
where REPLICATED information liveswhere DE-DUPLICATED data lives
where ARCHIVED data lives
where YOUR information should live