case study - storage consolidation
DESCRIPTION
Case Study - Storage Consolidation. Steve Curry Yahoo Inc. About Yahoo !. Quick Stats 300+ million registered users 2 billion page request per day 25 countries, 14 languages 500TB data on disk 1PB data on tape. Yahoo! Storage Operations. Responsibilities - PowerPoint PPT PresentationTRANSCRIPT
Hosted by
Case Study - Storage Consolidation
Steve CurryYahoo Inc.
Hosted byAbout Yahoo!
Quick Stats
300+ million registered users 2 billion page request per day 25 countries, 14 languages 500TB data on disk 1PB data on tape
Hosted byYahoo! Storage Operations
Responsibilities
All US storage administration Data archiving / backups US/Global storage architecture / standards 2nd tier support for global operations Tool development 24/7 global issue/outage response Reporting
Hosted byCase Study #1 – Y! Photos/Briefcase
Case Study #1
• Online photo album
• Online file storage
Hosted byCase Study #1 – Y! Photos/Briefcase
Legacy Architecture
Cheap… *repeat* cheap JBOD’s Single host support JBOD array A/B mirror for redundancy FreeBSD OS 150TB of content Custom apps
Hosted by
Case Study #1 – Y! Photos/Briefcase
…Legacy Architecture
Advantages
• Low cost hardware
• Extremely distributed
Disadvantages
• Not very scalable
• Management headache
• No longer meets reliability requirements
Hosted by
Case Study #1 – Y! Photos/Briefcase
…Legacy Architecture
Management Issues
• Management is per host (over 160 storage hosts)
• Synchronous mirror between A/B pair
• No “Hot-Swap” support
• Single spindle performance
Hosted byCase Study #1 – Y! Photos/Briefcase
hp proliant DL580g2ExtIntUID
2 3 41
hp proliant DL580g2ExtIntUID
2 3 41
hp proliant DL580g2ExtIntUID
2 3 41
hp proliant DL580g2ExtIntUID
2 3 41
hp proliant DL580g2ExtIntUID
2 3 41
hp proliant DL580g2ExtIntUID
2 3 41
hp proliant DL580g2ExtIntUID
2 3 41
hp proliant DL580g2ExtIntUID
2 3 41
hp proliant DL580g2ExtIntUID
2 3 41
hp proliant DL580g2ExtIntUID
2 3 41
hp proliant DL580g2ExtIntUID
2 3 41
hp proliant DL580g2ExtIntUID
2 3 41
hp proliant DL580g2ExtIntUID
2 3 41
hp proliant DL580g2ExtIntUID
2 3 41
hp proliant DL580g2ExtIntUID
2 3 41
This… X 12! Single tier, single spindle performance.
Hosted by
Case Study #1 – Y! Photos/Briefcase
Consolidation Plan
NAS or SAN? Requirements
• Reliability
• Scalability
• Reduce management overheadConsiderations
• Current hardware investment
• Application support
Hosted by
Case Study #1 – Y! Photos/Briefcase
Network Attached Storage Solution
Management
• Filers are heavily deployed
• Smart appliance
• Suite of tools already developed for filers Advantages
• RAID redundancy
• Multi-spindle performance
• Takes advantage of existing hardware
• Ease of application port
Hosted by
Case Study #1 – Y! Photos/Briefcase
…Network Attached Storage Solution
Disadvantages
• Initial cost of deployment (cutover, SCSI –vs- IDE)
• Lot’s of JBOD’s to get rid of! ;-)
Hosted by
Case Study #1 – Y! Photos/Briefcase
New Architecture
NAS solution FreeBSD app servers Load balanced 10 storage hosts Point in time snapshots Dedicated SAN backup fabric Distributed-farm model
Hosted byCase Study #1 – Y! Photos/Briefcase
Lan1
Lan2
UID
Ext
Int
hp proliant DL360g3
01
Lan1
Lan2
UID
Ext
Int
hp proliant DL360g3
01
Lan1
Lan2
UID
Ext
Int
hp proliant DL360g3
01
Lan1
Lan2
UID
Ext
Int
hp proliant DL360g3
01
Lan1
Lan2
UID
Ext
Int
hp proliant DL360g3
01
TCP/IP / NFS Traffic
Lan1
Lan2
UID
Ext
Int
hp proliant DL360g3
01
Lan1
Lan2
UID
Ext
Int
hp proliant DL360g3
01
Lan1
Lan2
UID
Ext
Int
hp proliant DL360g3
01
Lan1
Lan2
UID
Ext
Int
hp proliant DL360g3
01
Lan1
Lan2
UID
Ext
Int
hp proliant DL360g3
01
Simple 2 tier model. Scalable, redundant, multi-spindle RAID performance, hot-swap support.
Hosted by
Case Study #1 – Y! Photos/Briefcase
Consolidation Wins! Cost considerations Performance Backups Management High availability Hot swap
Hosted byCase Study #2 - Data Mining
Case Study #2
Global data mining Global log collection
Hosted byCase Study #2 - Data Mining
Current Architecture
DAS attached arrays Custom scripts Stacker type tape libraries Single-tier disk storage
Hosted byCase Study #2 - Data Mining
Management Issues
Large storage host count Many small tape libraries No redundancy Does scale for future requirements
Hosted byCase Study #2 - Data Mining
Storage Requirements
High write performance Data growth 2TB per day!! Store data on disk for 30 days Archive to tape
Consolidation Considerations
Reduce host management Create a multi-tier storage architecture Consolidate to one large tape library Increase write performance
Hosted byCase Study #2 - Data Mining
• Common Y! model
• Multi-tier storage
• Scalable
Primary Storage
Primary Storage
Lan1
Lan2
UID
Ext
Int
hp proliant DL360g3
01
Lan1
Lan2
UID
Ext
Int
hp proliant DL360g3
01
Lan1
Lan2
UID
Ext
Int
hp proliant DL360g3
01
Lan1
Lan2
UID
Ext
Int
hp proliant DL360g3
01
Lan1
Lan2
UID
Ext
Int
hp proliant DL360g3
01
TCP/IP / iSCSI Traffic
Lan1
Lan2
UID
Ext
Int
hp proliant DL360g3
01
Lan1
Lan2
UID
Ext
Int
hp proliant DL360g3
01
Lan1
Lan2
UID
Ext
Int
hp proliant DL360g3
01
Lan1
Lan2
UID
Ext
Int
hp proliant DL360g3
01
Lan1
Lan2
UID
Ext
Int
hp proliant DL360g3
01
2nd Tier NearlineStorage
2nd Tier NearlineStorage
Storage Fabric