1 u.s. department of the interior u.s. geological survey contractor for the usgs at the eros data...
TRANSCRIPT
1
U.S. Department of the Interior
U.S. Geological Survey
Contractor for the USGS at the EROS Data Center
EDC CR1 Storage ArchitectureEDC CR1 Storage Architecture
August 2003
Ken GackeSystems Engineer
(605) [email protected]
2Contractor for the USGS at the EROS Data Center
Storage Architecture DecisionsStorage Architecture Decisions
Evaluated and recommended through engineering white papers and weighted decision matrices
Requirements Factors Reliability – Data Preservation Performance – Data Access Cost – $/GB, Engineering Support, O&M Scalability – Data Growth, Multi-mission, etc. Compatibility with current Architecture
Program/Project selects best solution
5Contractor for the USGS at the EROS Data Center
Storage TechnologiesStorage Technologies
Online Storage Characteristics Immediate Data Access Server Limitations
Number of I/O slots System Bandwidth
Cost is Linear High Performance RAID -- $30/GB using 146GB drives Low Cost RAID -- $5/GB using ATA or IDE Drives Non RAID – Less than $5/GB using 146GB drives
Facility Costs Disk drives are always powered up Increased cooling requirements
Life cycle of 3 to 4 years
6Contractor for the USGS at the EROS Data Center
Storage TechnologiesStorage Technologies
Online Storage Direct Attach Storage (DAS)
Storage directly attached to server Network Attach Storage (NAS)
TCP/IP access to storage typically with CIFS and NFS access Storage Area Network (SAN)
Dedicated high speed network connecting storage devices Storage devices disassociated from server
7Contractor for the USGS at the EROS Data Center
Storage TechnologiesStorage Technologies
Direct Attach Online Storage Disk is direct attached to single server System Configuration
SCSI or Fibre Channel RAID Fibre Channel devices are typically SAN ready
Just a Bunch of Disk (JBOD) Redundant Array Independent Disk (RAID)
High Performance on the local server Manageability
Simple Configuration Resource reallocation requires physical move of controllers and
disk
8Contractor for the USGS at the EROS Data Center
Storage TechnologiesStorage Technologies
Direct Attach Online Storage Advantages
High performance on local server Good for image processing and database applications
Disadvantages Data sharing limited to slower network performance Difficult to reallocate resources to other servers
9Contractor for the USGS at the EROS Data Center
Storage TechnologiesStorage Technologies
Direct Attached
Host A
File System
Host B
File System
Host C
File System
100Mb Network (FTP/NFS)
100MB FC
10Contractor for the USGS at the EROS Data Center
Storage TechnologiesStorage Technologies
NAS Online Storage Disk attached on server accessible over TCP/IP Network System Configuration
Fibre Channel RAID Configurations Switched Network Environment
Performance Network Switches and/or dedicated network topologies
Reliability NAS Server performs a single function thereby reducing faults RAID, Mirror, Snapshot capabilities
Easy to Manage
11Contractor for the USGS at the EROS Data Center
Storage TechnologiesStorage Technologies
Network Attach Online Storage Advantages
Easy to share files among servers Network Storage support NFS and CIFS Servers can use existing network infrastructure
Good for small file sharing such as office automation Availability of fault protection such as snapshot and mirroring
Disadvantages Slower performance due to TCP/IP overhead Increases network load Backup/Restore to tape may be difficult and/or slow Does not integrate with nearline storage
12Contractor for the USGS at the EROS Data Center
Storage TechnologiesStorage Technologies
Network Attached1Gb Network (NFS/CIFS)
Host A Host B Host C
File System
File System
File System
Share Files
NASServer
15Contractor for the USGS at the EROS Data Center
Storage TechnologiesStorage Technologies
SAN Online Storage Disk attached within Fabric Network System Configuration
Fibre Channel RAID Configurations
Scalable High Performance High Reliability with redundant paths Manageability
Configuration becomes more complex Logical reallocation of resources
17Contractor for the USGS at the EROS Data Center
Storage TechnologiesStorage Technologies
Host A
Host B
Host C
Redundancy SAN Configuration
100Mb Network
FibreSwitch
(DMF)
FibreSwitch
18Contractor for the USGS at the EROS Data Center
Storage TechnologiesStorage Technologies
SAN Online Storage Architecture Disk Farm
Multiple servers share large disk farm Server mounts unique file systems
Clustered File Systems Multiple servers share a single file system Software Required – Vendor solutions include
SGI CXFS ADIC StorNext File System Tivoli SANErgy
19Contractor for the USGS at the EROS Data Center
Storage TechnologiesStorage Technologies
Host A
Host B
Host C
Disk Farm SAN Configuration
100Mb Network
FibreSwitch
Logicalreallocationof disk
20Contractor for the USGS at the EROS Data Center
Storage TechnologiesStorage Technologies
Host A
Host B
Host C
Cluster SAN Configuration
100Mb Network
FibreSwitch
CXFS
CXFS
ClusteredFile System
21Contractor for the USGS at the EROS Data Center
Storage TechnologiesStorage Technologies
SAN Risks Cost is higher than DAS/NAS Technology Maturity
Solutions are typically vendor specific Application software dependencies
Infrastructure Support Complexity of Architecture Management of SAN Resources Sharing of storage resources across multiple
Programs/Projects
22Contractor for the USGS at the EROS Data Center
Storage TechnologiesStorage Technologies
SAN Benefits Administration flexibility
Logically move disk space among servers Large capacity drives can be sliced into smaller file systems Scales better than direct attach Integrate within nearline configuration
Data Reliability Storage disassociated from the server Fault Tolerant with Redundant Paths
Increase Resource Utilization Reduce the number of FTP network transfers Logically allocate space among servers
23Contractor for the USGS at the EROS Data Center
Storage TechnologiesStorage Technologies
Host A
Host B
Host C
SAN with Nearline Configuration
1Gb Network
FibreSwitch
CXFS
DMF/CXFS
ClusteredFile System
Tape Library
24Contractor for the USGS at the EROS Data Center
Online/Nearline Cost ComparisonOnline/Nearline Cost Comparison
0
500
1000
1500
2000
2500
3000
3500
4000
5yr
Co
st
(1000s)
5TB 10TB 20TB 40TB 80TB
Perf RAID
Bulk RAID
PH 9840C
PH 9940B
Use of Existing Infrastructure (CR1 Silo)
25Contractor for the USGS at the EROS Data Center
Storage TechnologiesStorage Technologies
Bulk RAID Storage Considerations Manageability
Server connectivity constraints Many “islands” of storage
Multiple storage management utilities Multiple vendor maintenance contracts
Data Reliability Loss of online file system requires full restore from backup
On average, could restore one to two terabyte per day Performance
Multiple user access will reduce performance Life Cycle
Disk storage life cycle shorter then tape technologies
26Contractor for the USGS at the EROS Data Center
Storage TechnologiesStorage Technologies
SAN Nearline Storage Data Access
Data stored on infinite file system Immediate access to data residing on disk cache Delayed access for data retrieved from tape
Access via LAN using FTP/NFS Access via SAN Clustered File System
SGI DMF/CXFS Server SGI, SUN, Linux, NT clients
27Contractor for the USGS at the EROS Data Center
Storage TechnologiesStorage Technologies
SAN Cluster Proposal Mass Storage System & Product Distribution System (PDS) Limit Exposure to Risk
Servers are homogeneous Implement with Single dataset Data is file orientated Data currently being FTP
Anticipated Benefits Improved performance Reduce total disk capacity requirements Experience for future storage solutions
29Contractor for the USGS at the EROS Data Center
Current DMF/SAN ConfigurationCurrent DMF/SAN Configuration
DMF ServerProduct Distribution CXFS SAN Storage
Tape Drives 8x9840 2x9940
1Gb Fibre 2Gb Fibre
Disk Cache /dmf/edc 68GB/dmf/doqq 547GB/dmf/guo 50GB/dmf/pds 223GB/dmf/pdsc 547GB
30Contractor for the USGS at the EROS Data Center
CR1 Mass Storage SystemCR1 Mass Storage System
Nearline Data Storage
0
4
8
12
16
20
24
28
32
36
40
44
48
Dec
-93
Dec
-94
Dec
-95
Dec
-96
Dec
-97
Dec
-98
Dec
-99
Dec
-00
Dec
-01
Dec
-02
Ter
ab
ytes
Sto
red
31Contractor for the USGS at the EROS Data Center
CR1 Mass Storage SystemCR1 Mass Storage System
Nearline Data Storage by Data Type
0
4
8
12
16
20
24
Dec
-93
Dec
-94
Dec
-95
Dec
-96
Dec
-97
Dec
-98
Dec
-99
Dec
-00
Dec
-01
Dec
-02
Dec
-03
Ter
ab
ytes
Sto
red
General
Archive
Ortho
PDS
32Contractor for the USGS at the EROS Data Center
CR1 Mass Storage SystemCR1 Mass Storage System
Nearline Data Storage
0
10
20
30
40
50
60
70
80
90
100
Dec
-93
Dec
-94
Dec
-95
Dec
-96
Dec
-97
Dec
-98
Dec
-99
Dec
-00
Dec
-01
Dec
-02
Dec
-03
Dec
-04
Ter
ab
ytes
Sto
red
General
Archive
Ortho
PDS
Total
33Contractor for the USGS at the EROS Data Center
CR1 Mass StorageCR1 Mass Storage
0
1
2
3
4
5
6
7
8
9
Te
rab
yte
Pe
r M
on
th
19
93
19
94
19
95
19
96
19
97
19
98
19
99
20
00
20
01
20
02
20
03
Nearline Monthly Average Data Archive/Retrieve
Data Archived
Data Retrieved
34Contractor for the USGS at the EROS Data Center
CR1 Mass StorageCR1 Mass Storage
0
1
2
3
4
5
6
7
8
MB
/Se
c
19
94
19
95
19
96
19
97
19
98
19
99
20
00
20
01
20
02
20
03
Nearline Average Transfer Rate
Data Archived
Data Retrieved
35Contractor for the USGS at the EROS Data Center
CR1 Mass StorageCR1 Mass Storage
0
200
400
600
800
1000
1200
1400
Gig
ab
yte
19
96
19
99
20
02
20
03
Largest Single Day Data Transfers
Data Archived
Data Retrieved
Description 1996 – 3490, pre DOQQ1999 – D-3, DOQQ2002 – 9840, DOQQ2003 – 9840/9940, UA/AVHRR
Av 12.1MB/sec
36Contractor for the USGS at the EROS Data Center
CR1 DMF FY04 BudgetCR1 DMF FY04 Budget
DescriptionEstimated
CostStorageTek Maintenance $41,000.00SGI Maintenance (O300, DMF/SAN) $22,000.00Sun Maintenance $1,300.00ITS Charges (Labor, Legato) $20,000.00Infrastructure Upgrades $41,700.00Project Staff $64,000.00
Total $190,000.00
37Contractor for the USGS at the EROS Data Center
Storage TechnologiesStorage Technologies
Multi Tiered Storage Vision Online
Supported Configurations DAS – Local processing such as image processing NAS – Data sharing such as office automation SAN – Production processing such as product generation
Data accessed frequently Nearline
Integrated within SAN Scalable for large datasets and less frequently accessed data Multiple Copies and/or Offsite Storage
38Contractor for the USGS at the EROS Data Center
Storage TechnologiesStorage Technologies
SAN – Final Thoughts SAN Technology Maturity
SAN solution should be from a single vendor Program/Project SAN solution benefits
+ Decrease storage requirements+ Increase performance+ Increase reliability+ Increase flexibility of resource allocations- Increase cost (hardware/software)- Increase configuration complexity