mass storage & information retrieval

40
Mass Storage & Information Mass Storage & Information Retrieval Retrieval Paul J Mazzotte Union University April 02, 2004

Upload: fionnuala-russell

Post on 30-Dec-2015

50 views

Category:

Documents


0 download

DESCRIPTION

Mass Storage & Information Retrieval. Paul J Mazzotte Union University. April 02, 2004. Agenda. Background RAID and JBOD SCSI and FC Storage Paradigms DAS (Direct Attached Storage) NAS (Networked Attached Storage) SAN (Storage Area Networks) Performance and Cost – NAS vs SAN - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Mass Storage & Information Retrieval

Mass Storage & Information RetrievalMass Storage & Information Retrieval

Paul J MazzotteUnion University

April 02, 2004

Page 2: Mass Storage & Information Retrieval

Aprile 2, 2004 2

AgendaAgenda

Background– RAID and JBOD– SCSI and FC

Storage Paradigms– DAS (Direct Attached Storage)– NAS (Networked Attached Storage)– SAN (Storage Area Networks)– Performance and Cost – NAS vs SAN

Storage and Backup– Backup Software– Tape Technologies– DAS and Backup– SAN and Backup

What’s Next

Page 3: Mass Storage & Information Retrieval

Aprile 2, 2004 3

Background

Background

Page 4: Mass Storage & Information Retrieval

Aprile 2, 2004 4

RAID and JBODRAID and JBOD

Page 5: Mass Storage & Information Retrieval

Aprile 2, 2004 5

RAID and JBODRAID and JBOD

JBOD: “Just a Bunch Of Disks”– Drives independently attached to the I/O channel

– Scaleable, but requires server to manage multiple volumes

– Does not provide protection in case of failure

RAID: “Redundant Array of Inexpensive Disks”– Fault-tolerant grouping of disks that server sees as a single volume

– Combination of parity-checking, mirroring, and striping

– Self-contained manageable unit of storage

Page 6: Mass Storage & Information Retrieval

Aprile 2, 2004 6

RAIDRAID

Multiple RAID Levels to choose from:– 0, 1, 2, 3, 4, 5, 6, 10

Each level has certain inherent advantages and disadvantages.

Page 7: Mass Storage & Information Retrieval

Aprile 2, 2004 7

RAID LevelsRAID Levels

• Data is subdivided and each division is written to a different disk drive.

• Advantages – Performance when multiple controllers used

• Disadvantages - Not a true raid

• Minimum 2 drives

• Data is written to two different drives.

• Advantages – 100% Redundant

1 write, 2 reads possible

•Disadvantages – Highest Disk Overhead

• Minimum 2 drives

Page 8: Mass Storage & Information Retrieval

Aprile 2, 2004 8

RAID LevelsRAID Levels

• Each entire data block is written on a data disk; parity for blocks in the same rank is generated on Writes, recorded in a distributed location and checked on Reads.

• Advantage – High read, medium write performance

• Disadvantages – Rebuild time (Compared to Raid 1)

• Minimum 3 drives

• The data block is subdivided ("striped") and written on the data disks. The stripe parity is generated on writes, recorded on the parity disk and checked on reads.

• Advantage – Medium read, High write performance

• Disadvantages - Rebuild time (Compared to Raid 1)

• Minimum 3 drives

Page 9: Mass Storage & Information Retrieval

Aprile 2, 2004 9

SCSI and FCSCSI and FC

Page 10: Mass Storage & Information Retrieval

Aprile 2, 2004 10

SCSISCSI

Version Databus Speed Cable

1 (1986) 8 bit 5 MB/s (slow) 6 meters2 (1994) 8 bit (narrow) 10 MB/s (fast) 25 meters

16 bit (wide) 20 MB/s 25 meters3 [Ultra](1995) 8 bit 20 MB/s (fast-20) 25 meters

16 bit 40 MB/s 25 meters[Ultra-2](1998) 8 bit 40 MB/s (fast-40) 25 meters

16 bit 80 MB/s 25 meters[Ultra-3](1999) 8 bit 80 MB/s (fast-80) 25 meters

16 bit 160 MB/s (ultra-160) 25 meters[Ultra-4](2003) 8 bit 160 MB/s (fast-160) 25 meters

16 bit 320 MB/s (ultra-320) 25 meters

Page 11: Mass Storage & Information Retrieval

Aprile 2, 2004 11

Fibre ChannelFibre Channel

Point-to-Point

Arbitrated Loop

Switched Fabric

200 MB200 MB

200 MB200 MB

200 MB

Page 12: Mass Storage & Information Retrieval

Aprile 2, 2004 12

SCSI and FCSCSI and FC

Fibre Fibre ParallelChannel Channel AL SCSI

Connections 16 Million 126 15

Distance 10 km 10 km 25 m

Bandwidth 200 MB/s 200 MB/s 320

MB/sPer connection Shared Shared

Hut Plug Yes Yes No

Multiple Protocols Yes Yes No

Page 13: Mass Storage & Information Retrieval

Aprile 2, 2004 13

ATM

FC - ATM

IP

FC Link Encapsulation

FC - LE

ULP (Upper Level Protocol) SCSI-3

SCSI - 3 Command Set Mapping

FC - 4IPI - 3 Command

Set Mapping (IPI-3 STD)

FC - 3 Common Services

FC - 0

FC - 1

FC - 2Fibre Channel Physical & Signaling Interface( FC- PH, FC-PH2,

FC-PH3 )Physical Variant

Encode / Decode

Framing ProtocolFC - AL

8B/10B Encoding

Copper, Optical

FC - AL -2

NOT SCSI vs FCNOT SCSI vs FC

Page 14: Mass Storage & Information Retrieval

Aprile 2, 2004 14

Storage

Storage

Page 15: Mass Storage & Information Retrieval

Aprile 2, 2004 15

DAS, NAS, and SANDAS, NAS, and SAN

Page 16: Mass Storage & Information Retrieval

Aprile 2, 2004 16

DASDAS

LAN

File I/O(NFS/CIFS)

Client Workstations

Block I/O(SCSI/FC-AL)

FileServer(s)

ApplicationServer(s)

Definition: DAS is composed of multiple storage disks or disk array units that are directly attached to a general purpose server.

Page 17: Mass Storage & Information Retrieval

Aprile 2, 2004 17

DAS IssuesDAS Issues

Proliferation of “server and storage islands” which causes a large management burden

File Sharing Issues

Page 18: Mass Storage & Information Retrieval

Aprile 2, 2004 18

NASNAS

LAN

File I/O(NFS/CIFS)

NAS Servers (filers)

Client Workstations

Definition: NAS is a special-purpose storage system that directly attaches to the LAN and responds to file I/O requests coming across the LAN from a device.

Page 19: Mass Storage & Information Retrieval

Aprile 2, 2004 19

Same as DAS – Not ExactlySame as DAS – Not Exactly

Tuned Network Operating System (NOS)

Supports Multiple Protocols (NFS, CIFS, NCP)

Page 20: Mass Storage & Information Retrieval

Aprile 2, 2004 20

Does NAS Solve DAS IssuesDoes NAS Solve DAS Issues

Simplify Management – Yes (for the most part)– Allows storage to be consolidated but only up to

the size of the NAS box (~5 to 15 TB) File Sharing – Yes

– “True NAS” servers will have support for multiple protocols.

Page 21: Mass Storage & Information Retrieval

Aprile 2, 2004 21

NAS IssueNAS Issue

Performance– Network bandwidth / Network Traffic

– Protocol Inefficiencies

Page 22: Mass Storage & Information Retrieval

Aprile 2, 2004 22Disk

Client Management Station

ApplicationServers

ManagementServer(s)

LAN

SANSAN

Block I/O(FC)

FC Network

Definition: SAN is a high-speed network dedicated to interfacing storage subsystems to servers.

Page 23: Mass Storage & Information Retrieval

Aprile 2, 2004 23

Zoning Zoning (1 of 2)(1 of 2)

Zoning arranges FC connected devicesinto logical groups

FC Switch Network

Node Node Node NodeNode

Zone X Zone Y

Page 24: Mass Storage & Information Retrieval

Aprile 2, 2004 24

Zoning Zoning (2 of 2)(2 of 2)

Operation Zone members “see” only other members of the

zone Zones are configured dynamically Devices can be members of more than one

zone Switched fabric zoning can take place at the port

or device level Benefits

Secured device access Allows operating system co-existence

Page 25: Mass Storage & Information Retrieval

Aprile 2, 2004 25

Does SAN Solve DAS IssuesDoes SAN Solve DAS Issues

Simplify Management – Yes – Allows storage to be consolidated (seen as one

big island instead of a couple large islands like NAS)

File Sharing – Not Yet– Still waiting for the development of a CFS.

Page 26: Mass Storage & Information Retrieval

Aprile 2, 2004 26

SAN Local storage access Private net for storage Storage protocols Centralized management

NAS Remote file access Shares user net Network protocols “Centralized” management

Good for file sharing (“home directories”)

Good for hosting large databases

SAN and NAS RecapSAN and NAS Recap

Page 27: Mass Storage & Information Retrieval

Aprile 2, 2004 27

SAN/NAS PerformanceSAN/NAS Performance

SPEC

Page 28: Mass Storage & Information Retrieval

Aprile 2, 2004 28

SAN/NAS CostSAN/NAS Cost

Cost per MB

“The Storage Report - Customer Perspectives & Industry Evolution - 19 June 2001” by Merrill Lynch & Co. and McKinsey & Company, Page 48, Chart 51

3 Year TCO (cents per MB) for 2 TB

Page 29: Mass Storage & Information Retrieval

Aprile 2, 2004 29

Platform Cents per MB(2.5 TB)

Cents per MB(12 TB)

Cents per MB(5 TB)

NetappFAS960NetappFAS960

CompaqEVA

CompaqEVA

7.2($176,722)

9.1($228,261)

4.1($206,836)

5.5($275,266)

N/A

Note: SAN costs include two 16-port switches but no cabling.

Type

NAS

SAN

SAN/NAS CostSAN/NAS Cost

3.4($406,880)

Page 30: Mass Storage & Information Retrieval

Aprile 2, 2004 30

SAN/NAS Business TrendSAN/NAS Business Trend

“SNIA Presentation - 19 May 1999” by Nick Allen of Gartner Group

Page 31: Mass Storage & Information Retrieval

Aprile 2, 2004 31

SAN/NAS Business TrendSAN/NAS Business Trend

0

2

4

6

8

10

12

14

16

2000 2001 2002 2003 2004 2005 20060%

10%

20%

30%

40%

50%

60%

70%

80%

90%

Annual vendor revenue $B

DAS

SAN

SAN %

Source: “Worldwide external raid controller-based storage forecast, 2000-2006”, Gartner, August 2002

Page 32: Mass Storage & Information Retrieval

Aprile 2, 2004 32

Backup

Backup

Page 33: Mass Storage & Information Retrieval

Aprile 2, 2004 33

Legato (Networker)

Veritas (Netbackup)

IBM (Tivoli)

Backup Software Backup Software (Mid-Range)(Mid-Range)

Page 34: Mass Storage & Information Retrieval

Aprile 2, 2004 34

Mid-Range Tape TechnologiesMid-Range Tape TechnologiesAIT-3 SuperDLT LTO-1 Mammoth-2

Manufacturer Sony Quantum IBM/S/HP ExabyteRelease Q4 2001 Q1 2001 Q3 2000 Q1 2000Technology Helical Linear Linear HelicalNative Capacity (GB) 100 110 100 60Compressed Capacity (GB) 260 220 200 150Native Transfer Rate (MB/s) 12 11 15 12Compress Transfer Rate (MB/s) 31 22 30 3012 Hr Window Trans Rate (GB) 518.4 475.2 648.0 518.4MTBF (Hours) 400,000 250,000 250,000 300,000Head Life (Hours) 50,000 30,000 30,000 50,000Media Life (Avg Passes) 30,000 1,000,000 1,000,000 20,000Media Price per Cartridge$135 $134 $110 $89Price per GB (Native) $1.35 $1.22 $1.10 $1.48Drive Price $?,?00 $4,400 $4,300 $4,000SCSI LVD LVD/HVD LVD/HVD LVD/HVDFibre Channel NO NO YES YES

The announced road maps are as follows:[Note: Year(Native Capacity, Compressed Capacity, Native Transfer Rate, Compressed Transfer Rate]Mammoth (M3, M4, M5) 2003(120,300,20,50) 2004(200,500,30,75) 2005(400,1000,60,150)LTO (LTO-2, LTO-3, LTO-4) 2003(200,400,30,60) 2004(400,800,60,120) 2006(800,1600,120,240)AIT (AIT-4, AIT-5, AIT-6) 2003(200,520,24,62) 2005(400,1040,48,124) 2007(800,2080,96,248)DLT (SDLT-2, SDLT-3) 2003(220,440,22,44) 2005(500,1000,44,88) 200?(???,????,??,???)

Page 35: Mass Storage & Information Retrieval

Aprile 2, 2004 35

LAN

DAS and BackupDAS and Backup

Backup Client Nodes

Small Servers / Desktops

Backup Servers

Jukebox

Jukebox

More Servers

Page 36: Mass Storage & Information Retrieval

Aprile 2, 2004 36SAN Disk Array(s)

LAN

SAN and BackupSAN and Backup

FC Network

Servers(Oracle, Mail, etc)

Tape Library

NAS Nodes Server NodesBackup Server

From

Gigabit

Files to Backup

Backup File Index

Disk Blocks

Netapp Filers

Page 37: Mass Storage & Information Retrieval

Aprile 2, 2004 37

What’

s Nex

t

What’

s Nex

t

Page 38: Mass Storage & Information Retrieval

Aprile 2, 2004 38

In The Near FutureIn The Near Future

Storage– iSCSI

Backup– Disk to Disk Backup

Page 39: Mass Storage & Information Retrieval

Aprile 2, 2004 39

ReviewReview

RAID and JBOD

SCSI and FC

NAS and SAN

Backup

Page 40: Mass Storage & Information Retrieval

Aprile 2, 2004 40

The End

The End