Download - The GSI Mass Storage System
The GSI Mass Storage System
TAB GridKa, FZ Karlsruhe
Sep. 4, 2002
Horst Göringer, GSI Darmstadt
FZ Karlsruhe Sep. 4, 2002 [email protected] 2
Mass Storage @ GSI: Overview
recent history current (old) system
functionality structure usage
new system the requirements structure status
Outlook
FZ Karlsruhe Sep. 4, 2002 [email protected] 3
Mass Storage @ GSI: History
• till 1995: ATL Memorex 5400, capacity 1000 Gbyte IBM 3480 cartridges IBM HSM (MVS)• 1995: IBM 3494 ATL, ADSM (AIX) ADSM interface not acceptable: - no cross-platform support for clients (AIX, VMS) - ADSM "node-based" - no guaranteed availability for staged files - ADSM commands not so easy to use => use ADSM API for GSI mass storage system (1996)• 2000: broad discussions on future system• 2001: decision to enhance GSI mass storage system
FZ Karlsruhe Sep. 4, 2002 [email protected] 4
Mass Storage @ GSI: Servercurrent server hardware: IBM RS6000 H50, 2 processors, 1 GB memory Gigabit Ethernet (max rate ~21 Mbyte/s) IBM 3494 Automatic Tape Library 8 IBM 3590E tapes (capacity 20-40 GByte uncompressed) current capacity ATL 60 TByte (currently used: 27 TB exp. data, 12 TB backup) ~350 GByte internal staging disks
current server software: AIX V4.3 Tivoli Storage Manager Server V4.1.3
FZ Karlsruhe Sep. 4, 2002 [email protected] 5
Mass Storage @ GSI: Logical Structure
FZ Karlsruhe Sep. 4, 2002 [email protected] 6
Mass Storage @ GSI: Functionality1. command interface
archive retrieve query stage delete pool_query ws_query
2. RFIO API functions available for open, close, read, seek, ... ROOT@GSI: RFIO client accesses GSI mass storage write functions: till end 2002
FZ Karlsruhe Sep. 4, 2002 [email protected] 7
Mass Storage @ GSI: Functionality
• identical client interface on all GSI platforms
(Linux, AIX, VMS)
• unique name space
• security policy
• client tape support (ANSI Label)
• server log files for
- error reports
- statistical analysis
• GSI software: C with sockets
• integrated in Alien
FZ Karlsruhe Sep. 4, 2002 [email protected] 8
Mass Storage @ GSI: Stage Pool Manager
administers several Stage Pools with different attributes: file life time max space user access
currently active pools:
RetrievePool: no guaranteed life time StagePool: min life time guaranteed
future pools: ArchivePool pools dedicated to user groups
FZ Karlsruhe Sep. 4, 2002 [email protected] 9
Mass Storage @ GSI: Stage Pool Manager
administers an additional stage meta data DB
locks each access to a pool
handles disk clean requests from different sources: from process serving a user from watch demon
initiates and controls disk clean processes
FZ Karlsruhe Sep. 4, 2002 [email protected] 10
FZ Karlsruhe Sep. 4, 2002 [email protected] 11
Mass Storage @ GSI: Upgrade Requirements
scalable system needed (data capacity and max data rate)
1. higher bandwith => several data movers o each with access to each tape device and robot o each with own disk pools
2. one master administering the complete meta data DB3. hardware independency
This means: fully parallel data streams separation of control flow and data flow
FZ Karlsruhe Sep. 4, 2002 [email protected] 12
Mass Storage @ GSI: Upgrade Requirements
enabling technologies:
Storage Area Network
Tivoli Storage Manager (successor of ADSM)
FZ Karlsruhe Sep. 4, 2002 [email protected] 13
Mass Storage @ GSI: New Structure
FZ Karlsruhe Sep. 4, 2002 [email protected] 14
Mass Storage @ GSI: New Hardware
new hardware: tape robot StorageTek L700 (max 68 TByte) 8 IBM 3850 Ultrium LTO tape drives (capacity 100 GByte uncompressed) 2 TSM server (Intel PC, fail-safe Windows 2000 cluster) 4 (8) data mover (Intel PC, Windows 2000) SAN components Brokat switch 4100 (16 ports, each 1 Mbit/s)
purpose: verification of new concept hardware test: SAN, ATL, tape drives, tape volumes later: new backup system for user data
FZ Karlsruhe Sep. 4, 2002 [email protected] 15
Mass Storage @ GSI: Status
hardware, TSM/Storage Agent: seems to work (tests still running)
new GSI software (for Unix) nearly ready
• command client
• RFIO client (read only)
• server package (master and slaves on data movers)
• stage pool manager (master and slaves on data movers)
• in Oct 2002: to be used for production with current AIX server
FZ Karlsruhe Sep. 4, 2002 [email protected] 16
Mass Storage @ GSI: Current Plans
in 2003: DAQ connection to mass storage n event builders will write in parallel via RFIO to
dedicated archive disk pools enhanced performance and stability requirements
in 2003/2004: new ATL (several 100 TByte) to fulfill current requirements for next years
FZ Karlsruhe Sep. 4, 2002 [email protected] 17
Mass Storage @ GSI: Outlook
the yearly increment of experiment data grows rapidly: an order of magnitude next years after 2006 Alice experiment running "Future Project" of GSI
=> the mass storage system must be scalable in both, storage capacity, and data rates
the system must be flexible to follow the development of new hardware
Our new concept fulfills these requirements! TSM is a powerful storage manager satisfying our needs now and also in the next future high flexibility with GSI made user interface
Mass Storage @ GSI
Appendix
More Details
FZ Karlsruhe Sep. 4, 2002 [email protected] 19
Mass Storage @ GSI: DAQ Connection
FZ Karlsruhe Sep. 4, 2002 [email protected] 20
Mass Storage @ GSI : the new System
not only upgrade - entrance into new hardware and new platform! new server platform Windows 2000 new tape robot new tape drives and media new network
more work necessary due to missing practice unknown problems lower quality of tape drives and media presumably more operation failures
=> costs reduced by cheaper components, but more manpower necessary (in development and operation)however, we have many options for the future!
FZ Karlsruhe Sep. 4, 2002 [email protected] 21
Mass Storage @ GSI : the new System
SW enhancements and adaptions (cont‘d): adaption adsmcli server to new concept => tsmcli
division of functionality into several processes code restructuring and adaptions communication between processes data mover selection (load balancing)
enhancement disk pool manager subpools on each data mover => n slave disk pool managers communication master - slaves enhancement metadata database subpool selection DAQ pool handling
FZ Karlsruhe Sep. 4, 2002 [email protected] 22
Mass Storage @ GSI : the new System
Potential risks: new server platform Windows 2000 new tape robot new tape drives and media new network
more work necessary due to missing practice unknown problems lower quality of tape drives and media
FZ Karlsruhe Sep. 4, 2002 [email protected] 23
Current Status: File Representation
file archived is defined by
o archive name
o path name (independent from local file system)
o file name (identical with name in local file system)
user access handled by access table for all supported
client platforms
files already archived are not overwritten
o except explicitly required
local files are not overwritten (retrieve)
FZ Karlsruhe Sep. 4, 2002 [email protected] 24
Current Status: Local Tape Handling
support of standard ANSI Label tapes on client side
tape volumes portable between client platforms (AIX, Linux, VMS)
enhanced error handling:
o corrupt files do not affect others when handling a file list
o missing EOF mark is handled
user friendly: archive complete tape volume invoking one command
FZ Karlsruhe Sep. 4, 2002 [email protected] 25
Current Status: why Disk Pools
disk pools help to avoid
1. tape mount and load times
2. concurrent access to same tape volume
3. blocking of fast tape drives in robot
this is useful if files are needed several times in a short time the network connection of clients is slow large working sets are retrieved file after file large working sets are accessed in parallel from a compute farm
FZ Karlsruhe Sep. 4, 2002 [email protected] 26
FZ Karlsruhe Sep. 4, 2002 [email protected] 27
Disk Pool Manager: the current Pools
RetrievePool: o files stored via adsmcli retrieve
- inhibit with option stage=noo files stored when read via APIo no guaranteed life time
StagePool: o files stored via adsmcli stageo min life time guaranteed (3 days currently)
current space:o hardware sharedo overall: 350 GByteo StagePool: 100 GByte maxo the RetrievePool uses space unused by the StagePool (`lent space`)
FZ Karlsruhe Sep. 4, 2002 [email protected] 28
API Client for Mass Storage
API client: there are functions available e.g. to open/close files in mass storage to read/write buffers in remote files to shift the file pointer in remote files
=> data stream: analysis program - mass storage fully controlled by user
useful if only selective access to (small) parts of file required
o parts of ROOT tree o ntuples
local disk space unsufficient
requirement for GSI API client: compatible with CERN/HEP RFIO package RFIO interface available in CERN applications
FZ Karlsruhe Sep. 4, 2002 [email protected] 29
API Client: Logical Structure
FZ Karlsruhe Sep. 4, 2002 [email protected] 30
API Client: RFIO at GSI
RFIO functions developed: • needed for ROOT: rfio_open, rfio_read, rfio_close, rfio_lseek • additionally (e.g. for analysis programs): rfio_fopen, rfio_fread, rfio_fclose
file name: file representation in GSI mass storage systemcurrently available at GSI:
– enhanced adsmcli server already in production– sample C program using rfio_f... (read) on Linux: /GSI/staging/rfio– ROOT with RFIO client (read)– GO4 viewer with RFIO client (read)
in future: write functionality (rfio_write, rfio_fwrite)
presumably more operation failures (also data movers)
=> costs reduced by cheaper components more manpower necessary (development and operation)
FZ Karlsruhe Sep. 4, 2002 [email protected] 31
API Client: ROOT with RFIO
ROOT at GSI since version 225-03 with RFIO API
For RFIO usage in ROOT:
load shared library libRFIO.so in your ROOT session
for file open: use class TRFIOFile instead of TFile
prefix the file representation in the mass storage system
with `rfio:`
in GO4 viewer: no prefix to the file representation needed
FZ Karlsruhe Sep. 4, 2002 [email protected] 32
FZ Karlsruhe Sep. 4, 2002 [email protected] 33
FZ Karlsruhe Sep. 4, 2002 [email protected] 34
Mass Storage @ GSI : The current Bottlenecks
data capacity: in April 2001: tape robot nearly completely filled
(30 TByte uncompressed) since April 27, 2001: new tape drives IBM 3590E
o write with double density: > 20 GByte/volume
o copy all volumes => ~ 30 TByte free capacity
current requirements GSI (TByte): experiment backup accumulated 2001: 15 6.5 22 2002: 30 4.0 56 2003: 34 5.0 95
additionally: multiple instances of experiment data!