what is cern? particle physics laboratory
DESCRIPTION
CERN. What is CERN? Particle physics laboratory Europe old and new (plus collaborations with USA, Canada, Japan, India, Pakistan, Russia, China… ) Planning for ~5 PB per year, 2-5 GB/s in 2007: data storage problem!. Ski slopes. LHC ring. Geneva Airport. CERN: current tape situation. - PowerPoint PPT PresentationTRANSCRIPT
• What is CERN? Particle physics laboratory• Europe old and new (plus collaborations with USA, Canada,
Japan, India, Pakistan, Russia, China… )• Planning for ~5 PB per year, 2-5 GB/s in 2007: data storage
problem!
LHC ring
Geneva Airport
Ski slopes..
CERN
CERN: current tape situation
• Main drive: 9940B (50), already very busy– Peak test rate is ~ 1 GB/s
• Secondary: 9840A (20), for ‘small’ files– Do we modernise to 9840C?
• Main robotics: Powderhorn• Secondary: L700e (exotics, LTO1, SDLT..)
• Efficiency is low, especially for read• Lots of drives needed now, so in 2007?
Current usage: ~150 Mbytes/s
Media
• Main media: 9940 200GC 13,600
• Secondary: 9940 60GC 8,800– Being converted to 200GC, ~60% done
• 9840 A– 2,100 user data (move to 200GC, reuse?)– 2,800 Legato data (legacy, reuse?) – 2,900 ADSM data (legacy, reuse?)
• Oddities: SDLT, LTO, legacy 3590, DLT….
Current plans
• Avoid purchase of 9840 or 9940 media– Re-use existing media as far as possible, OK in 2004, but 2005?
• Consolidate backups, some aging out, but a lot of equipment! (virtualisation?)– 1 Timberwolf, 6 DLT700 for AFS– 2 Powderhorns, 14 IBM 3590E for ADSM– 2 Powderhorns, 6 STK 9840 for Legato– 1 Powderhorn, 4 STK 9940B for TSM
• Decide on LHC system components in 2005– Call for Tender
• Drive: STK, IBM 3592, LTO 2, other?• Library: STK Powderhorn/8500, IBM, ADIC Scalar 10000, other?
Estimations for LHC
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
2006 2008 2010
GB/s
0
10
20
30
40
50
60
2006 2008 2010
Petabyte
Minimal drives for LHC
0
20
40
60
80
100
120
140
160
2006
2008
2010
9940B
0
20
40
60
80
100
120
3592
Use at peak throughput assumed, realistically need 3 x this, > 3 x?Powderhorns can cope re drive numbers (40/silo)…. But speed?
Minimal drives?• Write can be reasonably effective, often >50% of possible maximum
– Many GB (10?) in one mount– Drive definitely streams
• ~60s unit reserved/pick/load/position• 350s writing, say, for 9940B• ~60s rewind/unload/place• We write ANSI standard tape files, minimum 3s per file today…
• Reading in CASTOR is poor, depends on files picked– 1 file, 1 GB, ~25% of possible maximum, depends heavily on robot speed
• ~60s pick/load/position• 35s writing, say, for 9940B• ~60s rewind/unload/place• Some improvement in next CASTOR version… (marshalling requests)
• But we READ more than we WRITE, except for data recording
Minimal cartridge slots for LHC
0
50
100
150
200
250
300
2006
2008
2010
9940B
SAIT
0
20
40
60
80
100
120
140
160
180
3592
3592b?
2010- 100K SAIT (or 78K 3592b?) is 18(14?) Powderhorns, so => new building?2010- 100K SAIT is fine with 8500 in existing zones, but not supportedA ‘3592b’ does not exist today. SAIT exists, 500GB, ~30 MB/s..
Costs for LHC, 2010
• Libraries: 20 8500 ~ 10 M $ (?) 33%• Media: 100K SAIT ~ 10 M $ (‘usually’ 100$/cart) 33%• Drives: ~300 SAIT ~ 10 M $ (‘usually’ 30 K$/drive) 33%
– Why so many? Because read is poor at CERN but frequent..
• However, drives/media in 8500 not a ‘monopoly’ problem
• Today? Consider only major use, drives important…• Libraries: 6 Powderhorn ~ 1.5 M $ 28%• Media: 25 K 9940 ~ 2.5 M $ 28%• Drives: 50 9940B ~ 1.5 M $
44%
Major operational interests• Benefits of 8500 very clear
– 99.9% available machinery, easy upgrading…– Speed very helpful in disorganised reading, common at CERN– Drive/media mix very helpful (but might not be used..)
• Benefits of SAIT-like capacity very clear– Higher capacity, no building needed– Data recording looks ‘easy’ at ~40 drives for 1 GByte/s
• Linux driver from STK?– Hard to write your own and maintain it, hard to adapt to ‘new drive’ quickly– Might they eventually do this?
• Better (WWW access) – Library/drive/media monitoring and logging features– Predictions of imminent failure, and timely requests for intervention – Access to MIR data for media monitoring, problem prediction, otherwise?– Customers ask for it, and it would save STK time, money…