hepix storage task force roger jones lancaster chep06, mumbai, february 2006

15
HEPiX Storage Task Force HEPiX Storage Task Force Roger Jones Lancaster CHEP06, Mumbai, February 2006

Upload: maud-dorsey

Post on 26-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: HEPiX Storage Task Force Roger Jones Lancaster CHEP06, Mumbai, February 2006

HEPiX Storage Task ForceHEPiX Storage Task Force

Roger JonesLancaster

CHEP06, Mumbai, February 2006

Page 2: HEPiX Storage Task Force Roger Jones Lancaster CHEP06, Mumbai, February 2006

February 13th 2005

MandateMandate

– Examine the current LHC experiment computing models.

– Attempt to determine the data volumes, access patterns and required data security for the various classes of data, as a function of Tier and of time.

– Consider the current storage technologies, their prices in various geographical regions and their suitability for various classes of data storage.

– Attempt to map the required storage capacities to suitable technologies.

– Formulate a plan to implement the required storage in a timely fashion.

Page 3: HEPiX Storage Task Force Roger Jones Lancaster CHEP06, Mumbai, February 2006

February 13th 2005

MembershipMembership

• -o- Roger Jones, Lancaster, ATLAS [email protected]• -o- Andrew Sansum, RAL, [email protected]• -o- Bernd Panzer/ Helge Meinhard, CERN,

[email protected]• -o- David Stickland (latterly) (CMS)• -o- Peter Malzacher GSI Tier-2, Alice, [email protected]• -o- Andrei Maslennikov,CASPUR,

[email protected]• -o- Jos van Wezel GridKA, HEPiX, [email protected]

• Shadow 1 [email protected]• Shadow 2 [email protected]

• -o- Vincenzo Vagnoni Bologna, LHCb, [email protected]

• -o- Luca dell’Agnello• -o- Kors Bos, NIKHEF by invitation

Thanks to all members!

Page 4: HEPiX Storage Task Force Roger Jones Lancaster CHEP06, Mumbai, February 2006

February 13th 2005

Degree of SuccessDegree of Success

• Assessment of Computing Model– RJ shoulders the blame for this area!– Computing TDRs help – see many talk at this conference– Estimates of contention etc rough; toy simulations are

exactly that, and we need to improve this area beyond the lifetime of the task force.

• Disk– Thorough discussion of disk issues– Recommendations, prices etc

• Archival media– Less complete discussion– Final reporting here in April HEPiX/GDB meeting in Rome

• Procurement– Useful guidelines to help tier 1 and tier 2 procurement

Page 5: HEPiX Storage Task Force Roger Jones Lancaster CHEP06, Mumbai, February 2006

February 13th 2005

OutcomeOutcome

• Interim document available through the GDB• Current High Level Recommendations

– It is recommended that a better information exchange mechanism be established between (HEP) centres to mutually improve purchase procedures.

– An annual review should be made of the storage technologies and prices, and a report made publicly available.

– Particular further study of archival media is required, and tests should be made of the new technologies emerging.

– A similar regular report is required for CPU purchases. This is motivated by the many Tier-2 centres now making large purchases.

– People should note that the lead time from announcement to effective deployment of new technologies is up to a year.

– It is noted that the computing models assume that archived data is available at the time of attempted processing. This implies that the software layer allows pre-staging and pinning of data.

Page 6: HEPiX Storage Task Force Roger Jones Lancaster CHEP06, Mumbai, February 2006

February 13th 2005

InputsInputs

• Informed by C-TDRs and computing model documents– Have tried to estimate contentions etc, but this

requires much more detailed simulation work– Have looked at data classes and associated

storage/access requirements, but his could be taken further

• E.g. models often provide redundancy on disk, but some sites assume they still need to back disk to tape in all cases

– Have included bandwidths to MSS from LHC4 exercise, but more detail would be good

Page 7: HEPiX Storage Task Force Roger Jones Lancaster CHEP06, Mumbai, February 2006

February 13th 2005

Storage ClassesStorage Classes

1) tape, archive, possibly offline (vault), access > 2 days, 100 MB/s

2) tape, on line in library, access > 1 hour, 400 MB/s

3) disk, any type, in front of tape caches4) disk, SATA type optimised for large files,

sequential Read only IO 5) disk, SCSI/FC type optimised for small files,

Read/Write random IO6) disk, high speed and reliability RAID 1 or 6

(catalogues, home directories etc)

Page 8: HEPiX Storage Task Force Roger Jones Lancaster CHEP06, Mumbai, February 2006

February 13th 2005

DiskDisk

• Two common disk types– SCSI/FibreChannel

• Higher speed and throughput• Little longer lifetime (~4 years)• More expensive

– SATA (II)• Cheaper• Available in storage arrays• Lifetime >3 years (judging by warrantees!)• RAID5 gives fair data security

– Could still have 10TB/1PB unavailable on any given day• RAID6 looks more secure

– Some good initial experiences– Care needed with drive and other support

• Interconnects– Today

• SATA (300 MB/s)– Good for disk to server, point to point

• Fibre channel (400 MB/s)– High speed IO interconnect, fabric

– Soon (2006)• Serial Attached SCSI (SAS – multiple 300 MB/s)• Infiniband (IBA 900 MB/s)

Page 9: HEPiX Storage Task Force Roger Jones Lancaster CHEP06, Mumbai, February 2006

February 13th 2005

ArchitecturesArchitectures

• Direct Attached Storage– Disk is directly attached to CPU– Cheap but administration costly

• Network Attached Storage– File servers on Ethernet network– Access by file-based protocols

• Slightly more expensive but smaller number of dedicated nodes

• Storage in a box – servers have internal disks• Storage out of box – fiber or SCSI connected

• Storage Area Networks– Block not file transport– Flexible and redundant paths, but expensive

Clients

IP networknode n

node 1

node 1

node n

node 2

Storage Controller

Page 10: HEPiX Storage Task Force Roger Jones Lancaster CHEP06, Mumbai, February 2006

February 13th 2005

Disk Data AccessDisk Data Access

• Access rates– 50 streams per RAID group or 2 MB/s per

stream on a 1 Gbit interface – Double this for SCSI

• Can be impaired by– Software interface/SRM– Non-optimal hardware configuration

• CPU, kernel, network interfaces

– Recommend 2 x nominal interfaces for read and 3 x nominal for write

Page 11: HEPiX Storage Task Force Roger Jones Lancaster CHEP06, Mumbai, February 2006

February 13th 2005

Disk Disk RecommendationsRecommendations

• Storage in a box (DAS/NAS disks together with server logic in a single enclosure)– most storage for a fixed cost– more experience with large SATA + PCI RAID deployments

desirable– more expensive solutions may require less labour/be more

reliable (experiences differ)– high quality support may be the deciding factor

• Recommendation– Sites should declare the products they have in use

• A possible central place would be the central repository setup at hepix.org

– Where possible, experience with trial systems should be shared (Tier-1s and CERN have a big role here)

Page 12: HEPiX Storage Task Force Roger Jones Lancaster CHEP06, Mumbai, February 2006

February 13th 2005

Procurement Procurement GuidelinesGuidelines

• These come from H Meinhard• Many useful suggestions for

procurement• May need to be modified to local rules

Page 13: HEPiX Storage Task Force Roger Jones Lancaster CHEP06, Mumbai, February 2006

February 13th 2005

Disk PricesDisk Prices

• DAS/NAS: storage in a box (disks together with server logic in a single enclosure)– 13500-17800 € per usable 10 TB

• SAN/S: SATA based storage systems with high speed interconnect.– 22000-26000 € per usable 10 TB

• SAN/F: FibreChannel/SCSI based storage systems with high speed interconnect– ~55000 € per usable 10 TB

• These numbers are reassuringly close to those from Pasta reviews, but it should be noted there is a spread from geography and other situations

• Evolution (raw disks)– Expect Moore’s Law density increase of 1.6/year between

2006 and 2010– Also consider effect of increase at only 1.4/year– Cost reduction 30-40% per annum

Page 14: HEPiX Storage Task Force Roger Jones Lancaster CHEP06, Mumbai, February 2006

February 13th 2005

Tape and ArchivalTape and Archival

• This area is ongoing and needs more work– Less frequent procurements

• Disk system approaches active tape system costs by ~2008• Note computing models generally only assume archive copies at the

production site

• Initial price indications similar to LCG planning projections– 40 CHF/TB for medium– 25MB/s effective scheduled bandwidth drive + server is 15kCHF -

35 kCHF– Effective throughput is much lower for chaotic usage– 6000 slot silo is ~500 kCHF

• New possibilities include spin on demand disk etc– Needs study by T0 and T1s, should start now– Would be brave to change immediately

Page 15: HEPiX Storage Task Force Roger Jones Lancaster CHEP06, Mumbai, February 2006

February 13th 2005

PlansPlans

• The group is now giving more consideration to archival– Need to do more on archival media– General need for more discussion of storage

classes – More detail to be added on computing model

operational details

• Final report in April• Further task forces needed every year or

so