scotgrid report: prototype for tier-2 centre for lhc akram khan on behalf of the scotgrid team...
TRANSCRIPT
ScotGRID Report: Prototype ScotGRID Report: Prototype for Tier-2 Centre for LHCfor Tier-2 Centre for LHC
ScotGRID Report: Prototype ScotGRID Report: Prototype for Tier-2 Centre for LHCfor Tier-2 Centre for LHC
Akram Khan
On Behalf of the ScotGRID TeamOn Behalf of the ScotGRID Team
(http:/www.scotgrid.ac.uk)(http:/www.scotgrid.ac.uk)
Akram Khan
On Behalf of the ScotGRID TeamOn Behalf of the ScotGRID Team
(http:/www.scotgrid.ac.uk)(http:/www.scotgrid.ac.uk)
GridPP6 Collaboration Meeting ScotGRID Report
Overview of TalkOverview of Talk
Misc BitsMisc Bits
Summary & OutlookSummary & Outlook
Future Plans Future Plans
Hardware / OperationHardware / Operation
What are we hoping to do..?What are we hoping to do..?
GridPP6 Collaboration Meeting ScotGRID Report
Never Forget The Spirit of the Never Forget The Spirit of the ProjectProject
The LHC Computing
Challenge for Scotland
2000: JREI Bid The JREI funds will make possible to commission and fully exercise a prototype
LHC computing centre in Scotland
The Centre would provide:
1. Technical service based for the grid(GIIS, VO services…)
2. DataStore to handle samples of data towards part. Analysis
3. Significant simulation production capability
4. Excellent network connection RAL + regional sites
5. Support grid middle devel. with CERN and RAL
6. Support core software devel. within LHCb and ATLAS
7. Support user applications in other scientific areas
This will enable us to answer: Is the grid viable solution for LHC computing challenge Can a two-site Tier-2 centre be setup and operate effectively How can network topology between Ed,GL, RAL & CERN
GridPP6 Collaboration Meeting ScotGRID Report
ScotGRID: Glasgow / Edinburgh ScotGRID: Glasgow / Edinburgh 59 x330 dual PIII 1GHz/2 Gbyte compute
nodes
2 x340 dual PIII/1 GHz /2 Gbyte head nodes
3 x340 dual PIII/1 GHz/2 Gbyte storage nodes, each with 11 by 34 Gbytes in Raid 5
1 x340 dual PIII/1 GHz/0.5 Gbyte masternode
59 x330 dual PIII 1GHz/2 Gbyte compute nodes
2 x340 dual PIII/1 GHz /2 Gbyte head nodes
3 x340 dual PIII/1 GHz/2 Gbyte storage nodes, each with 11 by 34 Gbytes in Raid 5
1 x340 dual PIII/1 GHz/0.5 Gbyte masternode
xSeries quad Pentium Xeon 700 MHz/16 Gbytes, server
1 FAStT 500 controller
7 diskarrays of 10 x 73 Gb disk
xSeries quad Pentium Xeon 700 MHz/16 Gbytes, server
1 FAStT 500 controller
7 diskarrays of 10 x 73 Gb disk
GridPP6 Collaboration Meeting ScotGRID Report
ScotGRID: Glasgow - SchematicScotGRID: Glasgow - Schematic
Internet VLAN
10.0.0.0 VLAN
100 Mbps
1000 Mbps
Masternode Storage Nodes Head Nodes
Compute Nodes
Campus Backbone
bottleneck
GridPP6 Collaboration Meeting ScotGRID Report
ScotGRID: Edinburgh - SchematicScotGRID: Edinburgh - Schematic
Disk Arrays(Total 4.6 Tb)Disk Arrays(Total 4.6 Tb)
FastT 500 Storage ControllerFastT 500 Storage Controller
Server (4*Pentium Xeon, 16Gb RAM)Server (4*Pentium Xeon, 16Gb RAM)
SRIF Network
GridPP6 Collaboration Meeting ScotGRID Report
Towards a Prototype Tier-2Towards a Prototype Tier-2
2002 200520042003
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4
Q1 Q2 Q3 Q4
Q1 Q2 Q3 Q4
2001
Q4
PrototypesPrototypesPrototypesPrototypesxCAT tutorial, attempt on masternode
ScotGRID room handed over to builders
Building work complete xCAT reinstall
User registration, trail production
Installation of Software
Configuring disk array
Reconfiguring kernel drivers for FAStT storage controller
User registration, Upgrade storage controller
Sco
tGR
ID d
eliv
ery
of
Kit
: D
ec 2
001
Group disk (re)organisation to match project
Group disk (re)organisation to match project
Glasgow: MC-FARM
Edinburgh: Datestore
Pro
po
sal J
RE
I: 2
000
GridPP6 Collaboration Meeting ScotGRID Report
ScotGRID 1ScotGRID 1stst Year Review Year Review
9:45 Arrive - Coffee
10:00-10:15 Welcome (Freddie Moran)
10:15-10:35 ScotGrid Introduction (Tony Doyle)
10:35-10:50 Technical Status Overview (Akram Khan)
10:50-11:05 Cluster Operations (David Martin)
11:05-11:30 Coffee
11:30-11:50 ScotGrid Upgrade Plans (Steve Playfer)
11:50-13:00 IBM IT Briefing Discussion
13:00-14:00 Lunch
14:00-14:30 IBM IT Briefing Discussion
14:40-14:55 Grid Data Management - simulations (David Cameron)
15:10-15:30 Tea
Particle Physics Applications
15:30-15:45 ATLAS (John Kennedy)
15:45-16:00 LHCb (Akram Khan)
16:00-16:15 BABAR (Steve Playfer)
16:15-16:30 CDF (Rick St Denis)
ScotGrid Meeting at IBM Briefing Centre (Greenock)
Friday 10th Jan
ScotGrid Meeting at IBM Briefing Centre (Greenock)
Friday 10th Jan
Complete Success as you will
see!
GridPP6 Collaboration Meeting ScotGRID Report
ScotGRID StatisticsScotGRID StatisticsThe amount of storage space in ScotGRID:used by each group
Edinburgh (5TBytes) Glasgow (600 Gbytes)
GridPP6 Collaboration Meeting ScotGRID Report
ScotGRID:CPU Usage 24/6/02 – 6/1/2003ScotGRID:CPU Usage 24/6/02 – 6/1/2003
The % use by each group over the pervious weeks
startup phase Christmas period different applications
GridPP6 Collaboration Meeting ScotGRID Report
Forward Look: IntroductionForward Look: Introduction
ScotGrid JREI project includes a mid-term hardware upgrade.
As part of GridPP planning, we need to upgrade from Prototype to Production Tier 2 status by 2004.
JREI funding left to be spent by June 2003:
Edinburgh £220k Glasgow £30k
£250k
GridPP6 Collaboration Meeting ScotGRID Report
Forward Look: Possible Upgrade Forward Look: Possible Upgrade Plan?Plan?
Edinburgh kit Glasgow
Dual FastT700+ 20-32 TB
IBM@server: xSeries 440 8* Xeon
(1.9GHz) Scalable configuration
GridPP6 Collaboration Meeting ScotGRID Report
Forward Look: Front-End Grid Forward Look: Front-End Grid ServersServers
Front end for EDG style Compute Engine/LCFG
Front end for EDG style Storage Engine Overall ScotGrid Front end to arbitrate
Grid services being requested?
Would like to install Grid software on dedicated (modest-sized) servers. Decouples Grid softwarefrom Compute and Storage hardware.
Will there be a standard configuration for Grid access to Tier 2 sites? (RLS/SlashGrid)
GridPP6 Collaboration Meeting ScotGRID Report
Towards a Production Tier-2 & Towards a Production Tier-2 & beyondbeyond
2002 200520042003
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4
Q1 Q2 Q3 Q4
Q1 Q2 Q3 Q4
2001
Q4
ProductionProductionProductionProduction
Delivery of more Kit…
End of JREI fundingStart of ScotGRID-II
Start of GridPP-II
Links to other applications…
Production Tier-2 Site
Future Upgrades ?
GridPP6 Collaboration Meeting ScotGRID Report
Technical Support Group Technical Support Group Core members of the group & invited to discuss wider issues:
CORE:Akram Khan (Chair: Edinburgh)David Martin (sysadm: Glasgow)Roy de Ruiter-Koelemeiger (sysadm: Edinburgh)Gavin McCance (EDG: Glasgow)RA post (EDG: Edinburgh)
INVITED:Paul Mitchell (sysadm: Edinburgh)Alan J. Flavell (Networking: Glasgow)Steve Traylen (EDG: RAL)IBM Team
CORE:Akram Khan (Chair: Edinburgh)David Martin (sysadm: Glasgow)Roy de Ruiter-Koelemeiger (sysadm: Edinburgh)Gavin McCance (EDG: Glasgow)RA post (EDG: Edinburgh)
INVITED:Paul Mitchell (sysadm: Edinburgh)Alan J. Flavell (Networking: Glasgow)Steve Traylen (EDG: RAL)IBM Team
Webpage “technical group” of http://www.scotgrid.ac.uk/
Support is a real issue we are just about ok but for a production Tier-2?
Support is a real issue we are just about ok but for a production Tier-2?
GridPP6 Collaboration Meeting ScotGRID Report
All University trafficPacket filtering
1 Gb/s 1 Gb/s
2.5 Gb/sGlasgowEdinburgh
1. 194.36.1.1 (194.36.1.1) 1.479 ms 0.743 ms 0.558 ms
2. 130.209.2.1 (130.209.2.1) 2.343 ms 0.678 ms 0.577 ms
3. 130.209.2.118 (130.209.2.118) 0.577 ms 0.322 ms 0.454 ms
4. glasgow-bar.ja.net (146.97.40.105) 0.564 ms 0.305 ms 0.341 ms
5. po9-0.glas-scr.ja.net (146.97.35.53) 0.546 ms 0.544 ms 0.465 ms
6. po3-0.edin-scr.ja.net (146.97.33.62) 1.644 ms 1.471 ms 1.634 ms
7. po0-0.edinburgh-bar.ja.net (146.97.35.62) 1.509 ms 1.474 ms 1.400 ms
8. 146.97.40.62 (146.97.40.62) 1.622 ms 1.493 ms 1.518 ms
9. vlan686.kb5-msfc.net.ed.ac.uk (194.81.56.58) 2.084 ms 2.528 ms 1.869 ms
10. 129.215.255.242 (129.215.255.242) 1.851 ms 1.828 ms 1.624 ms
Traceroute
GridPP6 Collaboration Meeting ScotGRID Report
EDG Middleware: EDG Middleware: Replica Optimiser Replica Optimiser SimulationSimulation
Using ScotGrid for large-scale simulation runs.
uses ~15MB memory for ~60 threads.
2-12 hours/simulation
Results to appear in IJHPCA 2003.
GridPP6 Collaboration Meeting ScotGRID Report
BaBar: Monte Carlo Production BaBar: Monte Carlo Production (SP4)(SP4)
ScotGrid (= edin)
8 Million Events
in 3 weeks
ScotGrid (= edin)
8 Million Events
in 3 weeks
Expect to import some streams/skims to Edinburgh in 2003
After the upgrade to ~30TB there may be interest in using ScotGrid to add to the storage available at the RAL Tier A site
GridPP6 Collaboration Meeting ScotGRID Report
CERN (932 k) and Bologna (857 k) RAL (471 k) Imperial College and Karlsruhe (437 k) Lyon (202 k) ScotGrid (194 k) Cambridge (100 k) Bristol (92 k) Moscow (87 k) Liverpool (70 k) Barcelona (56 k) Rio (32 k) CESGA (28 k) Oxford (25 k)
LHCb: Production CentresLHCb: Production Centres
We can be confident for the TDR production and in 56 days with the current configuration we can produce 10 Million events (March-April 2003).
Included in Draft of LHCC document: B0->J/phi K0s
GridPP6 Collaboration Meeting ScotGRID Report
Summary and OutlookSummary and Outlook
Exciting time for ScotGRID:Exciting time for ScotGRID: There has been a lot of effort during the past year to get
ScotGRID up and operational – we have learnt many ticks!
Exciting time for ScotGRID:Exciting time for ScotGRID: There has been a lot of effort during the past year to get
ScotGRID up and operational – we have learnt many ticks!
Operational Prototype Centre: Operational Prototype Centre: We have an operational centre Meeting the short term needs of the applications with modest resources (HEP + Middleware + non-PP) Proof of Principle for Tier-2 Operation (pre-grid)
There is a lot that needs still to the done:
having a full production system (24*7) (opt-grid) to prototype various architectural solutions for Tier-2 look towards upgrades with a view for LHC timetable
Support & Resources are a real issue for the near term future (Q1-2004)
There is a lot that needs still to the done:
having a full production system (24*7) (opt-grid) to prototype various architectural solutions for Tier-2 look towards upgrades with a view for LHC timetable
Support & Resources are a real issue for the near term future (Q1-2004)
GridPP6 Collaboration Meeting ScotGRID Report
RLS ArchitectureRLS Architecture
LocalReplica
Catalogues
LRC onStorageElement
LRC onStorageElement
LRC onStorageElement
RLI RLIRLI
LRC onStorageElement
Multiply indexed LRC for higher availability
RLI indexing over the full namespace (all LRCs are indexed)
RLI indexing over a subset of LRCs
LRC indexed by only one RLI
ReplicaLocationIndices
Glasgow Edinburgh CERN
A Replica Location Service (RLS) is system that maintains and provides access to information about the physical location of copies of data items.
A Replica Location Service (RLS) is system that maintains and provides access to information about the physical location of copies of data items.
Gavin McCanceAlasdair EarlAkram Khan
(starting Feb)