us atlas tier 1 facility rich baker deputy director us atlas computing facilities october 26, 2000
DESCRIPTION
10/26/00US ATLAS Tier 1 Facility - Rich BakerTRANSCRIPT
US ATLAS Tier 1 Facility
Rich BakerDeputy Director
US ATLAS Computing FacilitiesOctober 26, 2000
10/26/00 US ATLAS Tier 1 Facility - Rich Baker
Existing Facility
10/26/00 US ATLAS Tier 1 Facility - Rich Baker
E450(NFS Server)
Dual Intel
Dual Intel
USATLASSwitch
SANHub
BackupServer
HPSSArchiveServer
Ÿ XXX.USATLAS.BNL.GOVŸ E450 front line with SSHŸ Objectivity Lock Server
200 GBytesRAID Disk
US ATLAS Tier 1 Facility
62 Intel/LinuxDual 700/450 MHz256/512 MBytes
9/18 GBytes100 Mbit Ethernet(3,200 SPECint95)
9840TapesAFS
Servers
AFS
~10 GBytesRAID DiskUS ATLAS
AFS
Ÿ LSFŸ AFSŸ ObjectivityŸ Gnu etc.
US ATLAS Equipment
RCF Infrastructure
~50 GBytesJBOD Disk
Intel/LinuxWeb Server
August 2000 Configuration
128 MBytes18 GBytes
.
.
.
RCFLAN
10/26/00 US ATLAS Tier 1 Facility - Rich Baker
Full Scale Facility (1)• Based on NCB Review Numbers
– Focus on Analysis (200k of 209k SI95)– Probably Insufficient for Simulation
• CPU: 209,000 SpecInt95– Commodity Pentium/Linux– Estimated 640 Dual Processor Nodes
• Online Storage: 365 TB Disk– High Performance Storage Area Network– Baseline: Fibre Channel Raid Array
10/26/00 US ATLAS Tier 1 Facility - Rich Baker
Full Scale Facility (2)
• Tertiary Storage: 2 PB Tape Library– Baseline: HPSS, STK Media & Tape Drives– 75% Event Summary Data– 25% Simulation, Analysis Objects, Local Data– “Raw” I/O Rate: 400 MB/second, 12.5 PB/year– Exploit Use Patterns to Maximize Efficiency
• Random Access to AOD - Always on Disk• Managed Access to ESD - Grid SW? Custom SW?
10/26/00 US ATLAS Tier 1 Facility - Rich Baker
Timeline Overview• Prototype – FY ‘01 & FY ‘02
– Initial Development & Test, 1% to 2% scale– Establish Facility Independent from RCF– Lessons Learned from RCF Experience
• System Tests – FY ‘03 & FY ‘04– Large Scale System Tests, 5% to 10% scale– Support Growing Tier 2 Network
• Operation – FY ‘05, FY ‘06 & beyond– Full Scale System Operation, 20% (‘05) to 100% (‘06)
10/26/00 US ATLAS Tier 1 Facility - Rich Baker
Tier 1 Facility Capacity
0.0%
20.0%
40.0%
60.0%
80.0%
100.0%
2000 2001 2002 2003 2004 2005 2006 2007
Year
Perc
enta
ge C
ompl
ete
10/26/00 US ATLAS Tier 1 Facility - Rich Baker
0.0
5.0
10.0
15.0
20.0
25.0
30.0
2001 2002 2003 2004 2005 2006
Tier 1 Facility Staffing
Operations
Performance Monitoring
Grid & WAN
HSM
System Administration
Planning & Management
10/26/00 US ATLAS Tier 1 Facility - Rich Baker
Estimation Methods - Hardware (1)
• Use Recent RCF Purchases as Cost Baseline• Moore’s Law Scaling for Commodity
Components (CPU, Disk, Tape)• STK Tape Drives: Constant Cost per Drive,
Double I/O Capacity Every 2 Years• Similar Constant Cost Projections for High
Performance Data Mover Nodes– $40K per HPSS Mover Node– $30K per SAN Control Node
10/26/00 US ATLAS Tier 1 Facility - Rich Baker
Estimation Methods - Hardware (2)
• Local Area Network: 8% of Disk+CPU Cost plus $20K per HPSS Mover
• Firewall/WAN Hardware: 25% of LAN Cost• Interactive Nodes
– 2 Linux Nodes Purchased per Year– Maintain One Sun/Solaris Node
• “General Purpose” Nodes– 21 Currently for RCF - Estimate 25 for ATLAS
10/26/00 US ATLAS Tier 1 Facility - Rich Baker
• Share Site License Costs with RCF– HPSS: 50% of $200K by 2005– LSF: 50% of $65K Starting 2002
• Veritas: $5K per SAN Control Node– Or Other SW Choice
• Most Other SW License Costs Can Only be Estimated - Total $97K in 2005– Good Estimate Based on Actual RCF Costs to
Support Operational Facility & Development
Estimation Methods - Software
10/26/00 US ATLAS Tier 1 Facility - Rich Baker
Tier 1 Budget Numbers (k$)
• HPSS License Double Counted
Personnel Material Total2001 711$ 675$ 1,386$ 2002 1,151$ 569$ 1,720$ 2003 1,538$ 1,029$ 2,567$ 2004 2,173$ 1,215$ 3,388$ 2005 3,537$ 1,741$ 5,278$ 2006 3,450$ 5,103$ 8,553$
Total 12,559$ 10,333$ 22,892$
10/26/00 US ATLAS Tier 1 Facility - Rich Baker
$-$1,000$2,000$3,000$4,000$5,000$6,000$7,000$8,000$9,000
2001 2002 2003 2004 2005 2006
Tier 1 Budget Overview (k$)
Material
Personnel
10/26/00 US ATLAS Tier 1 Facility - Rich Baker
Tier 1 Material Numbers (k$)CPU Disk HPSS SW LAN/FW Travel Staff Misc HW Other
2001 -$ 279$ 131$ 85$ 46$ 39$ 21$ 20$ 12$ 2002 52$ 147$ 85$ 120$ 20$ 66$ 18$ 11$ 12$ 2003 173$ 307$ 108$ 120$ 44$ 94$ 33$ 48$ 14$ 2004 234$ 311$ 201$ 142$ 51$ 129$ 49$ 15$ 17$ 2005 242$ 371$ 516$ 142$ 101$ 165$ 72$ 15$ 22$ 2006 1,197$ 1,988$ 964$ 142$ 284$ 165$ 32$ 18$ 121
Total 1,898$ 3,403$ 2,005$ 753$ 546$ 657$ 226$ 128$ 197$
• HPSS License Included in HPSS Column• Volume Manager SW (Veritas) Included in Disk Column• “Other” Includes Power, Videoconference, Supplies
10/26/00 US ATLAS Tier 1 Facility - Rich Baker
$-
$1,000
$2,000
$3,000
$4,000
$5,000
2001 2002 2003 2004 2005 2006
Tier 1 Material Costs (k$)
Other
Misc HW
Staff
Travel
LAN
SW
HPSS
Disk
CPU
10/26/00 US ATLAS Tier 1 Facility - Rich Baker
Tier 1 Facility Beyond 2006• Staffing: Constant at 25.5 FTE• Major HW Components
– Constant $ at 33% of 2006 Full Facility Cost– Allows for Continual Upgrade
• All Other Costs Level at 2005/2006 LevelsCPU Disk HPSS SW LAN Travel Staff Misc HW Other
494$ 820$ 456$ 142$ 117$ 165$ 50$ 20$ 50$
Material Personnel Total2,315$ 3,531$ 5,846$
10/26/00 US ATLAS Tier 1 Facility - Rich Baker
Comments• Scaleable Design - Recent 2.5X Expansion• Very Late Bulk Procurement
– Working System Earlier - Minimize Design Risk– Maximize Moore’s Law Advantage– Retain Flexibility as Long as Possible
• Leverage RCF Knowledge– Lessons Learned– Improved Estimation
10/26/00 US ATLAS Tier 1 Facility - Rich Baker
Summary• Facility Already Running• Near Term Prototype Planning in Progress• Budget Exceeds Agency Guideline by 33%
– Despite Recent 2.5X Scale Expansion!• Estimates Are Realistic
– Detailed Cost Basis From Recent Purchases– Moore’s Law Uncertainty
• Build to Cost Contingency Feasible– As Long as Tier 2 Facilities Are Funded