southgrid status report pete gronbech: february 2005 gridpp 12 - brunel

17
Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel

Upload: sharleen-mccormick

Post on 28-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel

Southgrid Status Report

Pete Gronbech: February 2005

GridPP 12 - Brunel

Page 2: Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel

Southgrid Member Institutions

• Oxford • RAL PPD• Cambridge • Birmingha

m• Bristol• Warwick

Page 3: Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel

Status at Warwick

• No change since Gridpp 11.• Third line institute – no resources as

yet but remain interested in being involved in the future.

• Will not receive GridPP resources and so does not need to sign the MOU yet.

Page 4: Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel

Operational Status

• RAL PPD• Cambridge• Bristol• Birmingham• Oxford

Page 5: Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel

Status at RAL PPD

• Always on the leading edge of software deployment (Benefit of RAL Tier 1)

• SL3 cluster on 2.3.0 worker nodes increasing.

• Legacy service LCG 2.3.0 on RH7.3 (Winding down)

• CPUs: 24 2.4 GHz, 30 2.8GHz– 100% Dedicated to LCG

• 0.5 TB Storage– 100% Dedicated to LCG

Page 6: Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel

Status at Cambridge

• Currently LCG 2.2.0 on RH7.3

• Parallel install of SL3 with 2.3.0 using yaim.

• CPUs: 32 2.8GHz – increase to 40 soon.– 100% Dedicated to

LCG

• 3 TB Storage– 100% Dedicated to

LCG

Page 7: Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel

Status at Bristol

• Status– LCG involvement limited (“black dot”) for previous six months

due to lack of manpower– New resources, posts now on the horizon!

• Existing resources– 80-CPU BaBar farm to be switched to LCG– ~ 2TB storage resources to be LCG – accessible– LCG head nodes installed by SouthGrid support team with 2.3.0

• New resources– Funding now confirmed for large University investment in

hardware– Includes CPU, high quality and scratch disk resources

• Humans– New system manager post (RG) being filled– New SouthGrid support / development post (GridPP / HP) being

filled– HP keen to expand industrial collaboration – suggestions?

Page 8: Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel

Status at Birmingham

• Currently LCG 2.2 (since August).

• Currently installing SL3 on Gridpp Frontend Nodes, will use yaim to install LCG-2_3_0

• CPUs: 22 2.0GHz Xenon (+48 soon)– 100% LCG

• 2 TB Storage awaiting “Front End Machines”– 100% LCG.

• Southgrid’s “Hardware Support Post”Yves Coppens appointed.

Page 9: Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel

Status at Oxford

• Currently LCG 2.3.0 on RH7.3• Parallel SL3 install, will use yaim to

install 2.3.0 asap• CPUs: 80 2.8 GHz

– 100% LCG• 1.5 TB Storage – upgrade to 3TB

planned– 100% LCG.

Page 10: Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel

Two racks each containing 20 Dell dual 2.8GHz Xeon’s with SCSI system disks.

1.6TB SCSI disk array in each rack.

Systems are loaded with LCG2 software version 2.3.0

SCSI disks and Broadcom Gigabit Ethernet causes some problems with installation initially.

The systems have been heavily used by the LHCb Data Challenge.

Oxford Tier 2 centre for LHC

Page 11: Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel

First rack in very crowed computer room (650)Second rack currently temporarily located in theoretical physics computer room.

On the limit of power in 650

Air Conditioning not reliable

Problems: Space, Power and Cooling.

A proposal for a new purpose built computer room on Level 1 (underground) is in progress.

Page 12: Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel

CERN Computer Room

Page 13: Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel

Site on Level 1 for proposed computer room

• An ideal location– Lots of power (5000A)– Underground (no heat from

the sun and very secure).– Lots of headroom (false

floor/ceiling for cooling systems)

– Basement (so no floor loading limit)

• False floor, large Air conditioning units and power for approx 50-80 racks to be provided.

• A rack full of 1U servers can create 12KW of heat and use 50A of power.

• Will offer space to other Oxford University departments

Page 14: Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel

DWB computer room project. 26-Nov-2004

Page 15: Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel

Centre of the Racks

Page 16: Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel

LCG2 Administrator’s Course

• A lot of interest in a repeat, especially when the 8.5 “Hardware Support” posts are filled (suggestions welcome).

• PXE / kickstart install vs Quattor…?

Page 17: Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel

Ongoing Issues

• Complexity of the installation. New yaim scripts have helped enormously.

• Difficulty sharing resources – almost all of those listed are 100% LCG due to difficult sharing issues.

• How will we manage clusters without LCFGng? Quattor has a learning curve. Course showed that it is very modular but PXE/kickstart + yaim preferred option at the moment.

• grid certificates supported browsers.