binp/gcf status report binp lcg site registration oct 2009 [email protected]

12
BINP/GCF Status Report BINP LCG Site Registration Oct 2009 [email protected] k.su LCG

Upload: lucas-york

Post on 12-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: BINP/GCF Status Report BINP LCG Site Registration Oct 2009 A.S.Zaytsev@inp.nsk.su

BINP/GCF Status ReportBINP LCG Site Registration

Oct 2009 [email protected]

L C G

Page 2: BINP/GCF Status Report BINP LCG Site Registration Oct 2009 A.S.Zaytsev@inp.nsk.su

221 Oct 2009 BINP/GCF Status Report

Overview

Current status BINP LCG site registration procedures Getting to production with ATLAS VO activities NSC/SCN connectivity Cooperation with NSU SC facility Future prospects

Page 3: BINP/GCF Status Report BINP LCG Site Registration Oct 2009 A.S.Zaytsev@inp.nsk.su

3

BINP LCG Farm: Present Status

21 Oct 2009

CPU: 40 cores (100 kSI2k) | 200 GB RAMHDD: 25 TB raw (22 TB visible)Input power limit: 15 kVAHeat output: 5 kW

Page 4: BINP/GCF Status Report BINP LCG Site Registration Oct 2009 A.S.Zaytsev@inp.nsk.su

BINP/GCF Status Report 4

Current Resource Allocation(up to 80 VM slots now available within 200 GB of RAM)Computing Power LCG:

4 host systems now (40%) 70% share is prospected for

production with ATLAS VO (near future)

KEDR: 5 host systems (50%)

VEPP-2000, CMD-3,Test VMs, etc.: 1 host system (10%)

Centralized Storage LCG:

0.5 TB (VM images) 15 TB (DPM pool buffer,

VOs software areas) KEDR:

0.5 TB (VM images) 4 TB (local backup

of experimental data) Others (e.g. NSU):

up to 4 TB reserved forlocal NFS/PVFS2 buffer

21 Oct 2009

Page 5: BINP/GCF Status Report BINP LCG Site Registration Oct 2009 A.S.Zaytsev@inp.nsk.su

BINP/GCF Status Report 5

BINP LCG Site Registration (1) STEP 1: DONE

Defining the basic configuration values for the site (name, place within the hierarchy, geographic location, etc.) BINP-Novosibirsk-LCG Tier-2 within the distributed RuTier-2 of WLCG

Creating the mailing lists for covering the site admin activitiesand WLCG site security issues [email protected] [email protected] [email protected]

Choosing the architecture of the site, setting up the software repositories, and deploying the start-up set of the nodes (CE+SE+WNs) SLC4x86 + gLite 3.1

Registering the site in GOC (GRID Operating Center) with help of ROC (Regional Operating Center) representative, get the “Candidate” status for the site, publish the contact info of site admins & security officers A.Zaytsev, A.Suharev

21 Oct 2009

Page 6: BINP/GCF Status Report BINP LCG Site Registration Oct 2009 A.S.Zaytsev@inp.nsk.su

BINP/GCF Status Report 6

BINP LCG Site Registration (2) STEP 2: DONE

Installing utility nodes of the site (MON, LFC, WMS/LB, PX, UI, extra DPM_disk VMs, etc.)

Querying the certificates for all the service nodes of the site Configuring middleware on all the nodes Tune the local firewalls according to the site internal and external

connectivity requirements Tune local NAT engines / LCG farm-edge / BINP-edge / NSC firewalls

to provide the service nodes of the site with external connectivity(with major help from S.D.Belov and ICT sysadmins)

Get “OK” status with GStat tests run hourly by GOC Get “Certified/Production” status for the site from ROC Define the list of supported VOs

DTEAM, RDTEAM, RDSTEST, OPS, ATLAS

Start receiving the production SAM tests from GOC

21 Oct 2009

Page 7: BINP/GCF Status Report BINP LCG Site Registration Oct 2009 A.S.Zaytsev@inp.nsk.su

BINP/GCF Status Report 7

BINP LCG Site Registration (3) STEP 3: IN PROGRESS

Getting OK for all the SAM tests (currently being dealt with) Confirm the stability of operations for 1-2 weeks Upscale the number of WNs to the production level

(from 12 up to 32 CPU cores = 80 kSI2k max) Ask ATLAS VO admins to install the experimental software on the site Test the site for ability to run ATLAS production jobs Check if the 110 Mbps SB RAS channel is capable to carry the load

of 80 kSI2k site Get to production with ATLAS VO (hopefully by the end of Nov 2009)

21 Oct 2009

Page 8: BINP/GCF Status Report BINP LCG Site Registration Oct 2009 A.S.Zaytsev@inp.nsk.su

BINP/GCF Status Report 8

Future Prospects Several ways to follow:

Further upgrades of the farm up to 360 CPU Cores (0.9-1.0 MSI2k) and 300 TB of disk space

Extending LCG site to the outer computing resources (mainly to the SC of the NSU – up to 128 cores might be granted for the LCG activities) 10 Gbps NSU-BINP channel is expected to operate at full throughput staring

from this week (SSCC is on its way, TSU is on the horizon) Virtualization schema proposed for NSU is to be validated in 2 weeks to come

Both of the previous strategies in parallel Important issues foreseen:

Scaling up beyond 0.5 MSI2k might require more than 1 Gbps of external connectivity (exclusively) – major effort of improving the situation with external connectivity of the site needed in 2010-2011

10 Gbps links to the local experiments (KEDR, CMD-3) are required to make sure that all the resources of the farm (and its offshore parts) are used efficiently

21 Oct 2009

Page 9: BINP/GCF Status Report BINP LCG Site Registration Oct 2009 A.S.Zaytsev@inp.nsk.su

BINP/GCF Status Report 9

360 CPU cores/ 300 TB Configuration

21 Oct 2009

Page 10: BINP/GCF Status Report BINP LCG Site Registration Oct 2009 A.S.Zaytsev@inp.nsk.su

BINP/GCF Status Report 10

Prospected 10 Gbps Network Layout

21 Oct 2009

Page 11: BINP/GCF Status Report BINP LCG Site Registration Oct 2009 A.S.Zaytsev@inp.nsk.su

BINP/GCF Status Report 11

Summary

LCG site registration progress: 2 of 3 steps are handled With our own resources we export up to 80 kSI2k / 15

TB to WLCG (NSU may add 300 kSI2k in the near future)

The exact hardware and network equipment upgrade plan is yet to be defined (though specs are ready for up to 1.1 M$)

We are about to try to get to production with ATLAS VO exclusively in 2009Q4, more VOs support demanded starting from 2010

21 Oct 2009

Page 12: BINP/GCF Status Report BINP LCG Site Registration Oct 2009 A.S.Zaytsev@inp.nsk.su

Questions & Comments