john gordon cclrc e-science centre lcg deployment in the uk john gordon gridpp10

Download John Gordon CCLRC e-Science Centre LCG Deployment in the UK John Gordon GridPP10

Post on 03-Jan-2016




3 download

Embed Size (px)


  • Youve heard about LCG so whats happening in the UK?

    LCG Deployment, now and futureThe wider UK picture.and whats this EGEE?The Plan

  • A. Management StructureIn LCG ContextDeployment BoardTier1/Tier2,Testbeds,Rollout

    Servicespecification& provisionUser Board



    UserfeedbackMetadataWorkloadNetworkSecurityInfo. Mon.PMBCBStorage

  • Recent LCGTier1 +10 other sitesDCsTier2 structureSupport structureGOC MonitoringLCG Accounting

  • GridPP Summary: From Prototype to ProductionBaBarD0CDFATLASCMSLHCbALICE19 UK InstitutesRAL Computer CentreCERN ComputerCentreSAMGridBaBarGridLCGEDGGANGAEGEEUK PrototypeTier-1/A CentreCERN PrototypeTier-0 Centre4 UK Tier-2 CentresLCGUK Tier-1/ACentreCERN Tier-0Centre2007200420014 UK Prototype Tier-2 CentresARDASeparate Experiments, Resources, Multiple Accounts'One' Production GridPrototype Grids

  • VisionGridPP2 should deliver a production quality grid Meeting the computing needs of UK Particle PhysicsAutonomous and self-supporting with its own identityParticipating in LCG, EGEE, BaBarGrid, SAMGrid, and any others desired by its membersPart of an integrated UK GridIndependent but integrated, separate but seamless

  • Delivery PlansKeep up with LCGParticipate in LHC Data ChallengesTierA for BaBar and BaBarGridParticipate in LCG Service ChallengesUse by other VOs

    Put in place the structure to deliver this..and more

  • Production TeamDeploymentUser SupportMiddleware SupportApplications SupportNetwork SupportSecurityOperations

  • UK Tier-2 CentresNorthGrid ****Daresbury, Lancaster, Liverpool,Manchester, SheffieldSouthGrid *Birmingham, Bristol, Cambridge,Oxford, RAL PPD, WarwickScotGrid *Durham, Edinburgh, GlasgowLondonGrid ***Brunel, Imperial, QMUL, RHUL, UCLCurrent UK Status:11 Sites via LCG

  • Tier2 CentresUK model of distributed Tier2 CentresManagerial and organisational centreTier2 is free to organise internallyso I cannot describe yetTier2 is smaller than an EGEE Regionbut some aspects of the model may be useful (their own VO? own RB?)May hide some of the internal structure CE, GIIS?

  • DeploymentA Team to roll out software across UKSoftware release certification, installation support, site certificationSpecialist support for sysadmins Consists of staff from T1 + T2

  • User SupportMigrate from mailing list to problem-trackingFrom sysadmin support to user supportManaged Helpdesk for assignment, tracking, escalationWe already have a lot of experiencewe havent encapsulated it in FAQs etc

  • Middleware, Security and Network Development M/S/N builds upon UK strengths as part of International developmentConfiguration ManagementStorage InterfacesNetwork MonitoringSecurityInformation ServicesGrid Data Management

  • Middleware SupportGridPP2 Middleware development should have an emphasis on delivery and supportMiddleware teams should support their software areaT2 assigned 5 specialist support postsIntegrate support effort into Production Team

  • Applications SupportStephen Burke roaming support2 T1 experiment-facing peopleUK experiments

    Get deployment and middleware support working with experiments to ensure successful UK involvement in experiments use of Grid.

  • Network SupportMark Leese (CCLRC-DL)Rolled out network monitoring to UK Core e-Science programmeGridPP2 role in network supportNetwork optimisationParticipation in service challengesHopefully using lightpaths

  • SecurityNew Security Officer (to be appointed)Security operationsConsultants Kelsey - Joint EGEE-LCG SecurityJensen technical advice to CA/ middlewareMcNab e-Science Security CentreTrack UK developments (Permis, Shibboleth)

  • Grid Operations

  • GOCGOC GridSite MySQLResource CentreResources & Site InformationEDG, LCG-1, LCG-2, cesebdiirbMonitoringSecure Database Management via HTTPS / X.509RC

  • OperationsLCG Operations centreEGEE ROCMonitor GridPP (and NGS and GridIreland)Developed tools for LCG, reuse for GridPPContinue developing for EGEEEGEE CIC running grid-wide servicesAccounting

  • LCG Core Accounting















    Base CPU Time (Seconds)


    Base CPU Time (Seconds) per VO
















    Base CPU Time (Seconds)



  • Wider SupportGSCUK helpdeskUK E-Science CATrainingOur own and EGEE(NeSC)

  • Other UK GridsNGSNational Grid Service4 large clusters + 2 UK SupercomputersAlready using VDT and BDIIETFDeveloping UK OGSA/WSRF GridUK Grid Operations Centre DirectorSpeaking nextShould all be part of EGEE

  • EGEEUK/I Region in EGEE covers GridPP, NGS, and Grid Ireland one of 10 regionsEGEEs aim is to integrate national gridsNot to interfere or impose limits on themAll of the work I have described, short of actually running the Resource Centres, is EGEE work Many sites are actually signed up to EGEE so we can report it formally as suchMany of you will be asked to report work to EGEE (timesheets, quarterly reports) but this shouldnt be an imposition The development of GridPP will be aligned with EGEEBut EGEE is not well defined, so we plan GridPP and participate in the developing EGEE to learn, adopt, and influence.

  • EGEE IssuesEGEE=LCG?non-European sites in LCG non-LCG sites in EGEEPlatform Support non-Linux, free linux (cf RHEL)Integrated user supportSupport for new VOsSecurity, security, security

  • The Next StepsJust appointed Jeremy Coles as GridPP Production ManagerGrid Definitiondefine GridPP, get buy-in of stakeholdersProduction Teambuild the teamWorkplan

  • Production Manager Tasks

    Develop work plan (deliverables/milestones)Compile problems and issues list (implement tracking)Organise a GridPP deployment group workshopBetter establish GridPP identity address UK specific needsReview/develop operating procedures to maintain GridPP serviceGet GridPP more involved at UK/experiment software meetingsCoordinate UK Tier-2 resource input to LCG and EGEEWork with other grids to establish a single production grid.

  • Running a production service: areas to be reviewed and developed Main areas to be considered (transparency, control, accountability, security, improvement)

    Grid accountingWho needs to know what and in what form? Where are the gaps in LCG accounting?Grid monitoringService-level management tools. Efficiency of resource usage. Replication issues.Detailed metrics to be agreed Real-time notification and problem resolutionManagement & reportingGrid management: VO setup procedures; adding new Tier-2 resourcesFrequency, structure and content of reports to be agreed (e.g. resource usage, job success rates against targets)SecurityProcesses and procedures (e.g. incident handling)Mechanics of trust model defined: identity, privacy, policy and authority. (e.g how are rights revoked. Appeals.)Misuse of resources (intrusion), user & usage auditsSupport Installation (joining) requirements/guidelinesintegration & helpdesk requirementsLibrary deployment documentation. User feedback mechanism to inform future developmentsTrainingFor new GridPP users and new operations staffMiddleware release strategy (and stabilisation!)Tier-2 managementService levels (SLAs/MoUs to be developed)Resource, quota and priority handlingResourceMaintenance plansAuditOf Grid usage by user/VO

  • VisionGridPP2 should deliver a production quality grid Meeting the computing needs of UK Particle PhysicsAutonomous and self-supporting with its own identityParticipating in LCG, EGEE, BaBarGrid, SAMGrid, and any others desired by its membersPart of an integrated UK GridIndependent but integrated, separate but seamless

  • ChallengeLCG has given us a good baseWe now have a critical mass based on LCG2Make it production quality gridAttract the satellite grids UKQCD, BaBar,And bring in other experiments Participate fully in LCG and EGEEWithout alienating non LHC experiments

  • Can we do it?Yes, we can!


View more >