northgrid status alessandra forti gridpp13 durham, 4 july 2005
Post on 18-Dec-2015
213 views
TRANSCRIPT
NorthGrid status
Alessandra FortiGridpp13
Durham, 4 July 2005
4 July 2005 Alessandra Forti GridPP13 Durham
Outline
• Sites Resource Summary• Posts• Sites by Site situation
– Lancaster, Liverpool, Manchester, Sheffield
• Installation and configuration• dcache• Communication • Security• Conclusions
4 July 2005 Alessandra Forti GridPP13 Durham
Sites Resources Summary
Site kSI2k TB Network Status
Lancaster 464.6 2+30 290MBukLight
online
Liverpool 0 0 1 GB offline
Manchester
40 0.5 1 GB online
Sheffield 240 2 1GB online
4 July 2005 Alessandra Forti GridPP13 Durham
Sites Resources Summary
Site OS LCG SRM VOs
Lancaster SL3 2_4_0 Yes (dcache)
5
Liverpool SL3 2_4_0 No 3
Manchester
SL3 2_4_0 Yes (dcache)
7
Sheffield SL3 2_4_0 No 6
4 July 2005 Alessandra Forti GridPP13 Durham
Posts
• 4.5 FTE have been filled– Lancaster: Matt Doidge – Liverpool: Pawel Trepka– Manchester: Marc Kelly, Colin Morey– Sheffield: Andrew Beresford (0.5 FTE)
• Posts are working for each university and there is no common effort but they meet at the monthly technical meeting and report on sites situation and exchange information there.
4 July 2005 Alessandra Forti GridPP13 Durham
Lancaster
• New Farm has been installed in May– Few problems with RGMA and publishing
• the latter mostly due to a change in the name of the CE
• SC3 participation– Connected to ukLight and to the production
network• In the process of subnetting and dual homing
– Installing two dcache storage elements• To avoid overlapping between production network and
ukLight traffic
– Installing other required software like FTS, LFC
4 July 2005 Alessandra Forti GridPP13 Durham
Liverpool
• Still offline • The new post hopefully will solve the
manpower problem • Last 10 days a peak of effort to install
LCG– Didn’t quite make it for today but
hopefully will continue also after GridPP
• Perhaps next quarter Liverpool will be online?
4 July 2005 Alessandra Forti GridPP13 Durham
Manchester
• At the moment– Still online with the old cluster: 40kSI– UI software being installed on department desktops– Dcache installed on 2 WNs
• The order for the new cluster has been placed– It is due to arrive at the beginning of august– Electricity bills sorted
• sharing part of the cpus with engineers (not discussed yet how)
• Main effort dedicated to prepare the structure– Setting up servers, networking, monitoring, security- Surveying assembly of the nodes, installation and testing– Establishing cooperative relations with MCC– Working on dcache configuration
4 July 2005 Alessandra Forti GridPP13 Durham
Sheffield
• Not much to say about Sheffield – It is the best site in NorthGrid
• Always on time with updates• Cluster always full of jobs• Totalised ~15300 kSI hours ~5 times hours
than the second site.• Really active Atlas user asking a lot of user
questions
– The only note I can make is that they tend not to answer emails. • They fix the problem and don’t close the
ticket!
4 July 2005 Alessandra Forti GridPP13 Durham
Installation and configuration
• Installation easiness greatly improved.☺YAIM installs and configures a standard site very
easily☺YAIM can be easily debugged and fixed when
things go wrong☺YAIM can be extended for non standard sites☺Can be plugged in a kickstart×YAIM doesn’t configure a site to be secure
• Security recipes should go in the installation notes or on a security WEB site not in the scripts.
☺In general has made sys admins life much easier!
4 July 2005 Alessandra Forti GridPP13 Durham
Dcache
• Lancaster and Manchester have installed dcache– Different configurations:
• Lancaster has dedicated data servers• Manchester is using (2) WNs disk space
– YAIM merely installs the components and starts the services
• It doesn’t configure the dcache nor the Info System (waiting for new LCG release to see improvements
– Dcache configuration documentation is lacking– Examples of hardware configuration and requirements are
lacking– SRM available/space per VO cannot be easily calculated
• Matter of configuration again?
• Installation experience at different sites has been very different.
4 July 2005 Alessandra Forti GridPP13 Durham
Communication with the external world
• Users and experiments should open tickets not write directly to sys admin.– Users don’t have to know site contact addresses– Miscommunication is avoided– Tickets send automatic reminders– An escalation procedure is followed if the problem is not
solved– Tickets are traceable
• Sys admin should write to the ROC about rogue users– The sys admin doesn’t have to know about user personal
email– Miscommunication is avoided– The user might not be a user but a hacker better have
someone to investigate it
4 July 2005 Alessandra Forti GridPP13 Durham
Security
• Has it been said often enough?– Very simple things to check and block at OS
configuration level that can improve security• Portmap• Ssh• Inetd• Cron acls• Root password• Switching off unused services (port table out of date
and confusing) • Monitoring ports for services• Monitoring network traffic whe n something weird is
noticed
4 July 2005 Alessandra Forti GridPP13 Durham
Conclusions
• NorthGrid is slowly coming together– Posts are in place– Expected equipment has been setup or is on
its way– Technical information is exchanged during
the monthly technical meetings• It could be more often but one step at the time
– Networking is being followed up in two sites– SRM/dcache has been installed at two sites– One of the sites has a very good running
record