development of the distributed monitoring system for the nica cluster ivan slepov (lhep, jinr)...

Post on 04-Jan-2016

222 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Development of the distributed monitoring system for the NICA cluster

Ivan Slepov(LHEP, JINR)

Mathematical Modeling and Computational Physics Dubna, Russia, July 8, 2013

The MultiPurpose Detector – MPDto study Heavy Ion Collisions at NICA

Software for MultiPurpose Detector

MpdRoot Framework

components:

Detectors simulation

Data reconstruction

Event analysis

ROOT + FairRoot (FairBase + FairSoft software packages) =

Software for MultiPurpose Detector

MpdRoot Framework

components:

Detectors simulation

Data reconstruction

Event analysis

ROOT + FairRoot (FairBase + FairSoft software packages) =

Software for MultiPurpose Detector

MpdRoot Framework

components:

Detectors simulation

Data reconstruction

Event analysis

ROOT + FairRoot (FairBase + FairSoft software packages) =

Software for MultiPurpose Detector

MpdRoot Framework

components:

Detectors simulation

Data reconstruction

Event analysis

ROOT + FairRoot (FairBase + FairSoft software packages) =

Computing resources for MPD data processing

CPU: 128 XEON cores GPU: ~1500 TESLA cores

Computing resources for MPD data processing

CPU: 128 XEON cores => in future ~10 000 XEON cores GPU: ~1500 TESLA cores

Motivation to develop monitoring system

- Computing resources information (free space, memory, cpu, etc)

- System load (load average, processes)

- MPD software information (FairSoft version)

- Cluster software information (SGE, xrootd, proof)

- User tasks monitoring (batch processing and interactive jobs)

MPD users need more information about all own cluster nodes and public computers!

Monitoring system schemes

MySQLDB

BASH Scripts

DSHSoftware

Cronrun job

PHPScripts

WEBInterface

MySQLDB

Scheme 1 – for collect general information

Monitoring system schemes

MySQLDB

BASH Scripts

DSHSoftware

Cronrun job

PHPScripts

WEBInterface

MySQLDB

Scheme 1 – for collect general information

WEBInterface

PHPScripts

DSHSoftware

BASHScripts

MySQLDB

Scheme 2 – for collect information about user tasks and provide data management

Web-interface for

Monitoring system

1. MPD software information

2. Computing resources information

3. System load

4. User tasks monitoring

Monitoring system web-interfaceUser tasks

Monitoring system web-interfaceInteractive nodes

Access to the monitoring system on websitempd.jinr.ru

Thank you for your attention!

MPD users need more information about all own cluster nodes and public computers!

Why? If, for example, the concept of grid uses a layer of abstraction from the resources.

Because MPD software now still under development and needs testing and debugging.

Motivation to develop system monitoring

top related