logo proof system for parallel mpd event processing gertsenberger k. v. joint institute for nuclear...

14
LOGO PROOF system for parallel MPD event processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna

Upload: leo-hancock

Post on 13-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: LOGO PROOF system for parallel MPD event processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna

LOGO

PROOF system for parallel MPD event processing

Gertsenberger K. V.

Joint Institute for Nuclear Research, Dubna

Page 2: LOGO PROOF system for parallel MPD event processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna

NICA scheme

Gertsenberger K.V. 2

Page 3: LOGO PROOF system for parallel MPD event processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna

Multipurpose Detector (MPD)

The software MPDRoot is developed for the MPD event simulation, reconstruction of experimental or simulated data and following physical analysis of heavy ion collisions registered by the MultiPurpose Detector at the NICA collider.

3Gertsenberger K.V.

Page 4: LOGO PROOF system for parallel MPD event processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna

high interaction rate (up to 6 KHz) high particle multiplicity, about 1000 charged particles for the

central collision at the NICA energyone event reconstruction takes tens of seconds in

MPDRoot now, 1M events – months large data stream from the MPD:

is estimated at 5 to 10 PB of raw data per year

1m simulated events ~ 50 TBMPD event data can be processed concurrently the ability to use multicore / multiprocessor machines,

computing clusters and, subsequently, GRID system

4Gertsenberger K.V.

Prerequisites of the parallel processing

Page 5: LOGO PROOF system for parallel MPD event processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna

Current NICA cluster in LHEP

5Gertsenberger K.V.

Page 6: LOGO PROOF system for parallel MPD event processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna

Data storage on the NICA cluster

6Gertsenberger K.V.

Distributed file system GlusterFS

it aggregates existing file systems in a common distributed file system

automatic replication works as background process

background self-checking service restores corrupted files in case of hardware or software failure

Page 7: LOGO PROOF system for parallel MPD event processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna

Parallel MPD event processing

PROOF serverparallel data processing in ROOT macros on the parallel architectures

concurrent eventprocessing

MPD-schedulerscheduling system for the task distribution to parallelize data processing on the cluster nodes

7Gertsenberger K.V.

Page 8: LOGO PROOF system for parallel MPD event processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna

Parallel data processing with PROOF

PROOF (Parallel ROOT Facility) is a part of the ROOT software, no additional installations

PROOF uses data independent parallelism based on the lack of correlation for MPD events good scalability

Parallelization for three parallel architectures:

1. PROOF-Lite parallelizes the data processing on one multiprocessor/multicores machine

2. PROOF parallelizes processing on heterogeneous computing cluster

3. Parallel data processing in GRID system

Transparency: the same program code can execute both sequentially and concurrently

8Gertsenberger K.V.

Page 9: LOGO PROOF system for parallel MPD event processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna

Using PROOF in MPDRoot The last parameter of the reconstruction: run_type (default, “local”).

Speedup on the user multicore machine:

$ root reco.C(“evetest.root”, “mpddst.root”, 0, 1000, “proof”)

parallel processing of 1000 events with thread count being equal logical processor count

$ root reco.C(“evetest.root”, “mpddst.root”, 0, 500, “proof:workers=3”)

parallel processing of 500 events with three concurrent threads

Speedup on the NICA cluster:$ root reco.C(“evetest.root”, “mpddst.root”, 0, 1000, “proof:[email protected]:21001”)

parallel processing of 1000 events on all cluster’s cores of the PoD farm

$ root reco.C(“evetest.root”, …, 0, 500, “proof:[email protected]:21001:workers=15”)

parallel processing of 500 events on the PoD cluster with 15 workers

XRootD files support

9Gertsenberger K.V.

Page 10: LOGO PROOF system for parallel MPD event processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna

The speedup of the reconstruction on 4-cores machine

10Gertsenberger K.V.

Page 11: LOGO PROOF system for parallel MPD event processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna

PROOF on the NICA cluster

11Gertsenberger K.V.

proof proof proof

proof

proof = master serverproof = slave node

*.root

GlusterFS

Proof On Demand Cluster

(10) (10) (14)

$ root reco.C(“evetest.root”,”mpddst.root”, 0, 3, “proof:[email protected]:21001”)

event count

evetest.root event №1 event №2

mpddst.root

event №0

Page 12: LOGO PROOF system for parallel MPD event processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna

The speedup of the reconstruction on the NICA cluster

12Gertsenberger K.V.

Page 13: LOGO PROOF system for parallel MPD event processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna

The description of PROOF system on mpd.jinr.ru

13Gertsenberger K.V.

Page 14: LOGO PROOF system for parallel MPD event processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna

Conclusions The distributed NICA cluster was deployed on LHEP farm for the

NICA/MPD experiment (Fairsoft, ROOT/PROOF, MPDRoot, Gluster). 128 cores

The data storage was organized with the GlusterFS distributed file system: /nica/mpd[1-8]. 10 TB

PROOF On Demand cluster containing nc10 (with POD server), nc11 and nc13 machines with 34 processor cores was implemented to parallelize event data processing for the MPD experiment. PROOF support was added to the reconstruction macro.

The web site mpd.jinr.ru in section Computing – NICA cluster – PROOF parallelize presents the manual for the PROOF system.

14Gertsenberger K.V.