thomas jefferson national accelerator facility (jlab) 6/16/09multi-threaded event processing with...

27
Thomas Jefferson National Accelerator Facility (JLab) 6/16/09 Multi-threaded event processing with JANA -- David Lawrence 1 • 6 GeV electron accelerator user facility funded by the US Dept. of Energy Located in Newport News on the east coast of Virginia, USA • 1 of the 2 major nuclear physics research labs in the U.S. CHL2 for basic research into the quark structure of nuclear matter 12 GeV 11 GeV (CD-3 approval came in Sept. 2008 with data planned i

Upload: alberta-ferguson

Post on 26-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

1

Thomas Jefferson National Accelerator Facility (JLab)

6/16/09Multi-threaded event processing with JANA -- David

Lawrence

• 6 GeV electron accelerator user facility funded by the US Dept. of Energy

Located in Newport News on the east coast of Virginia, USA

• 1 of the 2 major nuclearphysics research labs in

the U.S.

CHL2

for basic research into the quark structure of nuclear matter

12 GeV

11 GeV(CD-3 approval came in Sept. 2008 with data planned in 2014)

Page 2: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

Multi-threaded event processing with JANA -- David Lawrence

2

The GlueX Experiment in Hall-D

6/16/09

real g

beam

2 Tesla solenoidmagnet

30 cm LH2 target

Forward EM calorimeter and forward TOF wall downstream

Cylindrical and planar drift chambers inside magnet

Barrel EM calorimeter inside magnet

Conventional meson has quantum numbers determined only by constituent quarks

Hybrid meson has some quantum properties due to contributions from the “glue”

Mapping the spectrum of

light-quark, exotic, hybrid mesons

Page 3: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

Multi-threaded event processing with JANA -- David Lawrence

3

Data Rates in 12GeV eraFront EndDAQ Rate

EventSize

L1 TriggerRate

Bandwidthto massStorage

GlueX 3 GB/s 15 kB 200 kHz

300 MB/s

CLAS12

100 MB/s

20 kB 10 kHz 100 MB/s

ALICE 500 GB/s

2.5 MB 200 kHz

200 MB/s

ATLAS 113 GB/s

1.5MB 75 kHz 300 MB/s

CMS 200 GB/s

1 MB 100kHz 100 MB/s

LHCb 40 GB/s

40 kB 1 MHz 100 MB/s

STAR 8 GB/s 80 MB 100 Hz 30 MB/s

PHENIX

900 MB/s

~60 kB ~ 15 kHz

450 MB/s

6/16/09

LH

CJL

ab

BN

L *

CH

EP

20

07

talk

Sylv

ain

Ch

ap

elin

pri

vate

com

m.

* NIM A499 Mar. 2003 ppg 762-765** CHEP2006 talk MartinL. Purschke

**

Page 4: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

Multi-threaded event processing with JANA -- David Lawrence

4/27

Categories of Developers

A: FrameworkB: ReconstructionC: End users (Analysis)

6/16/09

Page 5: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

Multi-threaded event processing with JANA -- David Lawrence

5/27

What is the Job of the Reconstruction Framework?

• Allow users to implement code in a consistent way

• Provide access to calibrations, run conditions and geometry databases

• Modular enough to allow simple programs (calibration)

• Extensible enough to allow complex programs (full reconstruction)

6/16/09

motherhood statements

Page 6: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

Multi-threaded event processing with JANA -- David Lawrence

6

CPU development in the coming years

6/16/09

From “Platform 2015: Intel Platform Evolution for the Next Decade”

expect more than 100 cores in a box by 2014!

• CPU development has shifted from increased clock speed to multiple cores

• Dual quad core CPUs are common today (8cores + 8hyperthreads)

• Some type of parallelization must be done to use all of the power in a next generation CPU

Page 7: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

Multi-threaded event processing with JANA -- David Lawrence

7

The JANA Factory Model

6/16/09

• Algorithms are represented as “factory” classes• Only const pointers are passed out of the factory (ownership stays with the factory)• Passing out only const pointers guarantees that only the factory may modify the objects• Subsequent requests get the same const pointersvector<const DTrack*> tracks;

loop->Get(tracks);

• Templated Get() method helps ensure type safety• Data-on-demand structure allows dynamic determination of which algorithms are activated for a given event

Page 8: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

Multi-threaded event processing with JANA -- David Lawrence

8

Threads in JANA

6/16/09

• Each thread in JANA is composed of its own event processing loop and a complete set of factories

• Reconstruction of a given event is done entirely inside of a single thread

• No mutex locking is required by authors of reconstruction code

• Threads work asynchronously to maximize rates at the expense of not maintaining the event order on output

raw data read in

reconstructed values written out(e.g. ROOT tree)

Page 9: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

Multi-threaded event processing with JANA -- David Lawrence

9

The GlueX Reconstruction Tree(factory call graph created using janadot plugin)

6/16/09

Number of calls and amount of time spent satisfying each is reported

Objects at bottom of graph are (mostly) supplied by event source

arrows indicate calling sequencedata flow is in opposite direction

Page 10: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

Multi-threaded event processing with JANA -- David Lawrence

10

Multi-threading when CPU limited

6/16/09

• CPU intensive jobs are the ideal application for multi-threading

• Blue circles are reconstruction of data from a Monte Carlo simulation

• Red triangles are from a CPU-hungry speed testing plugin

• Both show very good scaling of the event processing rate with the number of threads

Reconstruction of MC data, CPU bound jobs only

Overall event processing rate scales linearly with the number of threads

Page 11: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

Multi-threaded event processing with JANA -- David Lawrence

11

Multi-threading when I/O limited

6/16/09

• Multiple processes trying to access different locations on the same disk leads to competition causing the read head to physically move back and forth from one location on the disk to another

• A multi-threaded application will access a single file in sequence reducing the number of moves the read head must make

blue circles: one multi-threaded process reading from a single filered triangles: multiple single-thread processes reading different files from the same disk

No processing of event data, I/O bound jobs only

Page 12: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

Multi-threaded event processing with JANA -- David Lawrence

12/27

JANA is implemented in C++• JANA relies heavily on templates

– Provides better type safety– Easily extensible

• External services need not be in C++– Calibration server– Geometry description (XML)– Distributed deployment layer

6/16/09

Page 13: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

Multi-threaded event processing with JANA -- David Lawrence

13

Targeted Customers for JCalibration

• “B”-coders = reconstruction code authors(“A”-coders = framework authors; “C”-coders = physics analysis)

6/16/09

JCalibration URL run_number context (string) …

Reconstruction Code namepath (string)

STL container ref.

Database

Reconstruction Program

B-coders A-coders

The minimal information needed from the reconstruction code is the name of the constants and a container to put them in. (The container implicitly contains a data type.)

Page 14: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

Multi-threaded event processing with JANA -- David Lawrence

14

Calibration Web Service

• The JCalibrationWS class provides calibration constants through a web service

– Implemented as a plugin so remote access can be added to an existing executable

– Allows read-only access to calibration constants from anywhere in the world over HTTP(http://www.jlab.org/Hall-D/cgi-bin/calib)

– Uses gSOAP, a C++ SOAP implementation– Currently works like a proxy for JCalibrationFile on server side, but

could trivially be made to use another type of backend

• Calibration constants will need to be accessible from remote computers via the internet

• Direct access to a database is problematic due to cybersecurity concerns

• Web services work over HTTP and so are the appropriate mechanism for remote access

6/16/09

Page 15: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

Multi-threaded event processing with JANA -- David Lawrence

15

Saving a (semi-)complete set of calibration constants to the local disk

All JANA programs have the command line option:--dumpcalibrations

• Records which namepaths are requested during a job and writes the constants into ASCII files compatible with JCalibrationFile

• Avoids copying and running entire database or even copying a “complete” set of calibration constants (which could include obsolete ones or ones not applicable to the current run/code version)

6/16/09

Page 16: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

Multi-threaded event processing with JANA -- David Lawrence

16

BCAL Beam test 2006The JANA framework was used during a beam test in 2006 to test a 4m long calorimeter module in photon beam.

Reconstruction built around objects that could be constructed from multiple event sources.

Online/offline systems built around simulated data in ROOT files. Later fed from DAQ system in online environment and finally from raw data files in offline.

6/16/09

Page 17: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

Multi-threaded event processing with JANA -- David Lawrence

17

The janaroot plugin (for automatic creation of ROOT TTrees)

6/16/09

• Each data object implements a toStrings() method which provides an expression of the data object that may not be a full representation of the object• The toStrings() mechanism was developed for allowing a simple, low-level dump of objects from single events to the screen• This mechanism is leveraged by janaroot to provide a similar expression as TTrees• An empty event tree is also created with all other trees

Each leaf is an array of size “N” to represent the N objects of this type in the event

A leaf named “N” is automatically added to each tree

listed as friends so that a leaves from multiple objects can be used together in expressions• Limitations make this unsuitable for all applications, but it does provide a quick, easy way to make plots of some reconstructed values for less experienced users

Page 18: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

Multi-threaded event processing with JANA -- David Lawrence

18

SUMMARY• JANA is a multi-threaded event processing

framework designed to build full reconstruction packages for a multi-core environment

• Numerous features like: plugins, automatic ROOT-tree generation, and calibrations/conditions DB API with working Web Services implementation

• Data-on-demand design makes it suitable for L3 trigger algorithm.

6/16/09

http://www.jlab.org/JANA

Page 19: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

Multi-threaded event processing with JANA -- David Lawrence

19

Backup slides

6/16/09

Page 20: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

Multi-threaded event processing with JANA -- David Lawrence

20

CEBAFThe Continuous Electron Beam Accelerating Factility

Electron beam accelerator

• continuous-wave (1497MHz, 2ns bunch structure in halls)

• Polarized electron beam

• Upgrading to 12GeV (from 6GeV)

• 70 mA max @ 12GeV (200mA max @ 6GeV)

Existing experimental halls A, B, C

Future Hall-D site

6/16/09

Page 21: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

Multi-threaded event processing with JANA -- David Lawrence

21/27

Plugins

void InitPlugin(JApplication *app)

A plugin defines one external routine:

•Event Processors•Event Source Generators•Factory Generators

Plugins can be used to add:

6/16/09

Page 22: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

Multi-threaded event processing with JANA -- David Lawrence

22

API For Accessing ConstantsConstants can be stored in either arrays (1D) or tables (2D) and can be indexed either by name (key-value) or by position.

// Get 1-D array of values indexed by namebool Get(string namepath, map<string, T> &vals) // Get 1-D array of values indexed by rowbool Get(string namepath, vector<T> &vals) // Get 2-D table of values indexed by row and namebool Get(string namepath, vector< map<string, T> > &vals) // Get 2-D table of values indexed by row and columnbool Get(string namepath, vector< vector<T> > &vals)

arr

ays

tab

les

6/16/09

// Get list of available namepaths from backendvoid GetListOfNamepaths(vector<string> &namepaths)

dis

covery

Page 23: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

Multi-threaded event processing with JANA -- David Lawrence

23

map<string, double> twpars;loop->Get("FDC/driftvelocity/timewalk_parameters", twpars); slope = twpars["slope"];offset = twpars["offset"];exponent = twpars["exponent"];

Example of Accessing Calibration Constants as key-value pairs

... in factory class definition …

Template method converts values to doubles using stringstream class

For a few parameters like this, it makes sense to copy them into local data members of the factory class

6/16/09

double slope, offset, exponent;

... in brun() method ...

Page 24: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

Multi-threaded event processing with JANA -- David Lawrence

24

Backend Requirements

6/16/09

virtual bool GetCalib(string namepath, map<string, string> &svals)virtual bool GetCalib(string namepath, vector< map<string, string> > &svals)virtual void GetListOfNamepaths(vector<string> &namepaths)

To implement a JCalibration interface to a new backend, only 3 virtual methods need to be implemented!

OK, actually, it’s closer to 6, but these 3 are for the generator class and are trivial.

virtual const char* Description(void)virtual double CheckOpenable(string url, int run, string context)virtual JCalibration* MakeJCalibration(string url, int run, string context)

• The generator mechanism allows access to multiple types of databases in the same executable.• Access to a new type can even be brought in through a plugin.

Page 25: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

Multi-threaded event processing with JANA -- David Lawrence

25/27

Configuration Parameters

6/16/09

Page 26: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

Multi-threaded event processing with JANA -- David Lawrence

26/27

Implementing an Event Source

To implement an event source, 2 classes must be provided:

JEventSourceGenerator

const char* Description(void); double CheckOpenable(string source);JEventSource* MakeJEventSource(string source);

JEventSource

jerror_t GetEvent(JEvent &event);void FreeEvent(JEvent &event); jerror_t GetObjects(JEvent &event, JFactory_base *factory);

6/16/09

Page 27: Thomas Jefferson National Accelerator Facility (JLab) 6/16/09Multi-threaded event processing with JANA -- David Lawrence 1 6 GeV electron accelerator user

Multi-threaded event processing with JANA -- David Lawrence

27/27

JANA’s GUI API

• JANA’s has publicly accessible, lower level methods that allow the main “event” loop to be external to JEventLoop

6/16/09