helmholtz international center for cbm – online reconstruction and event selection open charm...

1
Helmholtz International Center for CBM – Online Reconstruction and Event Selection Open Charm Event Selection – Driving Force for FEE Open Charm Event Selection – Driving Force for FEE and DAQ and DAQ Open charm: D (c = 312 m): D + K - + + (9.5%) D 0 (c = 123 m): D 0 K - + (3.8%) D 0 K - + + - (7.5%) D s (c = 150 m): D + s K + K - + (5.3%) + c (c = 60 m): + c pK - + (5.0%) No simple, single track level trigger primitive, like high p t , available to tag events of interest. The only selective signature is the detection of the decay vertex. Track reconstruction in STS/MVD and displaced vertex search required in the first trigger level. Such a complex trigger is not feasible within the latency limits of conventional Front-End Electronics, typically 4 μsec at LHC. Work without L1 trigger Use Self-triggered Front-End Electronics Use timestamps to organize and correlate data Ship all hits to subsequent data buffer and processing stages High-Speed DAQ and Event Building High-Speed DAQ and Event Building Typical parameters (for 10 7 int/sec and 1% occupancy): 100 kHz channel hit rate 600 Byte/sec per channel data flow First level event selection, which replaces the L1 trigger in a conventional system, is done in a processor farm fed with data from the event building network Very efficient tracking algorithms are essential for the feasibility of the open charm event selection Up to 10 9 tracks/sec in the Silicon tracker Co-develop Silicon tracker layout and tracking algorithm for best overall performance Develop algorithms which exploit the full potential of modern processors. First step: - use 'Single Instruction Multiple Data' (SIMD) instructions. They are essential for the high performance of many multi-media applications (e.g. video codecs), but rarely used in data analysis. Best results were obtained with a Cellular Automaton based track finder with integrated Kalman filter track fit allows usage of double-side strip detectors even at high track densities highly optimized code - field approximated by polynomials - compact, cache-efficient data - most calculations SIMDized - fast on standard PC's - well adapted to next generation many-core and wide-SIMD processors - already ported to IBM cell processor very fast when only hard quasi-primary tracks are reconstructed, as needed in the online first level event selection of open charm candidates supports reconstruction of soft tracks down to 100 MeV/c, as needed in the offline analysis High Speed Tracking Algorithms High Speed Tracking Algorithms Source: I. Kisel, KIP, Heidelberg and GSI, Darmstadt FPGA FPGA FPGA FPGA PC PC PC PC PC PC PC PC PC PC Sub-Farm Sub-Farm Gaming Gaming STI: STI: Cell Cell GP GPU GP GPU Nvidia: Nvidia: Tesla Tesla GP CPU GP CPU Intel: Intel: Larrabee Larrabee CPU/GPU CPU/GPU AMD: AMD: Fusion Fusion ? ? Cell: heterogeneous multi-core Intel P4 Intel P4 Cell Cell lxg1411 eh102 blade11b c4 Data flow out of the Front-end Electronics at 10 7 int/sec will be about 1 TByte/sec Optimization steps for the track fit routine Performance on different platforms CPU time for track reconstruction and fit Typ. Au+Au collision Concept of SIMD instructions: process a short vector per cycle R&D Roadmap R&D Roadmap Detailed simulation and co- optimization of the tracking system and the analysis algorithms - alternate sensor types (single- sided sensors) - alternate module layouts Detailed studies of event selection algorithms - open charm selector covering all relevant channels (D 0 ,D ± ,D s c ) - design of multi-level event selection Mathematical and computational optimization of all algorithms Determine best platform (programmable logic vs.processor) for the different processing steps: Go beyond SIMDization (from scalars to vectors) Address MIMDization (multi-threads, multi-cores and many-core systems) Exploit the numerical throughput of dedicated purpose processors like GPU's (Graphics Processors) Be ready for the emerging heterogeneous many-core systems Re-design algorithms to run efficiently on all CPU/GPU architectures Investigate new languages for the performance critical core of algorithms, like Ct or CUDA GPU: Controller plus many ALU CPU: SIMD, multi-core

Upload: lilian-cooper

Post on 12-Jan-2016

224 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Helmholtz International Center for CBM – Online Reconstruction and Event Selection Open Charm Event Selection – Driving Force for FEE and DAQ Open charm:

Hel

mho

ltz In

tern

atio

nal C

ente

r fo

rCBM – Online Reconstruction

and Event Selection

Open Charm Event Selection – Driving Force for FEE and DAQOpen Charm Event Selection – Driving Force for FEE and DAQ

Open charm:

D (c = 312 m): D+ K-++ (9.5%)

D0 (c = 123 m): D0 K-+ (3.8%) D0 K- + + - (7.5%)

Ds (c = 150 m):

D+s K+ K- + (5.3%)

+c (c = 60 m):

+c pK-+ (5.0%)

No simple, single track level trigger primitive, like high pt, available to tag events of interest.

The only selective signature is the detection of the decay vertex.

Track reconstruction in STS/MVD anddisplaced vertex search required in thefirst trigger level.

Such a complex trigger is not feasiblewithin the latency limits of conventionalFront-End Electronics, typically 4 μsecat LHC.

Work without L1 trigger

Use Self-triggered Front-End Electronics

Use timestamps to organize andcorrelate data

Ship all hits to subsequent data buffer and processing stages

High-Speed DAQ and Event BuildingHigh-Speed DAQ and Event Building

Typical parameters (for 107 int/sec and 1% occupancy):100 kHz channel hit rate600 Byte/sec per channel data flow

First level event selection, which replaces the L1 trigger in aconventional system, is done in a processor farm fed withdata from the event building network

Very efficient tracking algorithms are essential for the feasibility of the open charm event selection

Up to 109 tracks/sec in the Silicon tracker

Co-develop Silicon tracker layout and trackingalgorithm for best overall performance

Develop algorithms which exploit the full potentialof modern processors. First step:- use 'Single Instruction Multiple Data' (SIMD)

instructions. They are essential for the highperformance of many multi-media applications(e.g. video codecs), but rarely used in dataanalysis.

Best results were obtained with aCellular Automaton based track finderwith integrated Kalman filter track fit

allows usage of double-side strip detectors even at high track densities

highly optimized code- field approximated by polynomials- compact, cache-efficient data- most calculations SIMDized- fast on standard PC's- well adapted to next generation

many-core and wide-SIMD processors- already ported to IBM cell processor

very fast when only hard quasi-primarytracks are reconstructed, as needed in the online first level event selection of open charm candidates

supports reconstruction of soft tracksdown to 100 MeV/c, as needed in theoffline analysis

High Speed Tracking AlgorithmsHigh Speed Tracking Algorithms

Source: I. Kisel, KIP, Heidelberg and GSI, Darmstadt

FP

GA

FP

GA

FP

GA

FP

GA

PCPC PCPCPCPCPCPC PCPC

Sub-FarmSub-Farm

GamingGaming STI: STI: CellCell

GamingGaming STI: STI: CellCell

GP GPUGP GPU Nvidia: Nvidia: TeslaTesla

GP GPUGP GPU Nvidia: Nvidia: TeslaTesla

GP CPUGP CPU Intel: Intel: LarrabeeLarrabee

GP CPUGP CPU Intel: Intel: LarrabeeLarrabee

CPU/GPUCPU/GPU AMD: AMD: FusionFusion

CPU/GPUCPU/GPU AMD: AMD: FusionFusion

????

Cell: heterogeneous multi-core

Inte

l P4

Inte

l P4

Cell

Cell

lxg1411

eh102blade11bc

4

Data flow out of the Front-end Electronicsat 107 int/sec will be about 1 TByte/sec

Optimization steps for the track fit routine

Performance on different platforms

CPU time for track reconstruction and fit

Typ. Au+Au collision

Concept of SIMD instructions:process a short vector per cycle

R&D RoadmapR&D Roadmap

Detailed simulation and co-optimization of the tracking system and the analysis algorithms- alternate sensor types (single-sided sensors)- alternate module layouts

Detailed studies of event selection algorithms- open charm selector covering all relevant

channels (D0,D±,Ds,Λc)- design of multi-level event selection

Mathematical and computational optimizationof all algorithms

Determine best platform (programmable logicvs.processor) for the different processing steps:- Hit/Cluster finding- Tracklet finding- Tracking/Vertexting

Go beyond SIMDization (from scalars to vectors)

Address MIMDization (multi-threads, multi-coresand many-core systems)

Exploit the numerical throughputof dedicated purpose processorslike GPU's (Graphics Processors)

Be ready for the emerging heterogeneousmany-core systems

Re-design algorithms to run efficiently onall CPU/GPU architectures

Investigate new languages for the performancecritical core of algorithms, like Ct or CUDA

GPU: Controller plus many ALUCPU: SIMD, multi-core