helmholtz international center for cbm – online reconstruction and event selection open charm...

Hel

mho

ltz In

tern

atio

nal C

ente

r fo

rCBM – Online Reconstruction

and Event Selection

Open Charm Event Selection – Driving Force for FEE and DAQOpen Charm Event Selection – Driving Force for FEE and DAQ

Open charm:

D (c = 312 m): D+ K-++ (9.5%)

D0 (c = 123 m): D0 K-+ (3.8%) D0 K- + + - (7.5%)

Ds (c = 150 m):

D+s K+ K- + (5.3%)

+c (c = 60 m):

+c pK-+ (5.0%)

No simple, single track level trigger primitive, like high pt, available to tag events of interest.

The only selective signature is the detection of the decay vertex.

Track reconstruction in STS/MVD anddisplaced vertex search required in thefirst trigger level.

Such a complex trigger is not feasiblewithin the latency limits of conventionalFront-End Electronics, typically 4 μsecat LHC.

Work without L1 trigger

Use Self-triggered Front-End Electronics

Use timestamps to organize andcorrelate data

Ship all hits to subsequent data buffer and processing stages

High-Speed DAQ and Event BuildingHigh-Speed DAQ and Event Building

Typical parameters (for 107 int/sec and 1% occupancy):100 kHz channel hit rate600 Byte/sec per channel data flow

First level event selection, which replaces the L1 trigger in aconventional system, is done in a processor farm fed withdata from the event building network

Very efficient tracking algorithms are essential for the feasibility of the open charm event selection

Up to 109 tracks/sec in the Silicon tracker

Co-develop Silicon tracker layout and trackingalgorithm for best overall performance

Develop algorithms which exploit the full potentialof modern processors. First step:- use 'Single Instruction Multiple Data' (SIMD)

instructions. They are essential for the highperformance of many multi-media applications(e.g. video codecs), but rarely used in dataanalysis.

Best results were obtained with aCellular Automaton based track finderwith integrated Kalman filter track fit

allows usage of double-side strip detectors even at high track densities

highly optimized code- field approximated by polynomials- compact, cache-efficient data- most calculations SIMDized- fast on standard PC's- well adapted to next generation

many-core and wide-SIMD processors- already ported to IBM cell processor

very fast when only hard quasi-primarytracks are reconstructed, as needed in the online first level event selection of open charm candidates

supports reconstruction of soft tracksdown to 100 MeV/c, as needed in theoffline analysis

High Speed Tracking AlgorithmsHigh Speed Tracking Algorithms

Source: I. Kisel, KIP, Heidelberg and GSI, Darmstadt

FP

GA

FP

GA

FP

GA

FP

GA

PCPC PCPCPCPCPCPC PCPC

Sub-FarmSub-Farm

GamingGaming STI: STI: CellCell

GamingGaming STI: STI: CellCell

GP GPUGP GPU Nvidia: Nvidia: TeslaTesla

GP GPUGP GPU Nvidia: Nvidia: TeslaTesla

GP CPUGP CPU Intel: Intel: LarrabeeLarrabee

GP CPUGP CPU Intel: Intel: LarrabeeLarrabee

CPU/GPUCPU/GPU AMD: AMD: FusionFusion

CPU/GPUCPU/GPU AMD: AMD: FusionFusion

????

Cell: heterogeneous multi-core

Inte

l P4

Inte

l P4

Cell

Cell

lxg1411

eh102blade11bc

4

Data flow out of the Front-end Electronicsat 107 int/sec will be about 1 TByte/sec

Optimization steps for the track fit routine

Performance on different platforms

CPU time for track reconstruction and fit

Typ. Au+Au collision

Concept of SIMD instructions:process a short vector per cycle

R&D RoadmapR&D Roadmap

Detailed simulation and co-optimization of the tracking system and the analysis algorithms- alternate sensor types (single-sided sensors)- alternate module layouts

Detailed studies of event selection algorithms- open charm selector covering all relevant

channels (D0,D±,Ds,Λc)- design of multi-level event selection

Mathematical and computational optimizationof all algorithms

Determine best platform (programmable logicvs.processor) for the different processing steps:- Hit/Cluster finding- Tracklet finding- Tracking/Vertexting

Go beyond SIMDization (from scalars to vectors)

Address MIMDization (multi-threads, multi-coresand many-core systems)

Exploit the numerical throughputof dedicated purpose processorslike GPU's (Graphics Processors)

Be ready for the emerging heterogeneousmany-core systems

Re-design algorithms to run efficiently onall CPU/GPU architectures

Investigate new languages for the performancecritical core of algorithms, like Ct or CUDA

GPU: Controller plus many ALUCPU: SIMD, multi-core

helmholtz international center for cbm – online reconstruction and event selection open charm...

Documents

track reconstruction

trigger level

open charm event selectionup

track finder

data analysis

c design

daqopen charm

track fit routineperformance