helmholtz international center for cbm – online reconstruction and event selection open charm...
TRANSCRIPT
Hel
mho
ltz In
tern
atio
nal C
ente
r fo
rCBM – Online Reconstruction
and Event Selection
Open Charm Event Selection – Driving Force for FEE and DAQOpen Charm Event Selection – Driving Force for FEE and DAQ
Open charm:
D (c = 312 m): D+ K-++ (9.5%)
D0 (c = 123 m): D0 K-+ (3.8%) D0 K- + + - (7.5%)
Ds (c = 150 m):
D+s K+ K- + (5.3%)
+c (c = 60 m):
+c pK-+ (5.0%)
No simple, single track level trigger primitive, like high pt, available to tag events of interest.
The only selective signature is the detection of the decay vertex.
Track reconstruction in STS/MVD anddisplaced vertex search required in thefirst trigger level.
Such a complex trigger is not feasiblewithin the latency limits of conventionalFront-End Electronics, typically 4 μsecat LHC.
Work without L1 trigger
Use Self-triggered Front-End Electronics
Use timestamps to organize andcorrelate data
Ship all hits to subsequent data buffer and processing stages
High-Speed DAQ and Event BuildingHigh-Speed DAQ and Event Building
Typical parameters (for 107 int/sec and 1% occupancy):100 kHz channel hit rate600 Byte/sec per channel data flow
First level event selection, which replaces the L1 trigger in aconventional system, is done in a processor farm fed withdata from the event building network
Very efficient tracking algorithms are essential for the feasibility of the open charm event selection
Up to 109 tracks/sec in the Silicon tracker
Co-develop Silicon tracker layout and trackingalgorithm for best overall performance
Develop algorithms which exploit the full potentialof modern processors. First step:- use 'Single Instruction Multiple Data' (SIMD)
instructions. They are essential for the highperformance of many multi-media applications(e.g. video codecs), but rarely used in dataanalysis.
Best results were obtained with aCellular Automaton based track finderwith integrated Kalman filter track fit
allows usage of double-side strip detectors even at high track densities
highly optimized code- field approximated by polynomials- compact, cache-efficient data- most calculations SIMDized- fast on standard PC's- well adapted to next generation
many-core and wide-SIMD processors- already ported to IBM cell processor
very fast when only hard quasi-primarytracks are reconstructed, as needed in the online first level event selection of open charm candidates
supports reconstruction of soft tracksdown to 100 MeV/c, as needed in theoffline analysis
High Speed Tracking AlgorithmsHigh Speed Tracking Algorithms
Source: I. Kisel, KIP, Heidelberg and GSI, Darmstadt
FP
GA
FP
GA
FP
GA
FP
GA
PCPC PCPCPCPCPCPC PCPC
Sub-FarmSub-Farm
GamingGaming STI: STI: CellCell
GamingGaming STI: STI: CellCell
GP GPUGP GPU Nvidia: Nvidia: TeslaTesla
GP GPUGP GPU Nvidia: Nvidia: TeslaTesla
GP CPUGP CPU Intel: Intel: LarrabeeLarrabee
GP CPUGP CPU Intel: Intel: LarrabeeLarrabee
CPU/GPUCPU/GPU AMD: AMD: FusionFusion
CPU/GPUCPU/GPU AMD: AMD: FusionFusion
????
Cell: heterogeneous multi-core
Inte
l P4
Inte
l P4
Cell
Cell
lxg1411
eh102blade11bc
4
Data flow out of the Front-end Electronicsat 107 int/sec will be about 1 TByte/sec
Optimization steps for the track fit routine
Performance on different platforms
CPU time for track reconstruction and fit
Typ. Au+Au collision
Concept of SIMD instructions:process a short vector per cycle
R&D RoadmapR&D Roadmap
Detailed simulation and co-optimization of the tracking system and the analysis algorithms- alternate sensor types (single-sided sensors)- alternate module layouts
Detailed studies of event selection algorithms- open charm selector covering all relevant
channels (D0,D±,Ds,Λc)- design of multi-level event selection
Mathematical and computational optimizationof all algorithms
Determine best platform (programmable logicvs.processor) for the different processing steps:- Hit/Cluster finding- Tracklet finding- Tracking/Vertexting
Go beyond SIMDization (from scalars to vectors)
Address MIMDization (multi-threads, multi-coresand many-core systems)
Exploit the numerical throughputof dedicated purpose processorslike GPU's (Graphics Processors)
Be ready for the emerging heterogeneousmany-core systems
Re-design algorithms to run efficiently onall CPU/GPU architectures
Investigate new languages for the performancecritical core of algorithms, like Ct or CUDA
GPU: Controller plus many ALUCPU: SIMD, multi-core