e - / π separation with trd gennady ososkov lit jinr, dubna semen lebedev gsi, darmstadt and lit...

ee--//ππ separation with TRD separation with TRD Gennady Ososkov

LIT JINR, Dubna

Semen Lebedev GSI, Darmstadt and LIT JINR, Dubna

Claudia Höhne, Florian UhligGSI, Darmstadt

Pavel NevskiBNL-CERN

14th CBM Collaboration meeting October 6-9 2009, Split, Croatia

G.Ososkov e/pi separation CBM Collaboration Meeting, 7.10.2009 2

TR Production (recall)

dE/dx is described reasonably well TR is calculated using a model:

M. Castellano et al. Computer Physics Communication 61 (1990)

Parameters of the model are chosen to describe the measured data

• Nr. of foils • Foil thickness • Gas thickness Parameters are adjusted only for

1.5 GeV/c Trunk version of CBM ROOT from

SEPT09 was used.

Florian Uhlig, CBM Collaboration Meeting, March 2009

π contribution = dE/dx

e- contribution= dE/dx+TR

See also presentation from S.Lebedev et al on energy loss simulation discussions


From: Dolgoshein TRD overview NIM NIM A326 (1993A326 (1993))

…two methods of signal processing have been mainly used :

1) total energy deposition ("Q-method")The main factor limiting the rejection of nonradiating particles in this case is the Landau "tail" of ionization loss which simulates a large energy deposition comparable with that of a radiating particle

Inspired by discussion with Pavel Nevski (BNL,CERN)Inspired by discussion with Pavel Nevski (BNL,CERN)

There are important distinctions between the above two methods (Q and N)- for N-method compared to the Q-method thinner foils and detector gas layers are needed;- The readout for the two methods is also different. ADCs (or FADCs) are needed for the Q-method. N-method requires fast discriminators and sealers.

With the second method the avalanches produced by X-ray photoelectrons are recorded and counted when they exceed a given threshold (typically 4-5 keV). Nonradiating particles should provide fewer detectable clusters.

2) cluster counting ("N-methodcluster counting ("N-method”), proposed yet in 1980”), proposed yet in 1980


Default method e-/π identification in CBM30 years passed and new more powerful classifiers appeared30 years passed and new more powerful classifiers appeared

Method which is used now on CBM as a defaut (V.Ivanov, S.Lebedev, T.Akishina, O.Denisova) is based on applying a neural network. Energy losses from all 12 TRD layers are normalized i=1,..,12

ordered, then the values of Landau distribution function are calculated to be input to NN Pion suppression result depends on parameters of NN class.

For fixed choice of initial weights it is 550


Threshold methods on the CBM TRD: Threshold methods on the CBM TRD: (1)(1) photon cluster counting photon cluster counting

π e-

1) Easy algorithm for N counting: - cleare N; - compare with cut=5 KeV, if > 5, then increase N by 1; - repeate for each of 12 TRD layersAfter 12 ifs N is the photon cluster size to be histogrammed separately for pions and electrons.

2) Then PID algorithm is common:

- if N > threshold, then e- is indentified, else – pion is identified\

Pion suppression with cut=5 KeV is 448After the cut optimization it is 584, although

the e- efficiency drops down to 88.3%

Boris Dolgoshein (1993): «In general, the cluster counting (N) method, should be the best method due to the distinction of the Poisson distributed number of the ionization clusters produced by nonradiated particles against the Landau tail of dE/dx losses in case of Q-method. But the comparison of the two methods is complicated and should be done for each individual case, because of the different optimum structures required for both methods and the problem of the cluster counting ability of TR chambers.»

The main lesson: a transformation needed to reduce Landau tails of dE/dx losses a transformation needed to reduce Landau tails of dE/dx losses


Threshold methods on the CBM TRD: Threshold methods on the CBM TRD: (2)(2) ordered statistics ordered statistics

Now, the easy threshold algorithm with corresponding cut on λ1 gives the pion

suppresion 10. Median, which is 6th order statistic λ6 with the distribution function

proportional to [FLandau(x)(1-FLandau(x))]6 , gives pion suppression = 374

π

e-

Such trasformation can be provided by ordering the ΔEi sample i=1,..,12. For instance, the first order statistic λ1=ΔEmin has the distribution function

F(x)=P{λ1<x}= P{ΔE1<x, ΔE2<x,…, ΔE12<x}=[FLandau(x)]12

That means a substantional distribution compression along the horizontal axis. i.e. tails diminishing.

The main conclusion:The main conclusion: some of ordered statistics can also be used for pion suppression. However, why don’t use the information of all of them as a neural net input?

π

e-

original Landau distribution λ1 distribution Median distribution


The idea of ordering signals to be input to NN

Distributions of all 12 dE/dx after their ordering

NN with input of 12 ordered ΔE gives pion suppression = 685


Idea: input to NN probabilities of ΔE

Plot of cumulative distributions of ΔEcalculated from previous histogramms

Note: all ΔE-s must be scaled to interval [0,1] to be input to NN.

The excellent guess – input to NN not ΔE-s, but their probabilities calculated individually for each ΔEi One can calculate these probabilities either by pion distribution or by electron one

Pion suppression for the first case = 1200. When e- distribution is used, it is = 786


A thought:A thought:

-let us try some other classifierslet us try some other classifiersfrom TMVA (Toolkit for MultiVariate data Analysis with ROOT)


Decision trees in Particle Identification(From Ososkov’s lecture on CBM Tracking workshop, 15 June 2009)

data sample

Single Decision TreeSingle Decision Tree

RootNode

Branch

Leaf

Leaf

Node

● Go through all PID variables, sort them, find the best variable to separate signal from background and cut on it.● For each of the two subsets repeat the process.● This forking decision pattern is called a tree.● Split points are called nodes.● Ending nodes are called leaves.

1) Multiple cuts on X and Y in a big tree (only grows steps 1-4 shown)

However, a danger exists -degrading of classifier performance by demanding perfect training separation, which is called “overtraining”

all cuts for the decision tree


How to boost Decision Trees

● Given a training sample, boosting increases the weights of misclassifiedweights of misclassified eventsevents (background wich is classified as signal, or vice versa), such that they have a higher chance of being correctly classified in subsequent trees.● Trees with more misclassified events are also weighted, having a lower weight than trees with fewer misclassified events.● Build many trees (~1000) and do a weighted sum of event scores from all trees (score is 11 if signal leaf, -1-1 if background leaf).The renormalized sum of all the scores, possibly weighted, is the final score of the event. High scores mean the event is most likely signal and low scores that it is most likely background.

Boosted Decision TreesBoosted Decision Trees (BDT) (BDT) Boosting Algorithm has all the advantages of singledecision trees, and less susceptibility to overtraining.

Many weak trees (single-cut trees) combined(only 4 trees shown)

boosting algorithmproduces 500 weak trees together


e-/e-/ππ separation withseparation with boosted decision tree boosted decision tree

BDT output

Result for the BDT classifier: pion supression is 2180 for 90% electron efficiency

Cut = 0,77


Summary and outlookComparative study of e/pi separation methods was Comparative study of e/pi separation methods was accomplished foraccomplished for - 1D cut methods - photon cluster counting - ordered statistics of dE/dx - default Neural Net classifier

- Neural Net classifiers with input of - ordered statistics - probabilities of ordered statistics - Boosted Decision Tree classifierThe BDT shows the best performanceThe BDT shows the best performance

Outlook:Outlook:- - Correct simulation parameters in order to obtain better correspondence to experimental results- Facilitate the input for NN and BDT by approximations of cumulative distributions- Stability and robustness study for NN and BDT classifiers- Test other classifiers from TMVA (Toolkit for MultiVariate data Analysis with ROOT)


P.Nevski’s P.Nevski’s comment comment related to the practical aspects of pi/e rejection:

For experiments like CBM one should consider not only a rejection procedure, as it is, but it is necessary to take into account its robustness to such experimental factorsexperimental factors as calibration of measurements, pile up of signals etc. Since these factors are different for each station, measurements are taken in different conditions and, inevitably, are heterogeneous. TThat leads to serious violations of all neural network methodshat leads to serious violations of all neural network methods.

Cluster counting methods, as it was shown in practice, is quite stable to its only parameter - threshold only parameter - threshold and, therefore, it is very robust.

However that is the subject for more detailed study.

The final questionThe final question of a mathematician of a mathematician who is not experienced in TRD design and elecronics:who is not experienced in TRD design and elecronics:- if pion supression, as 500, is enough;- if pion supression, as 500, is enough;- if photon cluster counting is cheaper to carry it out than existing if photon cluster counting is cheaper to carry it out than existing approach, thenapproach, thenwhy do not consider the “N-method” as a real alternative to why do not consider the “N-method” as a real alternative to ““Q-methods” despite of all improvements shown above?Q-methods” despite of all improvements shown above?

G.Ososkov e/pi separation CBM Collaboration Meeting, 7.10.2009 15 15

Thanks for your attention!Thanks for your attention!

e - / π separation with trd gennady ososkov lit jinr, dubna semen lebedev gsi, darmstadt and lit...

Documents