optimal jet finder ernest jankowski university of alberta in collaboration with d. grigoriev f....

OPTIMAL JET FINDER

Ernest Jankowski

University of Alberta

in collaboration with

D. Grigoriev • F. Tkachov

Acknowledgements

• Alberta Ingenuity Fund

• J Gordin Kaplan Graduate Student Award

Plan of the talk

• introduction

• Optimal Jet Definition and its implementation

• comparison with cone and kT algorithms

• benchmark physics application test based on

• motivations behind OJD

GeV 180atjets4 WWee

Introduction• final states of particle collisions in HEP experiments

often consist of sprays of hadrons

• simple interpretation: in the collision quarks with high kinetic energy are produced or released from the colliding particles

• but quarks interact very strongly at large distances and never appear as free particles (confinement)

• any attempt to separate free quarks results in production of extra pairs of quarks and antiquarks

• which recombine into colorless states: hadrons

• if the collision energy is high enough the hadrons appear in sprays, called jets

Hadronic jets: an example

jets44321 qqqqWWee

1q 2q 4q

W W

e e

3q

Jet algorithms

• roughly jets correspond to quarks and gluons produced in hard scattering process

• jet algorithm takes the final state hadrons and assigns them to jets

• a jet’s momentum is then computed from the momenta of the particles that belong to that jet (so called recombination scheme)

• the jet’s momentum corresponds approximately to the momentum of quarks and gluons in hard scattering process (partons in perturbative calculations)

• jet algorithms used: cone algorithm and successive recombination algorithms, such as Durham (kT)

Optimal Jet Definition

• I will present so called Optimal Jet Definition (OJD) proposed by Fyodor Tkachov

• a short introduction to the subject is Phys. Rev. Lett. 91, 061801 (2003)

• FORTRAN 77 implementation of OJD called Optimal Jet Finder (OJF) is described in hep-ph/0301226 (Comp. Phys. Commun., in print)

Recombination matrix zaj

HEP event: list of particles (partons • hadrons • calorimeter cells • towers • preclusters)

result: list of jets

recombination matrix

parts,,2,1, napa

jets,,2,1, njq j

jetsparts}{ nnjaz

parts

1

n

aaajj pzq the 4-momentum qj of the j-th jet

expressed by 4-momenta pa of the particles


• zaj describes the fraction of the a-th particle that belongs to the j-th jet (i.e. we split up particles between jets)

• conventional jet algorithms have zaj equal to 0 or 1, i.e. a particle either entirely belongs to some jet or does not belong to that jet at all

• fragmentation and hadronization is always effect of interaction of (at least) two hard partons evolving into two jets, so some hadrons that emerge in this process can belong partially to both jets

• but this is also very convenient algorithmically, because a jet configuration is described by a set of continuous numbers zaj


parts

1

n

aaajj pzq

0ajz

jets

1

1n

jaja zz

i.e. no more than 100% of each particle is assigned to jets

the fraction of the energy of the a-th particle that does not go into any jet

0az

the 4-momentum qj of the j-th jet expressed by 4-momenta pa of the particles (a=1,2,...,nparts)

the fraction of the energy of the a-th particle can be positive only

Optimal Jet Definition

• any allowed value of the recombination matrix {zaj} describes some jet configuration

• the desired optimal jet configuration is the one that minimizes some function ({zaj})

• details of are different for CM lepton-lepton collisions (spherical kinematics) and collisions involving hadrons (cylindrical kinematics), where boost invariance along the beam axis should be maintained

Optimal Jet Definition: spherical kinematics

partsjets parts

11 1

22 2

sin4 n

aaa

n

j

n

a

ajaajaj EzEz

Rz

parts

1

,n

aaajjjj pzEq q

aj is the angle between the a-th particle and j-th jet

Ea is the energy of the a-th particle

R>0 is a parameter with a similar meaning as the cone radius

width of the j-th jet energy outside jets

1.0R

GeV 180jets4]180-[0]360-[0 WWee

2.0R

GeV 180jets4]180-[0]360-[0 WWee

3.0R

GeV 180jets4]180-[0]360-[0 WWee

4.0R

GeV 180jets4]180-[0]360-[0 WWee

5.0R

GeV 180jets4]180-[0]360-[0 WWee

6.0R

GeV 180jets4]180-[0]360-[0 WWee

7.0R

GeV 180jets4]180-[0]360-[0 WWee

9.0R

GeV 180jets4]180-[0]360-[0 WWee

1.1R

GeV 180jets4]180-[0]360-[0 WWee

Optimal Jet Finder: fixed number of jets

• desired optimal jet configuration corresponds to minimum of ({zaj})

• the program finds a minimum iteratively using a simple gradient-based method

• facilitated by analytical formulas for gradient

• start with some candidate minimum (the initial jet configuration) and descend into a local minimum in subsequent iterations

• the initial jet configuration may be completely random

Minimization: 2 jets

1az

2az

1

1

1az

2az

1

1

a point inside the simplex a point on the boundary of the simplex

GeV 180jets4]180-[0]360-[0 WWee

iteration = 0

GeV 180jets4]180-[0]360-[0 WWee

iteration = 1

GeV 180jets4]180-[0]360-[0 WWee

iteration = 2

GeV 180jets4]180-[0]360-[0 WWee

iteration = 3

GeV 180jets4]180-[0]360-[0 WWee

iteration = 4

GeV 180jets4]180-[0]360-[0 WWee

iteration = 5

GeV 180jets4]180-[0]360-[0 WWee

iteration = 6

Local minima of

• each time the programs starts with a random configuration, it finds a local minimum which does not need to be the global minimum of ({zaj})

• this is similar to how the result of cone algorithm depends on the initial position for the cone iterations

• but here in contrast with the cone algorithm, we know which local minimum we should choose: the one that gives the smallest value of

Optimal Jet Finder: number of tries

• in order to find the global minimum or increase the probability of finding it, the program tries a couple different random initial configurations

• OJF parameter: number of tries ntries

• it was sufficient to take ntries 10

(even ntries 3) in the cases we studied

• compromise between quality of jets found and computing time used

OJF: number of jets to be determined1. assume some small positive parameter cut, which is analogous to the jet

resolution parameter ycut in binary recombination algorithms

2. start with (for example) njets=1

3. find the best configuration with the number of jets equal to njets

as described previously (starting from ntries different initial configurations and choosing the best configuration)

4. check if

5. if so, this is the final jet configuration

and the final number of jets is njets

6. if not, increase njets by 1 and go to point 3

cut

Cone algorithm• cone algorithm defines a jet as all particles within a cone

of certain radius in space

j, j are coordinates of the geometrical center of the cone; a, a are coordinates of the particles

j, j are found from the requirement that E-weighted centroid computed from all particles in the cone coincides with the geometrical center of the cone

Rjaja 22

cone

cone

a

aa

j E

E

cone

cone

a

aa

j E

E

Cone algorithm

j, j are found in an iterative procedure:

• start with some trial position j, j for a cone

for all particles within the cone compute E-weighted centroid ’j, ’j (usually different from j, j)

• use the centroid as a new position of the center of the cone

• the content of the cone will change accordingly,

update it and go to

• procedure ends when the cone center and E_weighted centroid for all particles in the cone coincide: a stable cone

Cone algorithm: seeds

• the iterative procedure described above needs some initial cone position

• look for stable cones starting everywhere

(i.e. at every tower or cell) – very expensive computationally

• start only at energetive towers, so called seeds

Problems with seeds

• problems arise when we compare theory and experimental data

• when we apply the cone algorithm involving seeds to partons beyond the LO, soft radiation or collinear splitting of partons may change the jet configuration significantly

• whereas the experimental jet configuration remains unchanged

Soft radiation at NLO

R < (angle between two partons) < 2R

the event is reconstructed to 2 jets

with a soft gluon as a seed the event is reconstructed into 1 jet

qqee gqqee

Collinear splitting at NNLO

R < (angle between two partons) < 2R

the event is reconstructed to 1 jet using the most energetic parton (red) as the seed

the “red” parton splits into 2 collinear pieces; now the “blue” parton is the most energetic and the event is reconstructed into 2 jets

gqqee QQqqeeggqqee ,

“Collinear splitting” in the experiment

the particle in the center hits a single calorimeter cell; the cell has sufficient energy to be a seed

gqqee gqqee

the particle in the center hits two separate cells; none of the cells has enough energy to be a seed

Cone algorithm: overlapping cones

• after all stable cones are found, deal with overlapping cones

• if the fraction f of overlapping energy (with respect to the smaller energy jet) exceeds some threshold (e.g. f>50%) merge the two jets

• otherwise split two jets, assigning particles to the closest jet

• if there are more than two jets overlapping, the final result may depend on the ordering of this procedure (so some standard order has to be specified)

• it is difficult to take account of the merging \ splitting procedure in theoretical calculations

Cone algorithm: overlapping cones

experiment theory

Solution to seed problems

• cone algorithm involving seeds becomes unstable at higher order perturbative calculations

• seedless algorithm: look for jets everywhere

• Improved Legacy Cone Algorithm (ILCA):

midpoints pab=pa+pb, pabc=pa+pb+pc, ... are used as seeds

• still problem with overlapping cones

• successive recombination algorithms (binary algorithms), such as JADE, Durham (kT)

is an infrared and collinear safe

partsjets partsparts

11 112

2 n

aaa

n

j

n

aaaj

n

aaajaj EzzEz

Rz p

partsjets

112

2 n

aaa

n

jjjaj EzE

Rz p

Successive recombination algorithms

• for each pair of particles compute the “distance” dab between the particles in the pair, for example

• choose the pair with the smallest distance and

• if min(dab) < ycut (ycut is the jet resolution parameter) combine the two particles into one particle using pnew=pa+pb and go back to the beginning (now having the number of particles decreased by one)

• if min(dab) ycut then stop (we achieved the final jet configuration)

JADE2

sin4 2 abbaab EEd

)(kDurham2

sin,min4 T222 ab

baab EEd

OJF and kT

• kT merges only 2 particles at a time (binary recombination 21)

• other recombination schemes also considered 32, mn

• OJF takes into account the global structure of the energy flow in the event, i.e.

• jet configuration is found from the momenta of all particles in the event (corresponds to mn)

• OJF finds more regular jets that kT (still a jet is not a cone)

OJF and kT

• OJF is much faster that kT if a large number of particles has to be analyzed

• average time per event

nparts for OJF

n3parts for kT

• this does not matter when we apply the algorithm to theoretical calculations, but in experiments we have to analyze large number of calorimeter cells

• D0 45000 cells, ATLAS 200 000 cells

OJF and kT

• the cubic dependence determines the way kT has to be used in experiments

• it cannot be applied directly at the level of cells

• a preclustering step is needed to reduce the input data to 200 preclusters (D0 and CDF practice)

• how does the preclustering affect measurements?

• it is difficult to account for the preclustering in theoretical calculations

• the preclustering is a completely independent procedure from the kT algorithm itself

Benchmark test: W-boson mass extraction

• benchmark test based on the W-boson mass extraction from the process

• modeled on the OPAL analysis (CERN-EP-2000-099)

• we compared OJF with JADE and Durham (kT) algorithm (the best algorithm used by the OPAL collaboration)

• we obtained the same accuracy as Durham (still we did not explore all possibilities)

• we studied the speed of OJF and we found that it is much faster then Durham when a large number of calorimeter cells need to be analyzed

jets44321 qqqqWWee

Quality of jets

ALGORITHM statistical error of W-boson mass (corresponding to 1000 experimental events) based on Fisher’s information

[MeV] (±3)

Durham (kT) 105

JADE 118

OJF 106

tim

e [s

econ

ds]

number of particles \ cells

average time per event

kT

OJF ntries=10

OJF ntries=5

GeV 180jets4 WWee

tim

e [s

econ

ds]

number of particles \ cells

average time per event

kT

OJF ntries=10, ntries=5

GeV 180jets4 WWee

Method of moments

• parameter estimation, a fundamental problem of mathematical statistic

• for an event P theory gives the probability density M(P) that depends on some parameter M, i.e. MW or s

• given an experimental sample of events, we want to estimate the best value of the parameter M with possibly small statistical error

• method of maximal likelihood

• reinterpreted as method of moments

Method of moments

)(theory

PPP fdf M

exp

1expexp

)(1 N

iif

Nf P

expexptheoryexp, MMff

depends on M

computed from the experimental sample

Method of moments

fMfI

var

1

2optopt ffI

Mf M

Pln

opt

optexp

exp

1

fINM

informativeness of the observable f, based on the statistical error that f gives

Fisher’s information

Rao-Cramer inequality

Event representation: basic shape observables

parts,,2,1, napa P

parts

1

ˆ,ˆ)ˆ(n

aaaE pnnP

parts

1sphereunit

ˆˆ)ˆ(ˆdn

aaa fEff pnnPnP

parts

1

1n

aaE

event: particles

event as energy flow collinearly invariant

basic shape observables

f function of a direction only

Factorial estimate

• event values of all basic shape observables f(P)

• the optimal observable for the measurement of some parameter can be expressed as a combination of basic shape observables

• applying a jet algorithm we reduce the available information about the event P to make the analysis computationally manageable

• we use values of all basic shape observables f(Q), taken on the jet configuration

PQP algorithmjet

Hadronic event approximated by jets

• good approximations for f(P) could exist among functions that depend only on Q, which is a parameterization of P in terms of a few jets, found from the condition

• modeled after

• (q is partonic structure of the event) this expresses the fact that most of the information about the event is inherited from its gluon-and-quark structure

Pq ff

PQ ff

Factorial estimate

QPQP ,, RRfCff

parts

1

ˆn

aaa fEf pP

jets

1

ˆn

jjj fEf qQ

for any f; Cf,R depends only on f, it does not depend on the event or jet configuration configuration

depends only on the event P and the jet configuration Q

we choose the jet configuration so that the loss of information about the event is minimal generically for all f (we minimize )

this is essentially the Optimal Jet Definition

Summary

• I presented the Optimal Jet Finder

• based on the global energy flow in the event

• infra-red and collinear safe: no seed-related problems

• no overlapping jets

• returns additional numerical characteristics of the jet configuration found (so called dynamical width and soft energy) which may be helpful in construction of (quasi-) optimal observables in statistical problems

• much faster than kT for a large number of input cells

Optimal Jet Definition: cylindrical kinematics

partsjets parts

11 1

222 2

sinh2

sinh4 n

aaa

n

j

n

a

jajaaajaj EzEz

Rz

parts

1

,n

aaajjjj pzEq q jj

j

j sin,cos

q

q

parts

parts

1

1n

aaaj

n

aaaaj

j

Ez

Ez 2y2x

aaa ppE

optimal jet finder ernest jankowski university of alberta in collaboration with d. grigoriev f....

Documents