r.k.bock, durham, march 2002 1 gamma/hadron separation in atmospheric cherenkov telescopes overview...

33
R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes (IACT-s) image classification methods under study trying for a rigorous comparison

Post on 15-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 1

Gamma/Hadron separation in atmospheric Cherenkov telescopes

Overview

• multi-wavelength astrophysics

• imaging Cherenkov telescopes (IACT-s)

• image classification

• methods under study

• trying for a rigorous comparison

Page 2: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 2

Wavelength regimes in astrophysics

• extend over 20 orders of magnitude in energy, if one adds infrared, radio and microwave observations

• Cherenkov telescopes use visible light, but few quanta: ‘imaging’ takes a different meaning

• some instruments have to be satellite-based, due to the absorbing effect of the atmosphere

Page 3: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 3

Full sky at different wavelengths

Page 4: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 4

An AGN at different wavelengths

Page 5: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 5

Objects of interest: active galactic nuclei

Black holes spin and develop a jet with shock waves: electrons and protons get accelerated and impart their energy to high-E -rays

Page 6: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 6

Principle of imaging Cherenkov telescopes

• a shower develops in the atmosphere, charged relativistic particles emit Cherenkov radiation (at WLs visible to UV)

• some photons arrive at sea level, get reflected by a mirror to a camera

• high sensitivity and good time resolution are vital, precision is not: high reflectivity mirrors, the best possible photomultipliers in the camera

Page 7: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 7

Principle of imaging Cherenkov telescopes

Page 8: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 8

Principle of image parameters

• hadron showers (cosmics) dominate the hardware trigger, image analysis must discriminate gammas from hadrons

• showers show different characteristics (like in any calorimeter): feature extraction using principal component analysis and other characteristics must be used - experiment in view of best separation

Page 9: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 9

One of the predecessor telescopes (HEGRA) in 1999

Page 10: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 10

Photomontage of the MAGIC telescope in La Palma (2000)

Page 11: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 11

Installing the mirror dish of MAGIC

La Palma, Dec 2001

Page 12: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 12

Deciding on a single variable (test statistic)

A test statistic can be used to discriminate between members of twodifferent data samples if they have different statistical distributions. Anyfeature (parameter) extracted from the images of our telescopes can beconsidered such a test statistic. The following example is for theparticularly discriminating orientation variable alpha, one of our imageparameters.

Page 13: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 13

Multivariate classification

• cuts are in the n-space of features (in our case image parameters), the problem gets unwieldy even at low n

• correlations between the features cause simple cuts in variables to be an ineffective method

• decorrelation by standard methods (e.g. Karhunen-Loeve) does not solve the problem, being a linear operation

• finding new variables does help, so do cut parameters along one axis, that depend on features along a different axis: dynamic cuts (subjective!)

• ideally, a transformation to a single test statistic should be found

Page 14: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 14

Different classification methods

• cuts in the image parameters (including dynamic cuts)

• mathematically optimized cuts in the image parameters: classification and regression tree (CART), commercial products available

• linear discriminant analysis (LDA)

• composite (2-D) probabilities (CP)

• kernel methods

• artificial neural networks (ANN)

Page 15: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 15

There are many general methods on the market (this slide from A.Faruque, Mississipi State University)

Page 16: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 16

Method details and comments: cuts and supercuts

• wide experience exists in many physics experiments and for all IACT-s; any method claiming to be superior must use results from these as yardstick

• does need an optimization criterion, will not result in a relation between gamma acceptance and hadron contamination (i.e. no single test statistic)

• usually leads to separate studies and approximations for each new data set (this is past experience) - often difficult to reproduce

Page 17: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 17

Method details and comments: CART

• developed originally by high-energy physicists to do away with the randomness in optimizing cuts (Breimann, Friedmann, Olshen, Stone, 1984)

• now developed into a data mining method, commercially available from several companies

• basic operations: growing a tree, pruning it, splitting the leaves again - done in some heuristic succession

• the problem is to find a robust measure to choose from the many trees that are (or can be) grown

• made for large samples: no experience with IACT-s, but there are promising early results

Page 18: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 18

Method details and comments: LDA

• parametric method, finding linear combinations of the original image parameters such that the separation between signal (gamma) and background (hadron) distributions gets maximized

• fast, simple and (probably) very robust

• ignores non-linear correlations in n-dimensional space (because of linear transformation)

• little experience with LDA in IACT-s, early tests show that higher-order variables are needed (e.g. x,y -> x2y)

Page 19: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 19

Method details and comments: LDAG i v e n t r a i n i n g s a m p l e s g i ( i = 1 , n g ) f o r g a m m a s a n d p j ( j = 1 , n p )f o r p r o t o n s , w i t h n v a r e l e m e n t s e a c h , f i n d a l i n e a rt r a n s f o r m a t i o n v e c t o r a ( n v a r ) s u c h t h a t g ' = a . g a n d p ' = a . p ,a n d t h e d i s c r i m i n a t i n g p o w e r d = y T S b y / y T ( S b + S w ) y g e t sm a x i m i z e d , w h e r e y = { g ' p ' } . S b ( b e t w e e n - c l a s s v a r i a n c e ) a n dS w ( w i t h i n - c l a s s v a r i a n c e ) a r e d e f i n e d b y :

1-classtotjtotib

obs

and classjclassiw

) - )( - ( S

) - )(x - (x S

( x = { g p } , c l a s s c l a s s m e a n ,

t o t o v e r a l l m e a n ) . T h i s l e a d s , f o rt w o c l a s s e s , t o t h e r e s u l t :

} - {)S(S )n(n

nn tot_2tot_1

1-wb

pg

pg

a

Page 20: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 20

Method details and comments: LDA

Like Principal Component Analysis (PCA), LDA is used for dataclassification and dimensionality reduction. LDA maximizes the ratio of between-class variance to within-class variance, for any pair of data sets. This guarantees maximal separability.

The prime difference between LDA and PCA is that PCA performs feature classification (e.g. image parameters!) while LDA performs data classification. PCA changes both the shape and location of the data in its transformed space, whereas LDA provides more class separability by building a decision region between the classes.

The formalism is simple: the transformation into the ‘best separable space’ is performed by the eigenvectors of a matrix readily derived from the data (for our application: in two classes, gammas and hadrons)

Caveat: both the PCA and LDA are linear transformations; they may be of limited efficiency when non-linearity is involved.

Page 21: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 21

Method details and comments: kernel

• kernel density estimation is a nonparametric multivariate classification technique. The advantage is that of generality of the class-conditional and consistently estimated densities

• uses individual event likelihoods, defined as the closeness to the population of gamma events or hadron events in n-dimensional space. The closeness is expressed by a kernel function as metric

• mathematically convincing, but leading into practical problems, including limitations in dimensionality; there is also some randomness in choosing the kernel function

• has been toyed with in Whipple (the earliest functioning IACT), results look convincing; however, Whipple still uses supercuts; only first experience with kernels in MAGIC: positive

Page 22: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 22

Method details and comments: kernelW e d e f i n e r e f e r e n c e s a m p l e s g i ( i = 1 , n g ) f o r g a m m a s a n d p j ( j =1 , n p ) f o r p r o t o n s : M o n t e C a r l o g a m m a s , a n d ' o f f ' e v e n t s . W e t h e nf i n d a s c l a s s i f y e r a l i k e l i h o o d f u n c t i o n

)(

/

,,functions kernel with the

pgrpg

pgg

xxkk

kkR

x i s t h e p o i n t u n d e r c o n s i d e r a t i o n , xr

a r e t h e p o i n t s i n t h er e f e r e n c e s a m p l e ( g a m m a o r p r o t o n ) . T h e t r i c k i s t o d e f i n e av a l i d k e r n e l f u n c t i o n . W h i p p l e h a s u s e d a m u l t i v a r i a t e G a u s s i a n( l i k e a p o i n t s p r e a d f u n c t i o n ) :

||2/)}()(exp{ 1r

nrr

Tr CxxCxxk

C r i s t h e c o v a r i a n c e m a t r i x o f t h e r e f e r e n c e d a t a s e t .

T h e m e t h o d n e e d s c o m p a r i n g e v e r y e v e n t w i t h e v e r y e v e n t i nb o t h r e f e r e n c e s a m p l e s , a n d t h u s i s c o m p u t a t i o n a l l y v e r y c o s t l y .W h i p p l e h a s r e d u c e d t h e p a r a m e t e r s p a c e a n d p r e c o m p u t e d t h ek e r n e l f u n c t i o n f o r a l a t t i c e , u s i n g i n t e r p o l a t i o n .( S e e G a m m a R a y W o r k s h o p 1 9 9 9 , S n o w b i r d , p . 3 3 8 , a n dI C R C 2 0 0 1 , H a m b u r g , p . 2 9 3 9 )

Page 23: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 23

Method details and comments: composite probabilities (2-D)

• intuitive determination of event probabilities by multiplying the probabilities in all 2D projections that can be made from image parameters, using constant bin content for some data

• shown on some IACT data to at least match best existing results (but strict comparisons suffered from moving data sets)

Page 24: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 24

CP program uses same-content binning in

2 dimensions

Bins are set up for gammas (red),

probabilities are evaluated for protons (blue)

all possible 2-D projections are used

Method details and comments: composite probabilities (2-D)

Page 25: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 25

Method details and comments: ANN-s

• method has been presented often in the past - resembles the CART method but works in locally linearly transformed data

• substantial randomness in choosing depth of tree, training method, transfer function…..

• so far no convincing results on IACT-s, Whipple have tried and rejected

Page 26: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 26

Gamma events in MAGIC

before and after cleaning

Page 27: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 27

Proton events in MAGIC

before and after cleaning

Page 28: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 28

Comparison MC gammas / MC protons

Page 29: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 29

Comparison MC gammas / MC protons

Page 30: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 30

Comparison MC gammas / MC protons

Page 31: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 31

Typically, optimization parameters are fully defined by cost, purity, and sample size

Different methods on the same data set

Page 32: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 32

We are running a comparative study: criteria

• strictly defined disjoint training and control samples

• must give estimators for hadron contamination and gamma acceptance (purity and cost)

• should ideally result in a smooth function relating purity with cost, i.e. result in a single test statistic

• if not, must show results for several optimization criteria, e.g. estimated hadron contamination at fixed gamma acceptance values, significance, etc.

• for MC events, can control results by comparing classification to the known origin of events

Page 33: R.K.Bock, Durham, March 2002 1 Gamma/Hadron separation in atmospheric Cherenkov telescopes Overview multi-wavelength astrophysics imaging Cherenkov telescopes

R.K.Bock, Durham, March 2002 33

Even if there were a clear conclusion…..there remain some serious caveats

• these methods all assume an abstract space of image parameters, which is ok in Monte Carlo situations, only

• real data are subject to influences that distort this space:• starfield and night sky background• atmospheric conditions• unavoidable detector changes and malfunction

• no method can invent new independent parameters

• we assume that in final analysis, gammas will be Monte Carlo, measurements are on/off: we must deal with variables which may not be representative in Monte Carlo events and yet influence the observed image parameters; e.g zenith angle changes continuously, energy is something we want to observe, hence unknown

• some compromise between frequent Monte Carlo-ing and parametric corrections to parameters is the likely solution