signal/background discrimination harrison b. prosper samsi, march 20061 signal/background...

34
Signal/Background Discrimination H arrison B. Prosper SAMSI, March 2006 1 Signal/Background Signal/Background Discrimination Discrimination in in Particle Physics Particle Physics Harrison B. Prosper Florida State University SAMSI 8 March, 2006

Upload: gabriella-oleary

Post on 27-Mar-2015

225 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 1

Signal/Background Signal/Background Discrimination Discrimination

in in Particle PhysicsParticle Physics

Harrison B. ProsperFlorida State University

SAMSI8 March, 2006

Page 2: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 2

OutlineOutline

Particle Physics Data

Signal/Background Discrimination

Summary

Page 3: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 3

Particle Physics DataParticle Physics Data

proton + anti-proton

-> positron (e+)neutrino ()Jet1Jet2Jet3Jet4

This event is described by(at least) 3 + 2 + 3 x 4 = 17measured quantities.

Page 4: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 4

Particle Physics DataParticle Physics Data

H0 Standard ModelH1 Model of the Week

1

106

Page 5: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 5

Signal/Background Signal/Background DiscriminationDiscrimination

To minimize misclassification probability, compute

p(S|x) = p(x|S) p(S) / [p(x|S) p(S) + p(x|B) p(B)]

Every signal/background discrimination method is ultimately an algorithm to approximate this function, or a mapping thereof.

p(s) / p(b) is the prior signal to background ratio, that is, it is S/B before applying a cut to p(S|x).

Page 6: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 6

GivenD D = x, y

x = {x1,…xN}, y = {y1,…yN}of N training examples (events)

Infer A discriminant function f(x, w), with parameters w p(ww|x, y) = p(x, y|ww) p(ww) / p(x, y)

= p(y|x, w) p(x|ww) p(ww) / p(y|x) p(x)= p(y|x, w) p(ww) / p(y|x)

assuming p(x|w) -> p(x)

Signal/Background Signal/Background DiscriminationDiscrimination

Page 7: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 7

A typical likelihood for classification:

p(y|x, ww) = i f(xi, ww)y [1 – f(xi, ww)]1-y

where y = 0 for background eventsy = 1 for signal events

If f(x, ww) flexible enough, then maximizing p(y|x, ww) with respect to w yields f = p(S|x), asymptotically.

Signal/Background Signal/Background DiscriminationDiscrimination

Page 8: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 8

However, in a Bayesian calculation it is more natural to average

y(x) = ∫ f(x, ww) p(ww|D) dw

Questions:1. Do suitably flexible functions f(x, ww) exist?

2. Is there a feasible way to do the integral?

Signal/Background Signal/Background DiscriminationDiscrimination

Page 9: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 9

Answer 1: Yes!Answer 1: Yes!

Hilbert’s 13th problem: Prove a special case of the conjecture:The following is impossible, in general,

f(x1,…,xn) = F( g1(x1),…, gn(xn) )

In 1957, Kolmogorov proved thecontrary: A function f:Rn -> R can berepresented as followsf(x1,..,xn) = ∑i=1

2n+1 Qi( ∑j=1n Gij(xj) )

where Gij are independent of f(.)

Page 10: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 10

Kolmogorov FunctionsKolmogorov Functions

H

j

P

iiijjj xuavbwxf

1 1

tanh),(

n(x,w)

x1

x2

u, a

v, b )],(exp[1

1),(

wxfwxn

A neural network is an example of a Kolmogorov function, that is, a function capable of approximating arbitrary mappings f:Rn -> R

The parameters w = (u, a, v, b) are called weightsweights

Page 11: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 11

Answer 2: Yes!Answer 2: Yes!

Computational MethodGenerate a Markov chain (MC) of N points

{w}, whose stationary density is p(w|D), and average over the last M points.

Map problem into that of “particle” moving in a spatially-varying “potential” and use methods of statistical mechanics to generate states (p, w) with probability

~ exp(- H),

where H is the “Hamiltonian”H = log p(w|D) + p2, with

“momentum” p.

Page 12: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 12

Hybrid Markov Chain Monte Hybrid Markov Chain Monte CarloCarlo

Computational Method…For a fixed H traverse space (p, w) using

Hamilton’s equations, which guarantees that all points consistent with H will be visited with equal probability ~ exp(-H).

To allow exploration of states with differing values of H one introduces, periodically, random changes to the momentum p.

SoftwareFlexible Bayesian Modeling by Radford Nealhttp://www.cs.utoronto.ca/~radford/fbm.software.html

Page 13: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Example 1Example 1

Page 14: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 14

Example 1: 1-DExample 1: 1-D

Signal p+pbar -> t q b

Background p+pbar -> W b b

NN Model Class (1, 15, 1)

MCMC 500 tqb + Wbb events Use last 20 points in a

chain of 10,000,

x

tqb

skipping every 20th

Wbb

Page 15: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 15

Example 1: 1-DExample 1: 1-D

x

Dots p(S|x) = HS/(HS+HB)

HS, HB, 1-D histograms

Curves Individual NNs n(x, wwkk)

Black curve < n(x, w) >

Page 16: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Example 2Example 2

Page 17: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 17

Example 2: 14-D (Finding Example 2: 14-D (Finding Susy!)Susy!)

Transversemomentumspectra

Signal:blackcurve

Signal/Noise

1/25,0001/25,000

Page 18: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 18

Example 2: 14-D (Finding Example 2: 14-D (Finding Susy!)Susy!)

Missingtransversemomentumspectrum

(caused byescape ofneutrinosand Susyparticles)

Measuredquantities:

4 x (ET, , )

+ (ET, )

= 1414

Page 19: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 19

Likelihood Prior

Example 2: 14-D (Finding Example 2: 14-D (Finding Susy!)Susy!)

Signal250 p+pbar -> gluino, gluino (Susy) events

Background250 p+pbar -> top, anti-top events

NN Model Class(14, 40, 1) (w є 641-D parameter space!)

MCMCUse last 100 networks in a Markov chain of

10,000, skipping every 20.

Page 20: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 20

ResultsResults

Network distributionbeyond n(x) > 0.9

Assuming L = 10 fb-1

Cut S B S/√B0.90 5x103 2x106 3.50.95 4x103 7x105 4.70.99 1x103 2x104 7.0

Page 21: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 21

But Does It Really Work? But Does It Really Work?

Let d(x) = N p(x|S) + N p(x|B) be the density of the data, containing 2N events, assuming, for simplicity, p(S) = p(B).

A properly trained classifier y(x) approximates

p(S|x) = p(x|S)/[p(x|S) + p(x|B)]

Therefore, if the data (signal + background) are weighted with y(x), we should recover the signal density.

Page 22: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 22

But Does It Really Work? But Does It Really Work?

It seems to!

Page 23: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Example 3Example 3

Page 24: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 24

Particle Physics Data, Take 2Particle Physics Data, Take 2

Two varieties of jet:

1. Tagged (Jet 1, Jet 4)

2. Untagged (Jet 2, Jet 3)

We are often interested in

Pr(Tagged|Jet Variables)

Page 25: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 25

Example 3: “Tagging” JetsExample 3: “Tagging” Jets

Tagged-jet

Untagged-jetcollision point

p(T|x)= pp(x|T) p(T) / dd(x)

d(x) = pp(x|T) p(T) + p(x|U) p(U)

x = (PT, , )(red curve is dd(x)!)

pp(x

|T)

or dd

(x)

Page 26: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 26

Probability Density Probability Density EstimationEstimation

Approximate a density by a sum over kernels K(.), one placed at each of the N points xi of the training sample.

h is one or more smoothing parameters adjusted to provide the best approximation to the true density p(x).

If h is too small, the model will be very spiky; if h is too large, features of the density p(x) will be lost.

N

i

i

h

xxk

Nxp

1

1)(ˆ

Page 27: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 27

Probability Density Probability Density EstimationEstimation

Why does this work? Consider the limit as N -> ∞ of

In the limit N -> ∞, the true density p(x) will be recovered provided that h -> 0 in such a way that

N

i

i

h

xxk

Nxp

1

1)(ˆ

dzzp

h

zxkxp )(

)()(ˆ

)( zxh

xxk ni

Page 28: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 28

Probability Density Probability Density EstimationEstimation

As long as the kernel behaves sensibly in the N -> ∞ limit any kernel will do. In practice, the most commonly used kernel is the product of 1-D Gaussians, one for each dimension “i”:

One advantage of the PDE approximation is that it contains very few adjustable parameters: basically, the smoothing parameters.

2/

1

2

)2(/2/exp/ nnn

i

i hh

zxhzxk

Page 29: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 29

Example 3: “Tagging” JetsExample 3: “Tagging” Jets

Tagged-jet

Untagged-jetcollision point

Projections of estimated p(T|x) (black curve) onto thePT, and axes. Blue points: ratio of blue to red histograms

(see slide 25)

Page 30: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 30

Example 3: “Tagging” JetsExample 3: “Tagging” Jets

Tagged-jet

Untagged-jetcollision point

Projections of data weighted by p(T|x). Recovers tagged density p(xx|T).

Page 31: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 31

But, How Well Does It Work?But, How Well Does It Work?

Tagged-jet

Untagged-jetcollision point

How well do the n-D model and the n-D data agree?A thought (JL, HBP):

1. Project the model and the data onto the same set ofrandomly directed rays through the origin.2. Compute some measure of discrepancy for each pairof projections.3. Do something sensible with this set of numbers!!

Page 32: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 32

Tagged-jet

Untagged-jetcollision point

But, How Well Does It Work?But, How Well Does It Work?

Projections of p(T|x)onto 3 randomly chosenrays through the origin.

Page 33: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 33

Tagged-jet

Untagged-jetcollision point

Projections of weighted tagged + untagged data onto the 3 randomly selected rays.

But, How Well Does It Work?But, How Well Does It Work?

Page 34: Signal/Background Discrimination Harrison B. Prosper SAMSI, March 20061 Signal/Background Discrimination in Particle Physics Harrison B. Prosper Florida

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006 34

SummarySummary

Multivariate methods have been applied with considerable success in particle physics, especially for classification. However, there is considerable room for improving our understanding of them as well as expanding their domain of application.

The main challenge is data/model comparison when each datum is a point in 1…20 dimensions. During the SAMSI workshop we hope to make some progress on the use of projections onto multiple rays. This may be an interesting area for collaboration between physicists and statisticians.