g. cowan lectures on statistical data analysis 1 statistical data analysis: lecture 10 1probability,...

G. Cowan Lectures on Statistical Data Analysis 1

Statistical Data Analysis: Lecture 10

1 Probability, Bayes’ theorem, random variables, pdfs2 Functions of r.v.s, expectation values, error propagation3 Catalogue of pdfs4 The Monte Carlo method5 Statistical tests: general concepts6 Test statistics, multivariate methods7 Significance tests8 Parameter estimation, maximum likelihood9 More maximum likelihood10 Method of least squares11 Interval estimation, setting limits12 Nuisance parameters, systematic uncertainties13 Examples of Bayesian approach14 tba

The method of least squaresSuppose we measure N values, y1, ..., yN, assumed to be independent Gaussian r.v.s with

Assume known values of the controlvariable x1, ..., xN and known variances

The likelihood function is

We want to estimate θ, i.e., fit the curve to the data points.

The method of least squares (2)

The log-likelihood function is therefore

So maximizing the likelihood is equivalent to minimizing

Minimum defines the least squares (LS) estimator

Very often measurement errors are ~Gaussian and so MLand LS are essentially the same.

Often minimize χ2 numerically (e.g. program MINUIT).

LS with correlated measurements

If the yi follow a multivariate Gaussian, covariance matrix V,

Then maximizing the likelihood is equivalent to minimizing

Example of least squares fit

Fit a polynomial of order p:

Variance of LS estimatorsIn most cases of interest we obtain the variance in a mannersimilar to ML. E.g. for data ~ Gaussian we have

and so

or for the graphical method we take the values of θ where

Two-parameter LS fit

Goodness-of-fit with least squaresThe value of the χ2 at its minimum is a measure of the levelof agreement between the data and fitted curve:

It can therefore be employed as a goodness-of-fit statistic totest the hypothesized functional form λ(x; θ).

We can show that if the hypothesis is correct, then the statistic t = χ2

min follows the chi-square pdf,

where the number of degrees of freedom is

nd = number of data points - number of fitted parameters

Goodness-of-fit with least squares (2)

The chi-square pdf has an expectation value equal to the number of degrees of freedom, so if χ2

min ≈ nd the fit is ‘good’.

More generally, find the p-value:

E.g. for the previous example with 1st order polynomial (line),

whereas for the 0th order polynomial (horizontal line),

This is the probability of obtaining a χ2min as high as the one

we got, or higher, if the hypothesis is correct.

Goodness-of-fit vs. statistical errors

Goodness-of-fit vs. stat. errors (2)

LS with binned data

LS with binned data (2)

LS with binned data — normalization

LS normalization example

Using LS to combine measurements

Combining correlated measurements with LS

Example: averaging two correlated measurements

Negative weights in LS average

Wrapping up lecture 10

Considering ML with Gaussian data led to the method ofLeast Squares.

Several caveats when the data are not (quite) Gaussian, e.g.,histogram-based data.

Goodness-of-fit with LS “easy” (but do not confuse good fitwith small stat. errors)

LS can be used for averaging measurements.

Next lecture: Interval estimation

g. cowan lectures on statistical data analysis 1 statistical data analysis: lecture 10 1probability,...

Documents

g. cowan sussp65, st andrews, 16-29 august 2009 /...

computing and statistical data analysis lecture 8 glen cowan...

g. cowan lectures on statistical data analysis lecture 14...

computing and statistical data analysis stat 5...

g. cowan s0s 2010 / statistical tests and limits 1...

g. cowan statistical data analysis / stat 3 1 statistical...

g. cowan lectures on statistical data analysis lecture 1...

g. cowan lectures on statistical data analysis lecture 2...

computing and statistical data analysis lecture 7 glen cowan...

advanced statistical methods for data analysis – lecture...

g. cowan next workshop, 2015 / gdc lecture 11 statistical...

1 glen cowan multivariate statistical methods in particle...

g. cowan lectures on statistical data analysis lecture 4...

advanced statistical methods for data analysis – lecture...

statistical methods in particle physics day 1:...

g. cowan statistical methods for hep / birmingham 9 nov 2011...

g. cowan weizmann statistics workshop, 2015 / gdc lecture 31...

statistical methods for particle physics -...

statistical data analysis stat 3: p-values, parameter...

g. cowan statistical data analysis / stat 2 1 statistical...