statistical data analysis: lecture 10
DESCRIPTION
Statistical Data Analysis: Lecture 10. 1Probability, Bayes ’ theorem, random variables, pdfs 2Functions of r.v.s, expectation values, error propagation 3Catalogue of pdfs 4The Monte Carlo method 5Statistical tests: general concepts 6Test statistics, multivariate methods - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Statistical Data Analysis: Lecture 10](https://reader031.vdocuments.site/reader031/viewer/2022012922/56813c1a550346895da5902c/html5/thumbnails/1.jpg)
G. Cowan Lectures on Statistical Data Analysis 1
Statistical Data Analysis: Lecture 10
1 Probability, Bayes’ theorem, random variables, pdfs2 Functions of r.v.s, expectation values, error propagation3 Catalogue of pdfs4 The Monte Carlo method5 Statistical tests: general concepts6 Test statistics, multivariate methods7 Significance tests8 Parameter estimation, maximum likelihood9 More maximum likelihood10 Method of least squares11 Interval estimation, setting limits12 Nuisance parameters, systematic uncertainties13 Examples of Bayesian approach14 tba
![Page 2: Statistical Data Analysis: Lecture 10](https://reader031.vdocuments.site/reader031/viewer/2022012922/56813c1a550346895da5902c/html5/thumbnails/2.jpg)
G. Cowan Lectures on Statistical Data Analysis 2
The method of least squaresSuppose we measure N values, y1, ..., yN, assumed to be independent Gaussian r.v.s with
Assume known values of the controlvariable x1, ..., xN and known variances
The likelihood function is
We want to estimate θ, i.e., fit the curve to the data points.
![Page 3: Statistical Data Analysis: Lecture 10](https://reader031.vdocuments.site/reader031/viewer/2022012922/56813c1a550346895da5902c/html5/thumbnails/3.jpg)
G. Cowan Lectures on Statistical Data Analysis 3
The method of least squares (2)
The log-likelihood function is therefore
So maximizing the likelihood is equivalent to minimizing
Minimum defines the least squares (LS) estimator
Very often measurement errors are ~Gaussian and so MLand LS are essentially the same.
Often minimize χ2 numerically (e.g. program MINUIT).
![Page 4: Statistical Data Analysis: Lecture 10](https://reader031.vdocuments.site/reader031/viewer/2022012922/56813c1a550346895da5902c/html5/thumbnails/4.jpg)
G. Cowan Lectures on Statistical Data Analysis 4
LS with correlated measurements
If the yi follow a multivariate Gaussian, covariance matrix V,
Then maximizing the likelihood is equivalent to minimizing
![Page 5: Statistical Data Analysis: Lecture 10](https://reader031.vdocuments.site/reader031/viewer/2022012922/56813c1a550346895da5902c/html5/thumbnails/5.jpg)
G. Cowan Lectures on Statistical Data Analysis 5
Example of least squares fit
Fit a polynomial of order p:
![Page 6: Statistical Data Analysis: Lecture 10](https://reader031.vdocuments.site/reader031/viewer/2022012922/56813c1a550346895da5902c/html5/thumbnails/6.jpg)
G. Cowan Lectures on Statistical Data Analysis 6
Variance of LS estimatorsIn most cases of interest we obtain the variance in a mannersimilar to ML. E.g. for data ~ Gaussian we have
and so
or for the graphical method we take the values of θ where
1.0
![Page 7: Statistical Data Analysis: Lecture 10](https://reader031.vdocuments.site/reader031/viewer/2022012922/56813c1a550346895da5902c/html5/thumbnails/7.jpg)
G. Cowan Lectures on Statistical Data Analysis 7
Two-parameter LS fit
![Page 8: Statistical Data Analysis: Lecture 10](https://reader031.vdocuments.site/reader031/viewer/2022012922/56813c1a550346895da5902c/html5/thumbnails/8.jpg)
G. Cowan Lectures on Statistical Data Analysis 8
Goodness-of-fit with least squaresThe value of the χ2 at its minimum is a measure of the levelof agreement between the data and fitted curve:
It can therefore be employed as a goodness-of-fit statistic totest the hypothesized functional form λ(x; θ).
We can show that if the hypothesis is correct, then the statistic t = χ2
min follows the chi-square pdf,
where the number of degrees of freedom is
nd = number of data points - number of fitted parameters
![Page 9: Statistical Data Analysis: Lecture 10](https://reader031.vdocuments.site/reader031/viewer/2022012922/56813c1a550346895da5902c/html5/thumbnails/9.jpg)
G. Cowan Lectures on Statistical Data Analysis 9
Goodness-of-fit with least squares (2)
The chi-square pdf has an expectation value equal to the number of degrees of freedom, so if χ2
min ≈ nd the fit is ‘good’.
More generally, find the p-value:
E.g. for the previous example with 1st order polynomial (line),
whereas for the 0th order polynomial (horizontal line),
This is the probability of obtaining a χ2min as high as the one
we got, or higher, if the hypothesis is correct.
![Page 10: Statistical Data Analysis: Lecture 10](https://reader031.vdocuments.site/reader031/viewer/2022012922/56813c1a550346895da5902c/html5/thumbnails/10.jpg)
G. Cowan Lectures on Statistical Data Analysis 10
Goodness-of-fit vs. statistical errors
![Page 11: Statistical Data Analysis: Lecture 10](https://reader031.vdocuments.site/reader031/viewer/2022012922/56813c1a550346895da5902c/html5/thumbnails/11.jpg)
G. Cowan Lectures on Statistical Data Analysis 11
Goodness-of-fit vs. stat. errors (2)
![Page 12: Statistical Data Analysis: Lecture 10](https://reader031.vdocuments.site/reader031/viewer/2022012922/56813c1a550346895da5902c/html5/thumbnails/12.jpg)
G. Cowan Lectures on Statistical Data Analysis 12
LS with binned data
![Page 13: Statistical Data Analysis: Lecture 10](https://reader031.vdocuments.site/reader031/viewer/2022012922/56813c1a550346895da5902c/html5/thumbnails/13.jpg)
G. Cowan Lectures on Statistical Data Analysis 13
LS with binned data (2)
![Page 14: Statistical Data Analysis: Lecture 10](https://reader031.vdocuments.site/reader031/viewer/2022012922/56813c1a550346895da5902c/html5/thumbnails/14.jpg)
G. Cowan Lectures on Statistical Data Analysis 14
LS with binned data — normalization
![Page 15: Statistical Data Analysis: Lecture 10](https://reader031.vdocuments.site/reader031/viewer/2022012922/56813c1a550346895da5902c/html5/thumbnails/15.jpg)
G. Cowan Lectures on Statistical Data Analysis 15
LS normalization example
![Page 16: Statistical Data Analysis: Lecture 10](https://reader031.vdocuments.site/reader031/viewer/2022012922/56813c1a550346895da5902c/html5/thumbnails/16.jpg)
G. Cowan Lectures on Statistical Data Analysis 16
Using LS to combine measurements
![Page 17: Statistical Data Analysis: Lecture 10](https://reader031.vdocuments.site/reader031/viewer/2022012922/56813c1a550346895da5902c/html5/thumbnails/17.jpg)
G. Cowan Lectures on Statistical Data Analysis 17
Combining correlated measurements with LS
![Page 18: Statistical Data Analysis: Lecture 10](https://reader031.vdocuments.site/reader031/viewer/2022012922/56813c1a550346895da5902c/html5/thumbnails/18.jpg)
G. Cowan Lectures on Statistical Data Analysis 18
Example: averaging two correlated measurements
![Page 19: Statistical Data Analysis: Lecture 10](https://reader031.vdocuments.site/reader031/viewer/2022012922/56813c1a550346895da5902c/html5/thumbnails/19.jpg)
G. Cowan Lectures on Statistical Data Analysis 19
Negative weights in LS average
![Page 20: Statistical Data Analysis: Lecture 10](https://reader031.vdocuments.site/reader031/viewer/2022012922/56813c1a550346895da5902c/html5/thumbnails/20.jpg)
G. Cowan Lectures on Statistical Data Analysis 20
Wrapping up lecture 10
Considering ML with Gaussian data led to the method ofLeast Squares.
Several caveats when the data are not (quite) Gaussian, e.g.,histogram-based data.
Goodness-of-fit with LS “easy” (but do not confuse good fitwith small stat. errors)
LS can be used for averaging measurements.
Next lecture: Interval estimation