es 07 these slides can be found at optimized for windows)
DESCRIPTION
ES 07 Normally the Gaussian distribution is standardized by putting = 0 and = 1 ( ) is called the frequency function note that ( ) = (– )TRANSCRIPT
ES07
These slides can be found at http://www.hep.lu.se/staff/stenlund/Somethings.ppt(
optimized for Windows)
ES07
The Gaussian distribution
f(x)dx is the probability that an observation will fall in between x – dx/2 and x + dx/2
ES07
Normally the Gaussian distribution is standardized by putting
= 0 and = 1
() is called the
frequency function
note that () = (– )
ES07The distribution function is the primitive function to the
frequency function
The distribution function cannot be calculated analytically, but is tabulated in most standard books. Or it can be approximated.
Note that (– ) = 1 – () and that F(x) = ()
The probability to obtain a value between 1 and 2 (1 < 2 ) is given by (2) – (1)
ES07
An approximation which can be used for () is:
() = 1 – () (a1t + a2t2 +a3t3) + () ( 0)
where t = (1 + p)-1
with p = 0.33267
a1 = 0.4361836
a2 = - 0.1201676
a3 = 0.937280
giving |()| < 110-5
ES07
Expectation value, variance and covariance
The sum is over the whole population
Standard deviation:
ES07
Variance of the population mean value
V[X] = V[X]/N
ES07
Expectation value and variance from a sample
Estimates with correct expectation value are thus given by:
and
ES07
The variance of the variance
ES07
This leads to an estimate of the “error” in the estimate of the standard deviation of a
distribution
Beware! V[V[X]] is normally a small positive number, but the terms used for its calculation
are normally very large. High precision is needed in the calculations.
ES07
Parameter fitting with the maximum likelihood method
If we know that the sample we want to study comes from a certain distribution, e. g. a Gaussian with unknown parameters, we can fit those using he
maximum likelihood method.
Calculate the probability to obtain exactly the sample you have as a function of the parameters
and maximize this probability
L(,) = f(xi) or l(,) = ln(f(xi))
The “error” of a parameter p is estimated by
l(p p) = lmax – ½
ES07
The l-function is usually close to a
parabola
l(p p) = lmax – ½
ES072 fitting and 2 testing
This method needs binning of the data.
In each bin we have (xi)min, (xi)max, yi = ni/N and i which can be taken as (ni)/N as long as ni 5 (no less than
five observations in a bin) and ni N.
Minimize the sum S
ES07
yth is calculated from the tested distribution. If this is a Gaussian with
parameters G and G we have
ES07
S can now be minimized with respect to the parameters to be fitted.
When Smin is found the “error” of the parameter can be estimated from (c. f. maximum likelihood method)
S(p p) = Smin + 1
S is in many cases approximately of parabolic shape close to the minimum.
ES07
S is 2 distributed with degrees of freedom. The number of degrees of freedom are the number of bins we have
minus the number of parameters that are fitted.
In the previous example we had 7 bins and two parameters giving = 5.
S(=5) Table meaning
1.6 0.90125 in only about 10 % of the cases a smaller S-value would be obtained
3.2 0.66918 in about 33 % of the cases a smaller S-value would be obtained
5.0 0.41588 in about 42 % of the cases a larger S-value would be obtained
7.8 0.16761 in about 17 % of the cases a larger S-value would be obtained
9.8 0.08110 in only about 8 % of the cases a larger S-value would be obtained
Generally we expect S/ to be close to 1 if the fluctuations in the data are of purely statistical origin and if the data is
described by the distribution in question.
ES07
Confidence levels and confidence intervalsAssume that we have estimated a parameter p and found
that p = 1.23 with p = 0.11.
Lets say that we want to construct an interval that covers the true value of p with 90 % confidence. This means that
we leave out 5 % on each side.
Start by finding , so that () = 0.95 = 1.6449
pmax = 1.64490.11 + 1.23 = 1.41 and
pmin = - 1.64490.11 + 1.23 = 1.05
We have found the two sided confidence interval of our estimate of p on the 90 % confidence level to be
1.05 – 1.41
ES07
If we want to state that p < x with some confidence we can construct a one sided confidence region.
Lets say that we want to construct an region that covers the true value of p with 99 % confidence.
Start by finding , so that () = 0.99 = 2.3263
pmax = 2.32630.11 + 1.23 = 1.49
We have found the one sided confidence region of our estimate of p on the 99 % confidence level to be
p < 1.49
ES07
Hypothesis testing (simple case)Lets again assume that we have estimated a parameter p
and found that p = 1.23 with p = 0.11.
No we have a hypothesis stating that p = 1.4
We now ask our selves with what probability the hypothesis is wrong.
We calculate = (1.4 – 1.23)/0.11 = 1.5455 and the probability is given by () = 0.939, i. e. we can state with
94 % confidence that the hypothesis is wrong.
ES07