me - 495 mechanical and thermal systems lab fall 2011 chapter 3: assessing and presenting...
TRANSCRIPT
ME - 495 Mechanical and Thermal Systems Lab
Fall 2011
Chapter 3: Assessing and Presenting
Experimental Data
Professor: Sam Kassegne, PhD, PE
Lecture Covers
Terminologies (error, precision, resolution, etc)
Statistical Theory Based on Population Statistical Theory Based on Sample
(student ‘t’, Chi squared, etc) Homework Problems
1) TERMINOLOGIES
ERRORS Error = ε = Xm – Xtrue:
Difference between measured and true values Can never be calculated exactly
Bias/Systematic errors: errors that occur the same way each time a measurement is taken Cannot be treated using statistics Commonly a zero offset or scale error
Precision/Random errors: different for each measurement but average to zero
COMBINED ERROR
Bias error > typical precision error typical precision error > bias error
CLASSIFICATION OF ERROR
Bias or systematic error Calibration errors Certain consistently recurring human errors Certain errors caused by defective equipment Loading errors Limitations of system resolution
Precision or random error Human error Errors caused by disturbance of the equipment Errors caused by fluctuating experimental
conditions Errors derived from insufficient measuring-
system sensitivity
OTHER ERRORS
Illegitimate error Blunders and mistakes during experimentation Computational errors in data reduction
Errors that are sometimes bias and sometimes precision error Errors from instrument backlash, friction,
hysteresis Errors from calibration drift, variation in test and
environmental conditions
Introduction to UNCERTAINTY
Uncertainty:
Where: Bx = bias uncertainty
Px = precision uncertainty
2x
2xx PBU
ESTIMATING PRECISION UNCERTAINTY
Distribution of error: probability that an error of given size will occur
Population: finite or infinite group from which samples are drawn
Sample: limited set of measurements Gaussian or Normal Distribution (bell curve) Probable difference/Confidence Interval:
Estimate of precision uncertainty of the measured sample
SAMPLE vs. POPULATION
A sample has a defined number of members (n)
A population may have: (n) members or may be infinite
A sample is randomly taken from a population of indefinite size
PROBABILITY DISTRIBUTIONSHistogram (Tossed Dice)Distribution Types:
Gaussian (standard)Students t-distributionχ2 distribution (lower right)
2) THEORY BASED ON POPULATION Probability Density Function (PDF)
Describes random precision error of sampling
Probability = area under curve
2
1
)(x
x
dxxf
GAUSSIAN DISTRIBUTION
For Gaussian Distribution:
x = magnitude of particular measurement μ = mean value of entire population σ = standard deviation of entire population
μ Xavg = (x1+x2+…+xn)/n for large nFor a measurement (x) the deviation (d)
d = x - μ
2
2
2
x
e2
1xf
GAUSSIAN DISTRIBUTION (ctd.)
σ = standard deviation Probability estimates
Standard curve
n
d...dd 2n
22
21
2
zexp
2
1)x(f
2
x
z
Probability
population entire the of deviation standardthe
population entire the of value mean the
tmeasuremen particular a of magnitude x
direction one iny probabilit z
xz
Question:
1) What is the area under the curve between z = -1.43 and z=1.43?
2) What range of z will contain 90% of the data?
3) THEORY BASED ON SAMPLE
Mean Standard Deviation
n
XX
ni
1ii
samples ofnumber n 1;-n freedom of Degree
)()(... 1
2222
2
2
1
n
ii
nxx
XnXXXXXXX
S
Useful Terms in Theory Based on Samples
Central Limit Theorem - For large number of n for a given number of samples, the distribution of the mean values is Gaussian. Eq 1.
Confidence Interval (c%) - used to indicate the reliability of an estimate. How likely the interval is to contain the parameter is determined by the confidence level.For a population with unknown mean and known standard deviation , a confidence interval for the population mean, based on a simple random sample of size n is: Eq 2.
Remember that: Eq 3 & Eq 4.
nx /
nzx c
2/
n
xxz
x /
nzc
2/
A) Student’s t-statistic
Useful for estimating confidence interval for small samples (n << 30)
Assumes the underlying population satisfies the Gaussian distribution
Considers the distribution of a quantity ‘t’ for a specific confidence interval
samples ofnumber n
sample ofdeviation standard
average population true
average sample
x
x
S
x
nS
xt
Student’s t-statistic
Probability for the t-statistic approaches Gaussian as n infinity
t-statistic eliminates probable outliers and what remains must center around the true value = n-1
is called the degree of freedom)
Student’s t-statistic Example 3.6 on pg 74
samples ofnumber n
sample ofdeviation stan.
73) pgon 3.6 table(from
)( freedom of degree and )2(percent confidence afor on distributi- t
level confidencegiven afor avg x valuein they uncertaintprecision
,2
,2
x
x
xx
S
t
Pn
StP
designates the area (hence probability) outside a specified value of t
Conversely,the probability that a given value will lie inside of the limits t is 1-.
n = 14, = 1.009 oz. Sx = 0.04178, = 13.
From Table 3.6, t0.025,13 = 2.160
two sided interval = +/- t0.025,13 Sx/sqrt(n) = +/- 0.02412.
= 1.009 +/- 0.024 oz with CI of 95%
x
B) CHI SQUARED TEST
* The χ2 test measures ‘goodness of fit’
* Tests the probability that a sample of data is or is not normally distributed.
n
1i i
2ii2
E
EO
total area under curve = unity.
Curve starts at chi^2 = 0
The curve is not symmetrical. As number of data increases, it becomes similar to normal distribution.
CHI SQUARED TEST – measures the ‘goodness of fit’
The χ2 goodness of fit test Oi = the number of observed data in a bin
Ei = the number of expected data in a bin N = the number of data bins, or “class intervals” = degrees of freedom=1- number of bins
n
1i i
2ii2
E
EOLarge discrepancy between Oi and Ei
CHI SQUARED EXAMPLE
For a group of 50 trees on a set area of land, to determine if the trees prefer dry soil we examine the following:
If 50% of the land is dry, 30% is moist, and 20% is wet, then if there were no preference, we would expect:
25 trees on dry soil; 15 on moist; and 10 on wet If actually: 40 on dry; 8 on moist; 2 on wet 40-25=15; 152/25=9 | 8-15=7; 72/15=3.2 | 2-10=8; 82/10=6.4 χ2= (9+3.2+6.4)= 18.6
Examining the previous graph it is shown that it is extremely unlikely that there is no soil preference, as our Chi squared value falls far above the questionable probability (Note that bin-size, n = 3; nu = 2)