systematic error illustration of · pdf filesystematic error illustration of bias sources of...
TRANSCRIPT
9/23/2009
1
Systematic ErrorIllustration of Bias
Sources of Systematic Errors
Instrument Errors
Method Errors
Personal
– Prejudice
– Preconceived notion of “true” value
– Number bias
Prefer 0/5
Small over large
Even over odd
Effects of Systematic Errors
Constant Errors
– Become more serious as size of
measurement get smaller
Proportional Errors
– Interfering contaminants
I f the contaminant becomes larger, the signal becomes larger.
9/23/2009
2
Detection of Systematic and Personal Errors
Calibration
Care and Self-discipline
– Instrument readings
– Notebook entries
– Calculations
– Physical disabilities--color blindness
Bias
Difficult to detect
– Analyze standard samples
– Do an independent analysis
– Determine a blank
– Vary the sample size
Applying Statistics to Data Evaluation
Gross error or segment of population?
Define the confidence interval.
Find the number of replicates necessary to ensure that the mean falls within a predetermined interval.
What is the probability that an experimental mean and a “true” value or a two experimental means are different.
Calibrate
9/23/2009
3
Gross ErrorsThe Q -test: rejecting outliers
Gross ErrorsThe Q -test: rejecting outliers
range
neighbornearest -result quest.
1
expw
d
xx
xxQ
q
nq
The Q -test: An Example
A calcite sample yields the following data
for the determination of calcium as CaO:
55.95, 56.00, 56.04, 56.08, and 56.23.
Should we reject 56.23?
54.095.5523.56
08.5623.56
1
expxx
xxQ
q
nq
9/23/2009
4
The Q -test: The Q -Table (5-1)
QcritNumber of
Observations 90% 95% 99%
3 0.941 0.970 0.994
4 0.765 0.829 0.926
5 0.642 0.710 0.821
6 0.560 0.625 0.740
7 0.507 0.568 0.680
The Q -test: An Example
A calcite sample yields the following data
for the determination of calcium as CaO:
55.95, 56.00, 56.04, 56.08, and 56.23.
Should we reject 56.23?
54.095.5523.56
08.5623.56
1
expxx
xxQ
q
nq
What’s the criterion?
If Qexp > Qcrit , reject.
If Qexp < Qcrit , accept.
Qexp = 0.54; Qcrit = 0.64, so accept.
Here
9/23/2009
5
Can we reject data?
Blind application of statistical tests is no better than doing nothing.
Use good judgement based on experience.
If you know that something went wrong with a sample and the sample produces an
outlier, then rejection may be warranted.
Be cautious about rejecting data for any reason.
Recommendations
Keep good records and examine the data carefully.
If possible, estimate the precision of the method.
Repeat the analysis if time and sample are available. Compare with first data.
If not feasible, apply the Q -test.
Recommendations
If Q -test indicated retention, consider reporting the median.
The median allows inclusion of all of the data without undue influence from the outlier.
The median of a set of 3 measurements from a normal distribution gives a better estimate than the mean of the remaining 2 values after an outlier is rejected.
9/23/2009
6
Confidence Limits and Intervals
Confidence Limits are limits around an experimentally determined mean within which the true mean lies with a give degree of probability.
The confidence interval is the interval around the mean defined by the confidence limits.
Confidence limits if s is a good estimate of
ts)measuremen of (mean
for CL
t)measuremen (single
for CL
Nx
N
zx
zx
50% Confidence Limits
9/23/2009
8
99% Confidence Limits
Confidence limits if s is not a good estimate of
ts)measuremen of (mean
for CL
) to(analogous
sStudent'
Nx
N
tsx
z
tx
t
Values of Student's t
Probability LevelDegrees of
Freedom 90% 95% 99% 99.8%
1 6.31 12.7 63.7 318
2 2.92 4.30 9.92 22.3
3 2.35 3.18 5.84 10.2
4 2.13 2.78 4.60 7.17
5 2.02 2.57 4.03 5.89
(z) 1.64 1.96 2.58 3.09
9/23/2009
9
Finding the Confidence Interval: An Example
Determination of the alcohol content in blood
gives the following data: % C2H5OH: 0.084,
0.089, and 0.079.
(a) If the precision of the method is unknown,
find the 95% confidence limits of the mean.
(b) Perform the same calculation if the the
standard deviation s = 0.0050% C2H5OH.
(How could we determine s ?)
(a) unknown (use t )
OHHC % 012.0084.0
3
)0050.0)(30.4(084.0CL 95%
OHHC % 0050.0
OHHC % 084.0
52
52
52
N
tsx
s
x
Values of Student's t
Probability LevelDegrees of
Freedom 90% 95% 99% 99.8%
1 6.31 12.7 63.7 318
2 2.92 4.30 9.92 22.3
3 2.35 3.18 5.84 10.2
4 2.13 2.78 4.60 7.17
5 2.02 2.57 4.03 5.89
(z) 1.64 1.96 2.58 3.09
9/23/2009
10
(b) s = 0.0050 %
(use z )
OHHC % 006.0084.0
3
)0050.0)(96.1(084.0CL 95%
OHHC % 0050.0
OHHC % 084.0
52
52
52
N
zx
x
Values of Student's t
Probability LevelDegrees of
Freedom 90% 95% 99% 99.8%
1 6.31 12.7 63.7 318
2 2.92 4.30 9.92 22.3
3 2.35 3.18 5.84 10.2
4 2.13 2.78 4.60 7.17
5 2.02 2.57 4.03 5.89
(z) 1.64 1.96 2.58 3.09
Comparing a mean to the true value: The Null Hypothesis The null hypothesis assumes that two
measurments are the same.
Any numerical difference is assumed to be due to random error.
If the observed difference is greater than or equal to the difference that would occur 5% of the time, the null hypothesis is rejected, and the difference is judged significant.
9/23/2009
11
The Critical Value
N
tsx
N
tsx
interval. confidence for the
equation therearrange We
The difference is compared tothe critical value at the desired probability level.
If is greater than the critical value, the null hypothesis is rejected.
Compare the difference to the critical value
x
Nts /
x
An Example: The Determinaton of Sulfur in Kerosenes
A known sample containing 0.123% sulfur was analyzed and the results for four samples were: 0.112, 0.118, 0.115, and 0.119 %S. Is there bias in the method? Let’s do a spreadsheet.
9/23/2009
12
The Spreadsheet (5%)
True Val. Data t(95%, 3 df)
0.123 0.112 3.18
Difference 0.118
-0.007 0.115 ts/sqrt(N)
0.119 0.0050
Mean 0.116
Std. Dev. 0.0032
If we wish to be wrong no more than 5% of the time, we must reject the null hypothesis, and there is systematic error.
What about 1%?
True Val. Data t(99%, 3 df)
0.123 0.112 5.84
Difference 0.118
-0.007 0.115 ts/sqrt(N)
0.119 0.0092
Mean 0.116
Std. Dev. 0.0032
If we wish to be wrong no more than 1% of the time, we must accept the null hypothesis, and there is no systematic error.
Here
Comparing Two Experimental Means
2d.f. 21
21
21pooled11
NN
NN
NNtsxx
9/23/2009
13
Least-Squares for Analyzing Linear Calibrations: y = mx +b Least-squares assumes that there is relatively
little error in the x measurement.
The mathematics of the derivation of the equations minimizes the sum of the squares of the deviations (the residuals ) of the points from the best line in the y direction only.
From calculus, take the partial derivatives of the equation for the sum of squares with respect to m and b , set it equal to zero, and solve for the variables.
9/23/2009
14
The Intermediate Equations (See pp. 161-2)
N
yy
N
xx
N
yxyxyyxxS
N
yyyyS
N
xxxxS
ii
ii
iiiixy
i
iiyy
i
iixx
and
2
22
2
22
The Results
xx
rm
xxyy
r
xx
xy
S
ss
NN
SmSs
xmybS
Sm
2
2
:slope theofdeviation standard The 4.
d.f.2 where2
:estimate theoferror standard or the
,regressionabout deviation standard The 3.
:Intercept 2. :Slope 1.
The Standard Deviation about Regression
Analogous to the standard deviation
Measure of the scatter of points
Precision similar to individual data
2
22
2
2
line
2
N
bmxy
N
yy
N
SmSs
ii
ixxyy
r
9/23/2009
15
More Results
unknown theof replicates no.
where11
:curven calibratio thefrom results ofdeviation standard The 6.
:intercept theofdeviation standard The 5.
1
2
2
22
2
MM
y
y
Sm
yy
NMm
ss
xxN
xss
M
i
i
c
xx
crb
ii
i
rb