systematic error illustration of · pdf filesystematic error illustration of bias sources of...

9/23/2009

1

Systematic ErrorIllustration of Bias

Sources of Systematic Errors

Instrument Errors

Method Errors

Personal

– Prejudice

– Preconceived notion of “true” value

– Number bias

Prefer 0/5

Small over large

Even over odd

Effects of Systematic Errors

Constant Errors

– Become more serious as size of

measurement get smaller

Proportional Errors

– Interfering contaminants

I f the contaminant becomes larger, the signal becomes larger.

9/23/2009

2

Detection of Systematic and Personal Errors

Calibration

Care and Self-discipline

– Instrument readings

– Notebook entries

– Calculations

– Physical disabilities--color blindness

Bias

Difficult to detect

– Analyze standard samples

– Do an independent analysis

– Determine a blank

– Vary the sample size

Applying Statistics to Data Evaluation

Gross error or segment of population?

Define the confidence interval.

Find the number of replicates necessary to ensure that the mean falls within a predetermined interval.

What is the probability that an experimental mean and a “true” value or a two experimental means are different.

Calibrate

9/23/2009

3

Gross ErrorsThe Q -test: rejecting outliers

Gross ErrorsThe Q -test: rejecting outliers

range

neighbornearest -result quest.

1

expw

d

xx

xxQ

q

nq

The Q -test: An Example

A calcite sample yields the following data

for the determination of calcium as CaO:

55.95, 56.00, 56.04, 56.08, and 56.23.

Should we reject 56.23?

54.095.5523.56

08.5623.56

1

expxx

xxQ

q

nq

9/23/2009

4

The Q -test: The Q -Table (5-1)

QcritNumber of

Observations 90% 95% 99%

3 0.941 0.970 0.994

4 0.765 0.829 0.926

5 0.642 0.710 0.821

6 0.560 0.625 0.740

7 0.507 0.568 0.680

The Q -test: An Example

A calcite sample yields the following data

for the determination of calcium as CaO:

55.95, 56.00, 56.04, 56.08, and 56.23.

Should we reject 56.23?

54.095.5523.56

08.5623.56

1

expxx

xxQ

q

nq

What’s the criterion?

If Qexp > Qcrit , reject.

If Qexp < Qcrit , accept.

Qexp = 0.54; Qcrit = 0.64, so accept.

Here

9/23/2009

5

Can we reject data?

Blind application of statistical tests is no better than doing nothing.

Use good judgement based on experience.

If you know that something went wrong with a sample and the sample produces an

outlier, then rejection may be warranted.

Be cautious about rejecting data for any reason.

Recommendations

Keep good records and examine the data carefully.

If possible, estimate the precision of the method.

Repeat the analysis if time and sample are available. Compare with first data.

If not feasible, apply the Q -test.

Recommendations

If Q -test indicated retention, consider reporting the median.

The median allows inclusion of all of the data without undue influence from the outlier.

The median of a set of 3 measurements from a normal distribution gives a better estimate than the mean of the remaining 2 values after an outlier is rejected.

9/23/2009

6

Confidence Limits and Intervals

Confidence Limits are limits around an experimentally determined mean within which the true mean lies with a give degree of probability.

The confidence interval is the interval around the mean defined by the confidence limits.

Confidence limits if s is a good estimate of

ts)measuremen of (mean

for CL

t)measuremen (single

for CL

Nx

N

zx

zx

50% Confidence Limits

9/23/2009

7




9/23/2009

8


Confidence limits if s is not a good estimate of

ts)measuremen of (mean

for CL

) to(analogous

sStudent'

Nx

N

tsx

z

tx

t

Values of Student's t

Probability LevelDegrees of

Freedom 90% 95% 99% 99.8%

1 6.31 12.7 63.7 318

2 2.92 4.30 9.92 22.3

3 2.35 3.18 5.84 10.2

4 2.13 2.78 4.60 7.17

5 2.02 2.57 4.03 5.89

(z) 1.64 1.96 2.58 3.09

9/23/2009

9

Finding the Confidence Interval: An Example

Determination of the alcohol content in blood

gives the following data: % C2H5OH: 0.084,

0.089, and 0.079.

(a) If the precision of the method is unknown,

find the 95% confidence limits of the mean.

(b) Perform the same calculation if the the

standard deviation s = 0.0050% C2H5OH.

(How could we determine s ?)

(a) unknown (use t )

OHHC % 012.0084.0

3

)0050.0)(30.4(084.0CL 95%

OHHC % 0050.0

OHHC % 084.0

52

52

52

N

tsx

s

x



Freedom 90% 95% 99% 99.8%

1 6.31 12.7 63.7 318

2 2.92 4.30 9.92 22.3

3 2.35 3.18 5.84 10.2

4 2.13 2.78 4.60 7.17

5 2.02 2.57 4.03 5.89

(z) 1.64 1.96 2.58 3.09

9/23/2009

10

(b) s = 0.0050 %

(use z )

OHHC % 006.0084.0

3

)0050.0)(96.1(084.0CL 95%

OHHC % 0050.0

OHHC % 084.0

52

52

52

N

zx

x



Freedom 90% 95% 99% 99.8%

1 6.31 12.7 63.7 318

2 2.92 4.30 9.92 22.3

3 2.35 3.18 5.84 10.2

4 2.13 2.78 4.60 7.17

5 2.02 2.57 4.03 5.89

(z) 1.64 1.96 2.58 3.09

Comparing a mean to the true value: The Null Hypothesis The null hypothesis assumes that two

measurments are the same.

Any numerical difference is assumed to be due to random error.

If the observed difference is greater than or equal to the difference that would occur 5% of the time, the null hypothesis is rejected, and the difference is judged significant.

9/23/2009

11

The Critical Value

N

tsx

N

tsx

interval. confidence for the

equation therearrange We

The difference is compared tothe critical value at the desired probability level.

If is greater than the critical value, the null hypothesis is rejected.

Compare the difference to the critical value

x

Nts /

x

An Example: The Determinaton of Sulfur in Kerosenes

A known sample containing 0.123% sulfur was analyzed and the results for four samples were: 0.112, 0.118, 0.115, and 0.119 %S. Is there bias in the method? Let’s do a spreadsheet.

9/23/2009

12

The Spreadsheet (5%)

True Val. Data t(95%, 3 df)

0.123 0.112 3.18

Difference 0.118

-0.007 0.115 ts/sqrt(N)

0.119 0.0050

Mean 0.116

Std. Dev. 0.0032

If we wish to be wrong no more than 5% of the time, we must reject the null hypothesis, and there is systematic error.

What about 1%?

True Val. Data t(99%, 3 df)

0.123 0.112 5.84

Difference 0.118

-0.007 0.115 ts/sqrt(N)

0.119 0.0092

Mean 0.116

Std. Dev. 0.0032

If we wish to be wrong no more than 1% of the time, we must accept the null hypothesis, and there is no systematic error.

Here

Comparing Two Experimental Means

2d.f. 21

21

21pooled11

NN

NN

NNtsxx

9/23/2009

13

Least-Squares for Analyzing Linear Calibrations: y = mx +b Least-squares assumes that there is relatively

little error in the x measurement.

The mathematics of the derivation of the equations minimizes the sum of the squares of the deviations (the residuals ) of the points from the best line in the y direction only.

From calculus, take the partial derivatives of the equation for the sum of squares with respect to m and b , set it equal to zero, and solve for the variables.

9/23/2009

14

The Intermediate Equations (See pp. 161-2)

N

yy

N

xx

N

yxyxyyxxS

N

yyyyS

N

xxxxS

ii

ii

iiiixy

i

iiyy

i

iixx

and

2

22

2

22

The Results

xx

rm

xxyy

r

xx

xy

S

ss

NN

SmSs

xmybS

Sm

2

2

:slope theofdeviation standard The 4.

d.f.2 where2

:estimate theoferror standard or the

,regressionabout deviation standard The 3.

:Intercept 2. :Slope 1.

The Standard Deviation about Regression

Analogous to the standard deviation

Measure of the scatter of points

Precision similar to individual data

2

22

2

2

line

2

N

bmxy

N

yy

N

SmSs

ii

ixxyy

r

9/23/2009

15

More Results

unknown theof replicates no.

where11

:curven calibratio thefrom results ofdeviation standard The 6.

:intercept theofdeviation standard The 5.

1

2

2

22

2

MM

y

y

Sm

yy

NMm

ss

xxN

xss

M

i

i

c

xx

crb

ii

i

rb

9/23/2009

16

Assignment 2

7-2, 7-4, 7-6, 7-11, 7-16, 7-19

SS p. 164

systematic error illustration of · pdf filesystematic error illustration of bias sources of...

Documents