lecture 3 & 4-datahandling1

8/20/2019 Lecture 3 & 4-DataHandling1

1/46

DATA HANDLING IN

ANALYTICAL CHEMISTRY

1


2/46

SI GNI F I CANT F I GURES

Two types of numbers:

• Exact number – consider as number with no

uncertainty e.g. factor in a multiplication (100 incalculation of percentage), exponent (102, 10-3) etc.•Measured number – other than the exact number,has a degree of uncertainty normally in the last digite.g. the wt. of a compound is found to be 12.4567g.It can simply means 12.4567±0.0001g.

2


3/46

3

What is the length of the woodenstick?

1) 4.5 cm2) 4.54 cm3) 4.547 cm


4/46

Measured Numbers•Do you see why Measured Numbers have

error…you have to make that Guess!•All but one of the significant figures are knownwith certainty. The last significant figure is only the

best possible estimate.•To indicate the precision of a measurement, thevalue recorded should use all the digits known withcertainty.

4


5/46

Below are two measurements of the mass of thesame object. The same quantity is being described at

two different levels of precision or certainty.

5


6/46

• The weak link in the chain of any analysis is

governed by that measurement that can be madewith the least accuracy.

• This means that the least accurate measurementwill determine the number of significant figures ofthe results of any analysis.

6


7/46

When a measurement is recorded, only those digits that aredependable are written down.• If you measured the width of a paper with your ruler you

might record 21.7cm.• To a mathematician 21.70, or 21.700 is the same.• But, to a scientist 21.7cm and 21.70cm is NOT the same.

• 21.700cm to a scientist means the measurement is accurateto within one thousandth of a cm.• However, if you used an ordinary ruler, the smallest

marking is the mm, so your measurement has to berecorded as 21.7cm.

7


8/46

• If we weigh a sample with an analytical balance, the wtcan be expressed as e.g. 10.1234g.

• If we measure a volume of a solution with a burette, thevolume can be expressed as e.g. 21.25ml

• This means that the calculation of final results utilising thetwo values should appropriately contain only 2 decimal

points. Why?

8


9/46

9

?


10/46

10


11/46

11

Analytical balancePrecise up to 4 decimal places ingrame.g. 12.0123 g

Top pan balance Preciseup to 2 decimal places ingram e.g. 12.01 g


12/46

Number of Significant Figures

• 23.45 ⇒ has 4 significant figures

• 45678 = 4567.8 = 456.78 = 45.678 = 4.5678 = 0.45678⇒

has 5 significant figures regardless of where the decimal point is.

• 92067μ m = 9.2067cm = 0.092067m

• All have same number of significant figures, merelyexpressed in different unit.

12


13/46

13

The position of zero


14/46

45.8736.000239

.0002390048000.480003.982 X 10 6

1.00040

14

•All digits count•Leading 0’s don’t

•Trailing 0’s do•0’s count in decimalform•0’s don’t count w/o

decimal•All digits count•0’s between digitscount as well as trailingin decimal form

6355

246


15/46

• State the number of significant figures in each of thesenumbers.

1) 274.032) 1.060 x 10 4

3) 0.000544) 600

5) 72000.06) 8.0007) 67.030

15


16/46

The General Analytical Problem

Select sample

Extract analyte(s) from matrix

Detect, identify and

quantify analytes

Determine reliability andsignificance of results

Separate analytes

16


17/46

Introduction

• The design of experiments (includingsize of sample required, accuracy of measurements required and the number of analyses needed) is determined from

a proper understanding of what the datawill represent.

17


18/46

Introduction

• A knowledge of statistical analysis will be required asyou perform experiments in the laboratory.

• To understand the significance of the data that arecollected .

18


19/46

M ean Defined as follows: x

x

N

i

N

= i = 1

Where x i = individual values of x and N = number of replicatemeasurements

Median

The middle result when data are arranged in order of size (for evennumbers the mean of middle two ).Median can be preferred when there is an “outlier ” - one reading verydifferent from rest.Median less affected by outlier than is mean.

19


20/46

Illustration of “Mean” and “Median”

Results of 6 determinations of the Fe(III) content of a solution, known tocontain 20 ppm:

Note : The mean value is 19.78 ppm (i.e. 19.8ppm ) - the median value is 19.7ppm (i.e. (19.6+19.8)/2)

20


21/46

Precision

Relates to reproducibility of results..How similar are values obtained in exactly the same way?

Useful for measuring this:Deviation from the mean :

d x xi i

21


22/46

Accuracy

Measurement of agreement between experimental mean andtrue value (which may not be known!).Measures of accuracy:

Absolute error : E = x i - xt (where xt = true or accepted value)

Relative error : E r

xi

xt

xt

100%

(latter is more useful in practice)

22


23/46

Accuracy and Precision: There is

a difference Accuracy: the degree of agreement between themeasured value and the true value .

Precision: the degree of agreement betweenreplicate measurements of the same quantity .Means the repeatability of a result.Expressed as the standard deviation .The more measurements that are made, the morereliable will be the measure of precision.

Good precision does not assure good accuracy.23


24/46

Illustrating the difference between “ accuracy ” and “precision ”

Low accuracy, low precision Low accuracy, high precision

High accuracy, low precision High accuracy, high precision24


25/46

Errors in Chemical Analysis

Impossible to eliminate errors.How reliable are our data?Data of unknown quality are useless!

•Carry out replicate measurements•Analyse accurately known standards•Perform statistical tests on data

25


26/46

Types of Error in Experimental Data

Three types:

(1) Random (indeterminate) Error

(2) Systematic (determinate) Error

(3) Gross ErrorsUsually obvious - give “ outlier ” readings.Detectable by carrying out sufficient replicatemeasurements.

26


27/46

• Readings all too high or too low.• Affects accuracy

• Systematic errors can be constant (e.g. errorin burette reading• less important for larger values of reading• Proportional, e.g. presence of given

proportion of interfering impurity in sample• equally significant for all values of

measurement

• Several possible sources. 27

Systematic/determinate Error


28/46

Sources of Systematic/determinate Error

Some common determinate errors are:(1)instrumental errors : these include

faulty equipment, un-calibrated weightsand un-calibrated glasswares.Need frequent calibration - both for apparatus such as volumetric flasks,burettes etc., but also for electronicdevices such as spectrometers.

28


29/46

(2) operative errors : includes personal errors and can bereduced by experience and care of the analyst in thephysical manipulations involved.

• Operations – include transfer of solutions, bumping duringsample dissolution, incomplete drying of samples etc.

• Insensitivity to colour changes for spectrophotometricmethods

• Other personal errors – mathematical errors incalculations

• Prejudice in estimating measurements: tendency toestimate scale readings to improve precision;preconceived idea of “true” value ..

• Difficult to correct for.

29



30/46

(3) Errors of the method – most serious errors of ananalysis.• Errors that are inherent in the method cannot be

changed unless the conditions of the determinationare altered.

• Due to inadequacies in physical or chemicalbehaviour of reagents or reactions (e.g. slow orincomplete reactions)

• Example nicotinic acid does not react completelyunder normal Kjeldahl conditions for nitrogendetermination.

• Reagent blank – analysis on the added reagentsonly. It is standard practice to run such blanks and tosubtract the results from those for the sample. 30


f


31/46

Minimise instrument errors by careful recalibration and goodmaintenance of equipment.

Minimise personal errors by care and self-discipline

Method errors - most difficult.“True” value may not be known.Three approaches to minimise:

•analysis of certified standards•use two or more independent methods•analysis of blanks

31

Minimisation of Systematic Errors inChemical Analyses


32/46

Random Errors or Indeterminate Errors

• Often called accidental errors .• Data scattered approx. symmetrically about a mean

value.• Affects precision - dealt with statistically• Represent the experimental uncertainty that occurs in

any measurement.

• Revealed by small differences in successivemeasurements made by the same analyst under virtually identical conditions and cannot be predictedor estimated .

32


33/46

Treatments of Random Errors

There are always a large number of small,random errors in making any measurement.

These can be small changes in temperature orpressure; random responses of electronicdetectors (“noise”) etc.

Can use statistical treatment for random errors

33


34/46

s : measure of precision of a population of data,given by

s

( ) x

N

ii

N

2

1

Where = population mean; N is very large.

The equation for a Gaussian curve is defined in terms ofand s , as follows:

ye

x ( ) / s

s

2 22

234

Standard Deviation


35/46

Standard Er ror of a M ean

The standard deviation relates to the probable error in a single measurement.If we take a series of N measurements, the probable error of the mean is less thanthe probable error of any one measurement.

The standard error of the mean , is defined as follows:

s s N

m

35

G i


36/46

SAMPLE = finite number of observations

POPULATION = total (infinite) number of observations

Properties of Gaussian curve defined in terms of population.

Modifications needed for small samples of data

Main properties of Gaussian curve :

Population mean ( ) : defined as earlier (N ). In absence of systematic error,is the true valu e (maximum on Gaussian curve).

sample mean ( x ) defined for small values of N.(Sample mean population mean when N 20)

36

Gaussian curve


37/46

Frequency Distribution forMeasurements Containing Random Errors

4 random uncertainties 10 random uncertainties

A very large number of random uncertainties

This is aGaussian or normal error

curve.Symmetrical about

the mean.

37


38/46

Replicate Data on the Calibration of a 10ml Pipette

No. Vol, ml. No. Vol, ml. No. Vol, ml

1 9.988 18 9.975 35 9.9762 9.973 19 9.980 36 9.9903 9.986 20 9.994 37 9.9884 9.980 21 9.992 38 9.9715 9.975 22 9.984 39 9.9866 9.982 23 9.981 40 9.978

7 9.986 24 9.987 41 9.9868 9.982 25 9.978 42 9.9829 9.981 26 9.983 43 9.97710 9.990 27 9.982 44 9.97711 9.980 28 9.991 45 9.98612 9.989 29 9.981 46 9.97813 9.978 30 9.969 47 9.98314 9.971 31 9.985 48 9.98015 9.982 32 9.977 49 9.98316 9.983 33 9.976 50 9.97917 9.988 34 9.983

Mean volume 9.982 ml Median volume 9.982 ml

Standard deviation 0.0056 ml 38


39/46

Calibration data in graphical form

A = histogram of experimental results

B = Gaussian curve with the same mean value, the same precision (see later)and the same area under the curve as for the histogram.

39

Standard Deviation & Gaussian Curve


40/46

Two Gaussian curves with two different

standard deviations, s A and s B (=2 s A)

General Gaussian curve plotted inunits of z, where

z = (x - )/si.e. deviation from the mean of adatum in units of standarddeviation . Plot can be used fordata with given value of mean,and any standard deviation.

40

Standard Deviation & Gaussian Curve


41/46

Area under A Gaussian Curve

From equation above, and illustrated by the previous curves,68.3% of the data lie within s of the mean ( ), i.e. 68.3% ofthe area under the curve lies between s of .

Similarly, 95.5% of the area lies between s , and 99.7%between s .

There are 68.3 chances in 100 that for a single datum therandom error in the measurement will not exceed s .

The chances are 95.5 in 100 that the error will not exceed s .

41


42/46

Sample Standard Deviation, s

The equation for s must be modified for small samples of data, i.e. small N

s

x x

N

i

i

N

( ) 21

1Two differences cf. to equation for s :

1. Use sample mean instead of population mean.

2. Use degrees of freedom , N - 1, instead of N .Reason is that in working out the mean, the sum of thedifferences from the mean must be zero. If N - 1 values areknown, the last value is defined. Thus only N - 1 degreesof freedom. For large values of N , used in calculatings , N and N - 1 are effectively equal.

42


43/46

Alternative Expression for sample standard deviation, s(suitable for calculators)

s

x

x

N

N

ii

N i

i

N

( )

( )2

1

1

2

1

Note: NEVER round off figures before the end of the calculation

43

M h d f i h i i (CV)


44/46

Methods for expressing the precision (CV)

VARIANCE : This is the square of the standard deviation:

s

x x

N

i

i

N

2

2 2

1

1

( )

COEFFICIENT OF VARIANCE (CV)(or RELATIVE STANDARD DEVIATION ):Divide the standard deviation by the mean value and express as a percentage:

CV s

x ( ) 100%

44


45/46

45

Exercise - Sample Standard Deviation

Reproducibility of a method for determining the content ofselenium in foods was studied. 9 measurements weremade on a single batch of brown rice.

Sample Selenium content ( g/g) (x I)1 0.072 0.073 0.084 0.075 0.076 0.087 0.088 0.099 0.08

Based on this data, what can you say about thereproducibility of the method?


46/46

Sample Selenium content ( g/g) (xI) x

i

2

1 0.07 0.00492 0.07 0.00493 0.08 0.00644 0.07 0.00495 0.07 0.0049

6 0.08 0.00647 0.08 0.00648 0.09 0.00819 0.08 0.0064

Sxi = 0.69 Sxi2= 0.0533

Mean = Sxi/N= 0.077 g/g ( Sxi)2/N = 0.4761/9 = 0.0529

Answer to exercise

s 0 0533 0 0529

9 10 00707106 0 007

. .. .

Coefficient of variance = 9.2% Concentration = 0.077 ± 0.007 g/g

Standard deviation:

46

lecture 3 & 4-datahandling1

Documents