lecture 3 & 4-datahandling1
TRANSCRIPT
-
8/20/2019 Lecture 3 & 4-DataHandling1
1/46
DATA HANDLING IN
ANALYTICAL CHEMISTRY
1
-
8/20/2019 Lecture 3 & 4-DataHandling1
2/46
SI GNI F I CANT F I GURES
Two types of numbers:
• Exact number – consider as number with no
uncertainty e.g. factor in a multiplication (100 incalculation of percentage), exponent (102, 10-3) etc.•Measured number – other than the exact number,has a degree of uncertainty normally in the last digite.g. the wt. of a compound is found to be 12.4567g.It can simply means 12.4567±0.0001g.
2
-
8/20/2019 Lecture 3 & 4-DataHandling1
3/46
3
What is the length of the woodenstick?
1) 4.5 cm2) 4.54 cm3) 4.547 cm
-
8/20/2019 Lecture 3 & 4-DataHandling1
4/46
Measured Numbers•Do you see why Measured Numbers have
error…you have to make that Guess!•All but one of the significant figures are knownwith certainty. The last significant figure is only the
best possible estimate.•To indicate the precision of a measurement, thevalue recorded should use all the digits known withcertainty.
4
-
8/20/2019 Lecture 3 & 4-DataHandling1
5/46
Below are two measurements of the mass of thesame object. The same quantity is being described at
two different levels of precision or certainty.
5
-
8/20/2019 Lecture 3 & 4-DataHandling1
6/46
• The weak link in the chain of any analysis is
governed by that measurement that can be madewith the least accuracy.
• This means that the least accurate measurementwill determine the number of significant figures ofthe results of any analysis.
6
-
8/20/2019 Lecture 3 & 4-DataHandling1
7/46
When a measurement is recorded, only those digits that aredependable are written down.• If you measured the width of a paper with your ruler you
might record 21.7cm.• To a mathematician 21.70, or 21.700 is the same.• But, to a scientist 21.7cm and 21.70cm is NOT the same.
• 21.700cm to a scientist means the measurement is accurateto within one thousandth of a cm.• However, if you used an ordinary ruler, the smallest
marking is the mm, so your measurement has to berecorded as 21.7cm.
7
-
8/20/2019 Lecture 3 & 4-DataHandling1
8/46
• If we weigh a sample with an analytical balance, the wtcan be expressed as e.g. 10.1234g.
• If we measure a volume of a solution with a burette, thevolume can be expressed as e.g. 21.25ml
• This means that the calculation of final results utilising thetwo values should appropriately contain only 2 decimal
points. Why?
8
-
8/20/2019 Lecture 3 & 4-DataHandling1
9/46
9
?
-
8/20/2019 Lecture 3 & 4-DataHandling1
10/46
10
-
8/20/2019 Lecture 3 & 4-DataHandling1
11/46
11
Analytical balancePrecise up to 4 decimal places ingrame.g. 12.0123 g
Top pan balance Preciseup to 2 decimal places ingram e.g. 12.01 g
-
8/20/2019 Lecture 3 & 4-DataHandling1
12/46
Number of Significant Figures
• 23.45 ⇒ has 4 significant figures
• 45678 = 4567.8 = 456.78 = 45.678 = 4.5678 = 0.45678⇒
has 5 significant figures regardless of where the decimal point is.
• 92067μ m = 9.2067cm = 0.092067m
• All have same number of significant figures, merelyexpressed in different unit.
12
-
8/20/2019 Lecture 3 & 4-DataHandling1
13/46
13
The position of zero
-
8/20/2019 Lecture 3 & 4-DataHandling1
14/46
45.8736.000239
.0002390048000.480003.982 X 10 6
1.00040
14
•All digits count•Leading 0’s don’t
•Trailing 0’s do•0’s count in decimalform•0’s don’t count w/o
decimal•All digits count•0’s between digitscount as well as trailingin decimal form
6355
246
-
8/20/2019 Lecture 3 & 4-DataHandling1
15/46
• State the number of significant figures in each of thesenumbers.
1) 274.032) 1.060 x 10 4
3) 0.000544) 600
5) 72000.06) 8.0007) 67.030
15
-
8/20/2019 Lecture 3 & 4-DataHandling1
16/46
The General Analytical Problem
Select sample
Extract analyte(s) from matrix
Detect, identify and
quantify analytes
Determine reliability andsignificance of results
Separate analytes
16
-
8/20/2019 Lecture 3 & 4-DataHandling1
17/46
Introduction
• The design of experiments (includingsize of sample required, accuracy of measurements required and the number of analyses needed) is determined from
a proper understanding of what the datawill represent.
17
-
8/20/2019 Lecture 3 & 4-DataHandling1
18/46
Introduction
• A knowledge of statistical analysis will be required asyou perform experiments in the laboratory.
• To understand the significance of the data that arecollected .
18
-
8/20/2019 Lecture 3 & 4-DataHandling1
19/46
M ean Defined as follows: x
x
N
i
N
= i = 1
Where x i = individual values of x and N = number of replicatemeasurements
Median
The middle result when data are arranged in order of size (for evennumbers the mean of middle two ).Median can be preferred when there is an “outlier ” - one reading verydifferent from rest.Median less affected by outlier than is mean.
19
-
8/20/2019 Lecture 3 & 4-DataHandling1
20/46
Illustration of “Mean” and “Median”
Results of 6 determinations of the Fe(III) content of a solution, known tocontain 20 ppm:
Note : The mean value is 19.78 ppm (i.e. 19.8ppm ) - the median value is 19.7ppm (i.e. (19.6+19.8)/2)
20
-
8/20/2019 Lecture 3 & 4-DataHandling1
21/46
Precision
Relates to reproducibility of results..How similar are values obtained in exactly the same way?
Useful for measuring this:Deviation from the mean :
d x xi i
21
-
8/20/2019 Lecture 3 & 4-DataHandling1
22/46
Accuracy
Measurement of agreement between experimental mean andtrue value (which may not be known!).Measures of accuracy:
Absolute error : E = x i - xt (where xt = true or accepted value)
Relative error : E r
xi
xt
xt
100%
(latter is more useful in practice)
22
-
8/20/2019 Lecture 3 & 4-DataHandling1
23/46
Accuracy and Precision: There is
a difference Accuracy: the degree of agreement between themeasured value and the true value .
Precision: the degree of agreement betweenreplicate measurements of the same quantity .Means the repeatability of a result.Expressed as the standard deviation .The more measurements that are made, the morereliable will be the measure of precision.
Good precision does not assure good accuracy.23
-
8/20/2019 Lecture 3 & 4-DataHandling1
24/46
Illustrating the difference between “ accuracy ” and “precision ”
Low accuracy, low precision Low accuracy, high precision
High accuracy, low precision High accuracy, high precision24
-
8/20/2019 Lecture 3 & 4-DataHandling1
25/46
Errors in Chemical Analysis
Impossible to eliminate errors.How reliable are our data?Data of unknown quality are useless!
•Carry out replicate measurements•Analyse accurately known standards•Perform statistical tests on data
25
-
8/20/2019 Lecture 3 & 4-DataHandling1
26/46
Types of Error in Experimental Data
Three types:
(1) Random (indeterminate) Error
(2) Systematic (determinate) Error
(3) Gross ErrorsUsually obvious - give “ outlier ” readings.Detectable by carrying out sufficient replicatemeasurements.
26
-
8/20/2019 Lecture 3 & 4-DataHandling1
27/46
• Readings all too high or too low.• Affects accuracy
• Systematic errors can be constant (e.g. errorin burette reading• less important for larger values of reading• Proportional, e.g. presence of given
proportion of interfering impurity in sample• equally significant for all values of
measurement
• Several possible sources. 27
Systematic/determinate Error
-
8/20/2019 Lecture 3 & 4-DataHandling1
28/46
Sources of Systematic/determinate Error
Some common determinate errors are:(1)instrumental errors : these include
faulty equipment, un-calibrated weightsand un-calibrated glasswares.Need frequent calibration - both for apparatus such as volumetric flasks,burettes etc., but also for electronicdevices such as spectrometers.
28
-
8/20/2019 Lecture 3 & 4-DataHandling1
29/46
(2) operative errors : includes personal errors and can bereduced by experience and care of the analyst in thephysical manipulations involved.
• Operations – include transfer of solutions, bumping duringsample dissolution, incomplete drying of samples etc.
• Insensitivity to colour changes for spectrophotometricmethods
• Other personal errors – mathematical errors incalculations
• Prejudice in estimating measurements: tendency toestimate scale readings to improve precision;preconceived idea of “true” value ..
• Difficult to correct for.
29
Sources of Systematic/determinate Error
-
8/20/2019 Lecture 3 & 4-DataHandling1
30/46
(3) Errors of the method – most serious errors of ananalysis.• Errors that are inherent in the method cannot be
changed unless the conditions of the determinationare altered.
• Due to inadequacies in physical or chemicalbehaviour of reagents or reactions (e.g. slow orincomplete reactions)
• Example nicotinic acid does not react completelyunder normal Kjeldahl conditions for nitrogendetermination.
• Reagent blank – analysis on the added reagentsonly. It is standard practice to run such blanks and tosubtract the results from those for the sample. 30
Sources of Systematic/determinate Error
f
-
8/20/2019 Lecture 3 & 4-DataHandling1
31/46
Minimise instrument errors by careful recalibration and goodmaintenance of equipment.
Minimise personal errors by care and self-discipline
Method errors - most difficult.“True” value may not be known.Three approaches to minimise:
•analysis of certified standards•use two or more independent methods•analysis of blanks
31
Minimisation of Systematic Errors inChemical Analyses
-
8/20/2019 Lecture 3 & 4-DataHandling1
32/46
Random Errors or Indeterminate Errors
• Often called accidental errors .• Data scattered approx. symmetrically about a mean
value.• Affects precision - dealt with statistically• Represent the experimental uncertainty that occurs in
any measurement.
• Revealed by small differences in successivemeasurements made by the same analyst under virtually identical conditions and cannot be predictedor estimated .
32
-
8/20/2019 Lecture 3 & 4-DataHandling1
33/46
Treatments of Random Errors
There are always a large number of small,random errors in making any measurement.
These can be small changes in temperature orpressure; random responses of electronicdetectors (“noise”) etc.
Can use statistical treatment for random errors
33
-
8/20/2019 Lecture 3 & 4-DataHandling1
34/46
s : measure of precision of a population of data,given by
s
( ) x
N
ii
N
2
1
Where = population mean; N is very large.
The equation for a Gaussian curve is defined in terms ofand s , as follows:
ye
x ( ) / s
s
2 22
234
Standard Deviation
-
8/20/2019 Lecture 3 & 4-DataHandling1
35/46
Standard Er ror of a M ean
The standard deviation relates to the probable error in a single measurement.If we take a series of N measurements, the probable error of the mean is less thanthe probable error of any one measurement.
The standard error of the mean , is defined as follows:
s s N
m
35
G i
-
8/20/2019 Lecture 3 & 4-DataHandling1
36/46
SAMPLE = finite number of observations
POPULATION = total (infinite) number of observations
Properties of Gaussian curve defined in terms of population.
Modifications needed for small samples of data
Main properties of Gaussian curve :
Population mean ( ) : defined as earlier (N ). In absence of systematic error,is the true valu e (maximum on Gaussian curve).
sample mean ( x ) defined for small values of N.(Sample mean population mean when N 20)
36
Gaussian curve
-
8/20/2019 Lecture 3 & 4-DataHandling1
37/46
Frequency Distribution forMeasurements Containing Random Errors
4 random uncertainties 10 random uncertainties
A very large number of random uncertainties
This is aGaussian or normal error
curve.Symmetrical about
the mean.
37
-
8/20/2019 Lecture 3 & 4-DataHandling1
38/46
Replicate Data on the Calibration of a 10ml Pipette
No. Vol, ml. No. Vol, ml. No. Vol, ml
1 9.988 18 9.975 35 9.9762 9.973 19 9.980 36 9.9903 9.986 20 9.994 37 9.9884 9.980 21 9.992 38 9.9715 9.975 22 9.984 39 9.9866 9.982 23 9.981 40 9.978
7 9.986 24 9.987 41 9.9868 9.982 25 9.978 42 9.9829 9.981 26 9.983 43 9.97710 9.990 27 9.982 44 9.97711 9.980 28 9.991 45 9.98612 9.989 29 9.981 46 9.97813 9.978 30 9.969 47 9.98314 9.971 31 9.985 48 9.98015 9.982 32 9.977 49 9.98316 9.983 33 9.976 50 9.97917 9.988 34 9.983
Mean volume 9.982 ml Median volume 9.982 ml
Standard deviation 0.0056 ml 38
-
8/20/2019 Lecture 3 & 4-DataHandling1
39/46
Calibration data in graphical form
A = histogram of experimental results
B = Gaussian curve with the same mean value, the same precision (see later)and the same area under the curve as for the histogram.
39
Standard Deviation & Gaussian Curve
-
8/20/2019 Lecture 3 & 4-DataHandling1
40/46
Two Gaussian curves with two different
standard deviations, s A and s B (=2 s A)
General Gaussian curve plotted inunits of z, where
z = (x - )/si.e. deviation from the mean of adatum in units of standarddeviation . Plot can be used fordata with given value of mean,and any standard deviation.
40
Standard Deviation & Gaussian Curve
-
8/20/2019 Lecture 3 & 4-DataHandling1
41/46
Area under A Gaussian Curve
From equation above, and illustrated by the previous curves,68.3% of the data lie within s of the mean ( ), i.e. 68.3% ofthe area under the curve lies between s of .
Similarly, 95.5% of the area lies between s , and 99.7%between s .
There are 68.3 chances in 100 that for a single datum therandom error in the measurement will not exceed s .
The chances are 95.5 in 100 that the error will not exceed s .
41
-
8/20/2019 Lecture 3 & 4-DataHandling1
42/46
Sample Standard Deviation, s
The equation for s must be modified for small samples of data, i.e. small N
s
x x
N
i
i
N
( ) 21
1Two differences cf. to equation for s :
1. Use sample mean instead of population mean.
2. Use degrees of freedom , N - 1, instead of N .Reason is that in working out the mean, the sum of thedifferences from the mean must be zero. If N - 1 values areknown, the last value is defined. Thus only N - 1 degreesof freedom. For large values of N , used in calculatings , N and N - 1 are effectively equal.
42
-
8/20/2019 Lecture 3 & 4-DataHandling1
43/46
Alternative Expression for sample standard deviation, s(suitable for calculators)
s
x
x
N
N
ii
N i
i
N
( )
( )2
1
1
2
1
Note: NEVER round off figures before the end of the calculation
43
M h d f i h i i (CV)
-
8/20/2019 Lecture 3 & 4-DataHandling1
44/46
Methods for expressing the precision (CV)
VARIANCE : This is the square of the standard deviation:
s
x x
N
i
i
N
2
2 2
1
1
( )
COEFFICIENT OF VARIANCE (CV)(or RELATIVE STANDARD DEVIATION ):Divide the standard deviation by the mean value and express as a percentage:
CV s
x ( ) 100%
44
-
8/20/2019 Lecture 3 & 4-DataHandling1
45/46
45
Exercise - Sample Standard Deviation
Reproducibility of a method for determining the content ofselenium in foods was studied. 9 measurements weremade on a single batch of brown rice.
Sample Selenium content ( g/g) (x I)1 0.072 0.073 0.084 0.075 0.076 0.087 0.088 0.099 0.08
Based on this data, what can you say about thereproducibility of the method?
-
8/20/2019 Lecture 3 & 4-DataHandling1
46/46
Sample Selenium content ( g/g) (xI) x
i
2
1 0.07 0.00492 0.07 0.00493 0.08 0.00644 0.07 0.00495 0.07 0.0049
6 0.08 0.00647 0.08 0.00648 0.09 0.00819 0.08 0.0064
Sxi = 0.69 Sxi2= 0.0533
Mean = Sxi/N= 0.077 g/g ( Sxi)2/N = 0.4761/9 = 0.0529
Answer to exercise
s 0 0533 0 0529
9 10 00707106 0 007
. .. .
Coefficient of variance = 9.2% Concentration = 0.077 ± 0.007 g/g
Standard deviation:
46