lecture 3 & 4-datahandling1

Upload: muhammad-naufal

Post on 07-Aug-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    1/46

    DATA HANDLING IN

    ANALYTICAL CHEMISTRY

    1

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    2/46

    SI GNI F I CANT F I GURES

    Two types of numbers:

    • Exact number – consider as number with no

    uncertainty e.g. factor in a multiplication (100 incalculation of percentage), exponent (102, 10-3) etc.•Measured number – other than the exact number,has a degree of uncertainty normally in the last digite.g. the wt. of a compound is found to be 12.4567g.It can simply means 12.4567±0.0001g.

    2

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    3/46

    3

    What is the length of the woodenstick?

    1) 4.5 cm2) 4.54 cm3) 4.547 cm

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    4/46

    Measured Numbers•Do you see why Measured Numbers have

    error…you have to make that Guess!•All but one of the significant figures are knownwith certainty. The last significant figure is only the

    best possible estimate.•To indicate the precision of a measurement, thevalue recorded should use all the digits known withcertainty.

    4

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    5/46

    Below are two measurements of the mass of thesame object. The same quantity is being described at

    two different levels of precision or certainty.

    5

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    6/46

    • The weak link in the chain of any analysis is

    governed by that measurement that can be madewith the least accuracy.

    • This means that the least accurate measurementwill determine the number of significant figures ofthe results of any analysis.

    6

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    7/46

    When a measurement is recorded, only those digits that aredependable are written down.• If you measured the width of a paper with your ruler you

    might record 21.7cm.• To a mathematician 21.70, or 21.700 is the same.• But, to a scientist 21.7cm and 21.70cm is NOT the same.

    • 21.700cm to a scientist means the measurement is accurateto within one thousandth of a cm.• However, if you used an ordinary ruler, the smallest

    marking is the mm, so your measurement has to berecorded as 21.7cm.

    7

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    8/46

    • If we weigh a sample with an analytical balance, the wtcan be expressed as e.g. 10.1234g.

    • If we measure a volume of a solution with a burette, thevolume can be expressed as e.g. 21.25ml

    • This means that the calculation of final results utilising thetwo values should appropriately contain only 2 decimal

    points. Why?

    8

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    9/46

    9

    ?

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    10/46

    10

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    11/46

    11

    Analytical balancePrecise up to 4 decimal places ingrame.g. 12.0123 g

    Top pan balance Preciseup to 2 decimal places ingram e.g. 12.01 g

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    12/46

    Number of Significant Figures

    • 23.45 ⇒ has 4 significant figures

    • 45678 = 4567.8 = 456.78 = 45.678 = 4.5678 = 0.45678⇒

    has 5 significant figures regardless of where the decimal point is.

    • 92067μ m = 9.2067cm = 0.092067m

    • All have same number of significant figures, merelyexpressed in different unit.

    12

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    13/46

    13

    The position of zero

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    14/46

    45.8736.000239

    .0002390048000.480003.982 X 10 6

    1.00040

    14

    •All digits count•Leading 0’s don’t

    •Trailing 0’s do•0’s count in decimalform•0’s don’t count w/o

    decimal•All digits count•0’s between digitscount as well as trailingin decimal form

    6355

    246

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    15/46

    • State the number of significant figures in each of thesenumbers.

    1) 274.032) 1.060 x 10 4

    3) 0.000544) 600

    5) 72000.06) 8.0007) 67.030

    15

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    16/46

    The General Analytical Problem

    Select sample

    Extract analyte(s) from matrix

    Detect, identify and

    quantify analytes

    Determine reliability andsignificance of results

    Separate analytes

    16

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    17/46

    Introduction

    • The design of experiments (includingsize of sample required, accuracy of measurements required and the number of analyses needed) is determined from

    a proper understanding of what the datawill represent.

    17

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    18/46

    Introduction

    • A knowledge of statistical analysis will be required asyou perform experiments in the laboratory.

    • To understand the significance of the data that arecollected .

    18

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    19/46

    M ean Defined as follows: x

    x

    N

    i

    N

    = i = 1

    Where x i = individual values of x and N = number of replicatemeasurements

    Median

    The middle result when data are arranged in order of size (for evennumbers the mean of middle two ).Median can be preferred when there is an “outlier ” - one reading verydifferent from rest.Median less affected by outlier than is mean.

    19

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    20/46

    Illustration of “Mean” and “Median”

    Results of 6 determinations of the Fe(III) content of a solution, known tocontain 20 ppm:

    Note : The mean value is 19.78 ppm (i.e. 19.8ppm ) - the median value is 19.7ppm (i.e. (19.6+19.8)/2)

    20

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    21/46

    Precision

    Relates to reproducibility of results..How similar are values obtained in exactly the same way?

    Useful for measuring this:Deviation from the mean :

    d x xi i

    21

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    22/46

    Accuracy

    Measurement of agreement between experimental mean andtrue value (which may not be known!).Measures of accuracy:

    Absolute error : E = x i - xt (where xt = true or accepted value)

    Relative error : E r

    xi

    xt

    xt

    100%

    (latter is more useful in practice)

    22

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    23/46

    Accuracy and Precision: There is

    a difference Accuracy: the degree of agreement between themeasured value and the true value .

    Precision: the degree of agreement betweenreplicate measurements of the same quantity .Means the repeatability of a result.Expressed as the standard deviation .The more measurements that are made, the morereliable will be the measure of precision.

    Good precision does not assure good accuracy.23

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    24/46

    Illustrating the difference between “ accuracy ” and “precision ”

    Low accuracy, low precision Low accuracy, high precision

    High accuracy, low precision High accuracy, high precision24

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    25/46

    Errors in Chemical Analysis

    Impossible to eliminate errors.How reliable are our data?Data of unknown quality are useless!

    •Carry out replicate measurements•Analyse accurately known standards•Perform statistical tests on data

    25

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    26/46

    Types of Error in Experimental Data

    Three types:

    (1) Random (indeterminate) Error

    (2) Systematic (determinate) Error

    (3) Gross ErrorsUsually obvious - give “ outlier ” readings.Detectable by carrying out sufficient replicatemeasurements.

    26

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    27/46

    • Readings all too high or too low.• Affects accuracy

    • Systematic errors can be constant (e.g. errorin burette reading• less important for larger values of reading• Proportional, e.g. presence of given

    proportion of interfering impurity in sample• equally significant for all values of

    measurement

    • Several possible sources. 27

    Systematic/determinate Error

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    28/46

    Sources of Systematic/determinate Error

    Some common determinate errors are:(1)instrumental errors : these include

    faulty equipment, un-calibrated weightsand un-calibrated glasswares.Need frequent calibration - both for apparatus such as volumetric flasks,burettes etc., but also for electronicdevices such as spectrometers.

    28

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    29/46

    (2) operative errors : includes personal errors and can bereduced by experience and care of the analyst in thephysical manipulations involved.

    • Operations – include transfer of solutions, bumping duringsample dissolution, incomplete drying of samples etc.

    • Insensitivity to colour changes for spectrophotometricmethods

    • Other personal errors – mathematical errors incalculations

    • Prejudice in estimating measurements: tendency toestimate scale readings to improve precision;preconceived idea of “true” value ..

    • Difficult to correct for.

    29

    Sources of Systematic/determinate Error

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    30/46

    (3) Errors of the method – most serious errors of ananalysis.• Errors that are inherent in the method cannot be

    changed unless the conditions of the determinationare altered.

    • Due to inadequacies in physical or chemicalbehaviour of reagents or reactions (e.g. slow orincomplete reactions)

    • Example nicotinic acid does not react completelyunder normal Kjeldahl conditions for nitrogendetermination.

    • Reagent blank – analysis on the added reagentsonly. It is standard practice to run such blanks and tosubtract the results from those for the sample. 30

    Sources of Systematic/determinate Error

    f

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    31/46

    Minimise instrument errors by careful recalibration and goodmaintenance of equipment.

    Minimise personal errors by care and self-discipline

    Method errors - most difficult.“True” value may not be known.Three approaches to minimise:

    •analysis of certified standards•use two or more independent methods•analysis of blanks

    31

    Minimisation of Systematic Errors inChemical Analyses

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    32/46

    Random Errors or Indeterminate Errors

    • Often called accidental errors .• Data scattered approx. symmetrically about a mean

    value.• Affects precision - dealt with statistically• Represent the experimental uncertainty that occurs in

    any measurement.

    • Revealed by small differences in successivemeasurements made by the same analyst under virtually identical conditions and cannot be predictedor estimated .

    32

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    33/46

    Treatments of Random Errors

    There are always a large number of small,random errors in making any measurement.

    These can be small changes in temperature orpressure; random responses of electronicdetectors (“noise”) etc.

    Can use statistical treatment for random errors

    33

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    34/46

    s : measure of precision of a population of data,given by

    s

    ( ) x

    N

    ii

    N

    2

    1

    Where = population mean; N is very large.

    The equation for a Gaussian curve is defined in terms ofand s , as follows:

    ye

    x ( ) / s

    s

    2 22

    234

    Standard Deviation

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    35/46

    Standard Er ror of a M ean

    The standard deviation relates to the probable error in a single measurement.If we take a series of N measurements, the probable error of the mean is less thanthe probable error of any one measurement.

    The standard error of the mean , is defined as follows:

    s s N

    m

    35

    G i

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    36/46

    SAMPLE = finite number of observations

    POPULATION = total (infinite) number of observations

    Properties of Gaussian curve defined in terms of population.

    Modifications needed for small samples of data

    Main properties of Gaussian curve :

    Population mean ( ) : defined as earlier (N ). In absence of systematic error,is the true valu e (maximum on Gaussian curve).

    sample mean ( x ) defined for small values of N.(Sample mean population mean when N 20)

    36

    Gaussian curve

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    37/46

    Frequency Distribution forMeasurements Containing Random Errors

    4 random uncertainties 10 random uncertainties

    A very large number of random uncertainties

    This is aGaussian or normal error

    curve.Symmetrical about

    the mean.

    37

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    38/46

    Replicate Data on the Calibration of a 10ml Pipette

    No. Vol, ml. No. Vol, ml. No. Vol, ml

    1 9.988 18 9.975 35 9.9762 9.973 19 9.980 36 9.9903 9.986 20 9.994 37 9.9884 9.980 21 9.992 38 9.9715 9.975 22 9.984 39 9.9866 9.982 23 9.981 40 9.978

    7 9.986 24 9.987 41 9.9868 9.982 25 9.978 42 9.9829 9.981 26 9.983 43 9.97710 9.990 27 9.982 44 9.97711 9.980 28 9.991 45 9.98612 9.989 29 9.981 46 9.97813 9.978 30 9.969 47 9.98314 9.971 31 9.985 48 9.98015 9.982 32 9.977 49 9.98316 9.983 33 9.976 50 9.97917 9.988 34 9.983

    Mean volume 9.982 ml Median volume 9.982 ml

    Standard deviation 0.0056 ml 38

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    39/46

    Calibration data in graphical form

    A = histogram of experimental results

    B = Gaussian curve with the same mean value, the same precision (see later)and the same area under the curve as for the histogram.

    39

    Standard Deviation & Gaussian Curve

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    40/46

    Two Gaussian curves with two different

    standard deviations, s A and s B (=2 s A)

    General Gaussian curve plotted inunits of z, where

    z = (x - )/si.e. deviation from the mean of adatum in units of standarddeviation . Plot can be used fordata with given value of mean,and any standard deviation.

    40

    Standard Deviation & Gaussian Curve

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    41/46

    Area under A Gaussian Curve

    From equation above, and illustrated by the previous curves,68.3% of the data lie within s of the mean ( ), i.e. 68.3% ofthe area under the curve lies between s of .

    Similarly, 95.5% of the area lies between s , and 99.7%between s .

    There are 68.3 chances in 100 that for a single datum therandom error in the measurement will not exceed s .

    The chances are 95.5 in 100 that the error will not exceed s .

    41

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    42/46

    Sample Standard Deviation, s

    The equation for s must be modified for small samples of data, i.e. small N

    s

    x x

    N

    i

    i

    N

    ( ) 21

    1Two differences cf. to equation for s :

    1. Use sample mean instead of population mean.

    2. Use degrees of freedom , N - 1, instead of N .Reason is that in working out the mean, the sum of thedifferences from the mean must be zero. If N - 1 values areknown, the last value is defined. Thus only N - 1 degreesof freedom. For large values of N , used in calculatings , N and N - 1 are effectively equal.

    42

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    43/46

    Alternative Expression for sample standard deviation, s(suitable for calculators)

    s

    x

    x

    N

    N

    ii

    N i

    i

    N

    ( )

    ( )2

    1

    1

    2

    1

    Note: NEVER round off figures before the end of the calculation

    43

    M h d f i h i i (CV)

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    44/46

    Methods for expressing the precision (CV)

    VARIANCE : This is the square of the standard deviation:

    s

    x x

    N

    i

    i

    N

    2

    2 2

    1

    1

    ( )

    COEFFICIENT OF VARIANCE (CV)(or RELATIVE STANDARD DEVIATION ):Divide the standard deviation by the mean value and express as a percentage:

    CV s

    x ( ) 100%

    44

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    45/46

    45

    Exercise - Sample Standard Deviation

    Reproducibility of a method for determining the content ofselenium in foods was studied. 9 measurements weremade on a single batch of brown rice.

    Sample Selenium content ( g/g) (x I)1 0.072 0.073 0.084 0.075 0.076 0.087 0.088 0.099 0.08

    Based on this data, what can you say about thereproducibility of the method?

  • 8/20/2019 Lecture 3 & 4-DataHandling1

    46/46

    Sample Selenium content ( g/g) (xI) x

    i

    2

    1 0.07 0.00492 0.07 0.00493 0.08 0.00644 0.07 0.00495 0.07 0.0049

    6 0.08 0.00647 0.08 0.00648 0.09 0.00819 0.08 0.0064

    Sxi = 0.69 Sxi2= 0.0533

    Mean = Sxi/N= 0.077 g/g ( Sxi)2/N = 0.4761/9 = 0.0529

    Answer to exercise

    s 0 0533 0 0529

    9 10 00707106 0 007

    . .. .

    Coefficient of variance = 9.2% Concentration = 0.077 ± 0.007 g/g

    Standard deviation:

    46