systematic errors in analytical measurement results

8
Journal of Chromatography A, 1158 (2007) 25–32 Review Systematic errors in analytical measurement results D. Brynn Hibbert School of Chemistry, University of New South Wales, Sydney, NSW 2052, Australia Available online 15 March 2007 Abstract Definitions of the concepts of bias and recovery are discussed and approaches to dealing with them described. The Guide To Uncertainty in Measurement (GUM) recommends correction for all significant systematic effects, but it is also possible to expand measurement uncertainty to take account of uncorrected bias. Run, laboratory and method bias can be defined as components of the bias of a particular measurement result, and can be useful as concepts used in method validation. Estimation of run bias allows a simplification of the estimation of measurement uncertainty. Multivariate calibration brings its own biases that must be quantified and minimised. © 2007 Elsevier B.V. All rights reserved. Keywords: Bias; Recovery; Measurement uncertainty; Metrological traceability; Systematic errors Contents 1. Introduction ............................................................................................................. 25 2. Definitions ............................................................................................................... 26 3. Components of the systematic error of a measurement result .................................................................. 26 3.1. Sampling bias ..................................................................................................... 26 3.2. Calibration bias .................................................................................................... 27 3.3. Recovery .......................................................................................................... 27 3.4. Analytical biases ................................................................................................... 28 3.5. Measurement of method bias by inter-laboratory studies ............................................................... 29 4. Bias of empirical methods ................................................................................................. 29 5. Treatment of bias ......................................................................................................... 30 6. Some examples of systematic errors in practice .............................................................................. 30 7. Conclusions ............................................................................................................. 31 References .............................................................................................................. 31 1. Introduction The concepts of ‘bias’ (and ‘recovery’) are important aspects of the understanding of a measurement result in analytical chem- istry. This paper will discuss the present definitions and will review different approaches to dealing with systematic effects. In addition to the metrological debate, field laboratories need to be able to estimate and, if necessary correct for, systematic effects. Example of present practice will be given. Tel.: +61 2 9385 4713; fax: +61 2 9385 6141. E-mail address: [email protected]. The concept of bias of a measurement result is best understood in terms of a measurement model that recognises systematic and random components of error. ˆ x = x + δ + ε (1) The true value of a measurand, x, is estimated by ˆ x which differs from it by a systematic component, the bias δ and a random component ε. The random error is considered to be Nor- mally distributed with expectation zero and standard deviation σ . Therefore, a large number of measurements will have a mean of (x + δ) as shown in Fig. 1. A single measurement result can- not distinguish between systematic and random error, but several 0021-9673/$ – see front matter © 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.chroma.2007.03.021

Upload: d-brynn-hibbert

Post on 26-Jun-2016

225 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Systematic errors in analytical measurement results

A

MtcM©

K

C

1

oirIte

0d

Journal of Chromatography A, 1158 (2007) 25–32

Review

Systematic errors in analytical measurement results

D. Brynn Hibbert ∗School of Chemistry, University of New South Wales, Sydney, NSW 2052, Australia

Available online 15 March 2007

bstract

Definitions of the concepts of bias and recovery are discussed and approaches to dealing with them described. The Guide To Uncertainty ineasurement (GUM) recommends correction for all significant systematic effects, but it is also possible to expand measurement uncertainty to

ake account of uncorrected bias. Run, laboratory and method bias can be defined as components of the bias of a particular measurement result, andan be useful as concepts used in method validation. Estimation of run bias allows a simplification of the estimation of measurement uncertainty.ultivariate calibration brings its own biases that must be quantified and minimised.2007 Elsevier B.V. All rights reserved.

eywords: Bias; Recovery; Measurement uncertainty; Metrological traceability; Systematic errors

ontents

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252. Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263. Components of the systematic error of a measurement result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.1. Sampling bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.2. Calibration bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.3. Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.4. Analytical biases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.5. Measurement of method bias by inter-laboratory studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4. Bias of empirical methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295. Treatment of bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306. Some examples of systematic errors in practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

. . . .

. . . .

7. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. Introduction

The concepts of ‘bias’ (and ‘recovery’) are important aspectsf the understanding of a measurement result in analytical chem-stry. This paper will discuss the present definitions and willeview different approaches to dealing with systematic effects.n addition to the metrological debate, field laboratories needo be able to estimate and, if necessary correct for, systematicffects. Example of present practice will be given.

∗ Tel.: +61 2 9385 4713; fax: +61 2 9385 6141.E-mail address: [email protected].

021-9673/$ – see front matter © 2007 Elsevier B.V. All rights reserved.oi:10.1016/j.chroma.2007.03.021

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

The concept of bias of a measurement result is bestunderstood in terms of a measurement model that recognisessystematic and random components of error.

x = x + δ + ε (1)

The true value of a measurand, x, is estimated by x whichdiffers from it by a systematic component, the bias δ and arandom component ε. The random error is considered to be Nor-

mally distributed with expectation zero and standard deviationσ. Therefore, a large number of measurements will have a meanof (x + δ) as shown in Fig. 1. A single measurement result can-not distinguish between systematic and random error, but several
Page 2: Systematic errors in analytical measurement results

26 D.B. Hibbert / J. Chromatogr

Fs

mtt

x

Ufstpmo1eouogbb

2

bdantIlt(dw

t

m

m

(

(

(

r

tcdl

Tsc

3m

atablrd

3

et

ig. 1. Schematic representation showing the relationship between random andystematic error in a measurement result.

easurements combined with knowledge about the characteris-ics of the method can allow calculation of an interval about x

hat contains the true value with a certain level of confidence.

= x ± U (2)

is known as the expanded uncertainty [1] and is obtainedrom considerations of all aspects of the uncertainty of the mea-urement result. The so-called GUM approach (after Guide tohe Expression of Uncertainty in Measurement [1]) was firstublished in 1993 and has been the basis of the recommendedethods for characterising a measurement result. A requirement

f the international standard for testing laboratories (ISO/IEC7025 [2]) is that an appropriate measurement uncertainty bestimated for the results. One reason that traditional conceptsf systematic and random error have been subsumed into thencertainty approach is that depending on the information used,ne kind of error can be turned into another, and so there is noeneral definition of these terms and the measuring system muste described very carefully. This discussion will be expandedelow.

. Definitions

In measurement science there is a need to carefully defineasic terms and concepts used on which the subject rests. Fun-amental terms such as “measurement” must mean the same tochemist as to an astronomer or psychologist. All major inter-ational bodies who have an interest in measurement, includinghe International Bureau of Weights and Measures, ISO, andUPAC have come together to revise the International Vocabu-ary of Basic and General Terms in Metrology, in order to providehis sought after common basis. In the forthcoming third editionVIM3, [3]) many of the changes have been made to accommo-ate chemical measurement. Measurement bias (synonym bias)

ill be defined as2.19measurement biasbias

otta

. A 1158 (2007) 25–32

systematic measurement error or its estimate, with respecto a reference quantity value

where bold terms are also defined in the VIM. Systematiceasurement error is defined as2.18systematic error of measurementsystematic errorcomponent of measurement error that in replicate measure-

ents remains constant or varies in a predictable mannerNOTES

1) The reference quantity value for a systematic measure-ment error is a true quantity value, or a measured quantityvalue of a measurement standard of negligible measure-ment uncertainty, or a conventional quantity value.

2) Systematic measurement error, and its causes, can be knownor unknown. A correction can be applied to compensate fora known systematic measurement error.

3) Systematic measurement error equals the difference of mea-surement error and random measurement error.

The earlier definition, and one used by ISO 5275 [4], is3.8 bias: The difference between the expectation of the test

esults and an accepted reference value.NOTE 5 Bias is the total systematic error as contrasted

o random error. There may be one or more systematic erroromponents contributing to the bias. A larger systematicifference from the accepted reference value is reflected by aarger bias value.

The depiction in Fig. 1 is consistent with these definitions.he definitions also imply that bias can be estimated by mea-urement of a reference quantity value, and then subsequentlyorrected for.

. Components of the systematic error of aeasurement result

In the definitions given above there is no distinction mademong different sources of systematic error. However, identifica-ion of the source of systematic error can impact on its estimationnd treatment. The headings below represent sources that haveeen proposed as worthy of individual attention. They may over-ap, and it must be remembered that for a single measurementesult there is but one, unknowable, ‘measurement error’—theifference between x and x in Eq. (1).

.1. Sampling bias

When the measurand is a quantity of a larger whole, samplingrror can be a major systematic effect, and will not be treated inhe same way as for effects in the laboratory procedures. A goal

f a sampling protocol is often to randomise effects that can behen treated statistically [5]. Ramsey [6] has pointed out that theraditional, i.e. an assertion of random sampling does not guar-ntee the desired result. ‘Analytical bias’ (Ramsey’s term for
Page 3: Systematic errors in analytical measurement results

atogr

sdbb(apsedpwfbws

3

isRctfcm

mbvtmctcAqlthbhcs

csabdnarLb

3

tasIrtacflormtyasititqarmsailbactum

R

wswtoauru

D.B. Hibbert / J. Chrom

ystematic effects arising during the laboratory measurement, toistinguish the effect from ‘sampling bias’) is usually estimatedy measurement of a reference material. By analogy samplingias can be estimated by use of a reference sampling targetRST). The RST is synthesized to have a known concentration ofnalyte [7], or it is a routine sample that has been selected for theurpose and its quantity value established by an inter-laboratorytudy [8]. The certified value may also be specified for its spatialxtent. The second method suggested by Ramsey [6], which isesigned to randomise sampling bias, is to use multiple samplingrotocols, again in a collaborative study. Each different biasill contribute to the sampling variance in an assumed random

ashion. Therefore, a realistic uncertainty due to sampling cane estimated and a decision regarding whether a measurementill be fit for purpose made. Ramsey calls this ‘appropriate’

ampling.

.2. Calibration bias

Calibration bias for direct reading instruments (where thendication of the measuring instrument is expressed in theame quantity as the standard) has been identified by Cuadros-odriguez et al. [9] in contrast to factors employed in indirectalibration. An example is a direct-reading balance, for whichhe bias, measured by weighing a calibrated mass, is subtractedrom subsequent measurements as a correction. The correctionan be an absolute value or a relative correction factor thatultiplies the uncorrected result.Indirect calibration is required for the majority of analytical

easurements, and involves the establishment of the relationetween the indication of the measuring instrument (counts,oltages, currents, peak areas, etc.) and values of the quan-ity being measured. If the calibration equation holds for the

easuring system, is linear in the coefficients, and the randomomponents of error are known to be constant or proportionalo the quantity value, then there are algebraic solutions for theoefficients and uncertainties of quantity values estimates [10].

bias arises when the calibration equation does not fit theuantity value-response relation; for example when a straightine is forced through data that curves at higher concentra-ions. The potential error in misusing straight line calibrationsas been demonstrated by Mulholland and Hibbert [11], ory employing an inappropriate function by Kirkup and Mul-olland [12]. It is important therefore that in establishing thealibration the adequacy of the mathematical model is demon-trated.

For the case of multivariate calibration (for example prin-ipal components regression, ridge regression and partial leastquares regression) because the calibration function is usuallylinear combination of variables, there is almost inevitable

ias. Kalivas [13] discusses this in a review in which heescribes the “bias/variance” trade off in the optimal harmo-ious model. Bias in multivariate calibration may be expressed

s the root mean square error of calibration, or by using a sepa-ate data set the root mean square of prediction (or validation).eave one out cross validation methods also give estimates ofias.

ienm

. A 1158 (2007) 25–32 27

.3. Recovery

Although incomplete recovery contributes systematic errorso a measurement result and so is included in this discussion, thections that give rise to the need to consider recovery perhapshould be seen as a part of the overall measurement procedure.ndeed one definition of the term concerns the physical sepa-ation of an analyte from a matrix, as opposed to our abilityo measure a quantity in the course of an analysis. Confusionmong the different terminology and interpretations of the con-ept ‘recovery’ still permeates the analytical community. For aavour of the problems in defining this term see the concise paperf Dybkaer [14] which was written in response to two IUPACecommendations [15,16]. Here we shall adopt the Dybkaer ter-inology, although there appears to be no great dispute about

he intent of the definitions of the concept. If in a chemical anal-sis it is necessary to apply a procedure to the sample to benalysed that separates or derivatises or otherwise changes theample before presentation to the measuring instrument, theres a possibility that not all of the analyte will be measured byhe instrument. The ratio of the measured amount (Dybkaer: thenitially estimated quantity) and the amount actually present inhe sample (Dybkaer: the actual amount) is called the recovereduantity ratio (colloquially the ‘recovery’) [14]. The problemrises when the recovery has to be estimated, in order to apply aecovery factor to subsequent measurement results. For a routineeasurement the actual amount is perforce unknown, and so a

eparate procedure is needed to establish the recovery factor. Ifwell characterised matrix-matched certified reference material

s available, the recovery factor is simply the mean of a suitablyarge number of measurement results on this material dividedy its certified quantity value (see Eq. (3)). A statistical test fornull hypothesis of R = 1 can be performed to decide if any

orrection is to be applied to routine results, and then an uncer-ainty component must be included when the combined standardncertainty is calculated for the corrected or uncorrected fieldeasurement result.

=∑i=p

i=1 C(ref)meas

pC(ref)cert= C(ref)meas

C(ref)cert(3)

u(R)

R=√(

u(C(ref)cert)

C(ref)cert

)2

+ 1

p

(u(C(ref)meas)

C(ref)meas

)2

(4)

here C(ref)meas is the initially estimated quantity that is mea-ured p times to give the mean, and C(ref)cert the actual amounthich is usually certified in the documentation accompanying

he reference material. The standard uncertainty of the estimatef the quantity value of the reference material is u(C(ref)meas)nd the standard uncertainty of the reference value itself is(C(ref)cert). In the majority of cases there is no matrix matchedeference material, and so the commutability of any materialsed must be considered. Thus, if a blank matrix material

s spiked with neat analyte the question arises as to whatxtent the analyte is taken up in the matrix in the same man-er as a field sample. Perhaps the spike will be recoveredore easily? In principle recovery is estimated in the same
Page 4: Systematic errors in analytical measurement results

2 atogr

wumctUb

R

Cpau

3

ettltor

mesmoc

Vceic(tcipumemtvsmt

FT

8 D.B. Hibbert / J. Chrom

ay, by replicate measurements of the spiked material. Thencertainty of the recovery factor must now include an esti-ate of the error arising from lack of commutability. This

omponent is difficult to estimate and might be of a magni-ude that makes it difficult to observe a significant recovery.nder this regime, equations for the recovery and its uncertaintyecome

=∑i=p

i=1 C(ref)meas

pC(ref)grav× f = C(ref)meas

C(ref)grav× f (5)

u(R)

R=

√√√√(u(C(ref)grav)

C(ref)grav

)2

+ 1

p

(u(C(ref)meas)

C(ref)meas

)2

+ u(f )2

(6)

(ref)grav is the quantity value of the analyte in a gravimetricallyrepared spike, and f appears formally in the equation for R ascorrection for the lack of commutability with a value of 1 andncertainty u(f).

.4. Analytical biases

It is possible to conceive of bias terms arising from differ-nt aspects of the analytical process. A rather poor methodhat consistently gave low results would not be susceptibleo improvement by more quality control at the laboratory

evel, whereas a hapless analyst who consistently miss-applieshe method procedure will contribute to that laboratory’s biasnly. Uncorrected a biased method would be unlikely to giveesults that could be compared with those from a different

sImt

ig. 2. Systematic errors for a method with a large inherent method bias showinghompson [18].

. A 1158 (2007) 25–32

ethod. The Analytical Methods Committee of the Royal Soci-ty of Chemistry [17] in 1995, and then later Thompson [18]plit the bias term of Eq. (11) into contributions from theethod, laboratory and run, which together form a ‘ladder

f errors’ [18]. Fig. 2 shows the relationships among theseoncepts.

O’Donnell and Hibbert [19,20] have considered bias from theIM definition and thus discuss different biases in relation to the

onditions under which the measurement result is obtained. Theymphasise that only a single systematic error can be measuredn a given set of experiments, whether done under repeatabilityonditions (run bias), intra-laboratory reproducibility conditionslaboratory bias) or as part of an inter-laboratory method valida-ion study (method bias). How lower level biases become randomomponents as the analysis moves up the hierarchy are shownn Fig. 3. For example, if an individual analyst uses the sameipette for preparing solutions, any bias from the nominal vol-me will be included in all measurements made. No matter howany repeats are made, there will be a systematic error. How-

ver when a result is the mean of measurements made acrossany laboratories, the biases of individual pipettes now average

o (hopefully) zero, but these biases now contribute to the overallariance. This is shown in the figure by different run means withmall variances being combined into an overall inter-laboratoryean with larger variance (Fig. 3). Although the usages of the

erms run bias, laboratory bias and method bias differ, it can be

een how they are related from Figs. 2 and 3. It is to be noted thatSO 5275 also defines laboratory bias, bias of the measurementethod and laboratory component of bias which mirror some of

he usage here [4].

the relationship between method, intra-laboratory and run bias as defined by

Page 5: Systematic errors in analytical measurement results

D.B. Hibbert / J. Chromatogr. A 1158 (2007) 25–32 29

F ised im

3s

imsaswmmsmcicnoth9srhlt[i

iimtea

4

bemiamoultgbic

ig. 3. A hierarchy of experimental conditions showing how run bias is randomethod validation studies.

.5. Measurement of method bias by inter-laboratorytudies

ISO 5275, part 4 [21] provides the statistical basis for annter-laboratory study to estimate the bias associated with aethod. This approach is used by organizations that publish

tandard methods of analysis to determine bias in the course ofmethod validation campaign. As with other inter-laboratory

tudies [22] laboratories and materials distributed are chosenith the aims of the study in mind, which are to estimate theagnitude of the bias of the measurement method and to deter-ine if it is statistically significant. If the bias is found to be

tatistically insignificant, then a further objective is to deter-ine the magnitude of the maximum bias that would, with a

ertain probability, remain undetected by the results of the exper-ment (i.e. the power of the test). Laboratories are chosen as aompetent and representative selection of field laboratories. Theumber of laboratories and replicate materials sent to each lab-ratory is determined by the minimum bias that it is wishedo discover. When the results are returned they are checked foromogeneity of variance (Cochran’s test) and the bias and its5% confidence interval calculated. As with all inter-laboratorytudies the organizing panel will scrutinize results and issue aeport to the sponsoring organization. This standard also showsow laboratory bias can be estimated from results from a single

aboratory if an inter-laboratory study has previously establishedhe repeatability standard deviation of the method. Hund et al.23] also mention estimation method and laboratory bias bynter-laboratory studies.

rdft

n intra-laboratory studies, and laboratory bias is randomised in inter-laboratory

Kuselman [24] treats results from a proficiency testing studyn terms of individual laboratories having significant or insignif-cant bias compared with the certified value of a reference

aterial. The inter-laboratory mean and standard deviation fromhe population of the participating laboratories are used tostablish criteria for rejecting the null hypothesis, or acceptinglternative hypotheses.

. Bias of empirical methods

An empirical method, one for which the measurand is definedy the method, has zero method bias by definition, when usedxactly according to the prescribed method. An example is theeasurement of chemical oxygen demand (COD) by ISO 6060,

n which a given amount of dichromate is reduced by oxidis-ble material in the sample according to a carefully prescribedethod [25]. Drolc et al. [26] have discussed the uncertainty

f this method, and also demonstrated the absence of run biassing certified reference materials whose COD had been estab-ished by the method. However it is noted that the uncertainty ofhe insignificant bias was not included in the uncertainty bud-et. If bias is measured, even if it is insignificant, there muste an uncertainty associated with that estimate which should bencluded in the overall uncertainty budget. The only exceptionould be for a well characterized method in which there is no

eason to expect bias, but which is checked as part of QC proce-ures. By analogy with the certification of reference materialsor stability, measurements that just verify stability do not addo the uncertainty [27].
Page 6: Systematic errors in analytical measurement results

3 atogr

5

asSassmrtbapambbHtCtdbtie

U

ufa

attura[mteF

δ

C

u

wmsuo

t

wc(

u

u5if

na

6

rmdcifrsgn

btetfbibnombcsbias.

0 D.B. Hibbert / J. Chrom

. Treatment of bias

Authoritative guides on the treatment of measurement resultsnd the estimation of measurement uncertainty agree that allignificant bias should be estimated and corrected for [1,28].ection 3.2.4 of the GUM reads “It is assumed that the result ofmeasurement has been corrected for all recognised significant

ystematic effects and that every effort has been made to identifyuch effects”. The example given is of a clear case in physicaleasurement in which the finite impedance of a voltmeter gives

ise to a bias, the form of which is well known and related tohe measurable impedance. In chemical measurement, as haseen shown, the nature of bias can be less obvious and ariset different stages of an analysis. Because of the greater com-lexity of systematic errors in chemistry, and perhaps becauseparticular value of the bias will be unique to each measure-ent, some sectors have tended to avoid explicit correction for

ias and instead have used an estimate of the magnitude of theias to augment the measurement uncertainty. O’Donnell andibbert [19] have reviewed such methods in comparison with

he GUM-recommended approach by Monte Carlo simulation.orrection of a measurement result with inclusion of the uncer-

ainty of the correction in the combined standard uncertainty isemonstrated to be the best approach. If a correction is not toe done then the method known as SUMUMax [29] in whichhe absolute value of the estimate of run bias (δrun as shownn Fig. 3) is added to the expanded uncertainty gives the beststimate, albeit a consistent overestimate.

(Ctest) = kuc(Ctest) + |δrun| (7)

c is the combined standard uncertainty and k is the coverageactor which is determined by the required level of confidencend number of degrees of freedom of uc.

The definition of run bias as the difference between theverage of results obtained under repeatability conditions andhe true value leads to a straight forward approach to correc-ion for bias in batch analyses and estimation of measurementncertainty. In a routine batch measurement if suitable matrixeference materials are available then run bias can be estimatednd if significant a correction applied. O’Donnell and Hibbert30] argue that if this is done, then the only components of theeasurement uncertainty are the repeatability precision that per-

ains to the run conditions under which the measurement and biasstimation are being made, and the uncertainty of the correction.or a bias correction this becomes

run =∑p

1 C(CRM)meas

p− C(CRM)cert

= C(CRM)meas − C(CRM)cert (8)

¯ (sample) =∑n

1C(sample)meas

n+ δ (9)

c(C(sample)

) =

√√√√(ur(C(sample)

)√

n

)2

+(

ur(C(CRM)meas

)√

p

)

. A 1158 (2007) 25–32

here a certified reference material (CRM), appropriatelyatrix matched, is measured p times and the test sample is mea-

ured n times, all under repeatability conditions with standardncertainty ur. The correction in Eq. (9) is applied if the valuef δ is significantly different from zero by a Student’t-test

= δrun

u(δ)(11)

here the standard uncertainty of the estimate of the bias is theombination of the measurement and certification terms in Eq.10).

(δ) =

√√√√(ur(C(CRM)meas

)√

p

)2

+ u(C(CRM)cert)2 (12)

If it can be demonstrated that the measurements are all madender repeatability conditions, for example using the tests in ISO725 [31] then the appropriate uncertainty (ur) is the repeatabil-ty from quality control data with essentially infinite degrees ofreedom.

An example of the analysis of the concentration of creati-ine in urine by a spectrophotometric method in a commercialnalyser has been given by O’Donnell and Hibbert [30].

. Some examples of systematic errors in practice

Multivariate methods of calibration cause concern about biaselative to more traditional methods. Sugar content of cane waseasured by principal components regression of mid-infrared

ata. A bias for sucrose is reported as 0.041 g/100 mL which islaimed to be better than that for direct polarimetry [32]. Othernfrared applications for which bias in transferring calibrationrom one instrument to another is important include analysis ofed grapes [33], the properties of wood in the presence of bluetain [34], and the analysis of petrochemicals [35]. The moreeneral question of robustness of multivariate calibration usingear infrared is discussed by Zeaiter et al. [36].

An example which is titled “universal bias” but which mighte called sampling bias has been identified in geophysics withhe analysis of rainwater. Ayers et al. [37] showed that biologicalffects on the ionic composition of rainwater are not restricted tohe previously reported pH. Ammonium, potassium, nitrate, sul-ate, methanesulfonate, and phosphate ions are also removed byiological processes, but remain in the rainwater in biomass. Themplication is that most previous rainwater composition studiesased on ionic analyses will have systematically underestimatedutrient deposition. More careful consideration of the definitionf the measurand, and its relation to the measurement functionight reveal that the problem lies in this definition, rather than

ias as such. The analysis of bias in geochemical analysis is ofoncern in a work by de Castilho [38], who reports Monte Carloimulations to test statistical methods for detecting analytical

2

+ u(C(CRM)cert)2 (10)

Page 7: Systematic errors in analytical measurement results

atogr

eom[am

ifmdta[(i

[bgcid

cY(mtttawdmmbmTcuoc7rtteteH5to

o

ettraabsAanacs

ar

7

asebpntrutc

R

[[[[[

D.B. Hibbert / J. Chrom

Charlet and Marchal [39] report the use of certified refer-nce materials in making metrologically sound measurementsf heavy metals in groundwater. They underline the role of aatrix CRM in estimating bias in method validation. Muller

40] also emphasises the use of appropriate reference materi-ls to establish analytical bias of routine methods in laboratoryedicine.Heydorn and Anglov propose a new approach for estimat-

ng measurement uncertainty of methods for which a functionalorm of the reproducibility standard deviation of the measure-ent result can be assigned. The variation of the standard

eviation implies that a simple linear or weighted regressiono produce the calibration function is not optimal, and they givenalysis of synthetic lead solutions by ICP-AES as an example41]. In this example bias is shown to be borderline significantat 5%), but it is not clear if the uncertainty of the bias estimates included in their uncertainty budget.

In electrophoretic analysis in microchips Lacharme and Gijs42] has shown a bias (called “sample bias”) on the same chipetween injections by the electrokinetic effect and by backate pressure injection for Rhodamine B but not for fluores-ein. Apparently Rhodamine B flows faster during electrokineticnjection. This shows the importance of proper tests for biasuring development and validation of a method.

Two recent approaches to the treatment of bias in new analyti-al methods based on chromatography, by Holden et al. [43] andang et al. [44] will be discussed. Holden and her team at NIST

National Institute of Standards and Technology) have reportedaking metrologically traceable (to the SI) measurements of

he amount of DNA using HPLC (and high performance induc-ively coupled optical emission spectroscopy HP-ICP-OES) forhe phosphorus content of thymidine 5′-monophosphate (TMP)s phosphate. The purity of a TMP standard used to measure biasas determined by two chromatographic methods both usingiode array detection to be better than 99%. Standard solutionsade up to measure the bias of the HPLC and HP-ICP-OESethods investigated were weighed on five figure balances with

uoyancy correction. Contributions to the uncertainty of theass fraction of TMP included uncertainties in: the mass ofMP, the mass of the 5% HCl digestion solution, the Karl Fis-her determination of the water content of TMP, and possiblendetected TMP impurities. Analysis of phosphate was carriedut using a Dionex DX 500 ion chromatograph with an ED 50onductivity detector. The retention time of phosphate ion was.0 min. The phosphate peak was well separated from the chlo-ide and the nitrate ion peaks, which had retention times lesshan 4 min. For quantitative analysis, a standard phosphate solu-ion (made from a NIST SRM) was injected before and afterach injection of the dilute digested phosphate sample solu-ion. Full GUM uncertainty budgets were prepared and bias andxpanded uncertainty reported. The expanded uncertainty of thePLC method was about 1% with biases measured from 3 to%. The extreme care and attention to detail that is clear from

his paper, gives high confidence in the reported characteristicsf the method.

Yang et al. [44] report the validation of the measurementf apomorphine in canine plasma by liquid chromatography–

[

[[[

. A 1158 (2007) 25–32 31

lectrospray ionization mass spectrometry. Using what appearso be the same standard material for calibration and recoveryhey quote relative standard deviation of less than 5.9% and aelative bias of less than 7.5%. The method calibrates the ratio ofreas under protonated molecular ion m/z 268 for apomorphinend m/z 234 for the internal standard mentazinol, and measuresias as the difference between the nominal concentration of a QCpike and the measured concentration expressed as a percentage.part from the concern of using the same standard for calibration

nd quality control (of which the traceability of the purity isot asserted), this might be seen as following the O’Donnellpproach discussed above, although Yang et al. do not go on tolaim their RSD values as measurement uncertainty, nor do theytate that the bias is not significant and therefore not corrected.

These two papers show that the rigor with which methodsre validated depends on the ultimate use of the method and theequirements of potential analysts.

. Conclusions

A proper understanding of systematic effects in chemicalnalysis is a requirement for the production of metrologicallyound measurement results. In method validation the inher-nt characteristics of the method, which includes bias, muste established by an appropriate statistically designed cam-aign. Estimation of run bias, whether deemed significant orot, allows for a simplified assessment of measurement uncer-ainty. To maintain metrological traceability of a measurementesult, any estimate of systematic effects that are subsequentlysed to correct initially estimated amounts, must themselves beraceable. This is best done using appropriate matrix matchedertified reference materials.

eferences

[1] Guide to the Expression of Uncertainty in Measurement (ISO), Geneva,1995.

[2] ISO/IEC 17025, General requirements for the competence of calibrationand testing laboratories (ISO), Geneva, 2005.

[3] Joint Committee for Guides in Metrology, International vocabulary of basicand general terms in metrology (ISO), Geneva, third ed., 2007.

[4] ISO 5725-1, Accuracy (trueness and precision) of measurement meth-ods and results—Part 1: General principles and definitions (ISO), Geneva,1994.

[5] ISO 11648-1, Statistical aspects of sampling from bulk materials—Part 1:General principles (ISO), Geneva, 2003.

[6] M.H. Ramsey, Accred. Qual. Assur. 7 (2002) 274.[7] M.H. Ramsey, S. Squire, M.J. Gardner, Analyst 124 (1999) 1701.[8] A. Argyraki, M.H. Ramsey, M. Thompson, Analyst 120 (1995) 2799.[9] L. Cuadros-Rodriguez, L. Gamiz-Cracia, E. Almansa-Lopez, J. Laso-

Sanchez, Trends Anal. Chem. 20 (2001) 195.10] D.B. Hibbert, Analyst 131 (2006) 1273.11] M. Mulholland, D.B. Hibbert, J. Chromatogr. A 762 (1997) 73.12] L. Kirkup, M. Mulholland, J. Chromatogr. A 1029 (2004) 1.13] J.H. Kalivas, Anal. Lett. 38 (2005) 2259.14] R. Dybkær, Accred. Qual. Assur. 10 (2005) 302.

15] M. Thompson, S. Ellison, A. Fajgelj, R. Willetts, R. Wood, Pure Appl.

Chem. 71 (1999) 337.16] D.T. Burns, K. Danzer, A. Townshend, Pure Appl. Chem. 74 (2002) 2201.17] Analytical Methods Committee, Analyst 120 (1995) 2303.18] M. Thompson, Analyst 125 (2000) 2020.

Page 8: Systematic errors in analytical measurement results

3 atogr

[[

[

[

[

[[

[[

[

[

[

[

[[

[

[

[

[[[[[

2 D.B. Hibbert / J. Chrom

19] G.E. O’Donnell, D.B. Hibbert, Analyst 130 (2005) 721.20] D.B. Hibbert, Quality Assurance for the Analytical Chemistry Laboratory,

Oxford University Press, New York, 2007.21] ISO 5725-4, Accuracy (trueness and precision) of measurement methods

and results—Part 4: Basic methods for the determination of the trueness ofa standard measurement method (ISO), Geneva, 1994.

22] D. Hibbert, in: P.J. Worsfold, A. Townshend, C.F. Poole (Eds.), Encyclo-pedia of Analytical Science, Elsevier, Oxford, 2005, p. 449.

23] E. Hund, D.L. Massart, J. Smeyers-Verbeke, Anal. Chim. Acta 423 (2000)145.

24] I. Kuselman, Accred. Qual. Assur. 10 (2006) 466.25] ISO 6060, Water quality—determination of the chemical oxygen demand

(ISO), Geneva, 1989.26] A. Drolc, M. Cotman, M. Ros, Accred. Qual. Assur. 8 (2003) 138.27] ISO Guide 35, Certification of reference materials—General and statistical

principles (ISO), Geneva, 1989.28] Eurachem/CITAC Guide, in: S. Ellison, M. Rosslein, A. Williams (Eds.),

Quantifying Uncertainty in Analytical Measurement, LGC, Teddington,2000.

29] A. Maroto, R. Boque, J. Riu, F.X. Rius, Accred. Qual. Assur. 7 (2002)90.

30] G. O’Donnell, D.B. Hibbert, Accred. Qual. Assur., in review.

[[

[

. A 1158 (2007) 25–32

31] ISO 5725-2, Accuracy (trueness and precision) of measurement methodsand results—Part 2: Basic method for the determination of repeatabilityand reproducibility of a standard measurement (ISO), Geneva, 1994.

32] F. Cadet, Talanta 48 (1999) 867.33] D. Cozzolino, W.U. Cynkar, R.G. Dambergs, L. Janik, M. Gishen, J. Near

Infrared Spectrosc. 13 (2005) 213.34] B.K. Via, C.L. So, T.F. Shupe, L.G. Eckhardt, M. Stine, L.H. Groom, J.

Near Infrared Spectrosc. 13 (2005) 201.35] U. Hoffmann, N. Zanier-Szydlowski, J. Near Infrared Spectrosc. 7 (1999)

33.36] M. Zeaiter, J.M. Roger, V. Bellon-Maurel, D.N. Rutledge, Trends Anal.

Chem. 23 (2004) 157.37] G.P. Ayers, R.W. Gillett, P.W. Selleck, Geophys. Res. Lett. 30 (2003) 1715.38] M.V. de Castilho, Geostandards Geoanal. Res. 28 (2004) 277.39] P. Charlet, A. Marschal, Trends Anal. Chem. 23 (2004) 178.40] M.M. Muller, Accred. Qual. Assur. 8 (2003) 340.41] K. Heydorn, T. Anglov, Accred. Qual. Assur. 7 (2002) 153.

42] F. Lacharme, M.A.M. Gijs, Sens. Actuators B 117 (2006) 384.43] M.J. Holden, S.A. Rabb, Y.B. Tewari, M.R. Winchester, Anal. Chem. 79

(2007) 1536.44] B. Yang, Y.Q. Yu, L. Cai, C.H. Deng, G.L. Duan, J. Sep. Sci. 29 (2006)

2173.