validation of analytical methods

16
 Validation of Analytical Methods By Ricard Boqué, Alicia Maroto, Jordi Riu and F. Xavier Rius* Department of Analytical Chemistry and Organic Chemistry University Rovira i Virgili Pl. Imperial T àrraco, 1, 43005-T arragona, Spain CONTENTS 1. Introduction 2. Validation and fitness-for- purpose 3. Analytical requirement s and performance criteria 3.1. Trueness 3.1.1. Certified reference materials (CRM),  reference materials and in-house materials. 3.1.2. Reference methods. 3.1.3. Proficiency testing  3.1.4. Spiked samples  3.1.5. As sessment of trueness in restricted  concentrati on ranges  3.1.6. Assessment of trueness in large concentration  ranges. Recovery studies  3.2. Precision and uncertainty  3.2.1. Calcul ation of the within-laboratory precision  3.2.2. Calc ulation of the repeatability limit  3.2.3. Calculation of the "internal reproducibility " limit  3.3. Uncertainty  3.3.1. Uncertainty of the assessment of trueness,  utrueness  3.3.2. Uncertainty of sample pretreatments,  upretreatments  3.3.3. Uncertainty of other terms  3.4. Linear and working ranges  3.5. Limit of detection (LOD)  3.5.1. Estimating the standard deviations of the  net concentrations 3.5.2. Altern atives for LOD calculation  3.6. Limit of quantificati on (LOQ)  3.7. Selectivity (specificity)  3.8. Sensitivi ty  3.9. Ruggedness (or robustness) 4. The basic principles of method validation 5. Performing the method validation 6. Inter-laboratory comparis on as a means for method v alidation 7. In-house validation 8. Reporting method valida tion  References RESUMEN Validación de métodos analíticos. En este artículo se discute el concepto de validación del mé- todo, se describen los elementos que la componen y se explica la fuerte relación entre la validación y las características de ajuste. El método de validación se basa en el cumplimiento de una serie de requerimientos, se explica como seleccionar esos requeri- mientos, la forma en que se suministran evidencias, y que trabajo se debe llevar a cabo en el laboratorio. También se describen, los principios básicos del método de validación y los diferentes cami- nos para validar una metodología, tanto en la comparación entre laboratorios o como cuando se lleva a cabo una validación dentro del laboratorio . PALABRAS-CLAVE: Características de ajuste - Criterios de funcionamiento - Validación del método - Validación dentro del laboratorio. SUMMARY Validation of analytical methods. In this paper we shall discuss the concept of method validation, describe the various elements and explain its close relationship with fitness for purpose. Method validation is based on the assumption that a series of requirements are fulfilled and we shall explain how these requirements are selected, the way in which evidence is suppli ed and what work has to be carried out in the laboratory. The basic principles of method validation and the different ways to validate a methodology, by inter-laboratory comparison or performing an in-house validation, are also described. KEY -WORDS: Fitness-for-purpose - In-house validation - Method validati on - Perf ormance criteria. 1. INTRODUCTION Analytical information can be used for a variety of purposes: to take decisions involving the control of the manufacturing process of a product, to assess whether a product complies with regulatory limits, to take decisions about legal affairs, international trade, health problems or the environment. Consequently, the producer of analytical results must not only ensure that they are reliable but also that they have all the elements that give confidence to the user. Quantitative analytical results are expressed as “estimated value ± U” where “estimated value” is the estimate of the true value obtained when the analytical method is applied to the test sample. “U” represents the uncertainty associated to the estimated value. Nowadays, the users of test results are also interested in other characteristics. They also require, for example, that the estimated value be repeatable under certain conditions, that the analytical method provide similar results when the experimental conditions are slightly modified and that the test method be capable of quantifying very low concentrations of the analyte. Since it is materially impossible to guarantee these characteristics for every single test result, the international analytical institutions have developed a system that ensures the quality of the results if they have been obtained under certain conditions. This quality assurance system, although highly developed nowadays, is in continuous evolution so that it can adapt to the new requirements of society . Grasas y Aceites 128 V ol. 53. Fasc. 1 (2002), 128-143

Upload: anara-gonzalez-carias

Post on 04-Oct-2015

15 views

Category:

Documents


1 download

DESCRIPTION

En este artículo se discute el concepto de validación del mé-todo, se describen los elementos que la componen y se explica lafuerte relación entre la validación y las características de ajuste.El método de validación se basa en el cumplimiento de una seriede requerimientos, se explica como seleccionar esos requeri-mientos, la forma en que se suministran evidencias, y que trabajose debe llevar a cabo en el laboratorio. También se describen, losprincipios básicos del método de validación y los diferentes caminos para validar una metodología, tanto en la comparación entrelaboratorios o como cuando se lleva a cabo una validación dentrodel laboratorio.

TRANSCRIPT

  • Validation of Analytical Methods

    By Ricard Boqu, Alicia Maroto, Jordi Riu and F. Xavier Rius*Department of Analytical Chemistry and Organic Chemistry

    University Rovira i VirgiliPl. Imperial Trraco, 1, 43005-Tarragona, Spain

    CONTENTS

    1. Introduction2. Validation and fitness-for-purpose3. Analytical requirements and performance criteria 3.1. Trueness 3.1.1. Certified reference materials (CRM),

    reference materials and in-house materials. 3.1.2. Reference methods. 3.1.3. Proficiency testing 3.1.4. Spiked samples 3.1.5. Assessment of trueness in restricted

    concentration ranges 3.1.6. Assessment of trueness in large concentration

    ranges. Recovery studies 3.2. Precision and uncertainty 3.2.1. Calculation of the within-laboratory precision 3.2.2. Calculation of the repeatability limit 3.2.3. Calculation of the "internal reproducibility" limit 3.3. Uncertainty 3.3.1. Uncertainty of the assessment of trueness,

    utrueness 3.3.2. Uncertainty of sample pretreatments,

    upretreatments 3.3.3. Uncertainty of other terms 3.4. Linear and working ranges 3.5. Limit of detection (LOD) 3.5.1. Estimating the standard deviations of the

    net concentrations 3.5.2. Alternatives for LOD calculation 3.6. Limit of quantification (LOQ) 3.7. Selectivity (specificity) 3.8. Sensitivity 3.9. Ruggedness (or robustness)4. The basic principles of method validation 5. Performing the method validation6. Inter-laboratory comparison as a means for method validation7. In-house validation 8. Reporting method validation References

    RESUMEN

    Validacin de mtodos analticos.

    En este artculo se discute el concepto de validacin del m-todo, se describen los elementos que la componen y se explica lafuerte relacin entre la validacin y las caractersticas de ajuste.El mtodo de validacin se basa en el cumplimiento de una seriede requerimientos, se explica como seleccionar esos requeri-mientos, la forma en que se suministran evidencias, y que trabajose debe llevar a cabo en el laboratorio. Tambin se describen, losprincipios bsicos del mtodo de validacin y los diferentes cami-nos para validar una metodologa, tanto en la comparacin entrelaboratorios o como cuando se lleva a cabo una validacin dentrodel laboratorio.

    PALABRAS-CLAVE: Caractersticas de ajuste - Criterios defuncionamiento - Validacin del mtodo - Validacin dentro dellaboratorio.

    SUMMARY

    Validation of analytical methods.

    In this paper we shall discuss the concept of methodvalidation, describe the various elements and explain its closerelationship with fitness for purpose. Method validation is basedon the assumption that a series of requirements are fulfilled andwe shall explain how these requirements are selected, the way inwhich evidence is supplied and what work has to be carried out inthe laboratory. The basic principles of method validation and thedifferent ways to validate a methodology, by inter-laboratorycomparison or performing an in-house validation, are alsodescribed.

    KEY-WORDS: Fitness-for-purpose - In-house validation -Method validation - Performance criteria.

    1. INTRODUCTION

    Analytical information can be used for a variety ofpurposes: to take decisions involving the control ofthe manufacturing process of a product, to assesswhether a product complies with regulatory limits, totake decisions about legal affairs, international trade,health problems or the environment. Consequently,the producer of analytical results must not onlyensure that they are reliable but also that they haveall the elements that give confidence to the user.

    Quantitative analytical results are expressed asestimated value U where estimated value is theestimate of the true value obtained when theanalytical method is applied to the test sample. Urepresents the uncertainty associated to theestimated value. Nowadays, the users of test resultsare also interested in other characteristics. They alsorequire, for example, that the estimated value berepeatable under certain conditions, that theanalytical method provide similar results when theexperimental conditions are slightly modified andthat the test method be capable of quantifying verylow concentrations of the analyte.

    Since it is materially impossible to guaranteethese characteristics for every single test result, theinternational analytical institutions have developed asystem that ensures the quality of the results if theyhave been obtained under certain conditions. Thisquality assurance system, although highly developednowadays, is in continuous evolution so that it canadapt to the new requirements of society.

    Grasas y Aceites128 Vol. 53. Fasc. 1 (2002), 128-143

  • This system is based on a few fundamentalconcepts:

    1. The laboratories that produce analyticalinformation must operate under quality assuranceprinciples. The best way to guarantee that they do isto ensure that they are accredited by the ISO/IECGuide 17025 (1999).

    2. The laboratory must participate in proficiencytesting schemes, which should be designed andconducted in accordance with the Internationalharmonized protocol for proficiency testing ofanalytical laboratories (Thompson 1993).

    3. The laboratory must use internal qualitycontrol procedures which comply with theHarmonized guidelines for internal quality control inanalytical chemistry (Thompson 1995).

    4. The laboratory must use validated methodsof analysis.

    In other words, if the laboratories that produce theresults are compared to other similar laboratoriesand can show that they use good tools properly(validated methods under quality assuranceconditions), that they internally control theirprocesses and demonstrate proficiency bycomparing their performance to similar laboratories,there are reasons to think that the general quality ofthe laboratories is translated to their individualresults.

    In this paper, we shall discuss the concept ofvalidated methods of analysis and explain theirdifferent elements after we establish the closerelationship between validation and fitness forpurpose. We shall also describe the analyticalrequirements and performance characteristics andprovide useful practical information such as whichbasic norms to follow in a validation process or howto conduct validation studies in house or by usinginter-laboratory comparison studies.

    2. VALIDATION AND FITNESS-FOR-PURPOSE

    Validation is defined as the confirmation byexamination and provision of objective evidence thatthe particular requirements for a specified intendeduse are fulfilled (ISO 1994b).

    This definition implies that analytical methodsshould be validated taking into account therequirements of specific applications. Therefore, it ismisleading to think that there is a single process ofmethod validation that will demonstrate that amethod fulfils a list of requirements and that it can beused in a general way for many different applications.On the contrary, the definition implies that thespecific requirements for an intended use should bedefined before the performance capabilities of themethod are specified.

    Fitness for purpose, defined by IUPAC in theOrange Book as degree to which data produced by

    a measurement process enables a user to maketechnically and administratively correct decisions fora stated purpose, is more related to the results ofthe analytical method than to the method itself. In avery recent joint project between ISO and AOACInternational, IUPAC states that fitness-for-purposeis not only related to statistically-based performancecriteria but also to practicability and suitability criteria(IUPAC 1999a). So, it is important to evaluate,among other characteristics, the ease of operation orthe cost. In fact, method validation is the means thatanalysts have of demonstrating that the method isfit-for-purpose, showing to the customers that theycan trust the reported results.

    This present concept of method validationintroduces the need for versatility in the laboratoryand, therefore, compels analysts to adapt theanalytical methodologies to the different problems tobe solved. For instance, there is no need to calculatethe limit of detection for a method used to determineoleic acid in olive oils but there is a need whenanother method is used to determine trace residuesin the same oil sample. On the other hand, the needfor versatility introduces some complexity. Therequirements should be defined by first taking intoaccount the needs of the end user (something that isnot evident in all cases) and they should then beconfirmed by using calibrated instruments, trainedanalysts and all other quality assurance conditions.

    3. ANALYTICAL REQUIREMENTS AND PERFORMANCE CRITERIA

    Once the specific analytical problem has beenclearly defined, the producer of the analyticalinformation should ideally be able to agree with thecustomer about the specific analytical requirements.

    For instance, if a laboratory carries out the qualitycontrol of a product, legislation often determinesparticular limits for certain parameters. The analystmust ensure that the test results will reflect thecharacteristics of the analysed sample and not thoseof the performance method used to obtain theresults. Therefore, the estimated value must not onlybe traceable but also be associated to an appropriatelevel of uncertainty so as to ensure the complianceor not of that product with the regulations. Analystsalso know that they must use a method whose limitof quantification for that parameter is well below thelegal limit. At the same time, the rapidity with whichresults are provided, the cost of the analysis and theavailability of trained staff should also be evaluated.

    In fact, the end user is often not well defined. Thecustomer may take different forms: it could be aninternal or external client to the own company, itcould be the administration that is usually guided bythe current legislation or even a written standard thatmust be followed. It is not unusual to meet customers

    Vol. 53. Fasc. 1 (2002) 129

  • who are unable to define the requirements for aspecific analysis. They generally require that the costof the analysis be reduced to a minimum and that theresults be available very quickly but they find itdifficult to understand the actual significance of theuncertainty values. In these cases, analysts musttake decisions for the end users and, simultaneously,try to raise their metrological knowledge. Sincecharacterising the method performance is not asimple task it takes time and considerableresources there must always be a balancebetween the different requirements. So theperformance criteria must be carefully selected.

    The essential parameters that analysts need toassess in order to check whether a method satisfiespreviously defined analytical requirements are theperformance criteria or performance characteristics.In fact, method validation consists of derivingexperimental values for the selected performancecriteria. There are several types of performancecriteria. The basic parameters usually refer to thereliability of the method and are commonly derivedby using statistical procedures. The trueness,precision, selectivity, sensitivity, range, ruggedness,limit of detection and limit of quantification are thecommonest. These will be studied individually in thissection.

    Other complementary criteria refer to whether themethod can be applied to a wide range of sampletypes and whether it is practical (e.g. cost, ease ofuse, availability of material, instruments and trainedstaff, sample throughput, etc.)

    Whenever the sampling and subsampling stepsare included in the analytical procedure, it isapparent that the test sample must be representativeof the analytical problem. Method validation shouldalso include operations that ensure therepresentativity of the sample submitted to the testprocedure in the initial stages of the analysis.

    3.1. TruenessTrueness is defined as the closeness of

    agreement between the average value obtained froma large set of test results and an accepted referencevalue ISO 3534 (1993). Trueness should beevaluated, in terms of bias, through the analysis ofreference samples. However, not all the referenceshave the same level of traceability. Therefore, thereference selected should be the one that has thesuitable level of traceability for our purpose. Thereferences commonly used in chemical analysis arelisted in the scheme 1, ordered according to theirlevel of traceability.

    3.1.1. Certified reference materials (CRM),reference materials and in-house materialsThese reference materials can be used as long

    as they are matched, as far as possible, to theroutine samples to be analysed in terms of matrixand analyte concentration. The CRMs should beused whenever possible because they have thehighest level of traceability. However, this may be notpossible because of the limited availability of CRMsthat are very similar to the routine samples. On theother hand, reference materials or in-housematerials can also be used as long as the content ofthe analyte has been thoroughly investigated.Preferably, the analyte should be determined usingtwo different methods based on different physicalchemical principles and, if possible, based ondeterminations carried out in different laboratories(IUPAC 1999a).

    3.1.2. Reference methodsTrueness can be assessed using a reference

    method provided that it is well known by the laboratory.Here, analysing the same representative test samplewith the reference method and the test method to bevalidated is the way to assess the trueness of the testmethod. It is important to note that the test samplesanalysed should be homogeneous, stable, and assimilar as possible to the routine samples (in terms ofmatrix and analyte concentration).

    3.1.3. Proficiency testingTrueness can also be assessed when the

    laboratory takes part in a proficiency testing scheme.In this case, the reference value corresponds to theconsensus value obtained by the participatinglaboratories.

    3.1.4. Spiked samplesThese references have the lowest level of

    traceability. However, the analyst usually has toresort to spiked samples when the other references

    1. Certif ied reference m aterials (CRM)

    2. Reference m aterials/in-house m aterials

    3. Reference m ethods

    4. Proficiency testings

    5. Spiked sam ples

    Leve

    l of t

    rac

    ea

    bilit

    y

    Scheme 1References commonly used to assess trueness in chemical

    measurements.

    130 Grasas y Aceites

  • are not available. This has the disadvantage that biasmay be incorrectly estimated if the spiked analytehas a different behaviour than the native analyte. Onthe other hand, a spike can be a good representationof the native analyte in case of liquid samples and inanalytical methods that involve the total dissolutionor destruction of the samples (IUPAC, 1999b).Otherwise, the different behaviour of the nativeanalyte can be investigated using a lessrepresentative reference material (Barwick 1999).

    The assessment of trueness depends on whetherthe method is intended to be used in a restrictedconcentration range or in large concentration ranges.In the following sections, we describe how to assesstrueness in both situations.

    3.1.5. Assessment of trueness in restricted concentration rangesTrueness is assessed by assuming that the bias is

    the same in the whole concentration range. This biasis calculated using a reference sample similar to theroutine samples. This reference sample should beanalysed n times with the method to be validated indifferent runs, i.e. in different days, with differentoperators, etc. It is recommended to analyse thereference sample at least 7 times (preferably n 10)to obtain an acceptable reliability (IUPAC 1999a). Anaverage value, cmethod, and a standard deviation, s,can be calculated from these n measurements.

    Trueness is then assessed by statisticallycomparing the mean value obtained with theanalytical method, cmethod, with the reference value,cref. This comparison is done with a t test. The type oft test applied depends on the reference used. We candistinguish two different cases:

    a) Trueness is assessed against a referencevalue (here we consider CRMs, reference materials,proficiency testing and spiked samples)

    b) Trueness is assessed against a referencemethod.

    Case a. Trueness is assessed against a referencevalueThe following t test is used to assess trueness:

    where u(cref) is the standard uncertainty of thereference value. For instance, if the certified value ofa CRM is expressed as cref U(cref) (with k=2), u(cref)is calculated as U(cref)/k (e.g. if the certifiedconcentration of copper is 10.1 0.1 mg/l (with k=2),u(cref)=0.1/2=0.05 mg/l).

    The t value calculated is then compared with thetwo-sided t tabulated value, ttab, for =0.05 and theeffective degrees of freedom, eff, obtained with theWelch-Satterthwaite approach (1941).

    where ref are the degrees of freedom associatedwith u(cref). If this information is not available, we canassume that =n-1. The trueness of the method isassessed if t ttab.

    Case b. Trueness is assessed against a referencemethodTrueness is assessed by analysing a test sample

    both with the reference method and the method to bevalidated. This test sample should be analysed withboth methods in different runs (i.e. in different days,different operators, etc.). An average value, cref, anda variance, s2ref, can be obtained from the nref resultsobtained with the reference method. Likewise, anaverage value, c, and a variance, s2, can be obtainedfrom the n results obtained with the method to bevalidated. The t test used to assess truenessdepends on whether the difference between thevariances of both methods is statistically significant.This is checked with an F test:

    in which s21 is the larger of the two variances. The Fvalue calculated is compared with the two-sided Ftabulated value, F(,1,2), for =0.05 and thedegrees of freedom 1=n1-1 and 2=n2-1.

    If F F(,1,2), the difference between bothvariances is not statistically significant. Therefore,trueness is verified with the following t test:

    The t value calculated is then compared with thetwo-sided t tabulated value, ttab, for =0.05 and thedegrees of freedom n+nref-2. The trueness isassessed if t ttab.

    If F>F(,1,2), the difference between bothvariances is statistically significant. Therefore,trueness is assessed with the following t test:

    n

    scu

    cct

    22

    ref

    ref

    )( +

    = (1)

    (2)

    1)()(

    4

    eff

    4ref

    222

    ref

    eff

    +

    +

    =

    n

    scu

    n

    scu

    (3)22

    21

    s

    sF =

    (4)

    ++

    +

    =

    refref

    ref22

    ref

    112

    )1()1(nnnn

    nsns

    cct

    ref

    Vol. 53. Fasc. 1 (2002) 131

  • the t value calculated is then compared with thetwo-sided t tabulated value, ttab, for =0.05 and thedegrees of freedom eff, obtained with theWelch-Satterthwaite approach (Satterthwaite 1941).

    The trueness is assessed if t ttab.

    3.1.6 Assessment of trueness in large concentration ranges. Recovery studiesTrueness can be assessed in large concentration

    ranges with recovery studies. Recovery is defined asthe proportion of the amount of analyte, present oradded to the analytical portion of test material, whichis extracted and presented for measurement (IUPAC,1999b). At present the term recovery is used in twodifferent contexts (IUPAC 1999a):

    To express the yield of an analyte in apreconcentration or extraction stage in ananalytical method.

    To denote the ratio of the concentration found,c, obtained from an analytical process via acalibration graph compared to the referencevalue, cref, i.e. R=c/cref.

    The first use of recovery should be clearlydistinguished from the second one. This is because a

    100% of recovery does not necessary require a100% yield for any separation or preconcentrationstage. Hence, the IUPAC recommends using twodifferent terms to distinguish between the two usesof recovery. The term recovery should be used foryield whereas the term apparent recovery shouldbe used to express the ratio of the concentrationfound versus the reference value (IUPAC, 2001).

    Trueness is then assessed in terms of apparentrecovery. This apparent recovery should becalculated for different reference samples that coverrepresentatively the variability of routine samples (interms of matrix and analyte concentration). Figure 1shows the necessity of calculating the apparentrecovery at several concentration levels. We can seethat the apparent recovery may fortuitously be 100%in the presence of proportional and constant bias(see point (2) of Figure 1) (IUPAC 2001).

    Ideally, the apparent recovery should beestimated using several reference samples.However, these references are seldom available andoften the analyst has to resort to spiked samples.Such a simple procedure is usually effective in thedetermination of trace elements, but, for instance, itmight not be applicable to a pesticide residue. In thislatter case, vigorous reagents can not be used todestroy the matrix. As a result, the apparentrecovery of the spiked is likely to be larger than theone corresponding to the native analyte. In this case,less representative reference materials can be usedto investigate the different behaviour of the nativeand the spiked analyte (Barwick, 1999).

    The apparent recovery is calculated as:

    where c is the concentration found with the methodto be validated and cref is the reference concentration.This expression is valid when reference samples areavailable or when the analyte is spiked to matrixblanks. However, if the analyte is spiked to samplesthat already contain the analyte, the apparentrecovery is calculated as:

    where cO+S is the concentration obtained for the spikedsample and cO is the concentration obtained for theunspiked sample. The disadvantage of spiking ananalyte to a sample which already contains theanalyte is that recovery could be 100% and, even so,a constant bias may be present (Parkany 1996). Theabsence of constant bias can be checked using the

    c

    cref

    c=a+bcref

    c=cref

    **

    *2

    3

    1

    Figure 1Recovery. The dotted line shows the validation function obtained

    with an unbiased method. The solid line shows the validationfunction obtained with a method that has a constant bias, a,

    and a proportional bias, b.

    (5)ref

    22

    ref

    n

    s

    n

    s

    cct

    ref+

    =

    (6)

    11

    4

    ref

    4ref

    22

    ref

    2ref

    eff

    +

    +

    =

    n

    sn

    s

    n

    sn

    s

    (7)100refccR =

    (8)100ref

    OSO

    c

    ccR = +

    132 Grasas y Aceites

  • Youden Method (Youden 1947). This method consistsof estimating constant bias by analysing differentamounts of a sample.

    For each reference sample, the apparentrecovery should be calculated at least 7 times(preferably n 10). An average apparent recovery, R,and a standard deviation, s(R), can be obtained fromthe n apparent recoveries calculated. Trueness isthen assessed by statistically comparing the meanapparent recovery, R, with the 100%. Thiscomparison can be done with the following t test:

    This t test considers that recoveries areexpressed as a percentage and that the uncertaintyof the reference value is negligible. The t valuecalculated is then compared with the two-sided ttabulated value, ttab, for =0.05 and n-1 degrees offreedom. There can be two possibilities:

    a) t ttab: the apparent recovery does not differsignificantly from 100%. The trueness of the methodis assessed. Therefore, there is no reason to correctresults by the apparent recovery.

    b) t > ttab: the apparent recovery differssignificantly from 100%. Therefore, the trueness ofthe method can not be assessed. In this situation,there is a current debate focused on whether resultsshould be corrected by the apparent recovery.IUPAC, ISO and EURACHEM recommend to correctthe results. However, AOAC international does notagree that results should be corrected as a generalpolicy (IUPAC 1999b).

    3.2. Precision and uncertaintyAccording to ISO 3534, precision is the

    closeness of agreement between independent testresults obtained under stipulated conditions (ISO,1993). These conditions depend on the differentfactors that may be changed between each testresult. For instance: the laboratory, the operator, theequipment, the calibration and the day in which thetest result is obtained. Depending on the factorschanged, three types of precision can be obtained:the repeatability, the intermediate precision and thereproducibility (ISO 5725 1994a).

    The two extreme precision measures are thereproducibility and the repeatability. Thereproducibility gives the largest expected precisionbecause it is obtained by varying all the factors thatmay affect to the results, i.e. results are obtained indifferent laboratories, with different operators andusing different equipment. This precision can only be

    estimated by interlaboratory studies (ISO 57251994a). On the contrary, repeatability gives thesmallest value of precision because the results areobtained by the same operator, with the sameequipment and within short intervals of time.Intermediate precision relates to the variation inresults when one or more factors, such as time,equipment and operator, are varied within alaboratory. Different types of intermediate precisionestimates are obtained depending on factorschanged between the measurements (ISO 57251994a). For instance, the time-different intermediateprecision is obtained when the test results areobtained in different days. If results are obtained indifferent days and by different operators, we obtainthe [time+operator]-different intermediate precision.Finally, if the laboratory obtains the results bychanging all the factors that may affect the results(i.e. day, operator, instrument, calibration, etc.) weobtain the [time + operator + instrument + calibration+...] different intermediate precision. Thisintermediate precision is also known as therun-different intermediate precision. For a laboratory,this is the most useful precision because it gives anidea of the sort of variability that the laboratory canexpect of its results and, also, because it is anessential component of the overall uncertainty of theresults.

    The within-laboratory precision of an analyticalmethod should be characterised by the repeatabilityand the run-different intermediate precision. Thislatter precision is an estimate of the internalreproducibility of the laboratory. Precision should becalculated with test samples that have been takenthrough the whole analytical procedure. These testsamples may be reference materials, controlmaterials or actual test samples. It is important to

    Factor

    Run

    Replicate

    x

    i

    j

    x1nx11 x21 x2n xp1xp1 xpnxpn .

    ix2x px1x

    ...

    .. ... . .. .

    Figure 2Two-factor fully nested design. The factors studied are the runs

    and the replicates. p is the number of runs on which themeasurement is carried out and n is the number of replicatesperformed at each run. xi is the mean of the j replicate

    measurements performed in run i. x is calculated as the meanof the mean values obtained for the different runs.

    (9)n

    RsR

    t )(100

    =

    Vol. 53. Fasc. 1 (2002) 133

  • note that we should not estimate precision usingsynthetic samples that do not represent the testsamples (IUPAC 1999a). Moreover, if the analyticalmethod is intended to be used over a largeconcentration range, precision should be estimatedat several concentration levels across the workingrange. For instance, precision can be estimated atlow, medium and high concentration levels.

    3.2.1. Calculation of the within-laboratory precisionThe simplest method for estimating precision

    within one laboratory consists of calculating thestandard deviation of a series of n measurements. Itis recommended that at least 7 measurements(preferably n 10) should be carried out to obtaingood precision estimates (IUPAC 1999a). Therepeatability standard deviation is obtained when then measurements are obtained under repeatabilityconditions and the run-different intermediateprecision is obtained when the measurements areobtained by changing the factors that may affect theanalytical results (i.e. operator, day, equipment, etc.).

    An alternative procedure consists of performing nreplicates in each of the p runs in which the testsample is analysed. This experimental designcorresponds to a two-factor fully nested design (ISO5725 1994a). Here the factors studied are the runand the replicate (Figure 2). The use of the Analysisof the Variance (ANOVA) provides the repeatability,

    the run-different intermediate precision and thebetween-run precision (Table I). Table II shows theexpressions for calculating the precision estimates.

    It is recommended that the number of replicatesper day be equal to 2 (n=2) and to focus the effort onthe number of runs in which the test sample isanalysed (e.g. if we want to estimate precision usinga series of 20 measurements, it is better to analysetwice the sample in 10 runs than to analyse 4 timesthe sample in 5 runs) (Kuttatharmmakul 1999). It isalso recommended to analyse the sample in at least7 different runs (preferably p 10) to obtain goodprecision estimates. The advantage of thismethodology is that it provides better precisionestimates than the simplest method.

    3.2.2. Calculation of the repeatability limit The repeatability limit, r, is the value below which

    the absolute difference between two single testresults obtained under repeatability conditions maybe expected to lie with a probability of 95%. This limitis obtained as (ISO 5725 1994a):

    r = 2.8.sr (10)

    3.2.3. Calculation of the internal reproducibility limitAn internal reproducibility limit, R, can be

    calculated from the run-different intermediatestandard deviation. This limit is the value below which

    Source Mean squares Degrees offreedom

    Expected meansquare

    Run

    1

    )(1

    2

    run

    =

    =

    p

    xxnMS

    p

    ii p-1 2

    e2run +n

    Replicate

    )1()(

    1 1

    2

    =

    = =

    np

    xx

    MS

    p

    i

    n

    jiij

    ep(n-1) 2

    e

    Table IANOVA table and calculation of variances for the design proposed in Figure 2

    Variance Expression Degrees of freedomRepeatability variance, s2r

    eMS p(n-1)Between-run variance, s2run

    n

    MSMS erun

    Run-different intermediate variance, sI(run) 22rrun ss +

    Table IICalculation of the precision estimates for the experimental design proposed in Figure 2

    134 Grasas y Aceites

  • the absolute difference between two single testresults obtained under run-different intermediateconditions may be expected to lie with a probability of95%. This limit is obtained as:

    R = 2.8.sl(run) (11)

    3.3. UncertaintyIt is widely recognised that the evaluation of the

    uncertainty associated with a result is an essentialpart of any quantitative analysis (ISO 17025 1999).Uncertainty is defined as a parameter, associatedwith the result of a measurement, that characterisesthe dispersion of the values that could reasonably beattributed to the measurand (BIPM 1993).

    Uncertainty and trueness are very relatedconcepts. This is because we can not guarantee thecorrectness of all the possible systematic errors if wehave not previously assessed the trueness of theanalytical method and, consequently, it is impossibleto ensure that the true value is included within theinterval estimated value U (where U is theuncertainty of the estimated result). Therefore, asFigure 3 shows, every analyst should verify thetrueness of the method before calculatinguncertainty. Uncertainty can then be calculated usingthe information generated in the assessment oftrueness. Moreover, precision and robustnessstudies give also useful information for calculatinguncertainty (Barwick 2000; Maroto 1999; IUPAC1999a; EURACHEM 2000). The uncertaintycalculated using this information can then be relatedto the routine test samples as long as the referencesample is representative of these test samples, andthe quality assurance conditions are implemented inthe laboratory effectively.

    Uncertainty can be calculated as the sum of fourterms (Maroto 1999):

    The first component of uncertainty, sI(run),corresponds to the run-different intermediateprecision and considers the experimental variationdue to the conditions of the measurement (i.e. day,operator, calibration, etc.). The second component,utrueness, considers the uncertainty of the assessmentof trueness, the third component considers theuncertainty of subsampling and/or samplepretreatments not carried out in the assessment oftrueness. Finally, the fourth component, uother terms,contains all the sources of uncertainty notconsidered in the former terms.

    3.3.1. Uncertainty of the assessment of trueness, utruenessIn the assessment of trueness it is checked

    whether the bias of the analytical method issignificant. If bias is not statistically significant, thetrueness of the method is assessed. However, itremains an uncertainty due to the estimation of the

    bias itself. This uncertainty is calculated as:where n is the number of times that the referencesample is analysed in the assessment of trueness.u(cref) is the standard uncertainty of the referencesample. The calculation of this uncertaintydepends on the reference used in the assessmentof trueness. For instance, if a CRM is used,u(cref)=U(cref)/2 (where U(cref) is the uncertaintyprovided by the manufacturer). If a referencemethod is used, u(cref)=sref/ ref (where sref is thestandard deviation of the nref results obtained whenthe test sample is analysed with the referencemethod).

    3.3.2. Uncertainty of sample pretreatments, upretreatments

    This component considers the uncertainty due tothe heterogeneity of the sample and/or to samplepretreatments (filtration, weighing, drying, etc. ) notcarried out in the assessment of trueness. Thisuncertainty can be estimated using a sample withthe same characteristics as the routine samples. Thereplicates incorporating subsampling and/orpreprocessing steps should be analysed bychanging all the factors that may affect them (e.g.

    n

    Uncertainty, U

    Othercom pon ents

    )(; refref cucTrueness?Yes

    Analytical m ethod

    Routine sam ple

    Accepted referenceAccepted reference

    Test sam ple

    sc ;

    Estim ated result U

    Quality assuranceconditions

    Figure 3Assessment of trueness of an analytical method against an

    accepted reference. Once the trueness is assessed, theuncertainty of routine samples can be calculated using the

    information generated in the validation process.

    (12)2222

    )(2 termsotherntspretreatmetruenessrunI uuusU +++=

    (13)22

    )( )( refrunItrueness cun

    su +=

    Vol. 53. Fasc. 1 (2002) 135

  • operator, day, material, etc.). The uncertainty canthen be calculated as:

    where s2pretreatments is the variance of the resultsobtained after analysing the different portions.s2conditions depends on the conditions in which weanalyse the different portions subsampled and/orpre-treated. If they are analysed under repeatabilityconditions, it corresponds to the variance of therepeatability, i.e. s2conditions= s2r. If they are analysedunder intermediate precision conditions, itcorresponds to the run-different intermediateprecision, i.e. s2conditions= s2I(run).

    3.3.3. Uncertainty of other termsThis term contains all the sources of uncertainty

    that have not yet been considered in the formerterms. Most of these sources correspond to factorsnot representatively varied in the estimation of theintermediate precision. The uncertainty of thesefactors can be calculated using information fromrobustness studies (Barwick 2000), frombibliography or from other sources such as previousknowledge.

    3.4. Linear and working ranges

    In all the quantitative methods it is necessary todetermine the range of analyte concentrations orproperty values for which the analytical method canbe applied. The lower end of the working range islimited by the values of the limits of detection andquantification. At the upper end, several effects limitthe working range.

    Within the working range there may exist a linearresponse range in which the method gives resultsthat are proportional to the concentration of analyte.Obviously, the linear range can never be wider than

    the working range. For instance, in the determinationof calcium in edible oils using atomic absorptionspectroscopy, the working range comprises any oilcontaining calcium while the linear range is normallybetween 1 and 10 ppm.

    An initial estimation of the working and linearranges may be obtained using blank and referencematerials (or a blank and fortified sample blanks) atseveral concentration levels. Ideally, at least sixconcentration levels plus blank are needed. Onereplicate at every concentration level is enough tovisually identify the linear range and the boundariesof the working range. After this primary estimation, amore accurate plot can be obtained using referencematerials or fortified sample blanks. At least, sixdifferent concentration levels, using at least threereplicates of each level, are necessary. The linearityof the calibration line can be assessed using thecorrelation coefficient and the plot of the residualvalues. The residual values (ei, with i=1n, where nis the number of concentration levels used) in astraight-line model are calculated by subtracting themodel estimates from the experimental data(ei=yi-b0-b1xi, where b0 and b1 are respectively theintercept and the slope of the straight line).

    A high value of the correlation coefficient and agood plot of the residual values are normally enoughto assess linearity. It is important to point it out thatthe sole use of the correlation coefficient does notalways guarantee the linearity of the calibration line,even if the correlation coefficient has value near toone. A good plot of the residual values has to fulfil thefollowing conditions:

    the residual values must not show tendencies the residual values must be more o less

    randomly distributed the number of negative and positive residual

    values has to be approximately the same theresidual values should have approximately thesame absolute value.

    Figure 4 shows some examples of plots ofresidual values. Figure 4a shows a plot with a gooddistribution of residual values, where all the residualvalues fulfil the aforementioned requisites. Figure 4bshows a plot where the residuals do not have thesame absolute value, probably showing the need fora weighted calibration method. In Figure 4c theresidual values clearly show tendencies (probablyindicating the presence of non-linearities). Figure 4dshows the presence of a probable outlier in the dataset.

    Experimental points with an abnormally highresidual value may be outliers, and they should becarefully examined using a test for the detection ofoutliers (for instance, the test of Cook (Meloun 2002).

    If a more accurate check of the linearity isdesired, a test for the lack of fit based on the analysisof variance (ANOVA) can be used alongside with the

    x x

    xx

    ei

    ei

    ei

    ei

    xixi

    xixi

    a)

    d)c)

    b)

    x x

    xx

    ei

    ei

    ei

    ei

    xixi

    xixi

    xx xx

    xxxx

    ei

    ei

    ei

    ei

    xixi

    xixi

    a)

    d)c)

    b)

    Figure 4Different plots of residual values.

    (14)22conditionsntspretreatmentspretreatme ssu =

    136 Grasas y Aceites

  • correlation coefficient and the plot of the residualvalues (Massart 1997).

    It is important to note that the calibration line isusually found using the ordinary least squaresmethod, but if the variance of the replicates at eachconcentration level is not constant through all thelinear range (giving rise for instance to the plot ofresidual values shown in Figure 4b), then a betteroption is to use the weighted least squares method,which takes into account the individual variancevalues in each calibration point (Massart 1997).

    3.5. Limit of detection (LOD)The limit of detection, LOD, is commonly defined

    as the minimum amount or concentration ofsubstance that can be reliably detected by a givenanalytical method. In a more recent definition, ISO11843-1 (1997) avoids the term limit of detection andintroduces the general term minimum detectable netconcentration, as the true net concentration or amountof the analyte in the material to be analysed which willlead, with probability (1-), to the conclusion that theconcentration or amount of the analyte in the analysedmaterial is larger than that in the blank material.IUPAC (1995), in a previous document, provided asimilar definition and adopted the term minimumdetectable (true) value.

    When an analytical method is used to provideresults above the quantification limit, but still anestimation of the detection limit is required, it isusually sufficient to estimate this detection limit asthree times the standard deviation of theconcentration in a matrix blank or in a samplecontaining the analyte at a low concentration level.The number of independent determinations isrecommended to be at least 10.

    When the analytical method is used for trace levelanalysis or in cases (e.g. food or pharmaceuticalanalysis) where the absence of certain analytes isrequired by regulatory provisions, the limit ofdetection must be estimated by a more rigorousapproach, such as the one described by IUPAC(IUPAC 1995, IUPAC 1999a) and must encompassthe whole analytical method and consider theprobabilities and of taking false decisions.

    According to the ISO and IUPAC definitions, theLOD is an a priori defined parameter of an analyticalmethod, because it is fixed before the measurementsare made. The LOD is essentially different from adetection decision, because the latter is taken oncethe result of the measurement is known. In otherwords: a posteriori. The decision of whether a givenanalyte is present or not in a sample is based on acomparison with the critical level, Lc, which is definedas

    where z1- is the upper- percentage point of anormal distribution and o is the standard deviation ofthe net concentration when the analyte is not presentin the sample (i.e. the true value is zero). The Lc isdefined in order to mark a minimum value for which apredicted concentration is considered as beingcaused by the analyte. By doing so, there exists arisk of committing a type I error, i.e., a false positivedecision, that means stating that the analyte ispresent when in fact it is not. However, if we want tokeep the risk of a false negative decision (called )low, the LOD of the method, LD, must be higher bytaking into account both probabilities of error:

    where z1- is the upper- percentage point of anormal distribution and D is the standard deviationof the net concentration when the analyte is presentin the sample at the level of the LOD. It has beenassumed in equations (15) and (16) that theconcentrations are normally distributed with knownvariance. Taking the default values for = = 0.05,and assuming constant variance between c = 0 andc = LD, equation (16) becomes:

    If variances are not known and have to beestimated based on replicates, then the 0 and Dvalues in equations (15) and (16) have to bereplaced by their corresponding estimates, s0 and sD.Accordingly, the z-values, based on normaldistributions must be replaced by the correspondingt-values from a Student t-distribution with degrees offreedom. Taking = the appropriate expressionsfor LC and LD (assuming constant variance) are:

    Equation (19) is an approximation thatapproaches the true LD as far as the number ofdegrees of freedom is larger. When the number ofreplicates to estimate s0 is low, the term 2t1-, mustbe corrected for the degrees of freedom (see Currie,1997). Additionally, if s0 is used instead of 0 the LODis uncertain by the ratio (s0/0) and an upper boundfor LD has to be calculated (Currie 1997, IUPAC1995).

    3.5.1. Estimating the standard deviations of the net concentrations It has already been mentioned that the decision of

    whether a given analyte is present or not in a sampleis based on a comparison with the critical level. As inany other situation, this process of comparison

    (15)01 = zLC

    (16)D101D += zzL

    (17)0D 3.3 =L

    (18)

    (19)0,1 stLC =

    0,1D 2 stL

    Vol. 53. Fasc. 1 (2002) 137

  • implies experimental errors (in the measurement ofthe blank sample and the analysed sample), whichare assumed to be random and normally distributed.At zero concentration level, the standard deviation ofthe net concentration is expressed in a general wayas:

    B is the standard deviation of the blank and = , where m and n are the number ofreplicates on the analysed sample and on the blanksample, respectively. = 1 (0 = B) in the specialcase when m=1 and n is high. When the netconcentration is calculated in a paired experiment(i.e. as the concentration in the sample minus theblank) then .= 2. If B is not known, then it has tobe replaced by its corresponding estimate, sB, in Eq.(20). For a reliable estimation of the detection limit, aminimum suggested number of 10 determinationsshould be carried out on the blank sample. Theprecision conditions (repeatability, intermediateprecision) in which the blank is analysed should beclearly specified.

    3.5.2. Alternatives for LOD calculationIf the calibration line used to check the linearity of

    the analytical method has been built in a restrictedconcentration interval close to the expected limit ofdetection, then this curve can be used to calculatethe LOD of the method. The expression for LOD inthis case is:

    where:sy/x: is the standard deviation of the residuals of thecalibration lineN: is the number of calibration standardsxi: is each of the concentrations of the calibrationstandardsx: is the mean value of the concentrations of the cali-bration standards.

    The LOD can be calculated by solving Eq (21) orby an iterative procedure.

    In chromatographic methods of analysis, in whichthe peak corresponding to the analyte has to bedistinguished from the chromatographic baseline,the detection limit can be calculated by measuringthe noise of a chromatogram of several (usuallythree) sample blanks, which have been submitted tothe whole analytical process. The procedurerecommended by the SFSTP (STP PharmaPractiques, 1992), consists on determining themaximum amplitude of the baseline in a time interval

    equivalent to twenty times the width at half weight ofthe peak of the analyte. The LOD is expressed as:

    LD = 2z1-hnoiseR (22)where hnoise is the average of the maximumamplitudes of the noise and R is the response factor(concentration/peak height). This procedure is validonly when the chromatographic peak heights areused for quantification purposes. If peak areas areused instead, then the LOD has to be estimated byother means, such as the described above.

    Finally, it is highly recommended when reportingthe detection limit of a given analytical method, tospecify the approach used for its calculation.

    3.6. Limit of quantification (LOQ)The limit of quantification, LOQ, is a performance

    characteristic that marks the ability of a chemicalmeasurement process to adequately quantify ananalyte and it is defined as the lowest amount orconcentration of analyte that can be determined withan acceptable level of precision and accuracy(EURACHEM 1998). In practice, the quantificationlimit is generally expressed as the concentration thatcan be determined with an specified relativestandard deviation (usually 10%). Thus:

    where kQ = 10 if the required RSD=10% (this is therecommended value by IUPAC, and Q is the standarddeviation of the measurements at the level of the limit ofquantification. To estimate Q a number of independentdeterminations (n > 10) must be carried out on asample which is known to contain the analyte at aconcentration close to LQ. Since LQ is not known,some guides recommend to analyse a sample with aconcentration between 2 and 5 times the estimateddetection limit (IUPAC 2001). Other guidelines (EURA-CHEM 1998) suggest to perform the determinationson a blank sample. Such a procedure, however,is discouraged, unless there is a strong evidence thatthe precision is constant between c = 0 and c = LQ.

    When determining Q, the sample has to be takento the whole analytical procedure. Also, the precisionconditions in which the sample is analysed must beclearly specified.

    Usually, the LOQ estimation is carried out as apart of the study to determine the working range ofthe analytical method and it is a common practice tofix the LOQ as the lowest standard concentration ofthe calibration range.

    3.7. Selectivity (specificity)In the majority of analytical methods it is

    necessary to ensure that the signal produced in themeasurement stage is only due to the analyte of

    1/m + 1/n

    (23)QQQ kL =

    (20) B0 =

    =

    =

    +++

    ++= N

    ii

    xyN

    ii

    xy

    xx

    xLNmb

    st

    xx

    x

    Nmbs

    tL

    1

    2

    2D

    1

    /,1

    1

    2

    2

    1

    /,1D

    )()(11

    )(11

    (21)

    138 Grasas y Aceites

  • interest, and not to the presence of interferences inthe sample. Hence it is necessary to test theselectivity (or specificity) of the analytical method.

    The terms selectivity and specificity are ofteninterchanged. Selectivity (or specificity) can bedefined as the ability of a method to accurately andspecifically determine the analyte of interest in thepresence of other components in a sample matrixunder the stated conditions of the test (EURACHEM1998). Some authors consider specificity to be 100%selectivity, and hence make a difference between thetwo terms: specificity is referred to a method that onlygives response for a single analyte, while selectivityrefers to a method that gives responses for a numberof chemical entities that may or may not bedistinguished from each other. This situation createsunnecessary confusions and can be avoided byauthors by giving preference for the use of selectivity(IUPAC 2001). IUPAC clarified this overlapexpressing the view that specificity is the ultimate ofselectivity (den Boef 1983).

    From a practical point of view, the selectivity of amethod may be checked studying its ability to measurethe analyte of interest when other specific interferenceshave been introduced in the sample. In a similar way,taking into account the matrix effect, which canenhance or suppress the sensitivity of the method. Thisis usually carried out analysing reference materials andsamples containing the analyte of interest and varioussuspected interferences with both the candidate andother independent analytical methods. From thecomparison of the results using all the methods, we canassess the ability of the candidate method to analysethe required analyte in presence of other substances.

    Another option, if no independent analyticalmethods are available, is to analyse a series of threematrix blanks (or samples containing the analyte ofinterest if matrix blanks are not available), threemethod blanks and the lowest concentrationstandard (or a standard corresponding to the analyteconcentration in the positive sample if matrix blanksare not available), and assessing their backgroundsignals and the significance of any interferencesrelative to the lowest analyte level specified in theworking range (IUPAC, 1999a). Responses frominterfering substances should be less than 1% of theresponse of the lowest concentration measured.

    3.8. SensitivitySensitivity is defined as the change in the

    response of a measuring instrument divided by thecorresponding change in the stimulus (being thestimulus for instance the amount of the measurand)(EURACHEM 1988). Although it clearly applies to themeasuring instrument, the sensitivity can also beapplied to the method as a whole. From a practicalpoint of view, it corresponds to the gradient of theresponse curve. Inside the linear range, the

    sensitivity is simply the slope of the calibrationstraight-line.

    3.9. Ruggedness (or robustness)The robustness/ruggedness of an analytical

    procedure is a measure of its capacity to remainunaffected by small, but deliberate variations inmethod parameters and provides an indication of itsreliability during normal usage (ICH 1994). ISO andIUPAC have not yet provided definitions andrecommendations to evaluate ruggedness, butVander Heyden et al (2001) has extensively coveredthe subject.

    Where different laboratories use the samemethod, they inevitably introduce small variations inthe procedure, which may or may not have asignificant influence on the performance of themethod. The ruggedness of a method is tested bydeliberately introducing small changes to a numberof parameters of the method and examining theeffect of these changes on the accuracy andprecision of the final results.

    The evaluation of ruggedness is normally carriedout by the individual laboratory, before thisparticipates in a collaborative trial, and should beconsidered at an earlier stage during thedevelopment and/or optimisation of the analyticalmethod.

    A large number of parameters may need to beconsidered, but because most of these will have anegligible effect, it will normally be possible to varyseveral at once. An established technique forruggedness testing is covered in detail by Youden andSteiner (1975) and uses Plackett-Burmanexperimental designs. Such designs allow theinvestigation of the influence of several parameters ina limited number of experiments. Typical parameterscan be subdivided in continuous (i.e. extraction time,mobile phase composition, temperature, flow rate,etc.) or non-continuous (i.e. type of chromatographiccolumn, brands, chemicals, etc.).

    The results from ruggedness testing can also beused to estimate components of uncertainty whichhave not been considered in the assessment oftrueness and precision of the method (Barwick 2000).

    Table IIIDesing to calculate the ruggedness of a method

    Parameter

    Experiment A B C Result

    1234

    +

    -

    +

    -

    +

    +

    -

    -

    +

    -

    -

    +

    y1y2y3y4

    Vol. 53. Fasc. 1 (2002) 139

  • The ruggedness test should be carried out on arepresentative sample, preferably a referencematerial. The first step consists on identifying andselecting the parameters that are going to be testedand defining their nominal and extreme levels (i.e.flow rate at 1.0 0.2 ml/min). After this, theexperimental design is has to be selected; forinstance, if three parameters were to be evaluated, adesign with only four experiments would be sufficientas indicated in Table III. The + and signs denote thecoded levels for the maximum and minimum valuesof the parameter under study. For each parameter,there are two experiments at each level. To study theeffect of a given parameter, the mean of the resultsobtained at the - level have to be subtracted from themean of the results at the + level. In the example, theeffect of parameter A would be calculated as:

    Effect A = [(y1 + y3)/2 (y2 + y4)/2]

    Once the effects have been calculated for everyparameter, a statistical and/or graphical analysis ofthe effects must be carried out in order to drawchemically relevant conclusions and, if necessary, totake actions to improve the performance of themethod.

    4. THE BASIC PRINCIPLES OF METHOD VALIDATION

    There are three basic principles of any process ofmethod validation:

    1. The validation must include the completeanalytical procedure. The measurement of theinstrumental signal is often considered to be themain step in the analytical procedure andconsiderable effort is made to establish the linearrange and to calculate the uncertainty derived fromthis step. However, nowadays, due to advances ininstrumentation, this step usually contributes little tothe bias or to the expanded uncertainty of the finalresult. Special care has to be taken with the samplepre-treatment process where the intervention of theanalysts and the non-automated steps has a crucialeffect on the performance parameters.

    2. The validation must comprise the wholerange of concentrations in which the analyte may bepresent in the test samples.

    3. Representativity is an essential factor. Themethod should be validated by considering the rangeof matrices in which the analyte has to bedetermined. Moreover, the different effectsinfluencing the results during the normal applicationof the method should be considered during themethod validation process. This concept will befurther developed when the uncertainty values arecalculated.

    5. PERFORMING THE METHOD VALIDATION

    When and who should validate the analyticalmethodology to be used are also aspects of methodvalidation that are of practical importance.

    There are many laboratory situations in which amethod needs to be validated. For instance, thevalidation required for a new method is clearlydifferent from the validation required for a methodthat has previously been validated by a methodperformance study. Validating modificationsintroduced into a test procedure is different fromvalidating a well-known method for testing samplesthat incorporate new matrices. In all cases, somesort of method validation should be performed.However, the extent of the validation varies fromcase to case. This topic is further developed insection 12.7

    Clearly, the laboratory that is going to use amethod for analysing test samples is responsible forguaranteeing that the method is fit-for-purpose.Therefore, this laboratory should demonstrate that isvalid to analyse the test samples adequately.

    6. INTER-LABORATORY COMPARISON AS A MEANS OF METHOD VALIDATION

    When a method is going to be used by manylaboratories throughout the world, it is advisable thatit be validated by means of a collaborative or methodperformance study involving a group of laboratories.

    In this type of studies, all the laboratories followthe same written analytical protocol and analyse thesame samples in order to estimate the performancecriteria of the method being studied. Usually thewithin-laboratory and among-laboratory precisionare estimated and, whenever necessary andpossible, so are other criteria such as the systematicerror, recovery, sensitivity or limit of detection.

    After the consensus reached under the auspicesof the IUPAC, William Horwitz prepared the Protocolfor the Design, Conduct and Interpretation ofMethod-Performance Studies (Horwitz 1995). In thisdocument, the design of the method performancestudy is described in considerable detail. Forinstance, at least five different materials are neededfor a single type of substance, a minimum of eightlaboratories should participate in the study and thedesigns used to estimate the repeatability andreproducibility are described. The statistical analysis,the final estimation of the parameters and the finalreport are also described.

    However, collaborative studies have not beenundertaken for many analytical methods. Amongother problems, they are too time-consuming andrequire expensive resources, particularly for thelaboratory in charge of the organisation. Moreover,the general guidelines for method validation are too

    140 Grasas y Aceites

  • impractical to be used in industrial laboratories orlaboratories that monitor, for instance, trace-levelorganic compounds.

    Because of these unfavourable factors, theAssociation of Official Analytical Chemists, AOACInternational, has introduced the Peer VerifiedMethod Program (AOAC 1998) so that methodsused by laboratories working with only one or twoothers can be validated. The aim of this program is toprovide a class of tested methods whoseperformance has been checked in at least oneindependent laboratory.

    7. IN-HOUSE VALIDATION

    We have seen that collaborative trials give rise toa sound validation of methodologies. But they alsopresent some drawbacks:

    They require to gather enough laboratoriesthat can devote the necessary resources toachieve proper validation.

    There are not enough scientists willing toundertake the organisation of the very largenumber of method trials that are required.

    Problems with shipping regulations, importrestrictions, analyte stability.

    This type of studies are uneconomic, too slow,restricted in scope

    For these and other reasons, there is currently amove towards giving more importance to the use ofmethods of analysis that have only been validatedin-house, provided that the laboratory implementsappropriate quality assurance procedures. In-housevalidation is the process of checking, inside alaboratory, that a method gives correct results. It canprovide a practical and cost-effective alternativeapproach to other validation systems.

    Many analysts think that a previously validatedstandard or reference method may be directlyapplied in their laboratory. The previous validation ofthe method is erroneously considered as a sufficientwarranty to achieve quality results. This is not true,and when a new method is introduced in a laboratory(even if it is a previously validated one by aninternational organism) it has always to be validated,although in this latter case a full validation is notneeded. As there is some parallelism between thelaboratory of analysis and the cuisine, we cancompare a validated analytical procedure and arecipe in a good cookbook. There are many goodpreviously tested recipes that are very carefullydescribed. Is it enough to get a good dish in ourcuisine? We all know that unfortunately this is notenough. The same happens with validated methods.We can get good results using them, but we need toprove that they adequately work in our laboratory.Hence a method always has to be validated to checkthat their quality parameters are valid for the

    particular analytical problem that we have in ourlaboratory. (EURACHEM 1998) It is always neededsome degree of validation, although this degree maysignificantly change depending in each particularsituation. Some of these particular situations are(AOAC/FAO/IAEA/IUPAC 2000):

    The laboratory wants to use as a routinemethod a previously validated one in anotherlaboratory or group of laboratories: Here a fullvalidation is not necessary, since the absence ofthe method bias has already been assessed.The laboratory only needs to verify that it iscapable of achieving the published performancecharacteristics of the method by checking theabsence of laboratory bias, and by calculating itsown precision (repeatability and intermediateprecision) estimates. In case of measuring atlow concentration values, the laboratory alsoneeds to calculate the limits of detection andquantification, since these values depend, amongothers, on the laboratory instrumentation andon the analysts.

    The laboratory wants to use a validatedmethod, but the analytes to be determined eitherare new or they are found in new samplematrices: In this case the validation has to beextended to the new analytes or matrices.Parameters like trueness, precision, calibration,analytical ranges, and selectivity have to bechecked for every new analyte or matrix.

    The method is published in the scientificliterature together with some analyticalcharacteristics: The laboratory has to undertakeverification and a limited validation in order tomeet the requirements.

    The method is published in the scientificliterature but not characteristics given: Thelaboratory has to undertake a full validation ofthe method and a self-verification in order toachieve the stated characteristics.

    The method has been developed in-house:The laboratory has to arrange a full validationof the method.

    Apart from these situations, some other checksare routinely used to test the whole analyticalprocedure before starting a new set of analysis. Forinstance, checking the sensitivity of the method(slope of the calibration straight line), or theselectivity in some chromatographic assays. Theselatter are often referred as to suitability checks.

    Another different kind of in-house validation is theretrospective validation. The retrospective validationis the validation of a method already in use basedupon accumulated production, testing and controldata. In some cases an analytical method may havebeen in use without sufficient validation. In thesecases, it may be possible to validate, to some extent,the adequacy of the method by examination of

    Vol. 53. Fasc. 1 (2002) 141

  • accumulated analysis. Retrospective validationallows, for instance, estimating precision through aperiod of time.

    8. REPORTING METHOD VALIDATION

    Once an analytical method has been validated, allthe procedures must be documented to ensure that themethod will always be used in the same conditions. Thevalues of the performance criteria are quite sensitive toany variations in the procedure due to the lack of properdocumentation. The uncertainty usually increases andbias may appear. Inconsistencies when the validatedmethod is used to test samples can only be avoided,therefore, by proper reporting and proper applicationof the validation process.

    Several documents recommend the way in which avalidated method should be reported. Among others,ISO 78/2 provides advice on how to document generalchemical methods. It is advisable that thestandardised written procedures be revised by anexperienced independent analyst, so that allpossible misinterpretations are detected andavoided.

    Following the general procedures for managingwritten documents, the authorised person shouldcheck that only the up-to-date documents are used.Similarly, documents should only be modified by theperson responsible, who should also ensure thatobsolete documents are withdrawn and the revisedmethodologies put into practice.

    REFERENCES

    AOAC (1998) Peer Verified Programs.http://aoac.org/vmeth/peerverimtd[1].htm

    AOAC/FAO/IAEA/IUPAC. (2000) Guidelines forSingle-Laboratory Validation of Analytical Methods forTrace-Level Concentrations of Organic Chemicals. In:Principles and Practices of Method Validation. Editedby A. Fajgelj and A. Ambrus. Royal Society ofChemistry, Cambridge.

    Barwick, V.J., Ellison, S.L.R. (1999) Measurementuncertainty: Approaches to the evaluation ofuncertainties associated with recovery. Analyst, 124,981-990.

    Barwick, V.J., Ellison, S.L.R. (2000) Development andHarmonisation of Measurement UncertaintyPrinciples. Part (d): Protocol for uncertainty evaluationfrom validation data, VAM Project 3.2.1, Report nLGC/VAM/1998/088. LGC, Teddington.

    BIPM, IEC, IFCC, ISO, IUPAC, IUPAP, OIML (1993)International Vocabulary of basic and general terms inMetrology, VIM, ISO, Geneva

    Currie, L.A. (1997) Detection: International update, andsome emerging di-lemmas involving calibration, theblank, and multiple detection decisions .Chemom.Intell. Lab. Syst., 37, 151-181.

    EURACHEM (1998) The Fitness for Purpose of AnalyticalMethods. A Laboratory Guide to Method Validation andRelated Topics. EURACHEM Secretariat, Teddington,Middlesex, http://www.vtt.fi/ket/eurachem

    EURACHEM/CITAC (2000) EURACHEM/CITAC Guide:Quantifying uncertainty in analytical measurement, 2ndEdition, Teddington, Middlessex.

    den Boef, G., Hulanicki, A. (1983) Recommendations forthe usage of selective, selectivity and related terms inanalytical chemistry . Pure Appl. Chem., 55, 553-556.

    Horwitz, W., (1995) Protocol for the Design, Conduct andInterpretation of Method-Performance Studies. PureAppl. Chem., 67, 331-343.

    ICH (1994) Harmonised Tripartite Guideline prepared withinthe Third International Conference on Harmonisation ofTechnical Requirements for the Registration of Pharmaceuticalsfor Human Use (ICH), Text on Validation of AnalyticalProcedures, 1994, http:/www.ifpma.org/ich1.html.

    ISO, International Organization for Stardization (1993) ISO3534-1, Statistics, Vocabulary and symbols, ISO, Geneva.

    ISO, International Organization for Standardization(1994a) ISO Guide 5725, Accuracy (Trueness andPrecision) of Measurement Methods and Results, ISO,Geneva.

    ISO, International Organization for Standardization,(1994b) ISO Guide 8402: Quality- Vocabulary, Geneva.

    ISO, International Organization for Standardization, (1997)ISO 11843-1, Capability of detection. Part 1: Terms anddefinitions. ISO, Geneva.

    ISO, International Organization for Standardization, (1999)ISO/IEC 17025. General requirements for thecompetence of calibration and testing laboratories.ISO, Geneva Ibid: UNE-EN ISO/IEC 17025. Requisitosgenerales relativos a la competencia de loslaboratorios de ensayo y calibracin. AENOR, Madrid,2000.

    IUPAC. International Union of Pure and Applied Chemistry(1995) Nomenclature in Evaluation of AnalyticalMethods including Detection and QuantificationCapabilities, Pure Appl. Chem., 67, 1699-1723.

    IUPAC. International Union of Pure and Applied Chemistry(1999a) Harmonised Guidelines for In-HouseValidation of Methods of Analysis (Technical report)http://www.iupac.org/projects/1997/501_8_97.html.

    IUPAC, International Union of Pure and Applied Chemistry(1999b) Harmonised guidelines for the use of recoveryinformation in analytical measurement, Pure Appl.Chem., 71, 337-348.

    IUPAC, International Union of Pure and Applied Chemistry(2001) Selectivity in Analytical Chemistry TechnicalDraft http://www.iupac.org/publications/pac/2001/.

    IUPAC, International Union of Pure and Applied Chemistry(2002) Recommendations for the use of the termrecovery in analytical procedures, Draft http://www.iupac.org/.

    Kuttatharmmakul, S., Massart, D.L., Smeyers-Verbeke, J.(1999) Comparison of alternative measurementmethods. Anal. Chim. Acta, 391, 203-225

    Maroto, A., Riu, J., Boqu, R., Rius, F.X. (1999) Estimatinguncertainties of analytical results using informationfrom the validation process . Anal. Chim. Acta 391,173-185.

    Massart, D.L., Vandeginste, B.M.G., Buydens, L.M.C., deJong, S., Lewi, P.J., Smeyers-Verbeke, J., (1997)Handbook of Chemometrics and Qualimetrics: Part A,Elsevier, Amsterdam.

    Meloun, M., Miitky, J., Hill, M., Brereton, R.G. (2002)Problems in regression modelling and their solutions.The Analyst. In press.

    Parkany, M. (1996) The use of recovery factors in traceanalysis. Royal Society of Chemistry, Geneva

    Satterthwaite, F.E. (1941) Synthesis of variance.Psychometrika, 6, 309-316.

    142 Grasas y Aceites

  • STP Pharma Practiques (1992) Guide de validationanalitique: Report SFTP. Methodologie and examples 2(4), 205-226.

    Thompson, M. Wood, R. (1993) The InternationalHarmonised Protocol for the Proficiency Testing of(Chemical) Analytical Laboratories. Pure Appl. Chem.,65, 2123-2144. Also published in J. AOAC Intl. 76,926-940.

    Thompson, M. Wood, R., (1995) Harmonised Guidelinesfor Internal Quality Control in Analytical ChemistryLaboratories. Pure Appl. Chem., 67, 49-56.

    Vander-Heyden, Y., Nijhuis A., Smeyers-Verbeke, J.,Vandeginste, B.G.M., Massart, D.L. (2001) Guidancefor robustness/ruggedness tests in method validation.J. Pharmaceutical and Biomedical Analysis 24,723-753.

    Youden, W.J. (1947) Technique for testing the accuracy ofanalytical data. Anal. Chem. 19, 946-950.

    Youden, W.J., Steiner, E.H. (1975) Statistical Manual of theAssociation of Official Analytical Chemists. AOAC,Arlington, VA.

    Vol. 53. Fasc. 1 (2002) 143