chapter 2 laboratory methods - rcnusers.rcn.com/dennisanoe/chapter2.pdf · 2001. 8. 28. ·...

30
LABORATORY MEASUREMENTS The majority of the studies performed in the clinical laboratory consist of measurements of the concentrations of blood cells and of chemical substances in body fluids (Table 2.1). The chemical substances that are studied include gases, electrolytes, metabolic intermediates, waste products, tissue proteins, plasma proteins, hormones, micronutrients, drugs, and toxins. Most of what is discussed in this chapter applies most directly to such concentration measurements. However, with some modifications, the concepts can also be applied other kinds of laboratory measurements. THE PROCESS OF MEASUREMENT The process of laboratory measurement proceeds in four steps: sample preparation, analyte separation, analytical signal production and detection, and calcu- lation of results. Sample preparation consists of the operations that must be performed on a specimen to yield the test material that constitutes the sample. It is the sample, or a portion of the sample, that is introduced into the measurement system and on which the measurement is carried out. In the deter- mination of the plasma concentration of a chemical substance such as creatinine, sample preparation consists of the separation of the plasma (or serum if the specimen is clotted blood) from the blood cells by centrifugation of the specimen. There may be an analyte separation step that represents a more-or-less specific isolation of the analyte of interest from the other chemical substances in the sample. For instance, if creatinine is to be measured using the Jaffé reaction, it can be absorbed to porous aluminum silicate clay or a cation exchange resin to separate it from other plasma substances that react with the signal generat- ing reagent. Analytical signals are generated and detected in a wide variety of ways. In automated blood cell counting, for instance, the signal consists of a voltage pulse arising from a change in electrical impedance across an aperture as a cell passes through the aperture. The signal is detected by a voltmeter. One way to produce an analytical signal for the measurement of chemical substances is by the formation of a chemical species that absorbs light. In the Jaffé reaction for creatinine, picrate reacts with creatinine under alkaline conditions to form a strongly light-absorbing red-orange compound, the Laboratory Methods 2-1 Table 2.1 Concentrations of selected blood analytes Blood cells cells/L 10 13 red cells 10 12 10 11 platelets reticulocytes 10 10 neutrophils 10 9 lymphocytes 10 8 eosinophils Chemical substances mol/L 10 0 10 -1 sodium, chloride 10 -2 potassium, glucose, urea 10 -3 calcium, carbon dioxide albumin 10 -4 uric acid 10 -5 bilirubin, iron, haptoglobin fibrinogen 10 -6 cortisol 10 -7 thyroxine hydrogen ion 10 -8 10 -9 testosterone, factor VIII cobalamin 10 -10 ferritin 10 -11 parathyroid hormone 10 -12 Chapter 2 LABORATORY METHODS © 2001 Dennis A. Noe

Upload: others

Post on 11-Aug-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

LABORATORY MEASUREMENTS

The majority of the studies performed in theclinical laboratory consist of measurements of theconcentrations of blood cells and of chemicalsubstances in body fluids (Table 2.1). The chemicalsubstances that are studied include gases,electrolytes, metabolic intermediates, wasteproducts, tissue proteins, plasma proteins,hormones, micronutrients, drugs, and toxins. Mostof what is discussed in this chapter applies mostdirectly to such concentration measurements.However, with some modifications, the concepts canalso be applied other kinds of laboratorymeasurements.

THE PROCESS OF MEASUREMENT

The process of laboratory measurement proceedsin four steps: sample preparation, analyte separation,analytical signal production and detection, and calcu-lation of results. Sample preparation consists of theoperations that must be performed on a specimen toyield the test material that constitutes the sample. Itis the sample, or a portion of the sample, that isintroduced into the measurement system and onwhich the measurement is carried out. In the deter-mination of the plasma concentration of a chemicalsubstance such as creatinine, sample preparationconsists of the separation of the plasma (or serum ifthe specimen is clotted blood) from the blood cellsby centrifugation of the specimen.

There may be an analyte separation step thatrepresents a more-or-less specific isolation of theanalyte of interest from the other chemicalsubstances in the sample. For instance, if creatinineis to be measured using the Jaffé reaction, it can beabsorbed to porous aluminum silicate clay or acation exchange resin to separate it from otherplasma substances that react with the signal generat-ing reagent.

Analytical signals are generated and detected ina wide variety of ways. In automated blood cellcounting, for instance, the signal consists of avoltage pulse arising from a change in electrical

impedance across an aperture as a cell passesthrough the aperture. The signal is detected by avoltmeter. One way to produce an analytical signalfor the measurement of chemical substances is by theformation of a chemical species that absorbs light.In the Jaffé reaction for creatinine, picrate reactswith creatinine under alkaline conditions to form astrongly light-absorbing red-orange compound, the

Laboratory Methods 2-1

Table 2.1Concentrations of selected blood analytes

Blood cells

cells/L 1013

red cells1012

1011 plateletsreticulocytes

1010

neutrophils109 lymphocytes

108 eosinophils

Chemical substances

mol/L 100

10-1 sodium, chloride

10-2 potassium, glucose, urea

10-3 calcium, carbon dioxidealbumin

10-4 uric acid

10-5 bilirubin, iron, haptoglobinfibrinogen

10-6 cortisol

10-7 thyroxinehydrogen ion

10-8

10-9 testosterone, factor VIIIcobalamin

10-10 ferritin

10-11 parathyroid hormone

10-12

Chapter 2LABORATORY METHODS

© 2001 Dennis A. Noe

Page 2: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

Janovski complex. The signal consists of the reduc-tion in the amount of light passing through thereaction solution. The signal is detected by aspectrophotometer.

Calculation of resultsIn order to arrive at a study result, the magni-

tude of the signal generated by a sample must beconverted to a concentration value. This can bedone in either of two ways. If the measurementsystem has been shown to have signal generatingproperties that closely match the theoretical ideal,the theoretical relationship between analyte concen-tration and signal magnitude can be used to calculatethe result. For example, the theoretical relationshipbetween analyte concentration and light absorbanceis embodied in the Beer-Lambert law,

analyte concentration =

absorbanceanalyte absorptivity $ light path

For an ideal system, dividing the observed absor-bance value by the known values for the absorptivityof the analyte and the length of the light path in thedetector yields the analyte concentration.

In practice, the behavior of measurementsystems is rarely ideal, so the use of theoreticalrelationships is not a satisfactory way to calculateresults. Instead, results are calculated using theempirical relationship between analyte concentrationand signal magnitude as established by the measure-ment of signals produced by a set of test materialswith known analyte concentrations. These testmaterials are called calibrators (formerly, standards)and the relationship between analyte concentrationand signal magnitude is called a calibration curve.A hypothetical linear calibration curve is shown inthe upper graph of Figure 2.1

To convert the signal generated by a test sampleto a concentration value, the calibration curve isused in reverse. That is, rather than finding the y(signal) value on the curve for a known x (concen-tration) value, an x value on the curve is found thatcorresponds to the known y value. For a signal of100 units, for instance, the corresponding point onthe calibration curve has a concentration of 10 units.The measurement result is therefore, 10 concentra-tion units. Because it is unconventional to reversethe roles of the x- and y-axes, results are calculatedusing what is called a measurement curve which isidentical to the calibration curve except that the x-

and y-axes are switched (the lower graph in Figure2.1). Note that inversion of the equation describingthe calibration curve yields the equation that definesthe measurement curve.

Calibration and measurement curves are usuallyconstructed each time that a batch of measurementsis made. Calibrators are run along with the testsamples and the parameters of the equations definingthe curves are estimated from the observed calibratorand signal magnitude pairs. The number andspacing of the calibrators should be chosen with theintent of providing for highly reliable estimation ofthe equation parameters. According to the statisticaltheory of optimal design, there exists a unique set ofcalibrator concentrations that yields the most preciseestimates of the equation parameters (Fedorov 1972,Steinberg and Hunter 1984). In general, the numberof separate concentrations that should be evaluatedequals the number of parameters in the equation. Inthe case of a linear calibration curve there are two

Laboratory Methods 2-2

0 5 10 15 20

Calibrator concentration

0

50

100

150

200

Sign

al m

agni

tude

0 50 100 150 200

Signal magnitude

0

5

10

15

20

Sam

ple

conc

entra

tion

calibrationy = b0 + b1 x

measurementy = (x - b0) / b1

Figure 2.1 Hypothetical calibration and measurementcurves (solid lines). The graphical technique for the calcula-tion of the result given a signal of 100 units is depicted.

Page 3: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

parameters, the slope and the intercept, so twoconcentrations should be evaluated. The optimalconcentrations are the concentrations at each end ofthe range of measurement. Figure 2.2 illustrates asimulated application of this optimal design. Thehypothetical linear calibration relationship has anintercept of zero and a slope of two and proportionalmeasurement variability. Using a scheme in whichthe calibrators are evenly spaced over the clinicalrange results in the parameter estimates shown in theupper graph. Using the optimal scheme (lowergraph) with the same number of calibrators givesparameter estimates that are closer to the true(simulation) values. An added benefit of the end-of-range calibrator spacing scheme is that the incon-stancy of the measurement variability is much moreclearly demonstrated than with the even spacingscheme. Optimal spacing schemes can also bedevised for more complex equations such as thosefor immunoassays (Bezeau and Endrenyi 1986).

Ordinary linear regression is the technique mostfrequently used for the estimation of the parametervalues of linear calibration curves. This techniquehas an underlying assumption: the variability in themeasurement of the dependent variable has aconstant variance (Berry 1993). If the variability inthe measurement of the analytical signal is notconstant over the range of calibrator concentrationsemployed, weighted linear regression analysis,which adjusts for the inconstancy in the variability ofmeasurement, should be used instead.

Nonlinear regression analysis is the mostaccurate and precise technique of parameter estima-tion for nonlinear curves (Motulsky and Ransnas1987). It is the technique that is used to estimate theparameters of the sigmoidal calibration curves foundin immunoassay systems

THE METHOD OF MEASUREMENT

Most measurements can be performed in anumber of ways that vary in the nature of thesample, the technique of analyte separation, themeans of producing and detecting the analyticalsignal, and the technique for calculating results. Aspecific way of performing a measurement isreferred to as a method of measurement or, infor-mally, as a laboratory method or, more simply still,as a method. Laboratory methods are the basic unitsof laboratory practice.

METHOD QUALITY

The quality of a laboratory method can bedefined as the ability of the method to satisfy theclinical needs served by measurements made withthe method. There is always a need for trueness andprecision.

Trueness of measurementTrueness is defined as the closeness of agree-

ment between the true value of an analyte and theaverage result value obtained from a large number ofreplicate measurements (Stöckl 1996, Dybkaer1995). It is the term that is currently applied to theconcept that used to be called accuracy. Accuracy isnow used to denote the broader concept of closenessof agreement between the true value of an analyteand a single measurement result. As such, accuracyreflects both trueness and precision of measurement.Trueness is measured on a ordinal scale of the sort:

Laboratory Methods 2-3

Figure 2.2 Calibration curves as obtained by two differentcalibrator spacing schemes. The even spacing scheme isshown in the top graph and the end-of-range spacingscheme is shown in the bottom graph.

0 10 20 30 40 50

Calibrator concentration

0

20

40

60

80

100Si

gnal

mag

nitu

de

0 10 20 30 40 50

Calibrator concentration

0

20

40

60

80

100

Sign

al m

agni

tude

intercept = 0.053slope = 1.994

intercept = 0.004slope = 1.997

Page 4: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

poor/average/excellent. The inverse of trueness,which is called bias (or systemic error of measure-ment), is a quantitative measure. It is the differencebetween the true value of an analyte and the averageresult value. Because it is useful to have a quantita-tive measure of quality, bias is the measure that isapplied when appraising the trueness of a measure-ment method.

Bias arises when the calibration process does notreflect the test measurement process perfectly.Causes of bias include matrix effects, calibratoreffects, and treatment effects (Strike 1996).

Matrix effects arise from the differences thatexist between the complex biological matrix found intest samples and the artificial matrix of the calibra-tors. Physical matrix effects are caused by physicalproperties, such as sample viscosity, that result intest samples being processed differently than calibra-tors by the measurement instrument. Nonspecificchemical matrix effects, called interference effects,are caused by substances in the test sample that,while not generating a signal themselves, affect themagnitude of the signal generated by the analytebeing measured (Kroll and Elin 1994). Specificchemical matrix effects are referred to as cross-reaction effects. They are caused by substances inthe test sample that generate a signal identical to thatof the analyte of interest. Considerable effort isdevoted to the evaluation of cross-reactions duringthe development of laboratory methods and varioustechniques may be employed to improve the specific-ity of the method by reducing or eliminating thecross reacting substances. Separation of the analytefrom cross-reacting substances, discussed earlier as afrequent step in the measurement process, is onemeans of increasing method specificity. The selec-tive adsorption of creatinine to fuller’s earth wasmentioned as an example. Some other separationtechniques are listed in Table 2.2. One of thesetechniques, liquid chromatography, is even moresuccessful than fuller’s earth in specifically isolatingcreatinine from substances that cross-react in theJaffé reaction. Method specificity can also beimproved in the analytical signal generation anddetection step of analyte measurement. Oneapproach taken at this step is to increase the selectiv-ity of the signal generating reaction so that only thespecific analyte participates in the reaction. Oneway to do this is to use analyte-specific enzymes tocatalyze a reaction that leads to the production of thesignal. A number of enzymatic methods are

available for the measurement of creatinine. In themost popular method, creatinine is hydrolyzed tocreatine by the highly specific enzyme, creatinineamidohydrolase. Creatine formation is coupled,through a series of specific enzymatic reactions, tothe production of a light-absorbing species thatprovides the analytical signal. The specificity inher-ent in enzyme reactions can also be taken advantageof when an enzyme is itself the analyte of interest.In that case, a reagent that is a substrate of theenzyme undergoes a catalytic conversion to aproduct that is coupled to the production of theanalytical signal. For example, creatine kinaseconcentrations are determined by measuring the rateof formation of ATP and creatine from the substratesADP and creatine phosphate. Note that, in methodsof this sort, enzyme concentrations are measured andreported in terms of enzymatic activity rather thansubstance concentration. Another approach forimproving method specificity in the analytical signalgeneration and detection step is to increase the selec-tivity of signal detection so that the only the signalgenerated by the analyte is detected. One way this isdone is by the so-called kinetic technique whichdepends upon a differential rate of signal productionfor the analyte and cross-reacting substances. In theJaffé reaction, the cross-reacting substances tend toreact with picrate slowly compared to creatinine sothe initial rate of Janovski complex formation is duelargely to creatinine. By measuring the initial rate,the signal from creatinine can be segregated from thesignals arising from the cross-reacting substances.

Bias due to calibrator effects arises from differ-ences between the analyte used in the calibrators andthe analyte as found in patients. One way thishappens is when a class of chemical species repre-sents the analyte of interest but the calibrationmaterial is based on a single species within the class.The measurement of total protein in the urine is agood example of this situation. Another cause ofcalibration effects is the use of non-human or alteredhuman analyte in calibrators. Analytes used incalibrators must be available in quantity and theymust be stable. It is often simply impossible toprocure from human sources adequate amounts oftrace analytes, such as hormones. It is similarlyimpossible to preserve in unaltered form fragileanalytes such as blood cells.

Bias can also be caused by treatment effects.These effects arise when test specimens and calibra-tors are not treated in an identical fashion when

Laboratory Methods 2-4

Page 5: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

preparing analytical samples. For example, prior tomeasuring the concentration of some protein-boundand intracellular analytes, it is necessary to releasethe analyte into solution. To accomplish this, arelease reaction will be performed on the test speci-mens. The same reaction may not be performed onthe calibrators because the analyte in calibrators isalready in solution.

Precision of measurementPrecision is defined as the closeness of agree-

ment among the result values obtained from a largenumber of replicate measurements (Dybkaer 1995).Precision is measured on an ordinal scale but itsinverse, imprecision, is a quantitative measure.Imprecision (or random error of measurement) is thedispersion of results for a large number of replicatemeasurements. It is usually expressed in terms ofstandard deviations.

Imprecision arises from multiple sources whichcan be categorized according to the followingscheme (Dybkaer 1995): (1) those that arise during asingle batch of measurements, (2) those that arise

over the course of the performance of multiplebatches of measurement, and (3) those that arisewhen several laboratories contribute to the produc-tion of results.

Imprecision arising during a single batch ofmeasurements is called within-run, or within-batchimprecision and is measured in terms of the within-run standard deviation. Because it quantifiesmethod precision in the setting of the minimumnumber of sources of measurement variability,within-run imprecision is the minimum precisionattainable by the method. The causes of within-runimprecision include volumetric errors, instrumentalfluctuations, variability in the efficiency of theseparation step, and vagaries in the rate andcompleteness of the signal generating step.

Imprecision arising from the performance ofmultiple batches of measurement is called between-run, or between-batch, imprecision and is measuredin terms of the between-run standard deviation. Thecauses of between-run imprecision include recalibra-tion of the measurement system, different calibratorsand reagents, different operators, and time

Laboratory Methods 2-5

Table 2.2 Selected Separation Techniques for Improving Method Specificity

Technique Principle and examples

Membrane permeability analyte-permeable membrane separates sample from site of signal generationonly specific analyte passes through membrane

O2 and CO2 electrodes, ion-selective electrodes, ultrafiltration, equilibrium dialysis

Electrophoresis chemicals in buffer migrate through support medium in electric fieldchemicals have different migration rates due to different charges and different

degrees of interaction with stationary phasespecific analyte migrates to characteristic location

cellulose acetate electrophoresis, agarose gel electrophoresis (named for support medium)

Chromatography mobile phase (gas or liquid) passes through a column with a stationary phase bound to support medium

chemicals in mobile phase have different degrees of interaction with stationary phase which leads to different migration rates through column

specific analyte elutes from column at characteristic retention time

gas chromatography, liquid chromatography (named for mobile phase)

Antibody binding analyte-specific antibodies are bound to support mediumonly specific analyte binds to antibodies and is retained on support medium

following wash

heterogeneous radio-, enzyme, and fluorescence immunoassays

Page 6: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

dependent phenomena such as drift in the perform-ance characteristics of the measurement instrument.

Within-laboratory imprecision is the total impre-cision in the measurement of an analyte in a singlelaboratory. It reflects the variability arising withinand between runs. The variances of these twosources add together to give the total variance,

varwithin-laboratory = varwithin−run + varbetween−run

where var is variance. In terms of the usual measureof imprecision, standard deviations,

SDwithin-laboratory = SDwithin−run2 +SDbetween−run

2

where SD is standard deviation.The imprecision that arises when several labora-

tories contribute to the production of results is calledbetween-laboratory imprecision. It is caused byinter-laboratory variation in calibrators, calibrationspacing scheme, choice of calibration function, andtechnique for estimation of calibration curve parame-ters. Other causes include differences in operatingconditions, differences in operator skill, and differ-ences in the measurement system.

Resolving power. The precision of a methoddetermines how good the method is at distinguishingdifferences in analyte concentration. This property,referred to as the resolving power of a method, is auseful alternative measure of method precision,especially when small changes in analyte concentra-tion must be discerned and when trace concentra-tions of analyte must be detected (Sadler et al. 1992,Gautschi et al. 1993). The resolving power of amethod is what is often referred to as the analyticalsensitivity of the method (Ekins and Edwards 1997).Resolving power is a less confusing term, however,because analytic sensitivity is also taken to mean theslope of the calibration curve (Pardue 1997).

The usual way in which resolving power isexpressed is as the minimum distinguishable differ-ence in concentration, Dmin. This parameter can bedefined for within-run differences or for between-run differences. For between-run differences(Sadler et al. 1992),

Dmin = zc SDwithin-laboratory2

where zc is the confidence coefficient as found withthe standard normal distribution; zc equals 1.645 fora 95% confidence level. This formula assumes thatmethod precision is essentially constant over inter-vals of analyte concentration equal in length to Dmin.The detection limit of a method, which is the

smallest analyte concentration that can reliably bedistinguished from zero, is a special case of Dmin.

Hierarchy of method qualityThe ideal of laboratory practice is to implement

methods of the highest quality. Unfortunately, ofthe methods available for the measurement of aparticular analyte, those of the very highest qualityare always too expensive and too impractical formost clinical laboratories. These methods, whichare called definitive methods, are used to validatethe accuracy of the methods at the next level ofquality, called reference methods. Referencemethods, which have only negligible inaccuracycompared to definitive methods, are generally lesscostly than definitive methods but they are stillimpractical for routine use. They are used tovalidate the accuracy of the affordable and practicalmethods of lower quality that are actually imple-mented in the clinical laboratory. These methodsare called field methods. This hierarchic chain ofvalidation of the accuracy of laboratory methodsrepresents one of the two elements of the system ofaccuracy transfer that is used to assure the quality offield methods. The other element of the system is ahierarchy of calibrators. In this hierarchy, fieldmethods are calibrated with secondary referencematerials, these being calibrators whose values havebeen established using a reference method. Refer-ence methods, in turn, are calibrated with primaryreference materials which are calibrators whosevalues have been certified by competent authoritythrough the use of a definitive method.

Analytical quality goalsIt is recognized that field methods cannot

provide reference method-level analytical qualitygiven the constraints of affordability and practicalitywithin which the methods must operate. However,it is necessary that the methods achieve a minimumlevel of quality—one that allows them to be of useclinically. It is therefore useful to define a desirablelevel of quality that can be used by both methoddevelopers and laboratorians as a benchmark forfield method performance.

A number of different approaches can be used todefine desirable analytical quality goals (Stöckl et al.1995). These approaches include defining goals inkeeping with the current “state of the art" in high-quality laboratories, having experts define the goals,and basing goals on the quality expectations of

Laboratory Methods 2-6

Page 7: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

clinicians. Additionally, quality goals can bederived from a consideration of the biologic variabil-ity of the analyte being measured. For instance, asdiscussed in Chapter 1, the extent of the variabilityin study results with repeated testing of an individualis determined by the within-individual biologicvariability of the analyte and the within-laboratoryanalytical variability,

SDwithin-individual =

SDwithin−individual,biologic2 +SDwithin−laboratory

2

The fractional increase in the total within-individualvariability attributable to analytic variability is,therefore,

1 +SDwithin−laboratory

SDwithin−individual,biologic

2− 1

To keep the contribution of the analytical componentat a reasonable level, say 10 percent, the ratio of thewithin-laboratory variability to the within-individualbiologic variability must be less than 0.459. Round-ing up to 0.5 (for which the fractional increase inwithin-individual variability is 11.8 percent) yieldsthe quality goal for repeated testing (Cotlove et al.1970, Harris 1979, Fraser et al. 1997),

SDwithin-laboratory < 0.5 SDintra-individual,biologic

This rule can also be expressed as

CVwithin-laboratory < 0.5 CVintra-individual,biologic

which is particularly useful if within-individualbiologic variability is proportional to analyte concen-tration. Then both coefficients of variation will beconstant.

The reference interval for an analyte dependsupon the median analyte value and the total (intra-and inter-individual) biologic variability of theanalyte. In also depends upon the bias in themeasurement of the analyte and the within-laboratory analytical variability. When calculatedbased on the assumption of a normal frequencydistribution,

reference interval =

median value + bias ± 1.96 SDtotal

where

SDtotal = SDbiologic2 +SDwithin−laboratory

2

The presence of bias results in a displacement of themeasured reference interval from the true referenceinterval which is,

median value ± 1.96 SDbiologic

As a result, at one side of the reference interval,individuals who fall inside the measured referenceinterval fall outside of the true reference interval.At the other side of the reference interval, individu-als who are outside of the measured reference inter-val are inside the true reference interval. Thefraction of the population misclassified in this wayshould be kept to an acceptable level. If a 5 percentmisclassification rate is used as the standard, in theabsence of analytical imprecision, the ratio of thebias to the total biologic variability should be keptless than 0.315. Rounding down to 0.25 (for whichthe misclassification rate equals 3.7 percent) yieldsthe quality goal for reference intervals (Gowans etal. 1988 and 1989, Fraser et al. 1997),

bias < 0.25 SDbiologic

which can also be expressed as relative bias < 0.25 CVbiologic

Analytic imprecision widens the reference intervaland thereby also results in misclassification. Using3.7 percent misclassification as the standard, in theabsence of bias, the ratio of the within-laboratoryvariability to the total biologic variability should bekept less than 0.56. This is another quality goal forreference intervals (Fraser et al. 1997),

SDwithin-laboratory < 0.56 SDbiologic

which can also be expressed as CVwithin-laboratory < 0.56 CVbiologic

Of course, the misclassification rate should also bewithin desirable limits when both bias and impreci-sion are present. Figure 2.3 shows the paired valuesfor relative bias and imprecision that satisfy the 3.7percent misclassification standard.

The application of these quality goals can beillustrated by using them to define the desirableanalytical quality of a field method for plasma creati-nine concentration. At a concentration of 100µmol/L, the average SDwithin-individual,biologic is 4.3µmol/L (CVwithin-individual,biologic, 4.3%) and the averageSDbiologic is 11.3 µmol/L (CVbiologic, 11.3%)(Sebastián-Gámbaro et al. 1997). The quality goalfor precision based on the rule for repeated testing isan SDwithin-laboratory of less than 2.15 µmol/L

Laboratory Methods 2-7

Page 8: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

(CVwithin-laboratory of less than 2.15%). At this level ofimprecision, the ratio of SDwithin-laboratory to SDbiologic is0.19. So, using Figure 2.3, the quality goal fortrueness based on the rule for reference intervals is arelative bias of less than 0.215 which corresponds toan absolute bias of less than 2.43 µmol/L.

MAINTAINING QUALITY

Once a laboratory method has been implementedin the clinical laboratory, its quality is maintained bystrict adherence to the approved procedure forperforming the method, by the maintenance of hightechnical skill among the method operators, by theregular maintenance of the instruments utilized in themethod, and by a rigorous quality assuranceprogram.

Written measurement procedureThe laboratory document that contains the

description of the steps in the performance of amethod is called the written measurement procedure(Dybkaer 1997). In addition to describing themethod, this document includes introductorymaterial, a description of the quality assuranceprogram for the method, and a summary report ofthe quality evaluation of the method (Table 2.3).

The introductory material identifies the method,summarizes the clinical rationale for the measure-ment of the analyte and the laboratory rationale forthe choice of method, specifies the type of specimen,and stipulates safety precautions in the use of the

method. The introductory material also provideslexicographic support for an unambiguous reading ofthe method description and lists of cited and recom-mended references. The procedure is periodicallyreviewed and is abridged as necessary. The dates ofreview and the dates and details of changes in themethod are recorded and filed with the introductorymaterial.

The method description is thorough and detailed.It includes sections devoted to specimen collectionand handling, reagents and equipment, preparationand performance of the method, calculation ofresults, and quality assurance. Aspects of specimencollection to be considered include any specialpreparation of the patient for the taking of a speci-men and the identification of the collection deviceand the specimen container. The handling of thespecimen is detailed as regards anaerobic conditions,temperature, the allowable time prior to processing,the method of processing, and the conditions ofstorage of the processed specimen. The list ofreagents stipulates the identity and source of eachreagent and gives instructions for its storage,handling, and disposal. The preparation of stockand working solutions is described and the

Laboratory Methods 2-8

0 0.1 0.2 0.3 0.4 0.5 0.6SDwithin-laboratory / SDbiologic

0

0.1

0.2

0.3

0.4Bi

as /

SDbi

olog

ic

Figure 2.3 Analytical quality goals for method imprecisionand method bias based on a consideration of patient classi-fication using the reference interval for an analyte. Impreci-sion and bias are expressed relative to total biologicvariability.

Table 2.3Components of a Written Measurement Procedure

Introductory material

Title Table of contentsIntroductionScopeWarning and safety precautionsDefinitionsSymbols and abbreviationsReferencesDates

Method description

Sampling and specimen handlingPrinciple of measurementReagentsApparatusPreparation of measurement systemUse of measurement systemModifications of the usual procedure for special casesCalculation of results

Quality assurance program

Analytical performance description

Analytical quality evaluation findingsMethod comparison findings

Page 9: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

shelf-lives of the solutions are indicated. The instru-ments and auxiliary equipment called for by themethod are identified and their operation and mainte-nance are covered, often by reference to themanufacturer’s manual. The procedure for readyingthe equipment and the samples (natural samples,blanks, calibrators, and controls) is indicated. Thesteps involved in the performance of the method arepresented in a sequential fashion; included are thesteps leading to the standby condition, if there isone, and the steps for closing down the method. Ifthe operating steps are different in certain circum-stances, the modifications of the method and thecircumstances for their application are described.The mathematical methods for deriving the calibra-tion function and the measurement function aredescribed and the algorithm for the calculation ofsample results is given. The computer software tobe used to perform these calculations is stated.

Quality assurance programIn the broadest possible sense, quality assurance

is concerned with the reliability of the patient datagenerated by the laboratory. It therefore encom-passes the procedures used to recognize, quantify,and control the sources of measurement variabilitythat arise within the laboratory between the receiptof a specimen and the posting of the study results(Büttner et al. 1980a). In its more common usage,quality assurance refers to the control of the preci-sion and trueness of laboratory methods. It is in thismore narrow sense of quality control that qualityassurance will be discussed here.

Internal quality controlInternal quality control refers to the procedures

for quality monitoring, intervention, and remediationundertaken in a single laboratory (Büttner et al.1983a, Nix et al. 1987, Petersen et al. 1996). Theunit of control is typically the set of samples thatconstitute one batch of the method. As mentionedpreviously, each batch includes a set of calibratorsfor the purpose of constructing a calibration curvefor that batch. This is done to reduce the variabilitythat results from between-run calibration variation inthe method. As this is a quality maintenance goal,batchwise calibration is properly considered one ofthe elements of internal quality control.

Also included in each batch of samples is a setof control samples for which the measurement resultfrequency distributions are known. Using statistical

tests called control rules to compare the currentresults for the control samples with their knownfrequency distributions, the trueness and precision ofthe method can be monitored.

Control samples are derived from controlmaterial rather than individual patient specimens butare otherwise handled in a fashion identical to testsamples. Indeed, valid internal quality controldepends upon the identical treatment of the controlsamples and the test samples. Control material mustcome from a large, homogeneous, and stable pool ofmaterial (Büttner et al. 1980c). The composition ofcontrol material should be as similar as possible tothe composition of the test material used in themethod being controlled. This requirement is bestsatisfied by control material made from humanproducts but such material is generally biohazardous.Instead, the most widely used type of controlmaterial is commercially manufactured artificialmaterial which is contrived to simulate the corre-sponding test material.

Using the techniques for the assessment ofmethod quality discussed later in this chapter, themean analyte concentration of the control material isestablished as is the within-laboratory imprecision inthe measurement of the concentration. For qualitycontrol purposes, method trueness is then evaluatedin terms of this mean and method precision is evalu-ated in terms of this SDwithin-laboratory. This is doneeach time a new lot of control material is introducedinto use in the laboratory.

Control rules. The logic of control rules is bestappreciated by considering the effects that a declinein method quality has upon the frequency distribu-tion of control sample results.

Degradation in the quality of a method may becharacterized by an increase in method bias, by anincrease in method imprecision, or by both. If thereis an increase in method bias, the control sampleresults will tend to be displaced from the establishedmean value for the control material. This means thatthere will be an increased probability for the controlsample results to be in the region of one of the tailsof the established distribution. For instance, asshown in the graph on the left in Figure 2.4, if thecurrent bias is equal to 1 SDwithin-laboratory, 15.9 percentof the control sample results will be larger than theestablished mean plus 2 SDwithin-laboratory. For the estab-lished distribution, only 2.3 percent of the controlsample results would be that far above the mean.On the other tail of the distribution, the bias will

Laboratory Methods 2-9

Page 10: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

result in only 0.1 percent rather than 2.3 percent ofthe control sample results having values smaller thanthe established mean minus 2 SDwithin-laboratory. The neteffect of the bias is an overall increase in thepercentage of control sample results outside of thecentral region of the established distribution, 16.0percent versus 4.6 percent. Identical percentagesapply when the bias is negative.

If there is an increase in method imprecision,there will be an increased probability for the controlsample results to be in the region of the tails of theestablished distribution. An example of this isshown in the graph on the right in Figure 2.4. For a1.5-fold increase in imprecision, 9.1 percent of thecontrol sample results will be more than 2SDwithin-laboratory larger than the established mean and9.1 percent of the control sample results will bemore than 2 SDwithin-laboratory smaller than the estab-lished mean. Hence, the net effect of the increase inimprecision is an overall increase from 4.6 to 18.2in the percentage of control sample results outside ofthe central region of the established distribution.

In general, there is an increased probability forcontrol sample results to be more than 2SDwithin-laboratory larger or smaller than the establishedmean if there is currently a bias in the method or anincrease in the imprecision of the method. There-fore, a control sample result of this magnitude canbe taken as a indication of a reduction in methodquality. This is the statistical basis of the 12s controlrule: a batch of measurements should not be consid-ered to be in-control if one control result exceeds themean plus or minus 2 SDwithin-laboratory.

The performance of this control rule, or any controlrule, is characterized by the relationship between theprobability of rejecting a batch using the rule and themagnitude of the current change in quality in themethod. This graphical presentation of this relation-ship is called the operating characteristic curve.Figure 2.5 shows the operating characteristic curvesfor the 12s control rule when using one to fivecontrol samples per batch. These curves representthe operating characteristics of the 12s control rulefor detecting current bias; operating characteristiccurves can also be drawn for the detection ofincreased imprecision and a 3-dimensional surfacecan be used to present the operating characteristicsfor the simultaneous detection of bias and increasedimprecision. A bias of zero means that the qualityof the method is unchanged so a rejection of a batchat this value represents a false rejection. Thus, they-intercepts of the curves equal the rates of falserejection when using the indicated number of controlsamples.

In the quality control of a particular method, thecontrol rule that should be used is the one that mostclosely achieves the clinical quality control goals forperformance and false-rejection rate while minimiz-ing the number of control samples run per batch.Consider, for instance, a method for which the false-rejection rate goal is 5 percent and the performancegoal is 90 percent rejection of batches for which thecurrent method bias is equal to or greater than 2.5SDwithin-laboratory. Using the 12s control rule, twocontrol samples must be run per batch to achieve theperformance goal at the stipulated level of bias.

Laboratory Methods 2-10

40 60 80 100 120 140 160Control sample concentration (µmol/L)

Freq

uenc

y

40 60 80 100 120 140 160Control sample concentration (µmol/L)

Freq

uenc

y

established current

current

- 2 SD + 2 SD

- 2 SD

+ 2 SD

established

Figure 2.4 Frequency distributions of control sample results for a hypothetical laboratory method. The established distribu-tion has a mean of 100 µmol/L and an SDwithin-laboratory of 10 µmol/L. It is shown as a light curve in both graphs. The currentdistributions are shown as dark curves. In the graph on the left, the method currently has a bias of 10 µmol/L. In the graphon the right, the SDwithin-laboratory of the method is currently 15 µmol/L.

Page 11: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

However, with two control samples per batch, thefalse-rejection rate is too high, 8.9 percent. Usingthe 13s control rule (which states that a batch ofmeasurements should not be considered to bein-control if one control result exceeds the mean plusor minus 3 SDwithin-laboratory), an acceptable false-rejection rate, 1.6 percent, can be achieved at theperformance goal, but at the expense of requiring sixcontrol samples per batch. Fewer control samplesper batch can be used while still achieving thequality control goals if a control rule intermediatebetween the 12s and 13s rules is employed. Specifi-cally, using three control samples per batch, the12.385s rule will yield a 5 percent false-rejection rateand a 90.6 percent rejection rate. In general, thebest control rules in terms of control sample require-ments are those for which the control limits havebeen calculated to achieve the specific qualitycontrol goals (Bishop and Nix 1993).

A somewhat different approach to evaluatingcontrol performance is taken when considering thedetection of long-term, or persistent, quality degra-dation in a method. Here the focus is on how manybatches will be accepted before the control rule leadsto a batch rejection and, thereby, detection of thequality problem (Nix et al. 1987). The most infor-mative way to present this performance behavior isas the cumulative run length distribution for thecontrol rule. This distribution gives the probabilityof having rejected any batch, including the currentbatch, as a function of the number of batches runsince the inception of the quality problem. Figure2.6 shows the cumulative run length distributions forthe 12s rule at five different levels of persistent

method bias. The graph also has a line that indicatesthe medians of the distributions. For instance, themedian run length for zero method bias is 15batches. That is the median number of consecutivebatches that can be expected to be accepted usingthis control rule when the quality of the method isunchanged. The median run lengths for the otherlevels of persistent method bias in the graph are 9,4, 2, and 1, in order of increasing bias. (Note:average run length is the usual measure for theevaluation of control rules in the setting of a persis-tent problem in method quality. Because run lengthfrequency distributions are highly right-skewed,average run lengths will be larger than median runlengths. For example, the average run length for the12s control rule is 22 batches when the quality of themethod is unchanged. Compare this to the medianvalue of 15 batches. Because it is less influenced byextreme run length values of low probability, medianrun length better reflects the run length behaviorexpected of a control rule.)

The control rule used to monitor a method forpersistent quality degradation must satisfy the clini-cal performance and false-rejection quality controlgoals for such problems. The performance goal canbe expressed as maximum median (or average) runlength at a specified level of quality decline in themethod. The false-rejection quality goal can beexpressed either as an acceptable median run lengthwhen the quality of the method is unchanged or, asfor single-batch quality monitoring, as an acceptablefalse-rejection rate per batch. Typically, single rulecontrol procedures are not able to provide the

Laboratory Methods 2-11

Figure 2.5 Operating characteristic curves for the detectionof current method bias using the 12s control rule. Curvesare shown for one to five control samples per batch.

0 1 2 3 4

Bias (multiples of within-laboratory SD)

0

0.2

0.4

0.6

0.8

1R

ejec

tion

prob

abilit

y

n =1

n = 5

0 5 10 15 20

Run length

0

0.2

0.4

0.6

0.8

1

Rej

ectio

n pr

obab

ility

00.5

1.0

1.5

2.0

Figure 2.6 Cumulative run length distributions for thedetection of persistent method bias using the 12s controlrule. Curves are shown for method bias in half multiples ofSDwithin-laboratory.

Page 12: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

requisite combination of performance and false-rejection rate. A number of multiple-rule proce-dures have been proposed that perform much betterthan single-rule procedures. The multiple-ruleprocedure of Westgard et al. (1981) has achieved thegreatest popularity. The rules used in the procedureare listed in Table 2.4. The control sample set forthis procedure consists of a low concentrationcontrol sample and high concentration controlsample. If both control results pass the 12s controlrule using the respective established result distribu-tions, the batch is considered to be in-control. If oneof the control results exceeds its 2 SDwithin-laboratory

limits, evaluate the results using the remainingcontrol rules. If the results fail to pass any of therules, the batch of measurement is rejected. If theresults pass all of the rules, the batch is in-control.

Cumulative sum procedures and moving averageprocedures have also been proposed as tools formonitoring for persistent degradation in methodquality (Nix et al. 1987, Strike 1996). The perform-ance of these approaches in the control of methodtrueness has been shown to be superior to that themultiple-rule procedures (Bishop and Nix 1993,Parvin 1992). These approaches are computationalintensive but can easily be implemented in themodern computerized laboratory.

Test sample-based quality control. The use ofcontrol material for internal quality control hasseveral practical problems. The material is expen-sive, it may have limited stability, and often itscomposition is different from that of the test

samples. This has prompted the development ofalternative procedures for monitoring method preci-sion and trueness that use test samples rather thancontrol material. These procedures have not beenaccepted as replacements for quality control usingcontrol material but many laboratories use them tosupplement their control material-based internalquality control program.

To monitor within-run precision, a test sample isdivided into aliquots prior to being assayed and thealiquots are run in the same batch. The variance ofthe replicate results for the test sample is calculatedusing the formula,

var = (xi −mean)2

n − 1where xi is the i th replicate result, mean is the meanof the replicates, and n is the number of replicates.If duplicates are used, the formula is,

var = (x1 − x2)2

Once 20 to 30 such replicate determinations havebeen made, the within-run imprecision is estimatedusing the formula,

SDwithin−run =varjN

where N is the number of test samples studied. Thisestimate is compared to the within-run imprecisionestablished for the method. When the next testsample replicate set is run, its variance is added tothe variance data set and the oldest variance value inthe data set is deleted. The within-run imprecisionis recalculated and the updated estimate is comparedto the imprecision standard. Within-laboratoryanalytic precision can be evaluated in a similarfashion by having test sample replicates assayed indifferent batches.

To monitor method trueness, the mean value iscalculated for a block of consecutive test sampleresults. The block may consist of a specifiednumber of results, of all of the results in a batch, orof all of the results for a defined period of time,usually a day. The block mean is compared to thepopulation mean established for the method.Assuming that the mix of patients is similar overtime, the mean value of a block of test sampleresults will equal the established population mean.Truncation of the data to exclude extreme resultvalues improves the performance of the method.Improved performance also comes from the use of a

Laboratory Methods 2-12

Table 2.4Multiple-Rule Control Procedure (SD is SDwithin-laboratory)

Warning rule

12s one control result exceeds mean ± 2 SD

Within batch rules

13s one control result exceeds mean ± 3 SD

R4s one control result exceeds mean + 2 SDone control result exceeds mean – 2 SD

Within and between batch rules

22s two consecutive control resultsexceed mean + 2 SD or mean – 2 SD

41s four consecutive control resultsexceed mean + 1 SD or mean – 1 SD

10x̄ ten consecutive control resultsfall on one or the other side of the mean

Page 13: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

smoothing function for the calculation of the meanvalues (Gardner 1985, Strike 1996). Smoothingleads to a reduction in the variability in the estimatesof the means that would otherwise arise from day-to-day variation in the makeup of the clinical popula-tion from which the test samples come. In addition,if the test sample mean is recalculated for eachsuccessive result rather than for blocks of results,method trueness can be continuously monitored,although continuous monitoring may not offer anybetter performance than block-wise monitoring(Smith and Kroft 1997).

External quality controlExternal quality control refers to the procedures

for quality control that involve the participation oftwo or more laboratories. Proficiency testing is anexternal quality control program mandated by aregulatory body for the purpose of determininglaboratory quality. Most commonly, programs ofexternal quality control are conducted by profes-sional societies or manufacturers of controlmaterials. Control material is provided to participat-ing laboratories where control samples are assayedon a regular schedule, usually in conjunction withthe internal control samples. All of the participatinglaboratories receive control material from the sameproduction lot. The control sample results are trans-mitted to the program sponsors who analyze the dataand issue reports that describe the result distributionsamong the participating laboratories and indicate thelocation of the individual laboratory results withinthat distribution. In this way, external qualitycontrol serves as an adjunct to internal qualitycontrol by providing an additional mechanism formonitoring the long-term trueness of a method.

There are a number of other important aimsserved by external quality control. These aimsinclude to provide a measure of the "state of the art"for the measurement of an analyte; to obtain consen-sus values for control material when neither defini-tive nor reference methods exist for the measurementof an analyte; and, importantly, to investigate thesources of inter-laboratory variability in themeasurement of an analyte (Büttner et al. 1983b).By analyzing subgroup results from a large numberof laboratories, it is possible to compare theperformance of different methods and to evaluate theextent to which inter-laboratory measurementvariability is explained by laboratory variables suchas laboratory size and laboratory workload.

METHOD PRACTICABILITY

Practicability refers to those properties of amethod that relate to practical aspects of its imple-mentation. These include speed, cost, technical skillrequirements, dependability, and safety (Büttner etal. 1980a). These are clearly important concerns inthe decision to employ a particular method.

The speed of a method is determined by the timeneeded for method and sample preparation, the timespent assaying the sample, and the time required forcalculation of the results. The time needed formethod preparation is at its longest if specimens arereceived infrequently and one-at-a-time, for themethod must then be set up anew for each specimen.Method preparation is at its shortest if samples arereceived and run while the method is maintained in afully operational state. There is sometimes a trade-off between the speed of a method and its quality.This may mean that two methods need to be set upin the same laboratory, a slower method of highquality that is used for routine work and a rapidmethod of lower quality that is used when circum-stances demand a quick turnaround, provided, ofcourse, that lower quality results can be toleratedclinically in exchange for rapidity in obtaining theresults.

The cost of a method includes not only theexpense of the reagents and materials used in samplepreparation and assay, but also capital and operationcosts, such as maintenance and repair costs for theinstrument on which the method is implemented,labor costs, which will vary depending upon thetechnical expertise required of the staff who run themethod, and overhead costs. Besides its affect uponmethod cost, the technical skill requirements of amethod, which determines who on the staff can runthe method, determines when it can be run—onlywhen the appropriate staff members are scheduled tobe at work.

The dependability of a method quantifies the rateor frequency with which the method succeeds inproducing valid results. Dependability is not whollyintrinsic to the method but can vary from location tolocation due to differences in laboratory conditionsand staff quality. This is particularly relevant whenthe method is to be used outside of the central clini-cal laboratory, in a setting closer to the patient.Such point-of-care testing may take place in a satel-lite laboratory, on a hospital ward, in a physician’soffice, or in the home of the patient. In these

Laboratory Methods 2-13

Page 14: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

settings, the personnel are typically trained in theperformance of the method but usually have only alimited amount of training and experience in generallaboratory practices. Consequently, the methodneeds to be easy to perform and highly reliable if itis to be dependable. For home testing, in which thepatient or a family member performs the test, theneed for method ease and reliability is even greater.

Method safety encompasses the whole range ofsafety considerations in the performance of amethod. It includes concerns for the biological,chemical, and radiation hazards to which the labora-tory staff may be exposed during the preparation ofsamples and the performance of the method and forthe electrical and mechanical safety of equipmentused in the performance of the method. Safetyconsiderations may determine who can run themethod and where the method can be set up in thelaboratory.

METHOD EVALUATION

There are two kinds of method evaluations: theevaluation of a newly developed method, which isusually undertaken and reported by the laboratoryscientists who devised the method, and the evalua-tion of a validated method, which is performed bythe laboratorians who are considering setting up themethod in their clinical laboratory. The first typeof evaluation is ordinarily conducted under the bestpossible laboratory conditions and with the lovingattention of the developers. The second type ofevaluation, which is usually conducted under routinelaboratory conditions, establishes how well a methodperforms in the laboratory in which it will be used.The performance may not be as good as thatreported by the developers because of the differencesin the operating conditions. That is why on-sitemethod evaluation is necessary before implementingany method. The evaluation of a newly developedmethod must be very thorough. The report of theevaluation should include the components listed inTable 2.5.

Laboratory methods are developed for threereasons. The first is to provide a method for themeasurement of an analyte which is recognized to beclinically useful but for which there is no existingmethod. The second is to provide a method withanalytic quality superior to that of other methodscurrently in use and the third is to provide a methodof greater practicability than that of currently

available methods. In an exemplary evaluation of anew method for determining plasma phosphateconcentration, Luque de Castro et al. (1995) indicatethat their primary motivation for developing themethod was to improve practicability. The methoduses a flow injection (FI) system with immobilizedenzymes. As they state, immobilized enzymes

have several advantages over the use ofdissolved enzymes in batch assays, such aslower analytical cost, higher selectivity andstability, and long life span.

A similar method had already been developed,

Male and Luong (16) developed the first FImethod for the determination of phosphatewith immobilized [nucleoside phosphorylase]and xanthine oxidase

but that method used amperometric detection of thereaction product. The method of Luque de Castro etal. produces a different indicator product that isdetected fluorometrically.

Method descriptionThe description of the method should include the

same items as are found in a written measurementprocedure (Table 2.3). The level of detail shouldalso be comparable to that of a written procedure:using only the method description and generallaboratory knowledge, a reader should be able tosetup and perform the method in his or her clinicallaboratory.

Sampling and specimen handling. The reportshould state the kinds of specimens that can beassayed using the method. It should be noted ifspecial processing is required of some specimens.For example, urine may need to be diluted prior tomeasurement. The kinds of specimens actually usedin the performance of the method evaluation shouldbe indicated.

Laboratory Methods 2-14

Table 2.5Components of a Method Evaluation Report

1. Statement of the motivation for the development of the method2. Description of the method3. Description of the optimization of analytical variables4. Characterization of the calibration curve5. Assessment of analytical quality6. Determination of analytical range

Page 15: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

Principle and method of measurement. Thestatement of the principle of measurement of themethod should include the techniques of analyteseparation and of signal generation and detection.The phosphate method of Luque de Castro et al.does not employ an analyte separation step. Thesignal is generated by an enzymatic reaction forphosphate coupled to the production of an fluores-cent end-product,

A number of enzymatic methods . . . arebased on the ability of phosphate to activatesome enzymatic reactions catalyzed byglyceraldehyde-3-phosphate dehydrogenase(8), phosphorylase a (9), maltose phosphory-lase (10), sucrose phosphorylase (11), andnucleoside phosphorylase

The method of Luque de Castro et al. utilizesnucleoside phosphorylase as the enzyme,

NPInosine + Pi Ö hypoxanthine + ribose-5-P

XODHypoxanthine + 2 H2O Ö uric acid + 2 H2O2

HPOXp-Hydroxyphenylacetic acid +2 H2O2 Ö

H2O + bi(p-hydroxyphenylacetic acid)

where NP is nucleoside phosphorylase, XOD isxanthine oxidase, and HPOX is peroxidase.

The species monitored fluorometrically is thedimer of p-hydroxyphenylacetic acid (p-HPA),which exhibits maximum excitation at 325 nmand maximum emission at 415 nm.

In this method analyte specificity is achieved both bythe use of an enzymatic reaction specific forphosphate and by the use of fluorometric detection.

Reagents and apparatus. As in the writtenprocedure, the method evaluation report shouldstipulate the identity and source of reagents anddescribe the preparation of stock solutions andworking solutions. The storage conditions and shelflives of the solutions should be stated. The prepara-tion of the immobilized enzyme reactors (IMERs) isof particular note in the method of Luque de Castroet al.,

NP and XOD were immobilized oncontrolled-pore glass (CPG 120-200 mesh;Electronucleonics, Fairfield, MA) by usingMasoom and Townshend's procedure (17).Pump tubes of different lengths [1.5 mm(i.d.)] were then packed with each support-enzyme conjugate and stored at 4°C in thefollowing solutions: 100 mmol/L Tris-HCl,pH 7.0, for the NP immobilized enzymereactor (IMER) and 1 mol/L ammoniumsulfate + 0.5 mmol/L sodium salicylate in100 mmol/L Tris-HCl, pH 7.0, for the XODIMER. Under these conditions both enzymereactors kept their activity for at least 3weeks.

The description of the instruments used in themethod should include a detailed discussion of anymodifications required for the performance of themethod. In the case of flow injection systems, suchas that of Luque de Castro et al., the flow injectionmanifold needs to be described.

The hydrodynamic system used (Fig. 1)consists of a peristaltic pump that propels thereagent streams through the channels. Thesample, diluted appropriately, is injected intoa stream of reagent A, which merges with astream of reagent B; reagent B containsinosine, the substrate for NP biocatalysis.The first two enzymatic reactions take placealong the NP and XOD IMERs. Anadditional merging point located after theIMERs allows the main stream to be mixedwith reagent C (which contains p-HPA andHPOX), which reacts and catalyzes, respec-tively, the derivatizing reaction of the hydro-gen peroxide produced in the previous step.The derivatizing reaction is developed alongthe reactor. Finally, the sample reaches theflow cell and provides the analytical signal.The enzyme reactor and the open reactor arethermostated at 37°C.

Preparation and use of the measurementsystem. In most cases, the preparation and use ofthe measurement system is a matter of generallaboratory knowledge so no explicit discussion ofthese topics is required. If the instrumentation isnew or if a familiar instrument is modified oroperated in a novel fashion, the report should

Laboratory Methods 2-15

Page 16: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

provide a detailed, preferably stepwise, account ofthe performance of the method.

Analysis and optimization of analytical variablesThe principal objective in the development of a

laboratory method is to end up with the maximumpossible analytical quality given the level of practica-bility envisioned by the developers. To achieve thisgoal, it is necessary to identify the optimal combina-tion of operating set-points for the analyticalvariables of the method. This requires an analysis ofthe dependency of the quality endpoints upon theoperating set-points. Luque de Castro et al.performed an exceptionally thorough analysis of thissort in the development of their method. They usedthe magnitude of the analytical signal as the primaryquality endpoint and studied a variety of analyticalvariables,

The variables affecting the analytical processand hence the signal it provided were classi-fied as chemical, physical, and hydrodynamic(Table 1), and then studied by univariateanalysis.

The analysis and optimization of the flow injectionvariables is described as follows,

High flow rates (2.32 mL/min) decreased theanalytical signal, but low flow rates (0.58mL/min) decreased the sampling frequencyand resulted in increased dispersion. A flowrate of 1.60 mL/min was selected as acompromise.

A sample volume of 300 µL was chosen toobtain the best analytical signal, since atgreater volumes the signal remained almostconstant.

The optimal lengths of the enzyme reactorswere 1 cm each. Using a longer NP IMERprovided a sharp increase in the baseline and adecreased analytical signal. Increasing theXOD IMER did not improve the analyticalsignal.

A length of 250 cm for the open reactor wasenough to achieve a reproducible mixture ofreagent C and the main stream, thus providingoptimal analytical signal.

Notice that, even though the primary qualityendpoint was the magnitude of the analytical signal,the flow rate that was selected as optimal was not theflow rate resulting in the maximum value of thesignal. Larger signals were obtained at lower flowrates. However, the lower flow rates decreasedmeasurement precision, a secondary qualityendpoint, and decreased the sampling frequency ofthe system, a practicability endpoint. It is not infre-quent in method development that the choice of anoptimal analytical variable set-point represents sucha compromise among competing quality and practi-cability considerations.

A univariate approach to method optimizationwas employed by Luque de Castro et al. In theunivariate approach, the response of a system to theset-point of one variable is studied with all of theother variable set-points held constant. This worksfairly well if all of the variable set-points are heldnear to their true optimal values or if the sensitivityof the system to the set-point of each variable islargely independent of the set-points of the othervariables. It works poorly if the approximate valuesof the optimal set-points are not known beforehandand if there is interdependence among the variablesin their effect upon the system response.

A multivariate optimization approach can beused in circumstances in which the univariateapproach is not likely to perform well. In the multi-variate approach, none of the set-points of theanalytical variables are kept constant; instead, theresponse of a system to various set-point combina-tions is studied (Box and Draper 1987). The combi-nations are chosen so that they will cover what is apriori believed to be the most interesting portion ofthe multivariate solution space. This provides datapoints on the response surface, the multi-dimensionalsurface that characterizes the relationship betweensystem response and the set-points of the analytical

Laboratory Methods 2-16

Page 17: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

variables. These points can be used to fit aresponse surface model, if a functional form for thesurface is suggested by the data. The optimalset-point combination can then be calculated fromthe model equation. Alternatively, the data pointscan be used as a starting point for an empiricalsearch algorithm. Search algorithms seek out optimain an iterative fashion: using the response data fromthe preceding step, the algorithms indicate the mostinformative set-point combinations to test next. Theiterations continue until the maximum systemresponse and its associated set-point combination isidentified. The popular search algorithms are highlyefficient and rapidly converge to the responsesurface optimum. This means that the delineation ofthe optimal analytical variable set-point combinationfor a new method can be achieved without undueexpense. Multivariate optimization should,therefore, be considered whenever there is uncer-tainty about the validity of the univariate optimiza-tion approach.

Characterization of the calibration curveThe most common form for the equation of the

calibration curve is a straight line. Sentiments of thesort, “Linearity is a state sought by all clinicallaboratorians. It means straight and predictable,good work, and good value” (Passey and Maluf1992) are common despite the fact that there aremeasurement systems in the clinical laboratory, suchas competitive immunoassay systems, that produceresults of high quality despite having nonlinearcalibration curves. Nevertheless, when calibrationlinearity is possible, considerable efforts are made toassure that the operating conditions for a measure-ment system yield a linear calibration curve.

In order to evaluate the linearity of a calibrationcurve, a measure or test of linearity is needed. Sucha measure has not proven easy to come by. Tholen(1992) listed 22 different statistical techniques thathad been proposed for evaluating calibration linear-ity up to the time he wrote his review. Since then, atechnique developed by Kroll and Emancipator(1993, Emancipator and Kroll 1993) has achieved adegree of acceptance in the laboratory medicinecommunity as a linearity measure. That technique isbased upon the quantification of nonlinearity as theroot mean square of the deviation of the calibrationcurve from an ideal straight line,

nonlinearity =¶xlow

xhigh(c(x) − g(x))2dxxhigh − xlow

where c(x) is the equation of the curve that best fitsthe empirical calibration data, g(x) is the equation ofthe ideal straight line fit of the data, and xhigh and xlow

are the values of the high and low calibrators,respectively. Defining the relative nonlinearity as,

relative nonlinearity =nonlinearityyhigh − ylow

where yhigh and ylow are the highest and lowest signalmagnitudes recorded during the linearity study,Emancipator and Kroll (1993) found that calibrationcurves that are acceptably linear by visual inspectionhave relative nonlinearities of less than 2.5%. Usingthis technique, Luque de Castro et al. found thattheir phosphate method demonstrated acceptablelinearity over the range of measurement,

The linearity of the method was assessed bymeans of Kroll and Emancipator's procedure(18,19) recently adopted by the College ofAmerican Pathologists (20). the . . . nonline-arity was 0.063 mmol/L; the relative nonline-arity, 1.29%.

A number of practical considerations areinvolved in the implementation of the technique ofKroll and Emancipator. The number and spacing ofthe calibrators need to be chosen. Kroll andEmancipator suggest using 5 equally spaced calibra-tors. This scheme will reveal both monotonic andsigmoidal nonlinearity. In cases in which the curva-ture in the calibration curve appears to be limited toone or the either end of the curve, such as whenthere is concavity at the high end of a calibrationcurve for a method which suffers substrate exhaus-tion at high analyte concentrations, it may be neces-sary to add additional calibrators within the suspectinterval. Each calibrator should be run in replicate.The number of replicates needed depends upon themeasurement variability of the method: for highlyprecise methods, duplicates are adequate; formethods that are imprecise, quintuplicates arewarranted. Most authors report the use oftriplicates, a convenient compromise number. Thereplicates should be between-run rather than within-run (Kroll and Emancipator 1993, Emancipator andKroll 1993).

Laboratory Methods 2-17

Page 18: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

The formula for nonlinearity requires twoequations, one for the curve that best fits the empiri-cal calibration data and one for the ideal straight linefit of the data. Kroll and Emancipator recommend,and themselves use, polynomial equations to derivethe best fit curve. Polynomial equations are a goodchoice because they can be readily fit to the data byweighted multiple linear regression and they are alsoeasy to integrate. The ideal straight line fit is theline that results in the minimum value for nonlinear-ity. Formulas for calculating the slope and interceptof this line can be found in Emancipator and Kroll(1993). Note that these equations refer to thecalibration curve and, as such, relate the calibratorconcentration to the signal magnitude. The y data ina linearity study, therefore, are signal magnitudes.Luque de Castro et al. properly used the strength ofthe fluorescence signal as the y variable in thelinearity study of their method,

A series of eight standard solutions withconcentrations between 0.1 and 20.0 µmol/Lwere prepared from the phosphate solutiondescribed in Materials and Methods. Theequation of the analytical signal obtained bytriplicate injection of these standards into theFI manifold was as follows: fluorescenceintensity = 27.5 + 49.2 [Pi] (µmol/L)

Some authors report using measurement resultsrather than signal magnitudes as the y data. This isimproper and, even more to the point, paradoxical inthat the calibration curve is the matter under study;without a calibration curve there cannot be ameasurement curve and, in turn, there cannot bemeasurement results.

Assessment of analytical qualityThe quality of an analytical method is assessed

through a characterization of the method truenessand the method precision.

Trueness. As discussed earlier, trueness is thecloseness of agreement between the true value of ananalyte and the average value obtained from a largenumber of replicate measurements. To evaluate thetrueness of a method it is, therefore, necessary toknow the true analyte value in a sample. Indeed, itis necessary to know the true value in a number ofsamples with analyte concentrations that vary acrossthe proposed range of measurement of the method.This requirement can be met in either of two ways.

If the method evaluators have access to a referencemethod for the measurement of the analyte, the trueconcentrations can be determined in clinical samplesfrom the evaluators’ laboratory. Otherwise, theevaluators can use certified reference material forwhich the true concentration of the analyte of inter-est have been determined by the certifying agencythrough the use of reference or definitive methods.

The steps in the characterization of the truenessof a laboratory method are depicted in Figure 2.7.Between-run replicate measurements are made of thesamples with known analyte concentration (topgraph). At least 5 replicate measurements should beperformed and 10 is preferred. The average valueof each set of replicates is computed and the differ-ences between the averages and the true values arecalculated. The differences, which represent thebias of the method at the sampled analyte concentra-tions, are plotted (middle graph). In order tocharacterize the bias at all concentrations within therange of measurement, a linear bias model is fit tothe data using weighted linear regression. Bias isclassified as being constant if the constant term ofthe model is nonzero. It is classified as proportionalif the slope of the model is nonzero. In the example,the bias shows a mixed pattern.

The graph of the bias model is referred to as abias profile (Keller and Passing 1989). The biasprofile can be graphed in terms of absolute bias(middle graph) or relative bias, which is the ratio ofthe bias to the analyte concentration expressed aspercent (bottom graph). It is especially useful tograph the bias profile in relative terms becausemethod bias criteria, which are rules for deciding ifa method shows adequate trueness for clinicalpurposes, are usually expressed in relative ratherthan absolute terms. A bias criterion can thereforebe plotted on the same graph as the bias profile andthe interval over which the method meets the crite-rion can be easily appreciated. An example of this isshown in the bottom graph. The bias criteriondepicted is a relative bias of less than 10%. Therelative bias of the method satisfies the criterion atanalyte concentrations greater than 19 units.

Recovery. If method evaluators do not haveaccess to a reference or definitive method and ifthere are no readily available certified referencematerials, method trueness cannot be evaluated bythe approach outlined in the preceding section.Trueness must then be evaluated by means of arecovery study. A known amount of analyte is

Laboratory Methods 2-18

Page 19: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

added to a clinical sample and the increment inanalyte concentration as measured by the method iscompared to the increment in concentration calcu-lated from the sample volume and amount of analyte

added. The comparison is usually expressed interms of the percentage recovery,

recovery =

measured concentration incrementpredicted concentration increment % 100%

Using multiple aliquots of a clinical sample with alow initial analyte concentration, a number of recov-ery samples of varying final concentrations aremade. The concentrations should span the proposedrange of measurement of the method. Between 5and 10 between-run replicate determinations areperformed on each of the recovery samples and theaverage recovery at each addition level is calculated.Luque de Castro et al. used a recovery study toevaluate trueness in their method evaluation,

Two aliquots of six samples were subjected toadditions of standards (0.5 and 1.7 mmol/L)to establish the recovery of the method. Theresults obtained (Table 3) ranged from 96% to104%, which represent a good recovery forthe supplemented samples.

There are two potential problems with recoverystudies that should be kept in mind. First, the calcu-lated increment in concentration in a recoverysample is subject to error due to possible errors inthe amount of analyte added or in the measurementof the volume of the sample. Second, unless a clini-cal sample with a very low analyte concentration canbe found, the trueness of the method is studied onlyin the higher concentration range (initial analyteconcentration plus added analyte concentration).

Cross-reactions and interferences. Methodtrueness is also evaluated by studying the effects ofpotential interfering and cross-reacting substances.Usually the substances are studied one at a time.The potential interferent or cross-reactant is added toa clinical sample and the change in the analyteconcentration is measured. The effects of the

Laboratory Methods 2-19

0 10 20 30 40 500

20

40

60

80M

easu

red

anal

yte

conc

entra

tion

0 10 20 30 40 50-10

-5

0

5

10

Bias

bias = 0.977 + 0.037 concentration

0 10 20 30 40 50Analyte concentration

0

20

40

60

80

100

Rel

ativ

e bi

as (%

)

Figure 2.7 The bias of a hypothetical laboratory method.The top graph shows replicate measurement data (symbols)and the line of identity. The middle graph shows the empiri-cal biases (symbols) and the linear model fit (line). Thebottom graph shows the relative bias profile (curve) and abias criterion (line).

Page 20: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

substance are most often expressed in terms ofpercent change in measured analyte concentration.The substance is evaluated over the range ofsubstance concentrations that spans the anticipatedclinical range for the substance so that the effectscan be related to the concentration of the substance,usually by use of a linear model. This one-at-a-timeapproach that was taken by Luque de Castro et al.,

The effects of bilirubin and hemoglobin aspotential optical interferences and of uric acid,ascorbic acid, xanthine, hypoxanthine, andglucose as possible chemical interferenceswere studied. Each compound was added to apool of serum (phosphate concentration 1.25mmol/L) and its influence established. Nointerferences were detected from bilirubin forconcentrations <600 mg/L, hemoglobin <20g/L, uric acid <150 mg/L, ascorbic acid<75 mg/L, and glucose <200 mmol/L . . .The presence of allopurinol < 200 mg/L inserum did not cause interference; it did causean error of ~ -4% at 400 mg/L.

Xanthine and hypoxanthine, which are substrates ofthe second enzymatic reaction, produced a positiveinterference, which presumably was proportional tothe concentration of the added substance. Theauthors comment that interference from thesesubstances in the clinical setting should be "nospecial problem" because they are "present in serumvery infrequently."

The one-at-a-time approach will not revealcomplex chemical interactions such as those occurbetween an interferent or cross-reactant and theanalyte and those that occur between different inter-ferents and cross-reactants. Quantitative evaluationof complex interferences and cross-reactions requiresa response surface modeling approach similar to thatdiscussed in the section on optimization of analyticalvariables (Kroll and Chesler 1992). In thisapproach, the potential interferents and cross-reactants are added in varying concentrations to eachaliquot of the clinical sample. For a complete facto-rial design, all possible combinations of the variousconcentrations for each of the interferents and cross-reactants are studied (Box and Draper 1987). Themeasured analyte concentrations are fit to anresponse surface model by multiple regression analy-sis. A model limited to first-order and interactionterms is usually used, such as the following ,

analyte concentration =

b0 + b1x1 + b2x2 + b12x1x2

for two interferents where x1 and x2 are the concen-trations of the respective interferents and b0 repre-sents the concentration of analyte in a sample free ofinterferents.

The presence of complex chemical interactions isindicated by statistically significance of the coeffi-cients of the interaction terms. The clinical signifi-cance of an interaction depends upon the magnitudeof the interaction effects.

Precision. Figure 2.8 illustrates the steps in thecharacterization of the precision of a method. Repli-cate measurements are made using samples withanalyte concentrations that span the measurement(top graph). If within-run imprecision is beingstudied, all of the measurements on a sample mustbe made during the same run. In a study of within-laboratory imprecision, the measurements on asample need to be performed during different runsand, preferably, on different days. An absoluteminimum of 10 replicate measurements need to bemade to obtain a moderately precise estimate of thestandard deviation; 20 replicates is better and 50replicates is better still (Sadler and Smith 1990).

The standard deviation of each set of replicatemeasurements is calculated using the formula(Bookbinder and Panosian 1986),

standard deviation = (xi −mean)2

n − 1

where xi is the i th replicate result, mean is the meanof the replicates, and n is the number of replicates.The standard deviations are plotted versus analyteconcentration. If the imprecision is constant overthe measurement range, the empirical standarddeviations will be roughly equal and will form afairly flat line. If the imprecision is proportional toanalyte concentration, the empirical standard devia-tions will increase in magnitude with increasinganalyte concentration. The imprecision model that isusually fit to the empirical data is the 3-parametermodel proposed by Sadler et al. (1988),

SD = (b0 + b1 concentration)b2

This model is quite flexible. If b2 is one, the modeldefines a line. Otherwise, the model defines a curvethat can be either convex (b2 greater than one) orconcave (b2 less than one). Because the model isnonlinear, it must be fit by nonlinear regression.

Laboratory Methods 2-20

Page 21: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

The graph of the imprecision model was origi-nally called a precision profile (Ekins 1983) but thedesignation, imprecision profile, has become morepopular. The y-axis of the imprecision profile can

be either the magnitude of the standard deviation(middle graph) or the magnitude of the coefficient ofvariation (bottom graph). Almost always, impreci-sion profiles are graphed with imprecision quantifiedas coefficient of variation. The reason is similar tothat for graphing the bias profile in relative terms:method precision criteria are expressed in relativeterms, i.e. as coefficients of variation, so they canbe plotted on the same graph as the imprecisionprofiles. A method precision criterion of 30% isplotted in the bottom graph. The precision of themethod satisfies the criterion at analyte concentra-tions greater than 11 units.

In the phosphate method evaluated by Luque deCastro et al., within-run and between-run precisionwere evaluated at three different concentrations,

The precision . . . of the method was checkedby assaying three serum pool samples from aclinical laboratory that contained low,medium, and high concentrations of phosphate(~0.900, 1.070, and 1.700 mmol/L, respec-tively). Aliquots of the three samples wereanalyzed after a 1:250 dilution, both in singlerun and during 11 days for within- andbetween-run studies, respectively.

Because the range of measurement is small, threeconcentrations seems an adequate number to study.Typically, the range of measurement is much largerand a greater number of concentrations need to bestudied. The authors did not model the precisiondata even though neither within-run nor between-runprecision appear to be constant.

The preceding discussion and example describethe separate evaluation of within-run and between-run precision. This is a straightforward and clear-cut way to conduct a precision study but it isn’t theonly way that such studies are performed. Often, afew within-run replicates are assayed in each of alarge number of runs and both within-run precisionand between-run precision are computed from the

Laboratory Methods 2-21

0 10 20 30 40 50 600

20

40

60

80M

easu

red

anal

yte

conc

entra

tion

0 10 20 30 40 50 600

5

10

15

20

Stan

dard

dev

iatio

n

SD = (1.2 + 0.12 concentration)1.3

0 10 20 30 40 50 60Analyte concentration

0

20

40

60

80

100

Coe

ffici

ent o

f var

iatio

n (%

)

Figure 2.8 The precision of a hypothetical laboratorymethod. The top graph shows replicate measurement data(symbols) and the line of identity. The middle graph showsthe empirical standard deviations (symbols) and the impre-cision model fit (curve). The bottom graph shows therelative imprecision profile (curve) and a precision criterion(line).

Page 22: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

resulting data set. The variance of the replicateresults for each run is calculated using the formula,

varj= (xi −meanj)2

n − 1

where xi is the i th replicate result in run j, meanj isthe mean of the replicates in run j, and n is thenumber of replicates assayed per run. If only tworeplicates are assayed per run, n-1 is set equal to 2not to 1. This biases the analysis somewhat, so it isbetter to assay three or more replicates per run.Within-run variance is calculated as the average ofthe individual run variances so,

SDwithin-run = varjN

where N is the number of runs. The variance of theindividual run replicate means is computed using theformula,

varmeans= (meanj − overall mean)2

N − 1where overall mean is the mean value of the individ-ual run replicate means. Then (Box et al. 1978),

SDbetween-run = varmeans −SDwithin−run

2

n

A point that needs to be mentioned in anydiscussion of method precision is that within-runprecision can always be improved by measuringsamples in duplicate or triplicate and reporting theaverage value of the results. With replicatemeasurements, the imprecision decreases by a factorequal to the square root of the number of replicates.For duplicate measurements, the within-run impreci-sion is 0.71 times as large as with single measure-ments and with triplicate measurements, it is 0.58times as large. This approach is costly in that fewersamples can be run per batch but the resultantimprovement in method precision may significantlyincrease the clinical utility of the method.

Resolving power and detection limit. Theresolving power of a method is expressed in terms ofthe minimum distinguishable difference in concen-tration, Dmin. Using the formula,

Dmin = zc SDwithin-laboratory2

the resolution profile for a method can be calculatedfrom the imprecision model, giving,

Dmin = zc (b0 + b1 concentration)b22

where the model parameters b0, b1, and b2 apply towithin-laboratory imprecision.

The detection limit can be calculated as thesmallest concentration that solves the equation

concentration = zc (b0 + b1 concentration)b22

An approximate solution can be found graphicallyfrom a plot of the resolution profile. It is theconcentration at which the resolution profile inter-sects the line of identity. Using this starting value,the exact solution can be found using an iterativeroot solving algorithm such as Newton’s method(Sadler et al. 1992). Such algorithms are nowwidely available in computer spreadsheet programs.

An alternative and more direct way of calculat-ing the detection limit is to use the imprecision datafrom the replicate sets with concentrations that arenear the detection limit. Typically, the zero concen-tration replicate set and the lowest concentrationreplicate set are used. The variances of the data setsare calculated (varzero and varlow) as is the pooledvariance,

varpool = n1varzero + n2varlown1 + n2 − 2

where n1 and n2 are the number of replicates in thezero concentration and low concentration data sets,respectively. The detection limit is calculated usingthe following formula (Rodbard 1978, Büttner et al.1980b),

detection limit = blank - tc varpool 1n1 + 1

n2

where blank is the mean value of the zero concentra-tion replicate set and tc is the confidence coefficientas found with the t distribution. For a 95% confi-dence level and equal numbers of replicates in eachdata set, tc equals 1.734 for 20 replicates, 1.701 for30 replicates, 1.686 for 40 replicates, and 1.677 for50 replicates.

Analytical rangeThe analytical range, or working range, is the

range of analyte concentrations for which the methodsatisfies all of the following criteria: the bias of themethod is within acceptable limits, the precision ofthe method is within acceptable limits, and, if appro-priate, the calibration curve is acceptably linear.The range is determined by reference to the biasprofile, the imprecision profile, and the findings ofthe linearity study.

Laboratory Methods 2-22

Page 23: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

It is obviously desirable that the analytical rangecover the entirety of the pathophysiological range ofvalues for the analyte measured by the method. Ifthe analytical range falls short, explicit rules need tobe devised to deal with samples that have analyteconcentrations outside of the analytical range. Inpractice, this problem is almost always due tosamples with analyte concentrations that are abovethe upper limit of the analytical range. The questionin such cases is if appropriate dilution of the samplewill yield a valid measurement. If so, the dilutionprocedure and the means for calculating a correctedresult need to be included in the method procedure.If not, the manner of reporting the out-of-boundresult needs to be specified in the procedure.

METHOD COMPARISON

The purpose of a method comparison is to ascer-tain if the test results for a set of clinical specimensas obtained from one field method are, on average,the same as those obtained by another field method.The comparison amounts to a exploration of theclinical equivalence of the two methods. Clinicallyequivalent methods can be freely substituted for oneanother. The substitution of a method for anotherwith which it is not clinically equivalent requires theestablishment of a new reference interval for theanalyte being measured and the development of aconversion formula that can be used to equate testresults from the old method with test results fromthe new study.

MotivationA complete report of a method comparison

includes the components listed in Table 2.6. Thefirst component is a statement of the motivation forthe comparison. The most common motivation isthe contemplated replacement of a method with onethat possesses greater practicability.

The following excerpt from Turpeinen et al.(1995) explains their motivation for undertaking acomparison of three different methods for measuringhemoglobin A1c (HbA1c). HbA1c is a stable adduct ofglucose and hemoglobin A. The percent ofhemoglobin present as HbA1c depends upon theblood glucose concentrations to which the hemoglo-bin is exposed over the life-span of the red cell andis, therefore, a useful clinical marker of long-termblood glucose concentration control in patients withdiabetes.

. . . the methods currently used for [HbA1c]measurement in clinical chemistry laboratoriesshow large differences between reportedvalues, and comparison of results from differ-ent laboratories is difficult.

At present there is no accepted standard oracknowledged reference method. Recently,calibration based on a cation-exchange HPLCmethod has been shown to increase thecomparability between various analyticalmethods (5,6).

In this study we compare our own high-resolution HPLC cation-exchange method(PolyCAT A) with two other assays: aboronate affinity binding assay (IMx) and anautomated system for [glycohemoglobin]analysis by cation-exchange chromatography(Diamat ™).

The motivation for this method comparison is tocompare two commercially available methods for thedetermination of HbA1c. In addition, the methodsare compared with a high-resolution version of themethodology that has been gaining support as acalibrating method, cation-exchangechromatography. In this instance, it can be appreci-ated that the employment of cation-exchange as acalibrating method can be taken as tacit acknowledg-ment that it is of adequate accuracy for routinelaboratory practice and, indeed, is probably moreaccurate than most other methods in routine use.

Analytical methodsWhen comparing methods, it is of course essen-

tial that it be clear exactly what the methods are. Ittherefore behooves the laboratorian to thoroughlydescribe the methods under study. In fact, it isappropriate to provide the same degree of complete-ness in the method description as one would for amethod evaluation. However, it may be possible to

Laboratory Methods 2-23

Table 2.6Components of a Method Comparison Report

1. Statement of the motivation for the comparison of the methods2. Description of the analytical methods3. Description of the study population4. Evaluation of concordance of the methods5. Assessment of clinical equivalence

Page 24: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

cite a previously published description of the methodor to refer to the manufacturer’s instructions when acommercial method is being studied. When anyoptions exist in the performance of a method, theoptions chosen should be indicated.

Study populationMethod comparison studies are performed using

clinical specimens that have been submitted to thelaboratory in the course of the medical care ofpatients. Most often, the specimens that are used arethose that have been submitted for determination ofthe analyte measured by the methods under study.This is an obvious necessity in the case of xenobiot-ics but it also makes sense for other kinds of analytesbecause the range of values for the analyte is usuallylargest among those patients in whom the analyte isbeing measured. For instance, HbA1c concentrationsare only measured in patients with diabetes. Thesepatients have values that range from normal togreatly increased with the majority being slightly tomoderately elevated. A random sampling of labora-tory specimens would be expected to uncover only afew specimens with elevated concentrations. Inkeeping with this logic, in their comparison study,Turpeinen et al. utilized specimens obtained frompatients with diabetes:

For method comparison we used 123 bloodsamples obtained mainly from diabetics sent toour routine laboratory for HbA1c analysis.

A wide clinical spectrum should be representedby the specimens used in a method comparisonstudy. This is important for two reasons. First, therange of values of the analyte often relates to thespectrum of disease in the patients from whom thespecimens are obtained. For instance, if the speci-mens that are submitted to the routine laboratory forHbA1c analysis come almost exclusively from outpa-tients whose disease is well controlled, the valueswill be for the most part normal or slightlyincreased. The concordance of the methods in thisrange may not reflect the concordance at the highvalues seen in patients whose disease is out-of-control. The HbA1c values reported by Turpeinen etal. in their article include many above 10% of totalhemoglobin, indicating moderate to severe chronichyperglycemia, so their clinical population clearlyencompasses a broad spectrum of glycemic control.The second reason to seek a broad clinical spectrum

is to guarantee a wide spectrum of biochemicalvariability. The wider the biochemical spectrum,the more likely it is that measurement differencesdue to differential method specificity will bedetected.

Evaluation of concordanceThere are two general approaches for the evalua-

tion of concordance between the results of fieldmethods, regression analysis and difference analysis.Correlation analysis is not a useful approach forevaluating concordance for a number of reasons(Bland and Altman 1986, Hollis 1996). First andforemost of these is that the correlation coefficient isnot a measure of agreement between data pairs but,rather, is a measure of the goodness-of-fit of a linearmodel of the data pairs. Thus, for example, datapairs that are aligned along the line,

resultmethod 2 = 10 + 2 resultmethod 1

will have a perfect correlation coefficient despite thefact that the data pairs do not agree at all. Anotherproblem with the correlation coefficient is that itsvalue depends upon the range of the data analyzed.The wider the range, the larger the correlationcoefficient. In this way, the inclusion of extremedata pairs, even pairs with less than average agree-ment, will inflate the value. Yet another problemwith the correlation coefficient is that it reflects datavariability as well as data linearity. As a result, thecorrelation coefficient of two highly precise methodsthat, on average, do not agree particularly well maybe larger than the correlation coefficient of two lessprecise methods that, on average, agree very well.

Regression analysis. The goal of regressionanalysis, which has been the standard approach forconcordance evaluation for decades, is to define thefunctional relationship between the results of themethods so that it can be compared to the relation-ship that characterizes ideal concordance. In prac-tice, this means using a linear regression techniqueto find the equation of the line that best fits thepaired result data,

resultmethod 2 = b0 + b1 resultmethod 1

The estimated values of the intercept and the slopeare compared to the values expected for perfectconcordance, i.e. an intercept of zero and a slope ofone.

Turpeinen et al. used the Deming technique oflinear regression in their comparison study. The

Laboratory Methods 2-24

Page 25: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

more familiar technique of ordinary linear regressionanalysis has as one of its underlying assumptions thatthe x variable has no appreciable measurementvariability (Berry 1993). In order for this assump-tion to be satisfied, the variability in the x variablemust be small compared to the variability in the yvariable. In method comparison studies, however, itis typical for the x variable, i.e. the test resultsobtained using one method, to have a variabilitycomparable to that of the y variable, the test resultsobtained using the other method. Consequently,ordinary regression analysis is usually not an appro-priate regression technique for a method comparisonstudy. Instead, one must use a linear regressiontechnique that takes into account variability in themeasurement of the x variable. One such “errors-in-variables” technique is the Deming method (Strike1996). It has been found to be among the mostreliable of the errors-in-variables linear regressiontechniques (Wakkers et al. 1975, Riggs et al. 1978,Linnet 1993). Weighted regression modifications ofthe Deming method are available for data sets inwhich the variance of the data pairs is not constantover the range of measurement (Riggs et al. 1978,Linnet 1990, 1993). The modification developed byLinnet applies to data sets in which the varianceincreases proportionally with analyte concentration.

Turpeinen et al. present graphs of the data andregression lines for each of the three methodpairings. For the IMx and Diamat method pairingthe graph is:

The authors found that the IMx and Diamatmethods showed the best result concordance by (un-weighted) Deming regression (note that the authorshave been careless in their terminology, when theywrite “correlation” they mean “regression”),

The correlation between IMx, calculated as%HbA1c, and the Bio-Rad Diamat (Fig 2A)gave the following results: IMx = 1.16Diamat - 0.98 (r = 0.922). The good correla-tion is explained by the fact that the IMx assayhas been standardized with an ionexchangeHPLC method, with the Diamat assay as asecondary reference HPLC system maintainedin close calibration to the primary referenceHPLC assay.

The 95% confidence intervals on the estimatesof the slope and intercept are 1.152 to 1.170 and-0.986 to -0.980, respectively. The confidenceinterval for the slope does not include 1 so theestimate is statistically different from 1. The confi-dence interval for the intercept does not include 0 soit is statistically different from 0. As neitherparameter is equal to the value expected of perfectconcordance, bias is present. When the intercept isnot 0, the data are said to show constant bias. Whenthe slope is not equal to 1, the data show propor-tional bias. When both parameters do not equaltheir ideal values, as here, the bias is referred to asmixed constant and proportional. The paper states

the confidence limits for the slope and inter-cept values were calculated with the jackknifemethod

The jackknife method is a nonparametric techniquefor generating empirical likelihood distributions forthe parameters of a statistical model (Mooney andDuval 1993). In this case, the statistical model is aline and the parameters are its slope and intercept.The jackknife method has been found to be a reliableway to calculate the parameter confidence intervalswhen either unweighted or weighted Deming regres-sion is used (Linnet 1993). Parametric approachesfor the calculation of the confidence intervals can beused if the methods have relatively constant variabil-ity over the clinical range of values for the analyte.Calculation of the exact confidence intervals iscomplex (Creasy 1956) but highly accurate, simpleapproximations are available (Strike 1996). Theapproximate confidence interval for the slope is,

b1 ± tc standard error of b1

in which

standard error of b1= b12 (1 − r2) / r2

n − 2

Laboratory Methods 2-25

Page 26: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

where r2 is the square of the correlation coefficientand n is the number of data pairs. The approximateconfidence interval for the intercept is,

b0 ± tc standard error of b1 x2n

where Σ x2 is the sum of the squared x values.The graph of the data and Deming regression

line for the Diamat and PolyCAT A pairing is:

The result concordance of these methods is muchless good that that between the Diamat and IMXmethods:

When the two ion-exchange chromatographicmethods, PolyCAT A and Diamat, werecompared (Fig 2B) . . . the regressionequation (PolyCAT A = 1.03 Diamat - 2.84)shows that much lower results were obtainedby ion-exchange chromatography with highresolution.

The 95% confidence intervals on the estimates of theslope and intercept are 1.026 to 1.038 and -2.842 to-2.834, respectively. Although the slope is statisti-cally different from 1, the difference is small, so theauthors conclude that the lack of concordanceappears to be due largely to constant bias. Theauthors offer the following explanation for the bias:

This might be due to the fact that the Diamatmethod also measures carbamylated and acety-lated forms of Hb … and possibly some otherderivatives formed in blood during storage,which can be separated from the HbA1c peakby using methods with higher resolution. OurPolyCAT A assay has been optimized toseparate different Hb variants from HbA1c.

This apparently partly explains the lowerresults obtained by this method. However,the difference between Diamat method andPolyCAT A assay cannot be explained only bycarbamylated and acetylated Hbs, for whichconcentrations <0.4% have been reported …

Difference analysis. Difference analysis wasintroduced as an approach for evaluating methodcomparisons by Altman and Bland (1983). Thisapproach strives to avoid the shortcomings ofordinary regression analysis for method comparisonby directly measuring how well each result pairagrees in terms of the difference in the values of theresults. The differences are plotted against theaverage values of the result pairs (Bland and Altman1986, Hollis 1996) or, as in the article by Turpei-nen et al., the differences are plotted against theresults of one of the methods. In the comparison ofthe IMx and Diamat methods, the plot, called adifference plot, is:

Examination of the difference plot suggests thatthere is a small inter-method bias present becausethere are more negative differences than positivedifferences. If there were no bias present, thedifferences would be evenly distributed about theline of zero difference. The authors describe thepattern as follows:

The bias observed (Fig 3A) suggests that theIMx method gives slightly higher results thanthe Diamat method at high amounts butsimilar results at normal amounts of HbA1c.

The difference plot for the Diamat and PolyCAT Amethod pairing is:

Laboratory Methods 2-26

Page 27: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

Here the bias is clear. The authors comment:

The differential plots show that the negativebias of ~ 2-3% of total BB is seen at allvalues of HbA1c when PolyCAT A iscompared with Diamat

Assessment of clinical equivalenceAs stated earlier, clinical equivalence of two

analytical methods means that they can be used inter-changeably. In practical terms, clinical equivalencemeans two things: that the results of the twomethods show a high degree of concordance and thatthe reference ranges for the measured analyte, asdetermined using the two methods, are essentiallyidentical. The degree of concordance and the close-ness of the agreement of the reference ranges thatare required in order to consider two method clini-cally equivalent are matters of clinical judgmentwhich may be codified in recommendations promul-gated by professional societies or in standardsimposed by regulatory agencies. Statistical evidenceis important in coming to this decision but it is notthe only consideration. For instance, the confidenceinterval for the estimate of the intercept of theregression line for paired results may indicate that itis statistically different from zero. This indicates thepresence of a bias in the methods. However, themagnitude of the bias may be considered to be clini-cally insignificant and the results of the methodsdeemed to be highly concordant.

Figure 2.9 demonstrates a graphical approachfor judging if two methods are clinical equivalent interms of the degree of concordance of the methods.In this figure the data pairs have been plotted as they

would be for a regression analysis. The lighter,inner lines demarcate the region in which 95% of thedata pairs will be if the methods are, on average,perfectly concordant. Because each sample is onlymeasured once by each method in the usual methodcomparison, individual method measurement varia-bility results in less than perfect concordance even ifthe methods are, on average, perfectly concordant.If each sample were measured repeatedly by eachmethod, the average result values obtained by twomethods that were, on average, perfectly concordantwould show exact agreement. These boundary linesare constructed using the formula,

range = analyte concentration ± zc SDresult pairs

Laboratory Methods 2-27

0 25 50 75 100 125

Method 1 result

0

25

50

75

100

125

Met

hod

2 re

sult

0 25 50 75 100 125

Method 1 result

0

25

50

75

100

125

Met

hod

2 re

sult

Figure 2.9 Two sets of hypothetical method comparisondata. The data are shown as symbols. Clinical equivalenceboundary lines are indicated.

Page 28: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

where, for the 95% range, zc equals 1.96, and

SDresult pairs = varmethod 1 + varmethod 2

The width of the range will vary with analyteconcentration if the variance of either methoddepends upon the concentration. In the figure, thevariance was treated as being proportional to analyteconcentration. The darker, outer lines delimit theregion in which 95% of the data pairs will be if themethods are concordant to within a clinically accept-able amount of bias. These are the clinical equiva-lence boundary lines. These boundary lines areconstructed using the formula,

range = analyte concentration

± ( zc SDresult pairs + acceptable bias)

The upper graph in Figure 2.9 shows thehypothetical results of a comparison of two highlyconcordant methods. All of the data points areinside of the clinical equivalence boundary lines(here the bias criterion is a relative bias of less than7.5%). Using the approximate formula (Newcombe1998),

confidence interval =

estimate + zc

2

2N ! zcestimate(1 − estimate)

N +zc

2

4N

1 + zc2

Nwhere N is the number of data pairs, the approxi-mate 95% confidence interval on the proportion is95.4 to 100%. Because it is statistically certain thatat least 95% of the data pairs are contained withinthe boundary lines, the concordance of the methodssatisfies the criterion for clinical equivalence. Thelower graph shows hypothetical results from twomethods which are discordant. Seventy percent ofthe data points are inside of the clinical equivalenceboundary lines with a 95% confidence interval of51.5 to 72.3%. It is, therefore, statistically certainthat fewer than 95% of the data pairs are containedwithin the boundary lines. Hence, the methods arenot clinically equivalent due, in this case, to thepresence of an unacceptably large inter-method bias.

A similar graphical approach can be taken whenthe result data are plotted as differences (Petersen etal. 1997). This is shown in Figure 2.10 for thesame hypothetical data that were used to makeFigure 2.9. Notice that in this figure, the x-axis isthe result as measured by the field method already inplace (method 1). When plotted in this fashion, the

graphs of the two data sets reveal exactly the samepercentage of data points inside of the clinicalequivalence boundary lines.

Based upon the difference analysis findings,Turpeinen et al. conclude that the methods theystudied do not demonstrate a clinically acceptabledegree of concordance. The authors alsoconstructed reference ranges for HbA1c using two ofthe methods under study:

Reference values for the Diamat and PolyCATA methods were determined by using 60freshly drawn blood samples from healthycontrols

They found that the ranges are not similar:

For IMx a reference range of 4.5–5.5% hasbeen reported (7). Our estimates [of] thereference values for HbA1c with the PolyCATA and Diamat methods with samples from 60

Laboratory Methods 2-28

0 20 40 60 80 100 120Method 1 result

-40

-30

-20

-10

0

10

20

30

40

Res

ult d

iffer

ence

0 20 40 60 80 100 120Method 1 result

-40

-30

-20

-10

0

10

20

30

40

Res

ult d

iffer

ence

Figure 2.10 The same hypothetical method comparisondata as in Figure 2.9 but here presented as difference plots.The data are shown as symbols. Clinical equivalenceboundary lines are indicated.

Page 29: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

healthy controls were, for Diamat, mean(±SD) 5.13% ± 0.33%, the reference range(mean±2SD) 4.5–5.8%, and the total range4.4–6.1%. For the PolyCAT A method themean (±SD) was 3.43% ± 0.47%, the refer-ence range 2.5–4.4%, and the range2.6–5.0%.

Because the methods do not produce identical refer-ence ranges and because they are not highly concor-dant, the authors decided that the methods are notclinically equivalent.

REFERENCES

Altman DG and Bland JM. 1983. Measurement inmedicine: the analysis of method comparison studies.Statistician 32:307.

Berry WD. 1993. Understanding RegressionAssumptions. Sage Publications, Newbury Park CA.

Bezeau M and Endrenyi L. 1986. Design of experimentsfor the precise estimation of dose-responseparameters: the Hill equation. J theor Biol 123:415.

Bishop J and Nix ABJ. 1993. Comparison of quality-control rules used in clinical chemistry laboratories.Clin Chem 39:1638.

Bland JM and AltmanDG. 1986. Statistical methods forassessing agreement between two methods of clinicalmeasurement. Lancet i:307.

Bookbinder MJ and Panosian KJ. 1986. Correct andincorrect estimation of within-day and between-dayvariation. Clin Chem 32:1734.

Box GEP and Draper NR. 1987. Empirical Model-Building and Response Surfaces. John Wiley & Sons,New York.

Box GEP, Hunter WG, and Hunter JS. 1978. Statisticsfor Experimenters. An Introduction to Design, DataAnalysis, and Model Building. John Wiley & Sons,New York.

Büttner J, Borth R, Boutwell HJ, Broughton PMG, andBowyer RC. 1980a. Approved recommendation(1978) on quality control in clinical chemistry. Part 1.General principles and terminology. J Clin Chem ClinBiochem 18:69.

Büttner J, Borth R, Boutwell HJ, Broughton PMG, andBowyer RC. 1980b. Approved recommendation(1978) on quality control in clinical chemistry. Part 2.Assessment of analytical methods for routine use. JClin Chem Clin Biochem 18:78.

Büttner J, Borth R, Boutwell HJ, Broughton PMG, andBowyer RC. 1980c. Approved recommendation(1978) on quality control in clinical chemistry. Part 3.Calibration and control materials. J Clin Chem ClinBiochem 18:855.

Büttner J, Borth R, Boutwell HJ, Broughton PMG, andBowyer RC. 1983a. Approved recommendation(1978) on quality control in clinical chemistry. Part 4.Internal quality control. J Clin Chem Clin Biochem18:877.

Büttner J, Borth R, Boutwell HJ, Broughton PMG, andBowyer RC. 1983b. Approved recommendation(1978) on quality control in clinical chemistry. Part 5.External quality control. J Clin Chem Clin Biochem18:885.

Cotlove E, Harris EK, and Williams GZ. 1970. Biologi-cal and analytical components of variations in long-term studies of serum constituents in normal subjects.III. Physiological and medical implications. ClinChem 16:1028.

Creasy MA. 1956. Confidence limits for the gradient inthe linear functional relationship. J Roy Stat Soc B18:65.

Dybkaer R. 1995. Result, error and uncertainty. Scand JClin Lab Invest 55:97.

Dybkaer R. 1997. Vocabulary for use in measurementprocedures and description of reference materials inlaboratory medicine. Eur J Clin Chem Clin Biochem35:141.

Ekins RP. 1983. The precision profile: its use in assaydesign, assessment and quality control. In HunterWM, and Corrie JET (eds). Immunoassays for Clini-cal Chemistry. Second edition. Churchill Livingston,Edinburgh.

Ekins R and Edwards P. 1997. On the meaning of "sensi-tivity". Clin Chem 43:1824.

Emancipator K and Kroll MH. 1993. A quantitativemeasure of nonlinearity. Clin Chem 39:766.

Fedorov VV. 1972. Theory of Optimal Experiments.Academic Press, New York.

Fraser CG, Petersen PH, Libeer J-C, and Ricos C. 1997.Proposals for setting generally applicable quality goalssolely based on biology. Ann Clin Biochem 34:8.

Gardner E. 1985. Exponential smoothing: the state of theart. J Forecast 4:1.

Gautschi K, Keller B, Keller H, Pei P, andVonderschmitt DJ. 1993. A new look at the limits ofdetection (LD), quantification (LQ) and power ofdefinition (PD). Eur J Clin Chem Clin Biochem31:433.

Gowans EMS, Hyltoft Petersen P, Blaabjerg O, andHřrder M. 1988. Analytical goals for the acceptanceof common reference intervals for laboratoriesthroughout a geographical area. Scand J Clin LabInvest 48:757.

Gowans EMS, Hyltoft Petersen P, Blaabjerg O, andHřrder M. 1989. Analytical goals for the estimationof non-Gaussian reference intervals. Scand J Clin LabInvest 49:727.

Laboratory Methods 2-29

Page 30: Chapter 2 LABORATORY METHODS - RCNusers.rcn.com/dennisanoe/Chapter2.pdf · 2001. 8. 28. · LABORATORY MEASUREMENTS ... strongly light-absorbing red-orange compound, the Laboratory

Harris EK. 1979. Statistical principles underlying analyticgoal-setting in clinical chemistry. Am J Clin Pathol72:374.

Hollis S. 1996. Analysis of method comparison studies.Ann Clin Biochem 33:1.

Keller H and Passing H. 1989. Performance profiles: newtools for characterization and comparison of clinicalchemical results. J Clin Chem Clin Biochem 27:613.

Kroll MH and Chesler R. 1992. Rationale for usingmultiple regression analysis with complex interfer-ences. Eur J Clin Chem Clin Biochem 30:415.

Kroll MH and Elin RJ. 1994. Interference with clinicallaboratory analyses. Clin Chem 40:1996.

Kroll MH and Emancipator K. 1993. A theoreticalevaluation of linearity. Clin Chem 39:405.

Linnet K. 1990. Estimation of the linear relationshipbetween the measurements of two methods withproportional errors. Stat Med 9:1463.

Linnet K. 1993. Evaluation of regression procedures formethods comparison studies. Clin Chem 39:424.

Luque de Castro MD, Quiles R, Fernández-Romero JM,and Fernández E. 1995. Continuous-flow assay withimmobilized enzymes for determining of inorganicphossphate in serum. Clin Chem 41:99.

Motulsky HJ and Ransnas LA. 1987. Fitting curves todata using nonlinear regression: a practical andnonmathematical review. FASEB J 1:365.

Mooney CZ and Duval RD. 1993. Bootstrapping. ANonparametric Approach to Statistical Inference. SagePublications, Newbury Park CA.

Newcombe RG. 1998. Two-sided confidence intervals forthe single proportion: comparisons of seven methods.Stat Med 17:857.

Nix ABJ, Rowlands RJ, Kemp KW, WIlson DW, andGriffiths K. 1987. Internal quality control in clinicalchemistry: a teaching review. Stat Med 6:425.

Pardue HL. 1997. The inseparable triad: analytical sensi-tivity, measurement uncertainty, and quantitativeresolution. Clin Chem 43:1831.

Parvin CA. 1992. Comparing the power of quality-controlrules to detect persistent systemic error. Clin Chem38:358.

Passey RB and Maluf KC. Linearity and calibration. Aclinical laboratory perspective. Arch Pathol Lab Med116:757.

Petersen PH, Ricós C, Stöckl D, Libeer JC, Baaden-huijsen H, Fraser C, and Thienpont L. 1996.Proposed guidelines for the internal quality control ofanalytical results in the medical laboratory. Eur J ClinChem Clin Biochem 34:983.

Petersen PH, Stöckl D, Blaabjerg O, Pedersen B, Birke-mose E, Thienpont L, Lassen JF, and Kjeldsen J.

1997. Graphical interpretation of a field method with areference method by use of difference plots. ClinChem 43:11.

Riggs DS, Guarnieri JA, and Addelman S. 1978. Fittingstraight lines when both variables are subject to error.Life Sciences 22:1305.

Rodbard D. 1978. Statistical estimation of the minimaldetectable concentration (“sensitivity”) for radioligandassays. Anal Biochem 90:1.

Sadler WA, Murray LM, and Turner JG. 1992.Minimum distinguishable difference in concentration:a clinically oriented translation of assay precisionsummaries. Clin Chem 38:1773.

Sadler WA and Smith. 1990. Use and abuse of impreci-sion profiles: some pitfalls illustrated by computingand plotting confidence intervals. Clin Chem 36:1346.

Sadler WA, Smith MH, and Legge HM. 1988. A methodfor direct estimation of imprecision profiles, withreference to immunoassay data. Clin Chem 34:1058.

Sebastián-Gámbaro MÁ, Lirón-Hernandez FJ, andFuentes-Arderiu X. 1997. Intra- and inter-individualbiological variability data bank. Eur J Clin Chem ClinBiochem 35:845.

Smith FA and Kroft SH. 1997. Optimal procedures fordetecting analytic bias using patient samples. Am JClin Pathol 108:254.

Steinberg DM and Hunter WG. 1984. Experimentaldesign: review and comment. Technometrics 26:71.

Stöckl D. 1996. Metrology and analysis in laboratorymedicine: a criticism from the workbench. Scand JClin Lab Invest 56:193.

Stöckl D, Baadenhuijsen H, Fraser CG, Libeer J-C,Petersen PH, and Ricós C. 1995. Desirable routineanalytical goals for quantities assayed in serum. Eur JClin Chem Clin Biochem 33:157.

Strike PW. 1996. Measurement in Laboratory Medicine:A Primer on Control and Interpretation. Butterworth-Heinemann, Oxford.

Tholen DW. 1992. Alternative statistical techniques toevaluate linearity. Arch Pathol Lab Med 116:746.

Turpeinen U, Karjalainen U, and Stenman U-H. 1995.Three assays for glycohemoglobin compared. ClinChem 41:191.

Wakkers PJM, Hellendoorn HBA, Op de Weegh GJ, andHeerspink W. 1975. Applications of statistics in clini-cal chemistry. A critical evaluation of regression lines.Clin Chim Acta 64:173.

Westgard JO, Barry PL, Hunt MR, and Groth T. 1981. Amulti-rule Shewhart chart for quality control in clini-cal chemistry. Clin Chem 27:493.

Laboratory Methods 2-30