digital filtering and model updating methods for improving the robustness of near-infrared...

10
Digital Filtering and Model Updating Methods for Improving the Robustness of Near-Infrared Multivariate Calibrations KIRSTEN E. KRAMER* and GARY W. SMALL Optical Science and Technology Center and Department of Chemistry, University of Iowa, Iowa City, Iowa 52242 Fourier transform near-infrared (NIR) transmission spectra are used for quantitative analysis of glucose for 17 sets of prediction data sampled as much as six months outside the timeframe of the corresponding calibration data. Aqueous samples containing physiological levels of glucose in a matrix of bovine serum albumin and triacetin are used to simulate clinical samples such as blood plasma. Background spectra of a single analyte-free matrix sample acquired during the instrumental warm-up period on the prediction day are used for calibration updating and for determining the optimal frequency response of a preprocessing infinite impulse response time-domain digital filter. By tuning the filter and the calibration model to the specific instrumental response associated with the prediction day, the calibration model is given enhanced ability to operate over time. This methodology is demonstrated in conjunction with partial least squares calibration models built with a spectral range of 4700–4300 cm 1 . By using a subset of the background spectra to evaluate the prediction performance of the updated model, projections can be made regarding the success of subsequent glucose predictions. If a threshold standard error of prediction (SEP) of 1.5 mM is used to establish successful model performance with the glucose samples, the corresponding threshold for the SEP of the background spectra is found to be 1.3 mM. For calibration updating in conjunction with digital filtering, SEP values of all 17 prediction sets collected over 3–178 days displaced from the calibration data are below 1.5 mM. In addition, the diagnostic based on the background spectra correctly assesses the prediction performance in 16 of the 17 cases. Index Headings: Near-infrared spectroscopy; NIR spectroscopy; Calibra- tion; Digital filtering. INTRODUCTION Near-infrared (NIR) spectroscopy has become a widely used analytical technique in many monitoring applications. 1 The method is rapid and nondestructive, and little or no sample preparation is necessary. In addition, the reduced background absorbance of water in the NIR region allows measurements to be made directly in aqueous samples. Despite these clear advantages, the development of NIR applications is often limited by the weak absorptivities and broad spectral bands associated with many target analytes. In a complex matrix such as a biological sample, high spectral overlap dictates that multivariate calibration methods must be used to implement a quantitative analysis. Techniques such as principal component regression (PCR) or partial least squares (PLS) are typically used to build quantitative models. 2–4 A ramification of the weak absorptivities associated with analyte spectral bands is the real possibility that spectral signals due to changes in the instrumental profile, sample temperature, or the background absorbance of the sample matrix may all be significantly larger than those of the analyte. 5–7 Multivariate calibration models may thus require many terms to extract the analyte information from the background. Because these non- analyte factors are embedded in the model, spectrometer drift may be a serious concern when applying unknown samples to a calibration model that was created with data collected earlier. 8,9 Traditional techniques to extend the usefulness of a calibration model over time include calibration transfer 10–13 or calibration updating methods. 14–17 Calibration updating usually involves periodic re-collection of a small number of reference samples. The new data are incorporated into the original calibration (with possible deletion of some original data) to allow the calibration to ‘‘evolve’’ according to the changes in the instrumental profile over time. Calibration transfer is a technique designed to use calibration data collected on one spectrometer to achieve accurate predictions for unknowns sampled with another instrument. This technique may also be used for unknowns collected with the same spectrometer at a date removed from the collection of the calibration data. Both calibration updating and calibration transfer usually involve re-collection of samples of known analyte concentration. In recent work, we have introduced a new approach to updating calibration models so that accurate predictions are maintained over time. 18 This technique is based on the collection of repeated spectra of a single background sample during the instrumental warm-up period. This background sample contains no analyte and is employed to capture the instrumental profile on the prediction day. In the initial work, procedures were devised for incorporating these background spectra into the set of calibration spectra used to compute a PLS model. Effective predictions were maintained for an evaluation period of six months after the collection of the calibration data. In the work reported here, the updating method is enhanced by combining it with an adaptive digital filtering strategy. In past research, various digital filtering methods 5,19–24 have been investigated for use in helping to extract analyte signals from NIR data. The motivation for the use of digital filtering in this context is to remove data variance before submitting spectra to the calibration model. If filters can be tuned to remove non- analyte spectral features selectively, the models are potentially simplified and fewer features susceptible to instrumental drift are embedded. Such models should exhibit more robust performance over time. In the current work, the background data used to drive the model-updating procedure is also employed to optimize the frequency response of a preprocess- ing digital filter. The developed methodology is evaluated by application to a NIR analysis of glucose in a simulated biological matrix. EXPERIMENTAL Solution Preparation. Samples in this experiment were composed of glucose in an aqueous matrix of varying levels of Received 14 July 2008; accepted 24 November 2008. * Present address: Cognis Corp., 4900 Este Ave., Cincinnati, OH 45232.  Author to whom correspondence should be sent. E-mail: gary-small@ uiowa.edu. 246 Volume 63, Number 2, 2009 APPLIED SPECTROSCOPY 0003-7028/09/6302-0246$2.00/0 Ó 2009 Society for Applied Spectroscopy

Upload: gary-w

Post on 02-Oct-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Digital Filtering and Model Updating Methods for Improving theRobustness of Near-Infrared Multivariate Calibrations

KIRSTEN E. KRAMER* and GARY W. SMALL�Optical Science and Technology Center and Department of Chemistry, University of Iowa, Iowa City, Iowa 52242

Fourier transform near-infrared (NIR) transmission spectra are used for

quantitative analysis of glucose for 17 sets of prediction data sampled as

much as six months outside the timeframe of the corresponding

calibration data. Aqueous samples containing physiological levels of

glucose in a matrix of bovine serum albumin and triacetin are used to

simulate clinical samples such as blood plasma. Background spectra of a

single analyte-free matrix sample acquired during the instrumental

warm-up period on the prediction day are used for calibration updating

and for determining the optimal frequency response of a preprocessing

infinite impulse response time-domain digital filter. By tuning the filter

and the calibration model to the specific instrumental response associated

with the prediction day, the calibration model is given enhanced ability to

operate over time. This methodology is demonstrated in conjunction with

partial least squares calibration models built with a spectral range of

4700–4300 cm�1. By using a subset of the background spectra to evaluate

the prediction performance of the updated model, projections can be

made regarding the success of subsequent glucose predictions. If a

threshold standard error of prediction (SEP) of 1.5 mM is used to

establish successful model performance with the glucose samples, the

corresponding threshold for the SEP of the background spectra is found to

be 1.3 mM. For calibration updating in conjunction with digital filtering,

SEP values of all 17 prediction sets collected over 3–178 days displaced

from the calibration data are below 1.5 mM. In addition, the diagnostic

based on the background spectra correctly assesses the prediction

performance in 16 of the 17 cases.

Index Headings: Near-infrared spectroscopy; NIR spectroscopy; Calibra-

tion; Digital filtering.

INTRODUCTION

Near-infrared (NIR) spectroscopy has become a widely usedanalytical technique in many monitoring applications.1 Themethod is rapid and nondestructive, and little or no samplepreparation is necessary. In addition, the reduced backgroundabsorbance of water in the NIR region allows measurements tobe made directly in aqueous samples.

Despite these clear advantages, the development of NIRapplications is often limited by the weak absorptivities andbroad spectral bands associated with many target analytes. In acomplex matrix such as a biological sample, high spectraloverlap dictates that multivariate calibration methods must beused to implement a quantitative analysis. Techniques such asprincipal component regression (PCR) or partial least squares(PLS) are typically used to build quantitative models.2–4

A ramification of the weak absorptivities associated withanalyte spectral bands is the real possibility that spectral signalsdue to changes in the instrumental profile, sample temperature,or the background absorbance of the sample matrix may all besignificantly larger than those of the analyte.5–7 Multivariatecalibration models may thus require many terms to extract the

analyte information from the background. Because these non-analyte factors are embedded in the model, spectrometer driftmay be a serious concern when applying unknown samples to acalibration model that was created with data collected earlier.8,9

Traditional techniques to extend the usefulness of acalibration model over time include calibration transfer10–13

or calibration updating methods.14–17 Calibration updatingusually involves periodic re-collection of a small number ofreference samples. The new data are incorporated into theoriginal calibration (with possible deletion of some originaldata) to allow the calibration to ‘‘evolve’’ according to thechanges in the instrumental profile over time. Calibrationtransfer is a technique designed to use calibration data collectedon one spectrometer to achieve accurate predictions forunknowns sampled with another instrument. This techniquemay also be used for unknowns collected with the samespectrometer at a date removed from the collection of thecalibration data. Both calibration updating and calibrationtransfer usually involve re-collection of samples of knownanalyte concentration.

In recent work, we have introduced a new approach toupdating calibration models so that accurate predictions aremaintained over time.18 This technique is based on thecollection of repeated spectra of a single background sampleduring the instrumental warm-up period. This backgroundsample contains no analyte and is employed to capture theinstrumental profile on the prediction day. In the initial work,procedures were devised for incorporating these backgroundspectra into the set of calibration spectra used to compute aPLS model. Effective predictions were maintained for anevaluation period of six months after the collection of thecalibration data.

In the work reported here, the updating method is enhancedby combining it with an adaptive digital filtering strategy. Inpast research, various digital filtering methods5,19–24 have beeninvestigated for use in helping to extract analyte signals fromNIR data. The motivation for the use of digital filtering in thiscontext is to remove data variance before submitting spectra tothe calibration model. If filters can be tuned to remove non-analyte spectral features selectively, the models are potentiallysimplified and fewer features susceptible to instrumental driftare embedded. Such models should exhibit more robustperformance over time. In the current work, the backgrounddata used to drive the model-updating procedure is alsoemployed to optimize the frequency response of a preprocess-ing digital filter. The developed methodology is evaluated byapplication to a NIR analysis of glucose in a simulatedbiological matrix.

EXPERIMENTAL

Solution Preparation. Samples in this experiment werecomposed of glucose in an aqueous matrix of varying levels of

Received 14 July 2008; accepted 24 November 2008.* Present address: Cognis Corp., 4900 Este Ave., Cincinnati, OH 45232.� Author to whom correspondence should be sent. E-mail: [email protected].

246 Volume 63, Number 2, 2009 APPLIED SPECTROSCOPY0003-7028/09/6302-0246$2.00/0

� 2009 Society for Applied Spectroscopy

bovine serum albumin (BSA) and triacetin. These species weredesigned to model total protein and triglycerides, respectively,in biological samples such as blood plasma. The concentrationsof each of the three components were varied over thephysiological range according to a uniform experimentaldesign.25 In this design, glucose spanned 1.4–19.3 mM, BSAranged from 51.0–99.0 g/L, and triacetin ranged from 1.4–3.9g/L. The calibration set consisted of 70 samples and theprediction set (a separate uniform design) contained 21samples. The maximum correlation coefficients betweencomponent concentrations were 0.01 and 0.25 for thecalibration and prediction sets, respectively.

Samples were prepared in a 0.1 M, pH 7.4 phosphate buffer(NaH2PO4 (ACS reagent, Fisher Scientific, Fair Lawn, NJ) þ50% w/w NaOH (Fisher)). Reagent-grade water obtained froma Milli-Q Plus water purification system (Millipore, Inc.,Bedford, MA) was used in all solution preparations. Sodiumbenzoate (5 g/L, Fisher) was added to the buffer as apreservative. Mixture samples were prepared by volumetricdilutions of stock solutions of glucose (71.4 mM, ACS reagent,Fisher), BSA (94.9 g/L, Cohn fraction V powder, minimum96% by electrophoresis, Product No. A4503, Sigma ChemicalCo., St. Louis, MO), and triacetin (29.7 g/L, 99%, Sigma).

Apparatus. A Digilab FTS-60A Fourier transform (FT)spectrometer (Varian, Inc., Randolph, MA) was used to collectinterferogram data of the mixture samples. The spectrometerwas equipped with a 100 W tungsten-halogen lamp, CaF2

beam splitter, and cryogenically cooled InSb detector. A liquidtransmission cell with a 20 mm diameter circular aperture(model 118–3, Wilmad Glass, Buena, NJ) and sapphirewindows (Meller Optics, Providence, RI) was used to holdthe samples. The path length was fixed at 2 mm. To ensuredetector linearity, a source aperture setting of 2 cm�1 was usedand a 63% thin-film neutral density filter (Rolyn Optics,Covina, CA) was placed before the sample. A K-bandinterference filter (Barr Associates, Westford, MA) was alsoplaced before the sample to limit the throughput range to 5000–4000 cm�1.

The sample cell was encased in a metal water jacket and aFisher Model 9100 circulator (Fisher Scientific, Inc., Pitts-burgh, PA) was used to control the temperature. Temperatureswere measured with a copper-constantan thermocouple probe(Omega Engineering Inc., Stamford, CT) inserted into a portwithin the cell jacket. An Omega Model 670 digital meter wasused to record the temperature of the sample with a precision of6 0.1 8C. The mean and standard deviation of all temperaturesrecorded throughout the experiment were 37.0 6 0.1 8C.Glucose concentrations were checked daily with a YSI Model2300 Stat Plus Glucose Analyzer (YSI, Inc., Yellow Springs,OH).

Raw Data Processing. All glucose samples were collectedas single-sided interferograms consisting of 2048 points and256 co-added scans. The spectral bandwidth was 15 801 cm�1.Interferograms of the background data (described below) werecollected as 256 co-added scans for analysis days 1, 2, 3, and30, and 64 co-added scans for all other days. Interferogramswere downloaded to a Silicon Graphics Origin 200 computer(Silicon Graphics, Inc., Mountain View, CA) operating underIrix (Version 6.5, Silicon Graphics, Inc., Mountain View, CA).Programs written in Fortran 77 and subroutines from the IMSLsoftware package (IMSL, Inc., Houston, TX) were used forFourier processing interferograms. Mertz phase correction and

triangular apodization were used, and the spectral point spacingwas 15.4 cm�1. The corresponding instrumental resolution wasdetermined to be 16.6 cm�1 on the basis of previousexperiments with a xenon line source.26

Calibration computations were performed with functionswritten in the Matlab development environment (Version 6.5,The Mathworks, Inc., Natick, MA). Some calculationsemployed functions from the PLS_Toolbox (Version 3.5,Eigenvector Research, Inc., Manson, WA). Digital filter designused the Matlab Signal Processing Toolbox (Version 6.1, TheMathWorks, Inc.).

Software timing experiments were performed with a DellXPS 600 computer equipped with a Pentium 4 processoroperating at 3.80 GHz (Dell Computer, Inc., Round Rock, TX).This system ran under Microsoft Windows XP (Microsoft, Inc.,Redmond, WA). Filtering and calibration calculations associ-ated with the timing experiments employed Matlab, Version7.1, and the Matlab Signal Processing Toolbox, Version 6.4(The Mathworks, Inc.). Functions from the PLS_Toolbox,Version 4.0 (Eigenvector Research, Inc.) were also used.

Data Collection. The 70 samples of the calibration set werecollected over days 1–2 (calibration set A) and again on days100–101 (calibration set B). The prediction set was sampled ondays 3, 30, 37, 44, 51, 58, 65, 72, 79, 102, 109, 124, 136, 143,150, 157, 164, 171, and 178. On each calibration day, 35samples were collected for a total of 105 triplicate spectra andthe full calibration consisted of 210 spectra. On each predictionday, the entire prediction set (21 samples, 63 spectra) wascollected. To collect the replicate spectra, the sample cell wasfilled and positioned, the temperature was allowed to stabilizeto 37 8C, and three consecutive 256-scan interferograms wereacquired. All samples were kept frozen between analysis daysand neither the calibration set nor the prediction set was re-prepared during the study. A randomized sample run order wasdevised for the calibration and prediction sets to minimizecorrelations between the sample constituents and time. Thesame run order was used each day.

Before the glucose samples were measured, backgroundspectra of the sample matrix were collected during theinstrumental source warm-up period. The background samplewas composed of 75.0 g/L BSA and 2.5 g/L triacetin, roughlythe mean matrix of the calibration set of samples. Approxi-mately 1–2 hours were allowed for stabilization of theinstrument after the NIR source was powered and the detectorfilled with liquid nitrogen. Background collection tookapproximately 20–30 minutes and was performed during thefirst 1.5 hours after the source was powered. Other types ofbackgrounds (buffer solvent and a sample containing analyte)were also collected during this period but were not used for thisstudy. Since the collection order of backgrounds was swapped,the matrix backgrounds were collected roughly 10 minutesafter the source was powered for some days. For other days, thebackgrounds were collected after the source had been on for alonger time. Batches of 20 background spectra were collectedon days 1–3, while 50 spectra were collected on all other days.In all cases, backgrounds were collected from a single fillingand positioning of the sample cell.

RESULTS AND DISCUSSION

Data Preprocessing. The quantitative models for glucosereported here are based on the direct analysis of single-beam

APPLIED SPECTROSCOPY 247

spectra preprocessed as

x 0i ¼ log10

1

xi

� �ð1Þ

where xi is the raw single-beam intensity at resolution element,

i. This transformation is often used for single-beam data to

approximate the units of absorbance.27 Spectra that are

processed in this way will be termed log/I spectra. Single-

beam spectra are used here for model building on the basis of

interest in our laboratory in avoiding the introduction of

additional spectral variance by taking the ratio of two

extremely temperature-sensitive single-beam spectra.27,28 All

of the data analysis procedures described could be similarly

applied to other types of spectra, however (e.g., spectra in

absorbance units (AU), Raman spectra, etc.).

Noise Analysis and Elimination of Outlying Data. To

assess data quality, short-term noise was computed on each

analysis day. For the glucose samples, three noise values were

computed from each group of three replicate spectra by taking

the ratio of each pair-wise combination of single-beam spectra

and converting the resulting transmittance values to absor-

bance. To remove the effects of temperature variance, these

spectra were fit to a second-order polynomial function over the

region of 4500 to 4300 cm�1. Noise estimates were then

computed as the root mean square (rms) values about the

polynomial fit. For the background data sets, a similar

calculation was performed, taking each group of threeconsecutive spectra as replicates.

Figure 1A plots the mean rms noise in lAU for the sets ofglucose data collected on each analysis day. Day 124 suffersfrom particularly high noise and after investigation of thespectra by principal component analysis29 it was apparent thatthe data collected during the first half of the day appeared to bequite different from the second half. Analysis of day 124 wasconsistently poor for the second half of the samples andtherefore this entire day was removed from the analysis.

Figure 1B plots the noise for the sets of matrix backgroundscollected on each analysis day. Days 1–3 have lower noisebecause the background spectra collected on these days werebased on 256 co-added scans (versus 64 scans on other days).The noise is higher than normal for four analysis days, 102,109, 150, and 171. After inspecting the spectra, it wasconcluded that the presence of a bubble in the sample cell (i.e.,transmission through air) most likely caused some anomalousbackground spectra to be collected on these days. Within thesefour groups of spectra, the sum of squares of intensities in the5000–4800 and 4200–4000 cm�1 regions were used to definean intensity threshold for elimination of outlying spectra. Whenbubbles are present, these regions will have greater thanexpected transmittance. For days 102, 109, 150, and 171, thenumber of spectra passing the threshold were 2, 1, 27, and 11,respectively. Since day 109 contained only one spectrum, thisday was eliminated from the filter study. Day 150 wasconsidered to have a sufficient number of backgrounds for the

FIG. 1. The rms noise values in lAU for (A) the sets of glucose samples and (B) matrix backgrounds collected on each analysis day (calculations and removal ofoutlying data are discussed in the text). (C) Values of SECV for PLS cross-validations performed separately for each analysis day for the available glucose data.These values provide an estimate of the best prediction performance that could be expected on each day.

248 Volume 63, Number 2, 2009

study. Days 102 and 171 were analyzed with the understandingthat prediction performance could be degraded because of thelimited number of backgrounds available.

Figure 1C shows the standard error of cross-validation(SECV) for each analysis day from a PLS model computedwith the log/I data collected on that day. For each set of glucosesamples, a leave-one-sample-out cross-validation was per-formed on the data set for PLS model sizes of 1–20 factors anda spectral range of 4700–4300 cm�1. This region has beenfound optimal in previous work with the glucose/BSA/triacetinmatrix.20,30–32 It includes the important C–H combination bandof glucose centered near 4400 cm�1.

In this experiment, one sample corresponded to threereplicate spectra. The cross-validation involved three spectrabeing withheld for prediction while a PLS model wasformulated with the remaining data. Mean centering wasperformed separately for each calibration set in the cross-validation before submitting the spectra to the PLS calculation.The standard error of cross-validation was computed as

SECV ¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiXn

i¼1

ðci � ciÞ2

n

vuuutð2Þ

where ci and ci are the prepared and predicted concentrations ofglucose, respectively, and n denotes the total number of spectraused. For the various days, n ¼ 105 for calibration days 1, 2,100, and 101, and n ¼ 63 for all other days. After the cross-validations were performed for 1–20 PLS factors, the lowestmodel size was selected whereby the error was not significantlydifferent from the minimum SECV according to an F-test (P¼0.70). The model sizes in Fig. 1C were between 7 and 13

factors for the internal cross-validations. Excluding day 124,the SECV values ranged from approximately 0.5 to 0.9 mM.These values were considered to be an assessment of the bestprediction error that could be expected for each day.

Prediction Results with Long-Term Data. Figure 2 showsprediction results for each analysis day using log/I single-beamPLS models for calibration sets A (black) and B (gray). Therange of 4700 to 4300 cm�1 was again used in building themodels, and the calibration data were mean centered beforesubmission to the PLS algorithm. Model sizes were computedby use of a leave-10%-out cross-validation procedure appliedto the calibration data, followed by analysis of the SECVvalues with the F-test at P ¼ 0.70. Replicate spectra werecarried together in each cycle of the cross-validation. Theresulting models were applied to the spectra in the predictionset collected on each day. Model performance was evaluatedby computing the standard error of prediction (SEP). The SEPwas computed with Eq. 2, where n¼ 63 denotes the number ofspectra in the prediction set for a given analysis day.

For the glucose analysis in the BSA/triacetin matrix andconsidering the noise levels achievable with the spectrometerused here, a maximum acceptable SEP was established at 1.5mM. Prediction results are �1.5 mM for only three days usingcalibration set A and for only one day using calibration set B.This motivates the need for additional data treatment toimprove the robustness of the models.

Overview of Calibration Updating Procedure. This workexplored the development of digital filtering methods33–35 foruse in helping to update multivariate calibration models toaccount for changes in instrumental response with time. Thiswork assumes the availability of a set of previously acquiredcalibration spectra that can be used to build a multivariatemodel for use in predicting analyte concentrations in unknown

FIG. 2. Values of SEP for each analysis day using PLS models based on calibration set A collected over days 1–2 (black) and calibration set B collected on days100–101 (gray). Values of SEP for prediction days 3, 65, and 72 were below the target limit of 1.5 mM for models based on calibration set A, and the SEP for day102 was below 1.5 mM for the models based on calibration set B. For all other days, prediction results were poor and show severe degradation in many instances.These results illustrate the effects of instrumental drift on the prediction performance for data taken significantly outside the time span of the calibration.

APPLIED SPECTROSCOPY 249

samples. Furthermore, as demonstrated by the results displayedin Fig. 2, we assume that in the time since the calibrationspectra were collected, instrumental drift has occurred thatprevents a model based on the calibration data alone fromperforming adequately. On a given day in which unknownsamples are analyzed (i.e., the ‘‘prediction day’’), the goal ofthis work was to develop an effective model-updating protocolthat required the minimum amount of additional datacollection.

In an initial study, we have demonstrated that models can beupdated by augmenting the calibration data with replicatespectra of a single analyte-free matrix sample collected on theprediction day during the instrumental warm-up period.18 Thiswork built upon previous research by Haaland and co-workersin the development of their augmented classical least squares(ACLS) methods.15,36–40

In the present study, we have sought to improve thiscalibration updating methodology by incorporating an adaptivedigital filter into the spectral processing stream for the purposeof removing variation that may interfere with the performanceof the calibration model. These filters are adaptive in the sensethat a new filter frequency response is optimized for eachprediction day. In the revised processing strategy, the filter isapplied to the spectral data immediately prior to the PLScalculation.

In implementing this strategy, a Butterworth design wasused to develop high-pass and bandpass infinite impulseresponse (IIR) filters.33,35 To prevent band shifts in the filteredspectra, filters were applied with a zero-phase implementation.This involved filtering the data in the forward direction,reversing the filtered sequence, and re-applying the filter.

Filter Optimization Strategy. The filtering methodologywas evaluated both with and without the updating procedure(i.e., with and without adding backgrounds from the predictionday into the calibration set). Both high-pass and bandpassfilters were studied thoroughly and were found to producesimilar results. Only results obtained with bandpass filters willbe presented here.

Bandpass filters are defined in terms of their pass-band andstop-band cutoff frequencies. Filters were evaluated for orders2, 4, 6, and 8 and were searched from frequency positionscentered at 0.01–0.8 in increments of 0.02. These filterspecifications apply to a normalized frequency axis of 0.0 to1.0. For each frequency position, values from 0.01–0.5 inincrements of 0.02 were subtracted from the center frequencyto obtain the pass-band cutoff and were added to the centerfrequency to obtain the stop-band cutoff for all computationallypossible filters.

For each filter setting, filtering was applied to the spectralrange of 4800–4200 cm�1. After filtering, the data weretruncated to 4700–4300 cm�1 for a total of 27 points in theanalysis range. The filtered data were then mean centered andsubmitted to the PLS algorithm.

When calibration updating was not performed, 20 spectrawere selected from the pool of backgrounds collected over thecalibration days and added to the 210 mixture spectra in thecalibration set. By contrast, calibration updating involvedadding 10 backgrounds from the calibration days and 10backgrounds from the prediction day, for the same total of 230spectra. The remaining backgrounds from the prediction daywere then used to test the updated model.

Two procedures were evaluated in an investigation of how to

select the backgrounds most appropriately for use in updatingand testing the calibration model. The first procedure wasbased on sampling the backgrounds equally over time. Forexample, if 50 backgrounds were collected, sequence numbers1, 6, 12, 17, 23, 28, 34, 39, 45, and 50 were used to augmentthe calibration set and the remaining backgrounds were used totest the updated model and determine the optimal filterbandpass.

The second procedure used a multivariate distance approach.To select the group of 10 (or 20) backgrounds from thecalibration or prediction groups, an iterative procedure wasused to subdivide the pool of collected backgrounds into twosets. If the number of collected backgrounds is indicated by n,and q spectra were to be selected (q ¼ 10 or 20), the first setwas of size n� q. This set was selected to maximize diversity.Operating with the spectra filtered with the current pass-bandspecifications being evaluated, a stepwise procedure was usedto identify the n � q spectra farthest from the multivariatemean. At each iteration, the spectrum farthest from the meanwas identified and added to the selected subset. The mean wasthen re-computed and the procedure was repeated until the setof n � q spectra was assembled.

After performing the above procedure, the set of q spectraclosest to the multivariate mean were used to augment thecalibration set and the model was then updated. The remainingset of n� q spectra farthest from the mean were subsequentlyused to test the updated model and evaluate the filter bandpassparameters.

The two background selection procedures are fundamentallydifferent from the standpoint of how representative thebackgrounds added to the calibration set are with respect tothe total set of backgrounds collected. By selecting thebackgrounds equally spaced over time, the first approachattempted to augment the calibration set with spectra that wererepresentative of all the backgrounds. By contrast, the secondprocedure was designed to select the most diverse backgroundsand reserve them for use in testing the updated model.Consequently, the backgrounds added to the calibration setwere much less diverse than those used to update the modelwith the first selection procedure.

Evaluation of these two procedures revealed that the secondapproach was superior. Having a more diverse (i.e., morechallenging) set of spectra for use in testing the model madeselecting the best filter bandpass easier and also provided abetter estimate of the performance of the updated model whenit was subsequently applied to predict the glucose concentra-tions in the set of actual prediction samples. The set of qspectra closest to the mean proved adequate for use in updatingthe calibration model. This suggests that the key informationrequired to adjust the model is found in the mean backgroundspectrum as long as enough statistical weight is given to thisspectrum so that the model is forced to take it into account (i.e.,it is effectively present q times in the calibration set).

On the basis of this evaluation, the results presented hereemployed the set of q background spectra to augment thecalibration set and the group of n � q spectra to test thecomputed model. Once the updated calibration set wasassembled, the mean spectrum was computed, the data weremean centered, and the PLS calculation was performed.

For prediction days 102 and 171, only 2 and 11 backgroundswere available, respectively, and the above procedure could notbe used. For day 102, one background was repeated 10 times.

250 Volume 63, Number 2, 2009

For day 171, q was set to 5, and these selected backgroundswere duplicated for the subset of 10.

After supplementing the calibration set with backgrounds, aleave-10%-out cross-validation was performed for 1–20 PLSfactors using the procedures outlined previously. As before, amodel size was selected using the F-test (P ¼ 0.70). Twochecks were performed at this stage to identify potentiallyanomalous filters. If the model size was larger than thatobtained without filtering, the filter was discarded. Similarly, ifthe standard error of calibration (SEC) or SECV exceeded 1.0mM, the filter was judged ineffective. The SEC is computed ina manner analogous to Eq. 2, except that the calibration spectraare used and the degrees of freedom in the denominator aredecreased by the model size (h). This estimation of the degreesof freedom is a commonly used approximation, although itdoes not take into account that when a latent variable methodsuch as PLS is employed, more than h original data points aretypically used to produce the scores that comprise the hindependent variables of the calibration model.

For filters that passed the initial screen, the SEP wascomputed from the set of n � q backgrounds that were notadded to the calibration set, using zero as the referenceconcentration. The SEP value of the background data (SEPBG)was used to guide the filter search, with the smallest value ofSEPBG indicating the best filter to use.

Results of Initial Studies. Initial studies were performed bysubdividing the calibration data into calibration and prediction

subsets on the basis of the calibration set (A or B) and the dayof data collection. In terms of days, calibration/prediction setsused were 1/100, 1/101, 2/100, 2/101, 100/1, 100/2, 101/1, and101/2. This allowed the long-term prediction sets to bereserved for final testing once the filtering/updating algorithmwas optimized.

Figure 3A plots the first three samples (nine log/I spectra) ofcalibration day 2 and the same three samples (after frozenstorage) collected on day 101 (18 spectra total). The samespectra after mean centering are plotted in Fig. 3C. Althoughthe spectra appear to be well overlapped in Fig. 3A, the mean-centered spectra in Fig. 3C exhibit quite a bit of variation. Sixdistinct groups of spectra are observed throughout the 4700–4300 cm�1 spectral range, rather than the three groups thatwould be considered ideal. Long-term predictions were poorfor the non-filtered data, yielding results in excess of 13 mMfor SEP.

Figure 3B plots the spectra after application of the optimalbandpass filter found using day 2 for calibration and day 101for prediction. The same filtered spectra after mean centeringare plotted in Fig. 3D. The order of the filter was 2 and cutofffrequencies were 0.16 and 0.74 for the pass-band and stop-band, respectively. An inspection of Fig. 3D reveals the desiredcase of only three distinct groups of spectra in the important4450 to 4300 cm�1 range. This indicates that the filter hasremoved some of the spectral variation associated with themeasurement of the three samples across a 99-day time

FIG. 3. (A) First three triplicates (nine spectra) of calibration day 2 and first nine spectra of calibration day 101 (18 total) are plotted for log/I spectra. Theserepresent the same samples collected 100 days apart from each other. (B) Filtered spectra using the optimal bandpass filter (order¼ 2, pass-band¼ 0.16, stop-band¼0.74) obtained using day 2 for calibration and day 101 for prediction. (C) Unfiltered log/I spectra from (A) after mean centering. (D) Filtered spectra from (B) aftermean centering.

APPLIED SPECTROSCOPY 251

interval. In addition, even though the 4700 to 4450 cm�1 regionstill shows six groups of spectra, variation within the groups ofreplicates is much reduced relative to the unfiltered spectradisplayed in Fig. 3C. Sharper and therefore potentially moreselective spectral features are also apparent in the filteredspectra in Figs. 3B and 3D. For the prediction set, SEP valuesobtained with the filtered spectra were less than 1.2 mM, evenwithout calibration updating. These results illustrate thepotential utility of filtering the data prior to the application ofPLS regression.

A goal of the initial studies was to investigate whether theoptimal value of SEPBG obtained in the filter search could beused to estimate SEP for the prediction samples. If successful,this would allow a projection to be made regarding theprediction performance and allow a decision to be maderegarding the need for calibration updating.

Examinations of correlations between SEPBG values and thecorresponding values of SEP for the prediction sets revealedseveral trends. For each of the eight calibration/predictionsubsets, Fig. 4 plots the optimal SEPBG (gray bars) and thecorresponding SEP (black bars) for the results of bandpassfiltering. No calibration updating was performed. On average,the SEPBG values are approximately 0.2 mM lower than theSEP values. This suggests that if 1.5 mM is judged to be anacceptable performance level for the SEP, a value of SEPBG of1.3 mM must be obtained. In principle, this cutoff value ofSEPBG could be used to signal when filtering alone is expectedto be sufficient to ensure a good prediction performance.

Inspections of the pass-band specifications of individualfilters and their relationships to the SEPBG and SEP values alsorevealed that in some cases, low pass-band cutoffs could leadto occurrences of good values of SEPBG but not correspond-ingly good values of SEP. This illustrates the possibility of acase in which a filter is obtained that works well with thebackground data but not with data in which the glucose featureis present. It was found that these occurrences could be

minimized, however, by restricting the filter searches to pass-band cutoffs of 0.4 or higher.

Results with Long-Term Prediction Data. Results arepresented in terms of the ability to provide an SEP of 1.5 mMor lower for the prediction sets collected over time. In thisstudy, the IIR filter search described above was implementedand the filter settings occurring at the minimum SEPBG wereconsidered to be optimal. If the value of SEPBG did not meetthe 1.3 mM threshold, the prediction was anticipated to beunsuccessful (SEP ¼ 1.5 mM or greater).

Model updating was also applied in conjunction withfiltering. When updating was used, SEPBG will be termedSEPBGU. Results for filter searches with and without updatingwill be presented as a general approach to improving the resultsof long-term predictions. In this study, a false positive will bedefined as the SEPBG or SEPBGU being above the thresholdwhen the final SEP was below 1.5 mM and a false negative willmean the SEPBG or SEPBGU was below the threshold when theSEP was greater than or equal to 1.5 mM.

Figure 5 shows the SEP values obtained when optimizedbandpass filters were applied to each prediction day. Thecalibration set was not updated in obtaining these results.Prediction results using calibration set A are plotted in black(the first bar of the pair) and results for calibration set B areplotted in gray (the second bar). Solid colors indicate ‘‘truenegatives’’, where the SEPBG was below the 1.30 mMthreshold and the corresponding SEP was less than 1.5 mM.Stripes indicate ‘‘true positives’’, where the SEPBG exceededthe 1.30 mM threshold and correctly identified a poorprediction result (SEP . 1.5 mM). Crosses indicate falsepositives or false negatives, where the SEPBG value did notcorrectly predict the results of SEP.

Optimal filter orders were 2 or 4, the mean 6 standarddeviation of the pass-band values was 0.24 6 0.14 frequencyunits, and the mean 6 standard deviation of the stop-band was0.76 6 0.21 frequency units. A false negative was observed for

FIG. 4. Results for each pair of calibration and prediction days for the bandpass filter searches without calibration updating. The x-axis gives the day used forcalibration followed by the day used for prediction in parentheses. The lowest value of SEPBG (gray) is plotted along with the final SEP result (black) after the filtersettings were applied.

252 Volume 63, Number 2, 2009

day 143 with calibration set A, where SEPBG was 1.20 mM.Overall, the results in Fig. 5 are improved compared to the non-filtered results of Fig. 2, but filtering alone did not achieve thedesired results in many cases.

Figure 6 shows the results using calibration updating inconjunction with the restricted filter search (0.4 and greaterfrequency units for the pass-band cutoff). As in Fig. 5, the firstbars are for calibration set A (black) and the second are forcalibration set B (gray). All final SEP values were below 1.5mM. One false positive occurred for day 136 using calibrationset B, where SEPBGU was 1.31 mM (just above the threshold).

When the results of the two calibrations are treatedcollectively, one false positive out of 34 is a 3% incident rate.As shown in Fig. 7, updating in conjunction with the filtersearch gave much superior results to updating alone, whichproduced final SEP values that exceeded 1.5 mM for a numberof different days. For the bandpass filters used in conjunctionwith updating, all optimized filter orders were 2, the mean 6

standard deviation of the pass-band cutoff was 0.45 6 0.05frequency units, and the mean 6 standard deviation of thestop-band cutoff was 0.77 6 0.11 frequency units.

Efficiency of Updating Procedure. The time required toperform the calibration updating consists of the data acquisitiontime for the background spectra, plus the time devoted to thefilter optimization. In this work, collection of the backgroundspectra was performed in an automated manner at the start ofthe day during the instrumental warm-up period. As notedpreviously, acquisition of the background spectra required 20–30 minutes. During this time, the instrument would normallybe idle. In a practical implementation of the methodology, thefilter optimization procedure would be performed immediatelyafter acquisition of the background data and could also becompletely automated.

Software timing experiments were performed with astandard desktop computer (described in the Experimentalsection) to provide an estimate of the time required to obtainthe optimal updated calibration model. With the restricted filtersearch described previously, a total of 712 Butterworth filterswere investigated in conjunction with calibration updating.Using a static spectral range of 4700–4300 cm�1, PLS modelsassessed with leave-10%-out cross-validation, and an evalua-tion of 1–20 latent variables, 3.9 6 0.1 minutes were requiredto perform the entire optimization procedure. This average andassociated standard deviation were computed from threeseparate executions of the optimization.

Software profiling revealed that the generation and applica-tion of the filters accounted for only about 19% of the totalexecution time. The bulk of the computational time wasdevoted to the cross-validation/PLS computations. As such, thetotal execution time was heavily dependent on the number ofcalibration spectra (210 sample spectra þ 20 augmentedbackgrounds), spectral dimensionality (27 points in the4700–4300 cm�1 range) and number of latent variables studied(20). For another application, execution times would increaseor decrease as these values changed.

CONCLUSION

In this study, a method of calibration updating was used toimprove prediction results of unknown samples collected faroutside the timeframe of the calibration data. Long-termpredictions were improved in this study using IIR filtering ora combination of model updating and IIR filtering. The methodrequired collection of a set of matrix backgrounds on eachprediction day. In addition to being used to update thecalibration model, the backgrounds were treated as a

FIG. 5. Results of filtering alone without calibration updating for bandpass filter searches using the filter producing the lowest value of SEPBG as the optimal filterfor each prediction day. Successful results for calibration sets A and B are plotted in black and gray, respectively. Stripes are used to denote true positives ornegatives, while crosses denote false positives or negatives (defined in text). Slanted stripes and crosses are used for calibration set A, while horizontal stripes andcrosses denote calibration set B. Filtering greatly improved the results compared to non-filtered predictions (Fig. 2), but many final SEP values were above theacceptable level of 1.5 mM.

APPLIED SPECTROSCOPY 253

monitoring set for filter optimization as well as for estimationof the final prediction error of the glucose samples.

Final results were best when calibration updating wascombined with filter optimization. Model updating consisted ofadding a subset of backgrounds to the calibration set. The

remaining backgrounds were used to guide the filter optimi-

zation. It was also found useful to confine the filter search to alimited range of pass-band frequencies in order to minimize

bias in SEPBGU, the diagnostic computed to provide anestimate of the prediction error. In order to create the most

FIG. 6. Final results of the updated calibrations combined with filter searches that were restricted to pass-band cutoff frequencies of 0.4 or higher for the bandpassfilter optimizations. Symbols are identical to those in Fig. 5. An acceptable level of performance (SEP , 1.5 mM) was found in all instances, although one falsepositive was found.

FIG. 7. Final prediction results for (A) calibration set A and (B) calibration set B when model updating was performed without filtering (black) and when updatingwas performed in conjunction with bandpass restrictive filtering (gray). The combination of filtering with updating provides superior results to updating alone.

254 Volume 63, Number 2, 2009

challenging prediction set, backgrounds clustered closest to themean of the set of background data were added to thecalibration, while the remaining backgrounds were used forprediction.

In the final results, IIR filtering alone produced manysuccessful prediction results for days far outside the calibration,but also many instances where the SEP was above 1.5 mM. Inthese instances, SEPBG usually exceeded the 1.30 mMthreshold, resulting in only one false negative. When updatingwas combined with filter searching, all final glucose SEPvalues were below 1.5 mM with only one false positiveobserved for the SEPBGU diagnostic over all studies.

In this work, calibration updating and estimation ofprediction performance were found to be successful using asingle matrix background sample that did not contain theanalyte in question. Successful calibrations were maintainedover 178 days. In addition, the 50 background spectra werecollected in an automated manner during the instrumentalwarm-up period at the start of the day.

To put this work in context, it is useful to contrast it to theACLS procedures developed by Wehlburg, Haaland, and co-workers40 in which one repeat calibration sample wasmeasured after every two calibration or prediction samples.In their work, maintenance of the calibration was investigatedover 29 days. Long-term drift effects were modeled bydifferences in the spectra of the repeat sample betweencalibration and prediction days, while short-term drift effectswere encoded by differences in the spectra of the repeat samplewithin the prediction day. These difference spectra were usedto update the ACLS model.

While the Wehlburg method has the theoretical advantage ofexplicitly handling short-term drift effects occurring during theprediction day, it has the disadvantage that more userinteraction is required to collect the spectra used in updatingthe model. Also, the repeat sample has to be maintained overtime with a constant analyte concentration. A key advantage toour approach is that the use of an analyte-free matrixbackground sample eliminates the need to maintain a staticrepeat sample and eliminates worry over whether that samplehas changed over time. In addition, as implemented here, theautomated collection of a large group of backgrounds alsoprovides data for the computation of the SEBBGU diagnosticthat serves to estimate prediction performance. The Wehlburgmethod provides no comparable diagnostic.

ACKNOWLEDGMENTS

This research was supported entirely by the National Institutes of Healthunder grants DK67445 and DK60657.

1. D. A. Burns and E. W. Ciurczak, Handbook of Near-Infrared Analysis,Second Edition, Revised and Expanded (Marcel Dekker, Inc., New York,2001).

2. D. M. Haaland and E. V. Thomas, Anal. Chem. 60, 1193 (1988).3. R. G. Brereton, Analyst (Cambridge, U.K.) 125, 2125 (2000).4. H. Martens and T. Næs, Multivariate Calibration (Wiley, New York,

1989).5. M. A. Arnold and G. W. Small, Anal. Chem. 62, 1457 (1990).6. L. A. Marquardt, M. A. Arnold, and G. W. Small, Anal. Chem. 65, 22,

3271 (1993).7. K. H. Hazen, M. A. Arnold, and G. W. Small, Appl. Spectrosc. 48, 477

(1994).8. L. Zhang, G. W. Small, and M. A. Arnold, Anal. Chem. 74, 4097 (2002).9. L. Zhang, G. W. Small, and M. A. Arnold, Anal. Chem. 75, 5905 (2003).

10. Y. Wang, D. J. Veltkamp, and B. R. Kowalski, Anal. Chem. 63, 2750(1991).

11. O. E. de Noord, Chemom. Intell. Lab. Syst. 25, 85 (1994).12. E. Bouveresse and D. L. Massart, Vib. Spectrosc. 11, 3 (1996).13. R. N. Feudale, N. A. Woody, H. Tan, A. J. Myles, S. D. Brown, and J.

Ferre, Chemom. Intell. Lab. Syst. 64, 181 (2002).14. K. Helland, H. E. Berntsen, O. S. Borgen, and H. Martens, Chemom. Intell.

Lab. Syst. 14, 129 (1991).15. D. M. Haaland and D. K. Melgaard, Vib. Spectrosc. 29, 171 (2002).16. C. L. Stork and B. R. Kowalski, Chemom. Intell. Lab. Syst. 48, 151

(1999).17. X. Capron, B. Walczak, O. E. de Noord, and D. L. Massart, Chemom.

Intell. Lab. Syst. 76, 205 (2005).18. K. E. Kramer and G. W. Small, Appl. Spectrosc. 61, 497 (2007).19. G. Horlick, Anal. Chem. 44, 943 (1972).20. R. E. Shaffer, G. W. Small, and M. A. Arnold, Anal. Chem. 68, 2663

(1996).21. G. W. Small, M. A. Arnold, and L. A. Marquardt, Anal. Chem. 65, 3279

(1993).22. M. J. Mattu, G. W. Small, and M. A. Arnold, Anal. Chem. 69, 4695

(1997).23. N. A. Cingo, G. W. Small, and M. A. Arnold, Vib. Spectrosc. 23, 103

(2000).24. M. J. Wabomba, G. W. Small, and M. A. Arnold, Anal. Chim. Acta 490,

325 (2003).25. Y. Liang, K. Fang, and Q. Xu, Chemom. Intell. Lab. Syst. 58, 43 (2001).26. K. E. Kramer and G. W. Small, J. Near Infrared Spectrosc. 14, 291 (2006).27. Q. Ding, G. W. Small, and M. A. Arnold, Appl. Spectrosc. 53, 402 (1999).28. G. Lu, X. Zhou, M. A. Arnold, and G. A. Small, Appl. Spectrosc. 51, 1330

(1997).29. S. Wold, K. Esbensen, and P. Geladi, Chemom. Intell. Lab. Syst. 2, 37

(1987).30. A. S. Bangalore, R. E. Shaffer, G. W. Small, and M. A. Arnold, Anal.

Chem. 68, 4200 (1996).31. S. T. Pan, H. Chung, M. A. Arnold, and G. W. Small, Anal. Chem. 68,

1124 (1996).32. Q. Ding, G. W. Small, and M. A. Arnold, Anal. Chem. 70, 4472 (1998).33. S. W. Smith, The Scientist and Engineer’s Guide to Digital Signal

Processing (California Technical Publishing, San Diego, 1997).34. C. Chen, Digital Signal Processing: Spectral Computation and Filter

Design (Oxford University Press, New York, 2001).35. S. M. Kuo and W. Gan, Digital Signal Processors: Architectures,

Implementations, and Applications (Prentice-Hall, Inc., Upper SaddleRiver, NJ, 2005).

36. D. M. Haaland and D. K. Melgaard, Appl. Spectrosc. 54, 1303 (2000).37. D. M. Haaland and D. K. Melgaard, Appl. Spectrosc. 55, 1 (2001).38. D. K. Melgaard, D. M. Haaland, and C. M. Wehlburg, Appl. Spectrosc. 56,

615 (2002).39. C. M. Wehlburg, D. M. Haaland, and D. K. Melgaard, Appl. Spectrosc. 56,

877 (2002).40. C. M. Wehlburg, D. M. Haaland, D. K. Melgaard, and L. E. Martin, Appl.

Spectrosc. 56, 605 (2002).

APPLIED SPECTROSCOPY 255