eigenstructure tracking analysis for assessment of peak purity in high-performance liquid...

9
ANALYTICA CHIMICA AC’IA ELSEVIER Analytica Chimica Acta 314 (1995) 131-139 Eigenstructure tracking analysis for assessment of peak purity in high-performance liquid chromatography with diode array detection F. Cuesta Sgnchez a, J. Toft a, O.M. Kvalheim b, D.L. Massart ay* a ChemoAC, Vrije Uniuersiteit Brussel, Laarbeeklaan 103, B-1090 Brussels, Belgium b Department of Chemistry, Uniuersity of Bergen, N-5007 Bergen, Norway Received 18 April 1995; accepted 17 May 1995 Abstract The application of eigenstructure tracking analysis (ETA) for the detection of an impurity under a chromatographic peak is discussed. A window of size three seems to be the most adequate for this problem. Some guidelines for interpretation of the ETA plots are given. A new normalization of spectra is proposed to remove heteroscedastic noise. The effect of this new normalization is compared with other data pretreatments. The results obtained are compared with the performance of methods such as fixed size window evolving factor analysis (FSW EFA), the methods based on the Gram-Schmidt orthogonalization and SIMPLISMA. Keywords: Liquid chromatography; Principal component analysis; Peak purity; Eigenstructure tracking analysis; Spectrum normalization 1. Introduction In some fields, e.g., pharmacy it is crucial to know whether a compound is pure or not. For this purpose, many approaches have been proposed dur- ing the last years. An important step forward was the introduction of the hyphenated techniques and the corresponding development of multivariate methods for data analysis. Among them, evolving factor anal- ysis (EFA) [1,2] is probably the most known and widely used. EFA searches for systematic variation in the data by analyzing an increasing number of consecutive spectra by singular value decomposition * Corresponding author. (SVD). Keller and Massart [3] proposed a modifica- tion of EFA, called fixed size window evolving factor analysis (FSW EFA). In this approach a mov- ing “window” is used, embracing a fixed number of consecutive spectra to be analyzed by SVD. Another window approach, known as eigenstructure tracking analysis (ETA) [4,5] that systematically increases the size of the moving window between repeated runs, has been proposed. The procedure starts by analyz- ing windows of size 2, i.e., analyzing two consecu- tive spectra at-the-time throughout the data matrix. In the second run the size of the moving window is increased by one. Thus, three consecutive spectra are analyzed. The maximum window size needed ex- ceeds the maximum number of compounds present by one. Therefore, if two compounds are present, a 0003-2670/95/$09.50 0 1995 Elsevier Science B.V. All rights reserved SSDIOOO3-2670(95)00280-4

Upload: fcuesta-sanchez

Post on 15-Jun-2016

221 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Eigenstructure tracking analysis for assessment of peak purity in high-performance liquid chromatography with diode array detection

ANALYTICA CHIMICA AC’IA

ELSEVIER Analytica Chimica Acta 314 (1995) 131-139

Eigenstructure tracking analysis for assessment of peak purity in high-performance liquid chromatography with diode array

detection

F. Cuesta Sgnchez a, J. Toft a, O.M. Kvalheim b, D.L. Massart ay* a ChemoAC, Vrije Uniuersiteit Brussel, Laarbeeklaan 103, B-1090 Brussels, Belgium

b Department of Chemistry, Uniuersity of Bergen, N-5007 Bergen, Norway

Received 18 April 1995; accepted 17 May 1995

Abstract

The application of eigenstructure tracking analysis (ETA) for the detection of an impurity under a chromatographic peak is discussed. A window of size three seems to be the most adequate for this problem. Some guidelines for interpretation of the ETA plots are given. A new normalization of spectra is proposed to remove heteroscedastic noise. The effect of this new normalization is compared with other data pretreatments. The results obtained are compared with the performance of methods such as fixed size window evolving factor analysis (FSW EFA), the methods based on the Gram-Schmidt orthogonalization and SIMPLISMA.

Keywords: Liquid chromatography; Principal component analysis; Peak purity; Eigenstructure tracking analysis; Spectrum normalization

1. Introduction

In some fields, e.g., pharmacy it is crucial to know whether a compound is pure or not. For this purpose, many approaches have been proposed dur- ing the last years. An important step forward was the introduction of the hyphenated techniques and the corresponding development of multivariate methods for data analysis. Among them, evolving factor anal- ysis (EFA) [1,2] is probably the most known and widely used. EFA searches for systematic variation in the data by analyzing an increasing number of consecutive spectra by singular value decomposition

* Corresponding author.

(SVD). Keller and Massart [3] proposed a modifica- tion of EFA, called fixed size window evolving factor analysis (FSW EFA). In this approach a mov- ing “window” is used, embracing a fixed number of consecutive spectra to be analyzed by SVD. Another window approach, known as eigenstructure tracking analysis (ETA) [4,5] that systematically increases the size of the moving window between repeated runs, has been proposed. The procedure starts by analyz- ing windows of size 2, i.e., analyzing two consecu- tive spectra at-the-time throughout the data matrix. In the second run the size of the moving window is increased by one. Thus, three consecutive spectra are analyzed. The maximum window size needed ex- ceeds the maximum number of compounds present by one. Therefore, if two compounds are present, a

0003-2670/95/$09.50 0 1995 Elsevier Science B.V. All rights reserved SSDIOOO3-2670(95)00280-4

Page 2: Eigenstructure tracking analysis for assessment of peak purity in high-performance liquid chromatography with diode array detection

132 F. Cuesta S&chez et al. /Analytica Chimica Acta 314 (1995) 131-139

window containing three spectra is sufficient. In all the cases the singular values of each window are determined and plotted as a function of time.

An intercomparison of methods for the assess- ment of peak purity in chromatography is presently being carried out. The goal is to find the most sensitive method for the detection of an impurity under a chromatographic peak when both resolution and concentration of the impurity are rather low, e.g. R, = 0.5 and 0.5% of impurity. For this purpose different methods, i.e., FSW EFA [3], methods based on the Gram-Schmidt orthogonalization [6,7], PCA after different pretreatments [8], and SIMPLISMA [9-111 have been applied to the high-performance liquid chromatography with diode array detection (HPLC-DAD) data published by Keller [12]. In this paper the application of ETA to this HPLC-DAD data set is discussed. The results are compared with the ones obtained by the other methods.

One of the problems affecting the evolving singu- lar value plots of the EFA based methods is the presence of heteroscedastic noise in the data [13,14]. Several pretreatments have been proposed to trans- form heteroscedastic noise into homoscedastic. In this paper we discuss selective normalization as pro- posed by Liang et al. [13], together with a new normalization which is a combination of the latter one and the one proposed by Windig [9,10]. Some guidelines for the interpretation of the ETA plots are given.

2. Theory

HPLC-DAD yields a data matrix X (m X n) where the m rows represent spectra measured at the differ- ent time intervals and the n columns are chro- matograms measured at the different wavelengths. Starting from the first spectrum, a window contain- ing p consecutive spectra is moved along the data matrix. If, e.g., p is equal to 2, the first window contains the two first spectra, the second window spectra 2 and 3 and so on. Each window is decom- posed by means of SVD and the logarithm of the p

singular values is plotted as a function of the win- dow number.

To explain the procedure a system is simulated where two compounds are eluting (for further details

see Section 3). For graphical illustration only, the data matrix is reduced to two wavelengths, 240 and 260 nm, which correspond to the maximum absorp- tion of the first and second compound, respectively. Initially one starts with a window of size p = 2, i.e., matrices with rows containing two consecutive spec- tra are decomposed by SVD. The two singular values (or the log) determined for each window are plotted as function of the window number. In Fig. la the absorbance at 240 nm is plotted versus the ab- sorbance at 260 nm, together with the two PCs determined for the window defined by the spectra at time 21 and 22. The score of, e.g., spectrum 21 on the first PC (PCl) is the distance from the origin of the variable space to the orthogonal projection of the object vector onto PCl. The score of each spectrum on PC1 is equal to the cosine of the angle deter- mined between the spectrum considered and PC1 multiplied by the length of the object vector repre- senting the spectrum. Similarly, the score of spec- trum 21 on the second PC (PC2) is the distance from the origin to its orthogonal projection onto PC2. The score of each spectrum on PC2 is equal to the sine of the angle between the spectrum considered and PC1 multiplied by the length of the vector representing the spectrum. Spectra 21 and 22 correspond to the elution of the first pure compound. Therefore, in the variable space, they are situated on a straight line passing through the origin which, theoretically, coin- cides with the direction of the PC1 of this window. In practice, they will not be situated exactly in the same direction due to the noise. The angle between, e.g., spectrum 21 and the corresponding PC1 is close to zero (cos (~~r,r~r Z 1). Therefore, the score on PC1 is practically equal to the Euclidean distance between the origin and the point representing spec- trum 21, while the score on PC2 is virtually zero. The first singular value for the window defined by spectra 21 and 22 is equal to the square root of the sum of the squared scores of spectra 21 and 22 for the first PC.

Thus, the singular value is a measure of variation in the data around the origin of the variable space. Only when the data are column centered the singular value is related to the usual definition of variance. In Fig. lb the two PCs obtained for the window defined by spectra 31 and 32 are shown. They represent mixture spectra. The overlapping region corresponds

Page 3: Eigenstructure tracking analysis for assessment of peak purity in high-performance liquid chromatography with diode array detection

F. Cuesta S&her et al./Analytica Chimica Acta 314 (1995) 131-139 133

0.1:

0.1

E

8 cd

8 5 0.0:

s s

C

-0.0= -6

0.1:

0.1

E

w 8 gJ 0.05

S d uz

a

"3

(a)

PC2 i \ \ \ \ s

\

\

1 PCL \ : . . .- \ ; c-

’ : /’

\ : . .cH \ f . .H \ ,;. -. .*

c $

\j. _*-

/

L. _- .._ ‘i . . . . . . . . .._.............

_* : *-

,* ! \ ; 1

\ : 1 : \ I \

0 0.05 0.1 0.15 Absorbance 240 nm

PC1 ,f'

I' (4

I’

- I’ I’

I

. . I I

PC2 P ,’

‘\ ’ 1 \ \

//-

.

./'

I*. ’

'.,. * >x

I ‘\ I’ ‘\ \

. *

0 0.05 0.1 0.15 Absorbance240 nm

Fig. 1. Simulated data: (a) absorbance at 240 MI versus ab-

sorbance at 260 nm, together with the two first PCs for the

window containing spectra 21 and 22 and their corresponding

scores for the first PC (arrows); (b) absorbance at 240 nm versus

absorbance at 260 nm, together with the two first PCs for the

window containing spectra 31 and 32 and their corresponding

scores for both PCs (arrows).

to the time where the first compound finishes elution and the second one starts to elute, and the ab- sorbance passes through a minimum. Also one ob- serves that the singular value for PC2 is no longer zero. The first singular value is a measure of the similarity of the spectra, while the second is a mea-

sure of the dissimilarity between the spectra and PCl. The same explanation can be applied for win- dows of size 3, 4, . . . .

In the specific case that we are dealing with, i.e., the assessment of peak purity, a window of three spectra is the most adequate choice. The first singu- lar value contains the overall information of the system. The second will provide additional informa- tion of the chromatographic overlap of an impurity with a spectrum different from that of the main compound. As we will show later the third singular value will provide information about the noise struc- ture.

As pointed out by Liang et al. [13] and Keller et al. [14], one of the factors influencing the detection of a minor compound is the presence of het- eroscedastic noise. To deal with this problem differ- ent approaches for data pretreatment are tried out.

(a) No pretreatment, the data are left unchanged:

yij = xii

fori=l,..., mandj=l,..., n (1)

This is useful to study the structure of the noise. (b) Selective spectrum normalization to constant

sum [13]. The spectra with large absorbance are normalized according to

xij yij=z. 7

2 xii j=l

fori=l,..., mandj=l,..., n

if

n ,: xii > z

j=l

where z is a threshold to be fixed. (c) Spectrum normalization with an offset. The

normalization proposed here is a combination of the selective normalization explained above and Windig’s normalization [9,10]. Each spectrum is nor- malized by dividing each intensity by the sum of all the intensities in the spectrum plus an offset

Yij = n xij

I: xii + offset j=l

fori=l,..., mandj=l,..., n (4)

Page 4: Eigenstructure tracking analysis for assessment of peak purity in high-performance liquid chromatography with diode array detection

134 F. Cuesta Sbtchez et al./Analytica Chimica Acta 314 (1995) 131-139

The offset is a fraction of the maximum sum

j$,xij for i = 1,. . . ,m

The spectra with large absorbance are normalized to a sum close to 1, and the spectra with low absorbance to a sum close to 0.

3. Data

3.1. Simulated data

An LC-DAD data matrix was simulated, with two coeluting compounds. The spectra consist of 70 anal- ysis times and 46 wavelengths. The chromatographic separation quality between the two analytes is equal to 1, while the ratio of concentrations is 3/4. Con- centration profiles and spectra were simulated with Gaussian functions. Homoscedastic noise, normally distributed, with a standard deviation equal to 0.5% of the average absorbance was added.

3.2. Experimental data

The performance of ETA was studied on an HPLC-DAD data set published by Keller et al. [12]. The samples consisted of hydrocortisone and differ- ent concentrations of prednisone. The latter was considered as impurity and its relative concentration ranged from 0.1 to 100%. Chromatographic separa- tion quality between the two analytes was varied from R, = 0.1 to 1.

4. Results and discussion

The first step that should be performed before the application of any chemometrical tool, including ETA, is a visual inspection of the raw data. For instance, the shape of the baseline can be revealed which can be helpful for the interpretation of the results obtained.

To interpret the plots obtained in the ETA proce- dure for the assessment of peak purity one starts by plotting the first singular values versus the window number. The flow chart shown in Fig. 2 is then followed:

Two compounds The 2nd SV plot shows

the 0var1apping region

Yes I

T One compound

Fig. 2. Scheme for the interpretation of the ETA plots (for further details see text).

(i) If the shape is unimodal there are two possibil- ities, (a) the peak is pure, or (b) the minor compound is embedded under the main peak. In the latter case the minor peak does not have selective chromato- graphic regions and the second singular value indi- cates the location of the minor compound. The pres- ence of a second compound should be confirmed by inspection of the plots of the second and third singu- lar values. If the second singular value plot shows the same trend as the third singular value plot only one compound is present. Otherwise, if the second singular value plot shows a peak over noise level not present in the third singular value plot, two com- pounds are coeluting.

(ii) If the shape is not unimodal but there is a second maximum, two rather well separated com- pounds (R, N 0.5-l depending on the concentration of the minor compound) are present. The second singular value indicates the overlapping region.

In Fig. 3 the singular value plots for the raw data of pure hydrocortisone are shown, together with a line indicating the noise level. The noise level is determined by the highest second singular value in the zero-component regions. If only one compound is eluting, the first singular value depends essentially on the length of the spectra belonging to each win- dow. The larger the length of the spectra, the higher is the value of the first singular value. The first singular value plot shows the elution of the hydro-

Page 5: Eigenstructure tracking analysis for assessment of peak purity in high-performance liquid chromatography with diode array detection

F. Cuesta Sbnchez et al. /Anaiytica Chimica Acta 314 (1995) 131-139 135

-3.5' I 0 10 20 30 40 50 60 70 &I 90

window""mber

Fig. 3. ETA plot for the raw data of pure hydrocortisone. From

top to bottom: the first, second and third singular values. Line

(- - -) indicates the noise level.

cortisone. The plot is unimodal. In the zero-compo- nent region, i.e., the time interval where no com- pound is eluting, the spectra are located at different angular distances and so is PC1 of each window. Their scores on PCl, and the corresponding singular values, are small due to the small length of the spectra. The second singular values of the spectra in the zero-component regions have smaller values than the ones of the first singular values because the cosine of the angle between the spectra and the corresponding PC1 is larger than the sine. In theory, the second singular values for the windows with the spectra corresponding to the elution of the compound should be of the same order as the second singular values of the spectra in the zero-component regions. The larger length of the spectra is compensated by the small value of the sine of the angle between the spectra and PCl. The second singular value plot for pure hydrocortisone (Fig. 3) shows a peak over the noise level at the same position as the first singular value plot. That means that the angle of the spectra with respect to PC1 and/or the length of the spectra is higher than it should be. This can be due to the presence of another compound or to heteroscedastic noise. If it is due to heteroscedastic noise, the singu- lar value is directly related to the absorbance; there- fore, the peak maximum in the second singular value plot will be located at the same position as the peak maximum in the first singular value plot and the peak will be symmetric. The same effect shows up in

the third singular value. The comparison of the second and third singular value allows to make a decision. Two factors should be considered: (a) the location and symmetry of the peak, and (b) the similarity with the third singular value plot. If the peak of the second evolving singular values is lo- cated at the same position as the one of the first singular value there is a high probability that it is due to heteroscedastic noise. If moreover the peak is symmetric and the third singular value plot also shows a peak at the same position, the presence of heteroscedastic noise is confirmed. Both conditions are fulfilled by the peak of the second singular value in Fig. 3. Thus, only one compound is eluting.

To illustrate the effect in the second singular value plot of a completely overlapped impurity, the ETA plot for R, = 0.3 and 5% of impurity is shown in Fig. 4. As before (Fig. 3) the first singular value plot is unimodal. The second singular value plot shows a peak over the noise limit. The peak follows the same trend as the first singular value, but the peak maximum is not exactly at the same position, and the peak is not symmetric. The third singular value plot also shows a peak over the noise limit, but it is lower than the one of the second singular value. Moreover, the peak is located under the maximum of the first singular value plot and is symmetric. The peak in the second singular value plot is due to the minor compound and to heteroscedastic noise, while

Fig. 4. ETA plot for the raw data with R, = 0.3 and 5% of impurity. From top to bottom: the first, second and third singular

values. Line (- - -) indicates the noise level.

Page 6: Eigenstructure tracking analysis for assessment of peak purity in high-performance liquid chromatography with diode array detection

136 F. Cuesta S6nche.z et al. /Analytica Chimica Acta 314 (1995) 131-139

-2.5

------- -- -_ _ _________----------. -3

t

4

-3.51 1

1 0 10 2a 30 40 50 SO 70 en 90

window number

Fig. 5. ETA plot for the raw data with R, = 1 and 0.5% of impurity. From top to bottom: the first, second and third singular

values. Line (- - -) indicates the noise level.

the peak in the third singular value plot is only due to heteroscedastic noise.

When the chromatographic separation between the main compound and the impurity is large (R, N l-0.5) the impurity can be detected in the first singular value plot. The ETA plot for R, = 1 and 0.5% of impurity is shown in Fig. 5. Initially the minor compound starts to elute. All the spectra are situated on a straight line, which coincides with the direction of PCl. As explained above, the first singu- lar value of each window depends on the length of the object vectors corresponding to the spectra be- longing to that window. When the main compound starts to elute, each spectrum will be situated at a different angular distance and so will be the PC1 of each window. The cosine of the angles between the spectra and PC1 will be smaller, since the angles are higher, and the first singular values of the windows containing mixture spectra will also become smaller. When only the main compound is left the first singular value depends on the length of the object vectors representing the spectra of the main com- pound. In the selective regions the second singular values are similar to the ones in the zero-component regions, since the increasing value of the length of the spectra is compensated by the decreasing value of the sine of the angle of each spectrum with PCl. In the mixture region, the sines of the angles of the spectra with PC1 are higher and so are the second singular values. Therefore, if two rather well chro-

matographically separated compounds elute, the sec- ond singular value plot indicates the overlapped re- gion. The second singular values (Fig. 5) give a peak with a maximum around window 30. The latter peak shows the same pattern as the first singular values, it is symmetric and is also present in the third singular value plot; so that, it indicates heteroscedastic noise.

The presence of heteroscedastic noise makes the detection of a minor compound more difficult, espe- cially when the relative concentration of the minor compound is low and accompanied by low resolu- tion. Liang et al. [13] proposed to normalize the spectra with large absorbance to constant sum to remove the effect of heteroscedastic noise. In this case we have fixed z equal to 1 (Eq. 3). As the data matrix consists of 46 wavelengths, that means that the spectra with mean absorbance higher than 0.022 are normalized. The normalization only modifies the length of the spectra. All the normalized spectra have similar length, but not identical, since they are nor- malized to constant sum. As a consequence, the first singular value of the windows of the normalized spectra have practically the same value. The ETA plot for pure hydrocortisone is shown in Fig. 6. The main difference introduced by the normalization is found in the second singular value plot. After nor- malization the second singular value indicates the angle between each spectrum and the corresponding PCl. Around the retention time of the hydrocortisone

Oti -0.5-

-1 -

Fig. 6. ETA plot for pure hydrocortisone after selective spectrmr

normalization to constant sum. From top to bottom: the first. second and third singular values. Line (---) indicates the noise

level.

Page 7: Eigenstructure tracking analysis for assessment of peak purity in high-performance liquid chromatography with diode array detection

F. Cuesta Srinchez et al./Analytica Chimica Acta 314 (1995) 131-139 137

Fig. 7. ETA plot for R, = 0.3 and 5% of impurity after selective

spectrum normalization to constant sum. From top to bottom: the first, second and third singular values. Line (---) indicates the

noise level.

(time 40) the direction of the spectra is less influ- enced by the noise, and the angular distance of the spectra with respect to PC1 is lower. The third singular values show the same trend as the second, which indicates that the peak is pure. A problem appears when the chromatographic resolution be- tween the impurity and the main compound is small, since the normalization may suppress the second compound together with the heteroscedastic noise. This is the case for R, = 0.3 and 5% of impurity (Fig. 7). The second singular value differs from the third singular value plot around windows 30-35, but the peak is just at the noise level. One cannot decide if the difference is due to a second compound or to, e.g., a baseline.

If the chromatographic resolution between the two compounds is large, the normalization emphasises the presence of the minor compound. The ETA plot for R, = 1 and 0.5% of impurity is given in Fig. 8. The mixture spectra are left unchanged, therefore the singular values are equal to the ones in Fig. 5. Only the spectra selective for the main compound are normalized. The peak in the second singular value plot around window 30, due to heteroscedastic noise, is suppressed. After the normalization there is only one peak in the second singular value plot which indicates the overlapping region.

The normalization of the large spectra to constant sum is useful when the chromatographic separation

between the two compounds is rather large and the relative concentration of the impurity is low (I 0.5%). In other case, if the mixture spectra are also normalized, the increase of the angle between the spectra and PC1 due to the elution of the minor compound is masked by the reduction of the length after normalization. A drawback of this normaliza- tion is the loss of information in the first singular value plot. As all the spectra with large absorbance have the same length one cannot see the location of the maximum. A possible solution to these problems is to normalize the spectra with large absorbance to sum close to one and the ones representing noise to sum close to zero (Eq. 4). This is done by dividing each element of a row by the sum of all the elements in the row plus an offset. The offset is a percentage of the maximum value of the sums of the rows. In our case 20% of the maximum sum seems to be the most adequate. It is important to introduce a high offset to reduce the length of the spectra in the zero-component regions enough to compensate for the high value of the sine of their angle with respect to PCl. The ETA plot for pure hydrocortisone is shown in Fig. 9. The first singular value plot is very similar to the one obtained without normalization (Fig. 3), but the peak is wider due to the normaliza- tion, The second singular value plot has a stochastic structure and shows the same trend as the third singular value plot. The values of the second and

Fig. 8. ETA plot for R, = 1 and 0.5% of impurity after selective

spectrum normalization to constant sum. From top to bottom: the first, second and third singular values. Line (---) indicates the noise level.

Page 8: Eigenstructure tracking analysis for assessment of peak purity in high-performance liquid chromatography with diode array detection

138 F. Cuesta Srinchez et al. /Analytica Chimica Acta 314 (1995) 131-139

-0.6

-1 -

0 10 a 30 40 50 60 70 en 90 Window number

Fig. 9. ETA plot for pure hydracortisone after spectrum normal- Fig. 11. ETA plot for R, = 1 and 0.5% of impurity after spectrum

ization with an offset. From top to bottom: the first, second and normalization with an offset. From top to bottom: the first, second

third singular values. Line (- - -) indicates the noise level. and third singular values. Line (- - -1 indicates the noise level.

third singular values for windows 35-40 are practi- cally the same as with the normalization to constant sum (Fig. 6). The main difference introduced by the selective normalization with an offset is that the length of the spectra in the zero-component region is smaller and so are the singular values. The smaller the length of the spectra, the higher is the effect of the offset. In each case it will be necessary to select the correct offset.

The ETA plot obtained for R, = 0.3 and 5% of impurity is given in Fig. 10. The second singular value plot shows a peak over the noise limit. This

Fig. 10. ETA plot for R, = 0.3 and 5% of impurity after spectrum

normalization with an offset. From top to bottom: the first, second

and third singular values. Line (- - -> indicates the noise level.

peak is not present in the third singular value plot. The comparison of Figs. 7 and 10 shows that normal- ization of the spectra with an offset significantly reduces the length of the spectra in the zero-compo- nent regions. This allows to see in the second singu- lar value plot the increment of the dissimilarity between the spectra where the impurity is eluting and the corresponding PCl. The effect of the new normalization is clear when the chromatographic separation between the two compounds is small.

The ETA plot for R, = 1 and 0.5% of impurity is shown in Fig. 11. The impurity shows up already in the first singular values. As with the normalization to constant sum, in the second singular value plot there is only one peak over the noise limit, which indicates the overlapping region. The effect of the overcorrec- tion found in the ETA plots after normalization to constant sum is not present with the new normaliza- tion. When the separation between the compounds is large enough, the advantage of adding the offset is more in the presentation of the plots than in the capability of detecting the impurity.

The relative amount of impurity detected by ETA in function of the chromatographic resolution is given in Table 1. The table shows the detection limits reached with untransformed data (ETA-raw); data after selective normalization of spectra to constant sum (ETA-norm11 and data after normalization of spectra with an offset (ETA-norm2). When the chro- matographic separation between the two compounds

Page 9: Eigenstructure tracking analysis for assessment of peak purity in high-performance liquid chromatography with diode array detection

F. Cuesta Srinchez et al. /Analytica Chimica Acta 314 (1995) 131-139 139

Table 1

Detection limits reached with untransformed data (ETA-raw); data

after selective normalization of spectra to constant sum (ETA-

norrnl) and data after normalization of spectra with an offset

(ETA-norm2). For the comparison, the reported results for FSW

EFA, the orthogonalization method and SIMPLISMA are included

R, ETA- ETA- ETA- FSW Orthogonal SIMPLISMA raw norm1 norm2 EFA

1 0.5 0.1 0.1 0.5 0.1 0.5

0.8 0.5 0.5 0.5 0.5 0.1 0.5

0.5 1 0.5 0.5 0.5 0.5 0.5

0.4 2 0.5 0.5 2 0.5 0.5

0.3 5 10 5 5 2 2

0.2 5 10 5 5 2 2

0.1 10 20 5 20 2 5

is between 1 and 0.4 the ETA approach performs better after normalization than with the raw data. This is due to the effect of heteroscedastic noise. At lower resolutions the normalization of the spectra to constant length is worse, since the minor compound is suppressed together with the heteroscedastic noise. In all cases the normalization of the spectra with an offset seems to be the best one.

The next step is to compare the best results obtained by the ETA approach with the ones ob- tained with other methods such as FSW EFA [3], the orthogonalization method [6,7] and SIMPLISMA [9,10] (see Table 1). The only difference between ETA and FSW EFA for peak purity determination is the size of the moving window. The reduction of the size of the window from seven to three spectra makes the method more sensitive for the detection of small amounts of impurity.

In the orthogonalization method each spectrum is compared with a “base” vector by means of the dissimilarity, i.e., the sinus of the angle determined between each spectrum and the “base” vector. The detection limits shown for the orthogonalization method correspond to the variant that performs the best [81, i.e., the data are untransformed and the “base” vector is the mean spectrum. This method seems to be the most sensitive.

SIMPLISMA is based on the selection of “pure spectra”, i.e., corresponding to only one compound. When the first “pure spectrum” has been selected the information contained in this spectra is sub- tracted in a similar way to the orthogonalization

approach. This method performs also rather well. Its limitation is mostly due to the concentration of the impurity, more than to the chromatographic separa- tion.

5. Conclusions

The best results with the ETA approach are ob- tained after normalization of the spectra with an offset. When the ETA is applied to the raw data it is not possible to detect low concentrations of impurity (0.1%) due to the presence of heteroscedastic noise. On the other hand, the normalization of the spectra to constant sum suppresses the impurity together with the heteroscedastic noise when the chromato- graphic resolution is low (R, I 0.3).

In general terms, the methods compared here show a similar performance. They should be applied to a different kind of data, with a different noise structure to know their limitations and to arrive at general conclusions.

References

111 H. Gampp, M. Maeder, C.J. Meyer and A.D. Zuberbiihler, Talanta, 32 (1985) 1133; 33 (1986) 943.

[2] M. Maeder and A. Zilian, Chemom. Intell. Lab. Syst., 3 (19881 205.

[3] H.R. Keller and D.L. Massart, Anal. Chim. Acta, 246 (1991) 379.

[4] J. Toft and O.M. Kvalheim, Chemom. Intell. Lab. Syst., 19 (1993) 65.

[5] Y.-z. Liang, O.M. Kvalheim, A. Rahmani and R.G. Brereton, J. Chemom., 7 (1993) 15.

[6] F. Cuesta Sanchez, M.S. Khots, D.L. Massart and J.O. De Beer, Anal. Chim. Acta, 285 (1994) 181.

[7] F. Cuesta Sanchez, M.S. Khots and D.L. Massart, Anal. Chim. Acta, 290 (1994) 249.

[8] F. Cuesta Sanchez, P.J. Lewi and D.L. Massart, Chemom. Intell. Lab. Syst., in press.

191 W. Windig and J. Guilment, Anal. Chem., 63 (1991) 1425.

1101

1111

I121

[131

u41

W. Windig, C.E. Heckler, F.A. Agblevor and R.J. Evans, Chemom. Intell. Lab. Syst., 14 (1992) 195.

F. Cuesta Sanchez and D.L. Massart, Anal. Chim. Acta, in

press.

H.R. Keller, D.L. Massart and J.O. De Beer, Anal. Chem., 65 (1993) 471.

Y.-z Liang, O.M. Kvalheim, H.R. Keller, D.L. Massart, P. Kiechle and F. Erni, Anal. Chem., 64 (1992) 946.

H.R. Keller, D.L. Massart, Y.-z. Liang and O.M. Kvalheim,

Anal. Chim. Acta, 267 (1992) 63.