fl acc03 pub

6

Click here to load reader

Upload: allel-bouzzidd

Post on 05-Apr-2018

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Fl Acc03 Pub

7/31/2019 Fl Acc03 Pub

http://slidepdf.com/reader/full/fl-acc03-pub 1/6

Model Predictive Controller Monitoring Based on Pattern Classification and PCA

Fred Loquasto III∗ Dale E. Seborg†

 Department of Chemical Engineering, University of California, Santa Barbara, CA 93106 

Abstract

A pattern classification-based methodology is presented as a prac-

tical tool for monitoring Model Predictive Control (MPC) systems.

Principal component analysis (PCA) is used, especially PCA and

distance similarity factors, to classify a window of current, MPC

operating data into one of several classes. Pattern classifiers are de-

veloped using a comprehensive, simulated database of closed-loop

MPC system behavior that includes a wide variety of disturbances

and/or plant changes. The pattern classifiers can then be employed

to classify current MPC performance by determining if the behav-

ior is normal or abnormal, if an unusual plant disturbance is present,

or if a significant plant change has occurred. The methodology is

successfully applied in an extensive case study for the Wood-Berry

distillation column model.

1 Introduction

Model predictive control is widely used in the petrochemical indus-

tries to control complex processes that have operating constraints

on the input and output variables. The MPC controller uses a pro-

cess model and a constrained, on-line optimization to determine the

optimal future control move sequence. The first control move is

implemented and the calculations are then repeated at the next con-

trol calculation interval, the so-called receding horizon approach.

Excellent overviews of MPC and comparisons of commercial MPC

controllers are available [1,2].

Although MPC control has been widely applied for over 25 years,the problem of monitoring MPC system performance has received

relatively little attention until very recently [3–9].

The objective of this research is to develop a MPC monitoring

technique that will help plant personnel to answer the following

questions: (1) Is the MPC system operating normally? (2) If not,

is its poor performance due to an abnormal disturbance or an in-

accurate process model (for the current conditions)? The proposed

MPC monitoring technique is based on a pattern classification ap-

proach. This approach was selected because it is desired to be able

to identify plant changes, in addition to disturbances, without per-

forming a full model re-identification that would require significant

process excitation. Thus, identifying plant changes in this context

is an extremely difficult task.

In a previous paper [10], a MPC monitoring strategy was de-veloped using multi-layer perceptron neural networks as the pat-

tern classifiers. In this paper, the classification is instead based

on a novel application of principal component analysis, especially

PCA similarity factors and distance similarity factors. The proposed

MPC monitoring technique is evaluated in a simulation case study

for the Wood-Berry distillation column model.

∗E-mail: [email protected]†E-mail: [email protected] , Corresponding au-

thor

2 PCA and Dynamic PCA Methodology

Principal component analysis is a multivariate statistical technique

that has been widely used for both academic research and industrial

applications of process monitoring. Its ability to create low-order,

data-driven models by accounting for the collinearity of the process

data, its modest computational requirements, and its sound theoret-

ical basis make PCA a highly desirable technique upon which to

base tools for monitoring processes. Traditional PCA monitoring

techniques use the Q and T 2 statistics [11,12] to determine how

well a single sample agrees with the PCA model. The monitoring

strategy proposed in this paper is based on a different approach; it

uses several PCA-based similarity factors [13,14] to compare cur-

rent operating data with a simulated, closed-loop database in order

to classify the current operating data. The dataset is a matrix X with

dimensions (n×m), where m is the number of measured variables

and n is the number of samples for each variable.

2.1 PCA models

In PCA a set of uncorrelated variables, the principal components,

are calculated from linear combinations of the original (correlated)

variables. The principal components are the eigenvectors of the co-

variance matrix of the data. They correspond to the directions of the

data that possess the highest degree of variability [11,15]. For exam-

ple, PCA models can be calculated using the pcacov.m function

in the Statistics Toolbox in MATLAB [16].

For dynamic PCA (DPCA), the X  data matrix is augmented

with previous “lagged” data [17,12]. Thus, the process dynam-ics are accounted for by effectively constructing an auto-regressive

with exogenous inputs (ARX) model of the process from this data.

If  l lags are considered, the DPCA matrix X (l) has dimension

(n − l × ml + m):

X (l) = [X (k), X (k − 1), · · · , X (k − l))] (1)

Note that the number of rows of  X (l) is reduced by the number of 

lags, l, compared to the original matrix X .There are various methods available in the literature for choosing

k, the number of principal components (PCs) for the PCA model,

as well as the number of lags l for the DPCA model. In this paper,

k is chosen by two methods: (i) specifying a threshold for the cu-

mulative explained variance or (ii) using parallel analysis [15]. Forthe first approach, k is selected to be the minimum number of prin-

cipal components whose cumulative explained variance exceeds a

specified threshold. The parallel analysis (PA) technique compares

the eigenvalues λ of Σ, the covariance matrix of X , with the eigen-

values µ from the covariance matrix of a similarly-sized data matrix

with independent, normally distributed random elements [15]. Usu-

ally, the λ values begin with larger magnitudes than the µ values,

and the number of PCs is chosen to be the smallest value of  k for

which λ(k) ≤ µ(k). However, in some cases, λ(1) ≤ µ(1), and kis selected as k = 1.

0-7803-7896-2/03/$17.00 ©2003 IEEE 1968Proceedings of the American Control Conference

Denver, Colorado June 4-6, 2003

Page 2: Fl Acc03 Pub

7/31/2019 Fl Acc03 Pub

http://slidepdf.com/reader/full/fl-acc03-pub 2/6

In order to specifythe number of lags l for DPCA, different meth-

ods have been presented in the literature [17,12]. The method used

here is based on the method presented in [17].

2.2 PCA and Distance Similarity Factors

The PCA similarity factor , S PCA, provides a useful characteriza-

tion of the degree of similarity for two datasets. It is based on the

similarity of the directions of the principal component vectors for

the two corresponding PCA models. A PCA model is defined to

be the matrix that has the first k principal component vectors as

its columns. Define T  to be the (n × k) PCA model for a train-

ing dataset, and C  to be the (n × k) PCA model for the current

dataset of interest. Define θij as the angle between the ith principal

component of T  and the jth principal component of C . For k prin-

cipal components, the PCA similarity factor, S PCA, is then defined

as [13,14,18]

S PCA =1

k

ki=1

kj=1

cos2 θij =C T T T T C 

k(2)

Although the PCA similarity factor compares the similarity of two datasets, it is also beneficial to have a similarity measure based

on the “distance” between two datasets in the m-dimensional data

space. This measure can be obtained by using the distance simi-

larity factor , S dist, proposed in [14]. First, the variables in the T dataset are scaled to zero mean and unit variance. These same val-

ues are then used to scale dataset C . Denote the vectors of sample

means as xC  and xT . The Mahalanobis distance is [19]

Φ =

 (xT  − xC )

T  Σ†T  (xT  − xC ) (3)

where Σ† is the pseudo-inverse of the covariance matrix Σ of the T dataset. The distance similarity factor, S dist is defined to be [14]:

S dist = 

 ∞

Φ

e−z2/2dz  (4)

For the purposes of classifying MPC operation, similarity fac-

tors were calculated for two types of data. The first type is the

standard, time-series measurement data (the “ X ” data). The second

type of data consists of sample autocorrelation (ACF) and partial

autocorrelation (PACF) function coefficients [20,21] (the “CORR”

data). The X  data is comprised of the inputs, outputs, and one-step

ahead residuals, in order to take advantage of the model used in

the controller. The CORR data is defined to be the first n/4 cor-

relation coefficients (as per the recommendations in [20]) of the

residuals, control error, and differenced inputs. The inputs were

differenced to help increase the similarity measure’s sensitivity to

higher-frequency dynamic behavior, while reducing the sensitivityto low-frequency disturbances.

The overall classification of MPC performance is based on a

composite similarity factor , SF , a combination of four individual

similarity factors:

1. S XPCA the PCA similarity factor value for the X  data

2. S CORRPCA the PCA similarity factor value for the CORR data

3. S Xdist the distance similarity factor for the X data

4. S CORRdist the distance similarity factor for the CORR data

The composite similarity factor is defined by the following equation

SF  α1 S XPCA + α2 S CORRPCA + α3 S Xdist + α4 S CORRdist (5)

where

α1 + α2 + α3 + α4 = 1 (6)

The αi values are tuning parameters for the classifier that can be

used to weight the individual similarity factors differently.

3 Classification Strategy

In summary, the basis for the proposed PCA approach is to use the

composite similarity factor SF  to determine the similarity between

a current dataset and a group of training datasets that contain a wide

variety of closed-loop process responses. The training datasets that

are most similar to the current dataset are collected into a candidate

 pool, and based on an analysis of the training datasets in the pool,

the current dataset is classified.

An important aspect of the classification is how the different op-

erating classes are defined. It is proposed to classify each dataset as

being a member of one of four mutually exclusive operating classes:

1. Normal operation (N)

2. An abnormal disturbance is present (D)

3. A plant change has occurred; thus there is significant model-

plant mismatch, or MPM (P)

4. Both a plant change and a disturbance are present (P+D)

We use the term “fault” to refer to a member of the D, P, or P+D

classes. With these definitions, the classification can be performed

using two types of classifiers:

I. Binary classifiers Three binary classifiers are used. Each one

classifies a dataset as being in one of two categories:

AB: Normal (N) vs. Abnormal (D, P, or P+D)

DIST: Abnormal disturbance (D or P+D) vs. none (N or P)

MPM: Plant change (P or P+D) vs. none (N or D)

II. Four-class classifiers This classifier classifies a dataset as be-

ing one of the four exclusive operating classes (#1–4), above.

For each classifier, the classification label for the dataset is deter-

mined by using one of two alternative techniques. A candidate pool

of size N p is constructed by selecting the N p training datasets that

are most similar to the current dataset, based on the SF  metric.

Then, the first option involves calculating class-average similarity

values by finding theaverage of the SF  values for eachclass present

in the pool, where the classes are defined based on whether the bi-

nary classifiers are or the four-class classifier is used. For example,

the two classes for the AB classifier are “Normal” and “Abnormal”.The class in the candidate pool with the highest class-average SF value is selected as the classification for the current dataset. The

second option is based on the frequency (or number of times) that

each class is present in the candidate pool. The class with the high-

est frequency is selected as the classification label for the current

dataset.

The performance of the individual binary classifiers can be im-

proved by using a combination of them in an exclusion strategy.

In general, our experience has been that the AB and DIST classi-

fiers perform very well while the MPM classifier is less successful.

1969Proceedings of the American Control Conference

Denver, Colorado June 4-6, 2003

Page 3: Fl Acc03 Pub

7/31/2019 Fl Acc03 Pub

http://slidepdf.com/reader/full/fl-acc03-pub 3/6

However, the AB and DIST classifiers can also be used to detect

plant changes with a low false alarm rate when used as part of the

exclusion strategy. This strategy is shown in Fig. 1. For example, if 

the AB classifier indicates an ‘Abnormal’ condition, and the DIST

classifier indicates a ‘No Disturbance’ condition, the current dataset

is classified as a ‘Plant Change’ condition. The disadvantage of 

this technique is that a P+D condition will be classified as a ‘Distur-

bance’ if perfect classifiers are used, rather than as ‘P+D’. However,

one could argue that this result is acceptable because it is partially

correct. Also, upon comparing the case study results, it is apparent

that the benefit of this approach is that it significantly reduces the

amount of false ‘P’ classifications (i.e. “false alarms”).

3.1 Generation of the Training Database

The first step in developing these pattern classifiers is to create

the simulated databases. The databases are made up of individual

datasets, where each dataset consists of closed-loop response data.

It is suggested that the length of the dataset is approximately the

longest open-loop settling time of the process so that all important

dynamics can be represented. An important aspect of the proposed

approach is that it does not require that a specially-designed pertur-

bation signal be applied to the plant. However, in the case study,

better classification occurs if each dataset contains the same known

excitation (such as a setpoint change or measured disturbance) as

in the current dataset. In this paper, both the training data and the

“current” data are assumed to be setpoint response data for a single

setpoint change.

After the simulated data has been generated, it must be scaled

appropriately. For this paper, “class scaling” was performed; each

dataset in the training database was scaled to zero mean and unit

variance based on the overall mean and standard deviation of the

class to which that dataset belonged (i.e. N, D, P, or P+D). Then

several design and tuning parameters must be specified to design the

classifiers: the number of principal components in the PCA models

(k), the number of lags for DPCA (l), the pool size (N p), and the

SF  weights (αi’s). The optimal values of these design parameters

were determined by classifying a second independent database con-

taining validation data.

4 MPC Case study

This section presents specific details about the simulated case study,

the Wood-Berry distillation column model and its MPC system. The

generation of the simulated databases and the classifiers is also de-

scribed. Results are then presented for both the PCA and DPCA-

based classifiers.

4.1 Wood-Berry distillation column model

The Wood-Berry model is a 2×2 transfer function model of a pilot-

plant distillation column that separates methanol and water [22].

The system outputs are the distillate and bottoms compositions, xDand xB [wt %], which are controlled by the reflux and steam flow

rates, R and S  [lb/min]. The unmeasured feed flow rate, F , acts as

a process disturbance. The column model is shown in Eq. (7). The

Wood-Berry model is a classical example used in many previous

publications and in the MATLAB MPC Toolbox [23].

xD

xB

=

12.8e−s

16.7s + 1

−18.9e−3s

21.0s + 1

6.6e−7s

10.9 + 1

−19.4e−3s

14.4s + 1

R

+

3.8e−8s

14.9s + 1

4.9e−3s

13.2s + 1

(7)

MPC tuning parameters. The closed-loop MPC simulation was

performed using MATLAB and the Model Predictive Control Tool-

box [23]. The manipulated variables (MV’s) were R and S , and

the controlled variables (CV’s) were xD and xB . High and low

saturation limits were imposed on the inputs. In order to reduce

the simulation time needed to explore a variety of design options

and design parameters, unconstrained MPC was employed. How-

ever, the classification strategy presented in this paper is identical

for both constrained and unconstrained MPC. The MPC controller

was tuned by trial-and-error to give a reasonably fast setpoint re-

sponse. The prediction horizon was P  = 30 and the control horizon

was M  = 5. The error weighting and move suppression matrices,

Q andR, were chosen to be the identity matrix. The control execu-

tion period was ∆t = 1 min. No special disturbance modeling was

used; the common “DMC”, constant step-disturbance approach was

employed [23,1].

4.2 Database Generation for the Case Study

Three types of databases are generated for different purposes: train-

ing, validation, and test. The purpose of the training database

was discussed earlier. The validation and test databases repre-

sent the “current data”. The validation data is used to obtain the

optimal design and tuning parameters for the classifier. The test

database takes the place of actual process data for the purpose

of evaluating the proposed monitoring approach. These databases

consist of many individual datasets. Each dataset contained 100

one-minute samples of setpoint response data. A unit xD set-

point change was made at t = 0. The choice of the variable

whose setpoint was stepped was based on the fact that xD is the

faster responding of the two controlled variables. The satura-

tion limits on the inputs were −0.833 ≤ R ≤ +0.833 lb/min and

−0.516 ≤ S ≤ +0.516lb/min.

Training and validation databases. The training and validation

data were created using input disturbances (added to u) and output

disturbances (added to y) in order to mimic a wide range of ac-

tual process operation when unknown disturbances are present. The

feed flow rate disturbance model in Eq. (7) was assumed to be un-

known, and thus was not used in the generation of either the training

or validation databases. (It was used to generate the test database,

as described below.) Input and output disturbances were chosenrandomly and independently from four types of disturbances: step,

ramp, sinusoidal, and stochastic. It should be noted that the step

and ramp output disturbances could also be used to represent sen-

sor biases and drifts. Step disturbances had random starting times

between t = 25 and t = 75 min. The ramp disturbances began

between t = 25 and t = 50 min and had a randomly chosen dura-

tion between 25 and 50 min. The magnitude of the step disturbance,

the final magnitude of the ramp disturbance, and the amplitude of 

the sinusoidal oscillations were chosen randomly from the range

±{0.25 − 3.25} wt%. The magnitudes of the input disturbances

1970Proceedings of the American Control Conference

Denver, Colorado June 4-6, 2003

Page 4: Fl Acc03 Pub

7/31/2019 Fl Acc03 Pub

http://slidepdf.com/reader/full/fl-acc03-pub 4/6

were adjusted so that the open-loop changes in the outputs were

approximately the same as for the output disturbance cases.

Two ARMA transfer functions were used in creating the stochas-

tic sequences in the training data, and a third transfer function was

used for the validation data. For “stochastic” input and output dis-

turbances, the magnitude of the input sequence to the transfer func-

tions, and the subsequent output magnitude were scaled to give the

same approximate effect on the output as the deterministic distur-

bances. For the sinusoidal oscillations, the frequency had a uni-

formly random distribution in the range of {0.03 − 1.03} rad/min.

Gaussian measurement noise with an approximate standard devia-

tion of  0.05 wt% was added to each output, and Gaussian process

noise was added to the two inputs with approximate standard devi-

ations of  0.006 lb/min and 0.003 lb/min, respectively. These val-

ues were specified so that the process noise had approximately the

same open-loop magnitude effect on the outputs as the measurement

noise.

Test database. In order to make the test case more realistic, the

unmeasured feed flow rate disturbance model (that was ignored in

creating the training and validation data) was utilized to generate

disturbances. For the feed disturbances, steps, ramps, sinusoids,and stochastic sequences were used as inputs to the disturbance

model. A different ARMA transfer function was used to generate

the stochastic sequences for the test database. The magnitudes for

the steps, ramps, and sinusoids were chosen randomly in the range

±{12.5 − 37.5}% of the nominal feed flow rate (2.45 lb/min).

Plant changes. Random plant changes were included in the sim-

ulated database in the same manner for the training, validation, and

test databases. The four transfer functions in the Wood-Berry model

of Eq. (7) contain a total of 12 model parameters (K , τ , and θ for

each transfer function). To create the plant change datasets, the

number of datasets in which j process models are to be perturbed,

for j = 1, . . . , 4, must be specified. For the case study, this was

based on the number of possible combinations in which j modelscould be chosen out of the four models. Then, for each plant change

dataset, the following parameters were chosen in a uniformly ran-

dom manner:

1. Specify which j specific models will be perturbed.

2. For each perturbed model, choose how many model parame-

ters (i) will be perturbed.

3. Determine the i specific model parameters (K , τ , θ) to perturb

for each model.

4. Specify ∆, the magnitude of the additive parameter pertur-

bation, where ∆ = ±{12.5% − 37.5%}. For example,

K  = K 0(1 + ∆), where K 0 is the nominal value.

Thus, an individual dataset could have contained as many as 12 pa-

rameter changes that were chosen randomly. These changes occur

at the beginning of the simulated setpoint change and remain in ef-

fect for the entire window.

Operating class distribution. Next, we consider the distribution

of the four independent operating classes (N, D, P, P+D) present

in the training database. Increasing the relative amount of a par-

ticular operating class can sometimes result in better performance

for one classifier at the potential expense of another. But in this

case study, we assume no a priori knowledge of the likelihood

of any of the fault conditions (D, P, or P+D) and have the same

number of datasets for each of the four operating conditions for

all three databases. The total amount of training data, N  = 2000datasets, was based on a neural network design criteria for a previ-

ous study [10]. To evaluate the effect of using less training data for

the PCA technique, smaller training databases with N  =500 or 200

datasets were created by randomly choosing a subset of the origi-

nal 2000 datasets. The number of validation and training datasets,

1000 and 800, were chosen arbitrarily to be less than the number of 

training datasets.

4.3 Case Study Results

This section presents classification results for both the PCA and

DPCA techniques. To analyze the classifier performance, the

following performance measures (in %) are defined for the test

datasets:

η efficiency = percentage of all datasets correctly classi-

fied.

ηi percent of all class i datasets classified correctly (i = N,

D, P, P+D).

 p accuracy of the data sets classified as a plant change.

For an accurate classification, a dataset classified to be

a plant change must belong to either (i) the P class for

the p(P only) metric, or (ii) either the P or P+D class for

the p(P, P+D) metric.

False Alarm Rate 100% − p(P, P+D).

Binary classifiers:

T 1 = Type I errors; the percentage of all test datasets in which

an error was caused by missing a fault condition.

T 2 = Type II errors; the percentage of all test cases in which

an error was caused by falsely identifying a non-fault

condition as a fault, i.e. a “false positive”.

Note that for the binary classifiers, η + T 1 + T 2 = 100%. For

the binary Type I and Type II errors, the null hypotheses are

defined as follows:

AB µ0: True Class ∈ {D, P, or P+D}

DIST µ0: True Class ∈ {D or P+D}

MPM µ0: True Class ∈ {P or P+D}

Four-class classifiers:

T 1 = 1

4

4

j=1 T 1j , where T 1j represents the percentage of 

all cases in which a class j, as indicated by the classifier,

is the correct classification.

T 2 = 100% − T 1, the percentage of all cases in which the

indicated class is not correct.

PCA Results. Because the classifiers based on the frequency-

pool method almost always performed better than the pool-average

method, only the frequency-based classifier results are presented.

Note that in some cases, multiple pool sizes or weight values may

give the same classification accuracy. In these cases, an average of 

the multiple results is presented. The classification results on the

validation database (not shown) indicated that using PCA models

explaining 85% of the variance leads to the best performance for

1971Proceedings of the American Control Conference

Denver, Colorado June 4-6, 2003

Page 5: Fl Acc03 Pub

7/31/2019 Fl Acc03 Pub

http://slidepdf.com/reader/full/fl-acc03-pub 5/6

the full training data (N =2000) case. Table 1 presents the classifi-

cation performance for the test database. The best validation classi-

fiers for the full data (85% variance explained) are underlined. The

best overall results (based on the test database) are shown in bold-

face. When the best validation classifiers were used to evaluate the

test database, an abnormal operating condition was detected with

over 98% accuracy, while the disturbance classifier achieved over

92% correct. The plant change classifier detected plant changes

correctly 73% of the time, while the four-class classifier achieved

an overall correct classification rate of 64%. It appears that classi-

fier performance tends to deteriorate when less training data is used.

However, the performance of all classifiers was still reasonable for

reduced amounts of training data.

Table 2 presents more detailed test database results for the best

test database classifiers from Table 1 (i.e. the ones in boldface). The

classification accuracy for each type of operating condition and the

Type I and II errors are shown. The performance is reasonable, but

the MPM classifier has a quite large Type II error value.

The exclusion strategy of Section 3 can be applied in order to de-

tect plant changes more accurately (with less false positives). Table

3 summarizes these results for the test database. Although the 75%

explained variance cases achieved the best p values for the valida-

tion database, if one “cheats” and looks at the test database results, it

is apparent that, by far, the best  p values occurred for parallel analy-

sis. Parallel analysis design achieves very high p values of  p(P only)

= 94.5% and p(P, P+D) = 97.6%, at the expense of a lower η value.

Use of this classifier thus results in a very low false alarm rate for

identifying plant changes of less than 3%. Table 4 presents a class

distribution table that shows how the true classes of the test database

are classified when using the parallel analysis exclusion classifiers.

DPCA Results. Analysis results for the DPCA classifiers are now

presented, for the full training database. Once again, the best per-

forming classifiers for the validation database (not shown) were ob-

tained based on 85% variance explained. A comparison of the PCA

and DPCA validation database classification performance indicatesthat the DIST and 4-class classifiers performed slightly better us-

ing DPCA than the PCA classifiers discussed earlier. However, the

AB and MPM results were worse. Test database results using the

best validation classifiers (i.e. the αi, l, and k values giving the best

performance) are shown in Table 5, where the best validation classi-

fiers are underlined. Again, the best overall test database classifiers

are shown in boldface. Comparing the test database performance of 

the optimal validation classifiers that used DPCA (Table 5) with the

PCA-based classifiers (Table 1), with both at a threshold of 85%,

indicates that the binary classifiers (AB, DIST, MPM) performed

worse for DPCA, while the 4-class classifier was slightly better. Ta-

ble 6 shows detailed results for the best overall test database classi-

fiers (those in boldface in Table 5).

Test database results for the exclusion strategy are presented inTable 7. Now, the choice of which design to select for this task is not

as clear cut as for the PCA case; based on the validation database

results (not shown), both 85% and 95% thresholds were acceptable.

The highest test database p values for the DPCA case occur for 95%

variance explained, which was one of the indicated best validation

cases. Although the η values for DPCA in Table 7 are similar to

the PCA results in Table 3, the p values are significantly lower for

DPCA; thus, because the goal of this strategy is to obtain high pvalues, it seems that one should choose the PCA method over DPCA

for best performance.

5 Summary

Several PCA-based pattern classifiers have been developed for mon-

itoring model predictive control systems. These PCA-based classi-

fiers gave accurate classifications for a 2 × 2 simulated case study,

the MPC-controlled Wood-Berry distillation column model. For ex-

ample, the AB classifier, used to detect abnormal operating condi-

tions, could accurately classify over 98% of the independent test

datasets. Using the best exclusion classifiers, over 77% of the plant

changes in the test database were identified as a plant change with

a false alarm rate of less than 3%, an important consideration in in-

dustrial applications. The classifiers performed slightly better when

they were based on PCA as opposed to DPCA. In general, a smaller

amount of training data had a detrimental effect on the classification

results. Choosing the number of PC’s to retain by parallel analysis

gave mixed results but is still recommended because it is an auto-

matic technique, and it produced reasonably accurate AB and DIST

classifiers along with a very accurate exclusion-based, plant change

classifier.

Acknowledgements

Funding for the research presented in this paper, provided by the

ChevronTexaco Research and Technology Company (CTRTC), is

greatly appreciated. The authors would like to thank Ron Sorensen

(CTRTC) and Jim Gunderman (ChevronTexaco) for discussions

about industrial MPC systems. Lastly, the first author would like

to thank the Control Engineering Laboratory at the Helsinki Uni-

versity of Technology for providing him with a visiting researcher

position, where many of the ideas in this paper were initially devel-

oped.

References

[1] J. M. Maciejowski, Predictive Control with Constraints. Har-low, England: Prentice Hall, 2002.

[2] S. J. Qin and T. A. Badgwell, “A survey of industrial model

predictive control technology,” Control Eng. Prac. (in press),

2002.

[3] Y. Zhang and M. A. Henson, “A performance measure for

constrained model predictive controllers,” in Proc. European

Control Conf., (Karlsruhe, Germany), 1999.

[4] B.-S. Ko and T. F. Edgar, “Performance assessment of con-

strained model predictive control systems,” AIChE J., vol. 47,

no. 6, pp. 1363 – 1371, 2001.

[5] D. J. Kozub, “Controller performance monitoring and diag-

nosis. an industrial perspective,” in Proc. 15th IFAC TriennialWorld Congress, (Barcelona, Spain), 2002.

[6] T. J. Harris and C. T. Seppala, “Recent developments in con-

troller performance monitoring and assessment techniques,”

Chem. Proc. Control-VI, AIChE Symposium Series No. 326,

Vol. 98 , pp. 208 – 222, 2002.

[7] S. L. Shah, R. Patwardhan, and B. Huang, “Multivariate con-

troller performance analysis: Methods, applications, and chal-

lenges,” Chem. Proc. Control-VI, AIChE Symposium Series

 No. 326, Vol. 98 , pp. 190 – 207, 2001.

1972Proceedings of the American Control Conference

Denver, Colorado June 4-6, 2003

Page 6: Fl Acc03 Pub

7/31/2019 Fl Acc03 Pub

http://slidepdf.com/reader/full/fl-acc03-pub 6/6

[8] B. Huang and E. C. Tamayo, “Model valiation for industrial

model predictive control systems,” Chem. Eng. Sci., vol. 55,

pp. 2315 – 2327, 2000.

[9] B. Huang, “Multivariable model validation in the presence of 

time-variant disturbance dynamics,” Chem. Eng. Sci., vol. 55,

pp. 4583 – 4595, 2000.

[10] F. Loquasto, III and D. E. Seborg, “Monitoring model predic-

tive control systems using a novel neural network approach,”in AIChE Annual Meeting, (Reno, NV), 2001.

[11] B. M. Wise and N. B. Gallagher, “The process chemometrics

approach to process monitoring and fault detection,” J. Proc.

Control, vol. 6, pp. 329 – 348, 1996.

[12] L. H. Chiang, E. L. Russell, and R. D. Braatz, Fault Detection

and Diagnosis in Industrial Systems. London: Springer, 2001.

[13] W. J. Krzanowski, “Between-groups comparison of principal

components,” J. Am. Stat. Assoc., vol. 74, pp. 703 – 707, 1979.

[14] A. Singhal and D. E. Seborg, “Pattern matching in multivariate

time series databases using a moving-window approach,” Ind.

& Eng. Chem. Res., vol. 41, pp. 3822 – 3838, 2002.

[15] J. E. Jackson, A User’s Guide to Principal Components. NewYork: John Wiley, 1991.

[16] The MathWorks, Inc., Statistics Toolbox Ver. 3.0. 2000.

[17] W. Ku, R. H. Storer, and C. Georgakis, “Disturbance detec-

tion and isolation by dynamic principal component analysis,”

Chemom. Int. Lab. Sys., vol. 30, pp. 179 – 196, 1995.

[18] M. C. Johannesmeyer, A. Singhal, and D. E. Seborg, “Pattern

matching in historical data,” AIChE J., vol. 48, pp. 2022 –

2038, 2002.

[19] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification,

2nd ed . New York: John Wiley, 2001.

[20] G. E. P. Box, G. M. Jenkins, and G. C. Reinsel, Time Series

 Analysis Forecasting and Control, 3rd

ed . Englewood Cliffs,NJ: Prentice Hall, 1994.

[21] G. E. P. Box and A. Luceno, Statistical Control by Monitoring

and Feedback Adjustment . NY: John Wiley, 1997.

[22] R. K. Wood and M. W. Berry, “Terminal composition con-

trol of a binary distillation column,” Chem. Eng. Sci., vol. 28,

pp. 1707–1717, 1973.

[23] M. Morari andN. L. Ricker, Model Predictive Control Toolbox

User’s Guide. Natick, MA: The MathWorks, Inc., 1995.

Table 1. PCA results: η values for different training sizes and

variance thresholds.

Variance Explained

75% 85% 95% PA

N  = 2000 500 200 2000 500 200 2000 500 200 2000 500 200

AB 98.7 98.3 97.8 98.3 97.0 97.1 98.5 98.0 98.1 97.1 96.6 97.8

DIST 91.1 93.0 87.1 92.3 94.3 84.0 80.4 89.6 88.6 94.5 93.8 92.3

MPM 74.0 73.9 70.5 73.0 71.5 74.6 73.4 74.6 75.1 72.8 73.3 76.0

4-class 68.5 68.6 58.8 63.9 58.9 59.5 64.4 59.8 62.6 65.1 63.6 60.1

Table 2. Detailed results for the best test database PCA classifiers.

η ηN  ηD ηP  ηP +D T 1 T 2AB 98.7 99.0 100.0 95.8 100.0 1.1 0.3

DIST 94.5 100.0 99.5 81.0 97.5 0.8 4.8

MPM 76.0 99.0 22.0 95.0 88.0 4.3 19.8

4-class 68.5 97.5 37.0 72.5 67.0 71.0 29.0

Table 3. PCA exclusion strategy plant change classification results.

% Var. Expl. η p (P only) p (P, P+D)

75 82.8 77.9 90.6

85 80.5 81.3 93.4

95 82.3 55.3 79.5

Par. Anal. 77.5 94.5 97.6

Table 4. True and Indicated classifications using the PCA exclusion

classifiers (parallel analysis case).

Classified as (%)

Actual Dataset N D P Not Classified  

N 98.0 0.0 2.0 0.0

D 0.5 94.0 0.0 5.5

P 3.5 19.0 77.5 0.0

P+D 0.0 97.5 2.5 0.0

Table 5. DPCA test results: η values for different variance

thresholds.

Variance Explained

75% 85% 95% PA

AB 96.8 95.8 96.5 88.9

DIST 90.4 88.4 89.1 85.8

MPM 68.6 72.6 74.5 73.0

4-class 61.2 64.6 61.1 65.4

Table 6. Detailed results for the best test database DPCA classifiers.

η ηN  ηD ηP  ηP +D T 1 T 2AB 96.8 96.5 98.5 92.0 100.0 2.5 0.8

DIST 90.4 1 00.0 8 7.0 9 0.5 84.0 7.3 2.4

MPM 74.5 100.0 19.0 86.0 93.0 5.3 20.3

4-class 65.4 84.0 26.0 67.0 84.5 68.6 31.4

Table 7. DPCA exclusion strategy plant change classification

results.

% Var. Expl. η p (P only) p (P, P+D)

75% 82.5 72.7 86.8

85% 83.0 68.0 82.4

95% 84.0 76.0 92.3

PA 76.0 62.8 78.5

Disturbance Classifier (DIST) No Disturbance Disturbance

Normal / Abnormal Classifier (AB) 

Normal

Abnormal

CLASSIFY AS:

Normal NOT

CLASSIFIED

CLASSIFY AS:

Disturbance CLASSIFY AS:

Plant Change 

Figure 1. Illustration of the exclusion strategy.

1973Proceedings of the American Control Conference

Denver, Colorado June 4-6, 2003