fl acc03 pub
TRANSCRIPT
![Page 1: Fl Acc03 Pub](https://reader038.vdocuments.site/reader038/viewer/2022100806/577ce7031a28abf103942e31/html5/thumbnails/1.jpg)
7/31/2019 Fl Acc03 Pub
http://slidepdf.com/reader/full/fl-acc03-pub 1/6
Model Predictive Controller Monitoring Based on Pattern Classification and PCA
Fred Loquasto III∗ Dale E. Seborg†
Department of Chemical Engineering, University of California, Santa Barbara, CA 93106
Abstract
A pattern classification-based methodology is presented as a prac-
tical tool for monitoring Model Predictive Control (MPC) systems.
Principal component analysis (PCA) is used, especially PCA and
distance similarity factors, to classify a window of current, MPC
operating data into one of several classes. Pattern classifiers are de-
veloped using a comprehensive, simulated database of closed-loop
MPC system behavior that includes a wide variety of disturbances
and/or plant changes. The pattern classifiers can then be employed
to classify current MPC performance by determining if the behav-
ior is normal or abnormal, if an unusual plant disturbance is present,
or if a significant plant change has occurred. The methodology is
successfully applied in an extensive case study for the Wood-Berry
distillation column model.
1 Introduction
Model predictive control is widely used in the petrochemical indus-
tries to control complex processes that have operating constraints
on the input and output variables. The MPC controller uses a pro-
cess model and a constrained, on-line optimization to determine the
optimal future control move sequence. The first control move is
implemented and the calculations are then repeated at the next con-
trol calculation interval, the so-called receding horizon approach.
Excellent overviews of MPC and comparisons of commercial MPC
controllers are available [1,2].
Although MPC control has been widely applied for over 25 years,the problem of monitoring MPC system performance has received
relatively little attention until very recently [3–9].
The objective of this research is to develop a MPC monitoring
technique that will help plant personnel to answer the following
questions: (1) Is the MPC system operating normally? (2) If not,
is its poor performance due to an abnormal disturbance or an in-
accurate process model (for the current conditions)? The proposed
MPC monitoring technique is based on a pattern classification ap-
proach. This approach was selected because it is desired to be able
to identify plant changes, in addition to disturbances, without per-
forming a full model re-identification that would require significant
process excitation. Thus, identifying plant changes in this context
is an extremely difficult task.
In a previous paper [10], a MPC monitoring strategy was de-veloped using multi-layer perceptron neural networks as the pat-
tern classifiers. In this paper, the classification is instead based
on a novel application of principal component analysis, especially
PCA similarity factors and distance similarity factors. The proposed
MPC monitoring technique is evaluated in a simulation case study
for the Wood-Berry distillation column model.
∗E-mail: [email protected]†E-mail: [email protected] , Corresponding au-
thor
2 PCA and Dynamic PCA Methodology
Principal component analysis is a multivariate statistical technique
that has been widely used for both academic research and industrial
applications of process monitoring. Its ability to create low-order,
data-driven models by accounting for the collinearity of the process
data, its modest computational requirements, and its sound theoret-
ical basis make PCA a highly desirable technique upon which to
base tools for monitoring processes. Traditional PCA monitoring
techniques use the Q and T 2 statistics [11,12] to determine how
well a single sample agrees with the PCA model. The monitoring
strategy proposed in this paper is based on a different approach; it
uses several PCA-based similarity factors [13,14] to compare cur-
rent operating data with a simulated, closed-loop database in order
to classify the current operating data. The dataset is a matrix X with
dimensions (n×m), where m is the number of measured variables
and n is the number of samples for each variable.
2.1 PCA models
In PCA a set of uncorrelated variables, the principal components,
are calculated from linear combinations of the original (correlated)
variables. The principal components are the eigenvectors of the co-
variance matrix of the data. They correspond to the directions of the
data that possess the highest degree of variability [11,15]. For exam-
ple, PCA models can be calculated using the pcacov.m function
in the Statistics Toolbox in MATLAB [16].
For dynamic PCA (DPCA), the X data matrix is augmented
with previous “lagged” data [17,12]. Thus, the process dynam-ics are accounted for by effectively constructing an auto-regressive
with exogenous inputs (ARX) model of the process from this data.
If l lags are considered, the DPCA matrix X (l) has dimension
(n − l × ml + m):
X (l) = [X (k), X (k − 1), · · · , X (k − l))] (1)
Note that the number of rows of X (l) is reduced by the number of
lags, l, compared to the original matrix X .There are various methods available in the literature for choosing
k, the number of principal components (PCs) for the PCA model,
as well as the number of lags l for the DPCA model. In this paper,
k is chosen by two methods: (i) specifying a threshold for the cu-
mulative explained variance or (ii) using parallel analysis [15]. Forthe first approach, k is selected to be the minimum number of prin-
cipal components whose cumulative explained variance exceeds a
specified threshold. The parallel analysis (PA) technique compares
the eigenvalues λ of Σ, the covariance matrix of X , with the eigen-
values µ from the covariance matrix of a similarly-sized data matrix
with independent, normally distributed random elements [15]. Usu-
ally, the λ values begin with larger magnitudes than the µ values,
and the number of PCs is chosen to be the smallest value of k for
which λ(k) ≤ µ(k). However, in some cases, λ(1) ≤ µ(1), and kis selected as k = 1.
0-7803-7896-2/03/$17.00 ©2003 IEEE 1968Proceedings of the American Control Conference
Denver, Colorado June 4-6, 2003
![Page 2: Fl Acc03 Pub](https://reader038.vdocuments.site/reader038/viewer/2022100806/577ce7031a28abf103942e31/html5/thumbnails/2.jpg)
7/31/2019 Fl Acc03 Pub
http://slidepdf.com/reader/full/fl-acc03-pub 2/6
In order to specifythe number of lags l for DPCA, different meth-
ods have been presented in the literature [17,12]. The method used
here is based on the method presented in [17].
2.2 PCA and Distance Similarity Factors
The PCA similarity factor , S PCA, provides a useful characteriza-
tion of the degree of similarity for two datasets. It is based on the
similarity of the directions of the principal component vectors for
the two corresponding PCA models. A PCA model is defined to
be the matrix that has the first k principal component vectors as
its columns. Define T to be the (n × k) PCA model for a train-
ing dataset, and C to be the (n × k) PCA model for the current
dataset of interest. Define θij as the angle between the ith principal
component of T and the jth principal component of C . For k prin-
cipal components, the PCA similarity factor, S PCA, is then defined
as [13,14,18]
S PCA =1
k
ki=1
kj=1
cos2 θij =C T T T T C
k(2)
Although the PCA similarity factor compares the similarity of two datasets, it is also beneficial to have a similarity measure based
on the “distance” between two datasets in the m-dimensional data
space. This measure can be obtained by using the distance simi-
larity factor , S dist, proposed in [14]. First, the variables in the T dataset are scaled to zero mean and unit variance. These same val-
ues are then used to scale dataset C . Denote the vectors of sample
means as xC and xT . The Mahalanobis distance is [19]
Φ =
(xT − xC )
T Σ†T (xT − xC ) (3)
where Σ† is the pseudo-inverse of the covariance matrix Σ of the T dataset. The distance similarity factor, S dist is defined to be [14]:
S dist =
2π
∞
Φ
e−z2/2dz (4)
For the purposes of classifying MPC operation, similarity fac-
tors were calculated for two types of data. The first type is the
standard, time-series measurement data (the “ X ” data). The second
type of data consists of sample autocorrelation (ACF) and partial
autocorrelation (PACF) function coefficients [20,21] (the “CORR”
data). The X data is comprised of the inputs, outputs, and one-step
ahead residuals, in order to take advantage of the model used in
the controller. The CORR data is defined to be the first n/4 cor-
relation coefficients (as per the recommendations in [20]) of the
residuals, control error, and differenced inputs. The inputs were
differenced to help increase the similarity measure’s sensitivity to
higher-frequency dynamic behavior, while reducing the sensitivityto low-frequency disturbances.
The overall classification of MPC performance is based on a
composite similarity factor , SF , a combination of four individual
similarity factors:
1. S XPCA the PCA similarity factor value for the X data
2. S CORRPCA the PCA similarity factor value for the CORR data
3. S Xdist the distance similarity factor for the X data
4. S CORRdist the distance similarity factor for the CORR data
The composite similarity factor is defined by the following equation
SF α1 S XPCA + α2 S CORRPCA + α3 S Xdist + α4 S CORRdist (5)
where
α1 + α2 + α3 + α4 = 1 (6)
The αi values are tuning parameters for the classifier that can be
used to weight the individual similarity factors differently.
3 Classification Strategy
In summary, the basis for the proposed PCA approach is to use the
composite similarity factor SF to determine the similarity between
a current dataset and a group of training datasets that contain a wide
variety of closed-loop process responses. The training datasets that
are most similar to the current dataset are collected into a candidate
pool, and based on an analysis of the training datasets in the pool,
the current dataset is classified.
An important aspect of the classification is how the different op-
erating classes are defined. It is proposed to classify each dataset as
being a member of one of four mutually exclusive operating classes:
1. Normal operation (N)
2. An abnormal disturbance is present (D)
3. A plant change has occurred; thus there is significant model-
plant mismatch, or MPM (P)
4. Both a plant change and a disturbance are present (P+D)
We use the term “fault” to refer to a member of the D, P, or P+D
classes. With these definitions, the classification can be performed
using two types of classifiers:
I. Binary classifiers Three binary classifiers are used. Each one
classifies a dataset as being in one of two categories:
AB: Normal (N) vs. Abnormal (D, P, or P+D)
DIST: Abnormal disturbance (D or P+D) vs. none (N or P)
MPM: Plant change (P or P+D) vs. none (N or D)
II. Four-class classifiers This classifier classifies a dataset as be-
ing one of the four exclusive operating classes (#1–4), above.
For each classifier, the classification label for the dataset is deter-
mined by using one of two alternative techniques. A candidate pool
of size N p is constructed by selecting the N p training datasets that
are most similar to the current dataset, based on the SF metric.
Then, the first option involves calculating class-average similarity
values by finding theaverage of the SF values for eachclass present
in the pool, where the classes are defined based on whether the bi-
nary classifiers are or the four-class classifier is used. For example,
the two classes for the AB classifier are “Normal” and “Abnormal”.The class in the candidate pool with the highest class-average SF value is selected as the classification for the current dataset. The
second option is based on the frequency (or number of times) that
each class is present in the candidate pool. The class with the high-
est frequency is selected as the classification label for the current
dataset.
The performance of the individual binary classifiers can be im-
proved by using a combination of them in an exclusion strategy.
In general, our experience has been that the AB and DIST classi-
fiers perform very well while the MPM classifier is less successful.
1969Proceedings of the American Control Conference
Denver, Colorado June 4-6, 2003
![Page 3: Fl Acc03 Pub](https://reader038.vdocuments.site/reader038/viewer/2022100806/577ce7031a28abf103942e31/html5/thumbnails/3.jpg)
7/31/2019 Fl Acc03 Pub
http://slidepdf.com/reader/full/fl-acc03-pub 3/6
However, the AB and DIST classifiers can also be used to detect
plant changes with a low false alarm rate when used as part of the
exclusion strategy. This strategy is shown in Fig. 1. For example, if
the AB classifier indicates an ‘Abnormal’ condition, and the DIST
classifier indicates a ‘No Disturbance’ condition, the current dataset
is classified as a ‘Plant Change’ condition. The disadvantage of
this technique is that a P+D condition will be classified as a ‘Distur-
bance’ if perfect classifiers are used, rather than as ‘P+D’. However,
one could argue that this result is acceptable because it is partially
correct. Also, upon comparing the case study results, it is apparent
that the benefit of this approach is that it significantly reduces the
amount of false ‘P’ classifications (i.e. “false alarms”).
3.1 Generation of the Training Database
The first step in developing these pattern classifiers is to create
the simulated databases. The databases are made up of individual
datasets, where each dataset consists of closed-loop response data.
It is suggested that the length of the dataset is approximately the
longest open-loop settling time of the process so that all important
dynamics can be represented. An important aspect of the proposed
approach is that it does not require that a specially-designed pertur-
bation signal be applied to the plant. However, in the case study,
better classification occurs if each dataset contains the same known
excitation (such as a setpoint change or measured disturbance) as
in the current dataset. In this paper, both the training data and the
“current” data are assumed to be setpoint response data for a single
setpoint change.
After the simulated data has been generated, it must be scaled
appropriately. For this paper, “class scaling” was performed; each
dataset in the training database was scaled to zero mean and unit
variance based on the overall mean and standard deviation of the
class to which that dataset belonged (i.e. N, D, P, or P+D). Then
several design and tuning parameters must be specified to design the
classifiers: the number of principal components in the PCA models
(k), the number of lags for DPCA (l), the pool size (N p), and the
SF weights (αi’s). The optimal values of these design parameters
were determined by classifying a second independent database con-
taining validation data.
4 MPC Case study
This section presents specific details about the simulated case study,
the Wood-Berry distillation column model and its MPC system. The
generation of the simulated databases and the classifiers is also de-
scribed. Results are then presented for both the PCA and DPCA-
based classifiers.
4.1 Wood-Berry distillation column model
The Wood-Berry model is a 2×2 transfer function model of a pilot-
plant distillation column that separates methanol and water [22].
The system outputs are the distillate and bottoms compositions, xDand xB [wt %], which are controlled by the reflux and steam flow
rates, R and S [lb/min]. The unmeasured feed flow rate, F , acts as
a process disturbance. The column model is shown in Eq. (7). The
Wood-Berry model is a classical example used in many previous
publications and in the MATLAB MPC Toolbox [23].
xD
xB
=
12.8e−s
16.7s + 1
−18.9e−3s
21.0s + 1
6.6e−7s
10.9 + 1
−19.4e−3s
14.4s + 1
R
S
+
3.8e−8s
14.9s + 1
4.9e−3s
13.2s + 1
F
(7)
MPC tuning parameters. The closed-loop MPC simulation was
performed using MATLAB and the Model Predictive Control Tool-
box [23]. The manipulated variables (MV’s) were R and S , and
the controlled variables (CV’s) were xD and xB . High and low
saturation limits were imposed on the inputs. In order to reduce
the simulation time needed to explore a variety of design options
and design parameters, unconstrained MPC was employed. How-
ever, the classification strategy presented in this paper is identical
for both constrained and unconstrained MPC. The MPC controller
was tuned by trial-and-error to give a reasonably fast setpoint re-
sponse. The prediction horizon was P = 30 and the control horizon
was M = 5. The error weighting and move suppression matrices,
Q andR, were chosen to be the identity matrix. The control execu-
tion period was ∆t = 1 min. No special disturbance modeling was
used; the common “DMC”, constant step-disturbance approach was
employed [23,1].
4.2 Database Generation for the Case Study
Three types of databases are generated for different purposes: train-
ing, validation, and test. The purpose of the training database
was discussed earlier. The validation and test databases repre-
sent the “current data”. The validation data is used to obtain the
optimal design and tuning parameters for the classifier. The test
database takes the place of actual process data for the purpose
of evaluating the proposed monitoring approach. These databases
consist of many individual datasets. Each dataset contained 100
one-minute samples of setpoint response data. A unit xD set-
point change was made at t = 0. The choice of the variable
whose setpoint was stepped was based on the fact that xD is the
faster responding of the two controlled variables. The satura-
tion limits on the inputs were −0.833 ≤ R ≤ +0.833 lb/min and
−0.516 ≤ S ≤ +0.516lb/min.
Training and validation databases. The training and validation
data were created using input disturbances (added to u) and output
disturbances (added to y) in order to mimic a wide range of ac-
tual process operation when unknown disturbances are present. The
feed flow rate disturbance model in Eq. (7) was assumed to be un-
known, and thus was not used in the generation of either the training
or validation databases. (It was used to generate the test database,
as described below.) Input and output disturbances were chosenrandomly and independently from four types of disturbances: step,
ramp, sinusoidal, and stochastic. It should be noted that the step
and ramp output disturbances could also be used to represent sen-
sor biases and drifts. Step disturbances had random starting times
between t = 25 and t = 75 min. The ramp disturbances began
between t = 25 and t = 50 min and had a randomly chosen dura-
tion between 25 and 50 min. The magnitude of the step disturbance,
the final magnitude of the ramp disturbance, and the amplitude of
the sinusoidal oscillations were chosen randomly from the range
±{0.25 − 3.25} wt%. The magnitudes of the input disturbances
1970Proceedings of the American Control Conference
Denver, Colorado June 4-6, 2003
![Page 4: Fl Acc03 Pub](https://reader038.vdocuments.site/reader038/viewer/2022100806/577ce7031a28abf103942e31/html5/thumbnails/4.jpg)
7/31/2019 Fl Acc03 Pub
http://slidepdf.com/reader/full/fl-acc03-pub 4/6
were adjusted so that the open-loop changes in the outputs were
approximately the same as for the output disturbance cases.
Two ARMA transfer functions were used in creating the stochas-
tic sequences in the training data, and a third transfer function was
used for the validation data. For “stochastic” input and output dis-
turbances, the magnitude of the input sequence to the transfer func-
tions, and the subsequent output magnitude were scaled to give the
same approximate effect on the output as the deterministic distur-
bances. For the sinusoidal oscillations, the frequency had a uni-
formly random distribution in the range of {0.03 − 1.03} rad/min.
Gaussian measurement noise with an approximate standard devia-
tion of 0.05 wt% was added to each output, and Gaussian process
noise was added to the two inputs with approximate standard devi-
ations of 0.006 lb/min and 0.003 lb/min, respectively. These val-
ues were specified so that the process noise had approximately the
same open-loop magnitude effect on the outputs as the measurement
noise.
Test database. In order to make the test case more realistic, the
unmeasured feed flow rate disturbance model (that was ignored in
creating the training and validation data) was utilized to generate
disturbances. For the feed disturbances, steps, ramps, sinusoids,and stochastic sequences were used as inputs to the disturbance
model. A different ARMA transfer function was used to generate
the stochastic sequences for the test database. The magnitudes for
the steps, ramps, and sinusoids were chosen randomly in the range
±{12.5 − 37.5}% of the nominal feed flow rate (2.45 lb/min).
Plant changes. Random plant changes were included in the sim-
ulated database in the same manner for the training, validation, and
test databases. The four transfer functions in the Wood-Berry model
of Eq. (7) contain a total of 12 model parameters (K , τ , and θ for
each transfer function). To create the plant change datasets, the
number of datasets in which j process models are to be perturbed,
for j = 1, . . . , 4, must be specified. For the case study, this was
based on the number of possible combinations in which j modelscould be chosen out of the four models. Then, for each plant change
dataset, the following parameters were chosen in a uniformly ran-
dom manner:
1. Specify which j specific models will be perturbed.
2. For each perturbed model, choose how many model parame-
ters (i) will be perturbed.
3. Determine the i specific model parameters (K , τ , θ) to perturb
for each model.
4. Specify ∆, the magnitude of the additive parameter pertur-
bation, where ∆ = ±{12.5% − 37.5%}. For example,
K = K 0(1 + ∆), where K 0 is the nominal value.
Thus, an individual dataset could have contained as many as 12 pa-
rameter changes that were chosen randomly. These changes occur
at the beginning of the simulated setpoint change and remain in ef-
fect for the entire window.
Operating class distribution. Next, we consider the distribution
of the four independent operating classes (N, D, P, P+D) present
in the training database. Increasing the relative amount of a par-
ticular operating class can sometimes result in better performance
for one classifier at the potential expense of another. But in this
case study, we assume no a priori knowledge of the likelihood
of any of the fault conditions (D, P, or P+D) and have the same
number of datasets for each of the four operating conditions for
all three databases. The total amount of training data, N = 2000datasets, was based on a neural network design criteria for a previ-
ous study [10]. To evaluate the effect of using less training data for
the PCA technique, smaller training databases with N =500 or 200
datasets were created by randomly choosing a subset of the origi-
nal 2000 datasets. The number of validation and training datasets,
1000 and 800, were chosen arbitrarily to be less than the number of
training datasets.
4.3 Case Study Results
This section presents classification results for both the PCA and
DPCA techniques. To analyze the classifier performance, the
following performance measures (in %) are defined for the test
datasets:
η efficiency = percentage of all datasets correctly classi-
fied.
ηi percent of all class i datasets classified correctly (i = N,
D, P, P+D).
p accuracy of the data sets classified as a plant change.
For an accurate classification, a dataset classified to be
a plant change must belong to either (i) the P class for
the p(P only) metric, or (ii) either the P or P+D class for
the p(P, P+D) metric.
False Alarm Rate 100% − p(P, P+D).
Binary classifiers:
T 1 = Type I errors; the percentage of all test datasets in which
an error was caused by missing a fault condition.
T 2 = Type II errors; the percentage of all test cases in which
an error was caused by falsely identifying a non-fault
condition as a fault, i.e. a “false positive”.
Note that for the binary classifiers, η + T 1 + T 2 = 100%. For
the binary Type I and Type II errors, the null hypotheses are
defined as follows:
AB µ0: True Class ∈ {D, P, or P+D}
DIST µ0: True Class ∈ {D or P+D}
MPM µ0: True Class ∈ {P or P+D}
Four-class classifiers:
T 1 = 1
4
4
j=1 T 1j , where T 1j represents the percentage of
all cases in which a class j, as indicated by the classifier,
is the correct classification.
T 2 = 100% − T 1, the percentage of all cases in which the
indicated class is not correct.
PCA Results. Because the classifiers based on the frequency-
pool method almost always performed better than the pool-average
method, only the frequency-based classifier results are presented.
Note that in some cases, multiple pool sizes or weight values may
give the same classification accuracy. In these cases, an average of
the multiple results is presented. The classification results on the
validation database (not shown) indicated that using PCA models
explaining 85% of the variance leads to the best performance for
1971Proceedings of the American Control Conference
Denver, Colorado June 4-6, 2003
![Page 5: Fl Acc03 Pub](https://reader038.vdocuments.site/reader038/viewer/2022100806/577ce7031a28abf103942e31/html5/thumbnails/5.jpg)
7/31/2019 Fl Acc03 Pub
http://slidepdf.com/reader/full/fl-acc03-pub 5/6
the full training data (N =2000) case. Table 1 presents the classifi-
cation performance for the test database. The best validation classi-
fiers for the full data (85% variance explained) are underlined. The
best overall results (based on the test database) are shown in bold-
face. When the best validation classifiers were used to evaluate the
test database, an abnormal operating condition was detected with
over 98% accuracy, while the disturbance classifier achieved over
92% correct. The plant change classifier detected plant changes
correctly 73% of the time, while the four-class classifier achieved
an overall correct classification rate of 64%. It appears that classi-
fier performance tends to deteriorate when less training data is used.
However, the performance of all classifiers was still reasonable for
reduced amounts of training data.
Table 2 presents more detailed test database results for the best
test database classifiers from Table 1 (i.e. the ones in boldface). The
classification accuracy for each type of operating condition and the
Type I and II errors are shown. The performance is reasonable, but
the MPM classifier has a quite large Type II error value.
The exclusion strategy of Section 3 can be applied in order to de-
tect plant changes more accurately (with less false positives). Table
3 summarizes these results for the test database. Although the 75%
explained variance cases achieved the best p values for the valida-
tion database, if one “cheats” and looks at the test database results, it
is apparent that, by far, the best p values occurred for parallel analy-
sis. Parallel analysis design achieves very high p values of p(P only)
= 94.5% and p(P, P+D) = 97.6%, at the expense of a lower η value.
Use of this classifier thus results in a very low false alarm rate for
identifying plant changes of less than 3%. Table 4 presents a class
distribution table that shows how the true classes of the test database
are classified when using the parallel analysis exclusion classifiers.
DPCA Results. Analysis results for the DPCA classifiers are now
presented, for the full training database. Once again, the best per-
forming classifiers for the validation database (not shown) were ob-
tained based on 85% variance explained. A comparison of the PCA
and DPCA validation database classification performance indicatesthat the DIST and 4-class classifiers performed slightly better us-
ing DPCA than the PCA classifiers discussed earlier. However, the
AB and MPM results were worse. Test database results using the
best validation classifiers (i.e. the αi, l, and k values giving the best
performance) are shown in Table 5, where the best validation classi-
fiers are underlined. Again, the best overall test database classifiers
are shown in boldface. Comparing the test database performance of
the optimal validation classifiers that used DPCA (Table 5) with the
PCA-based classifiers (Table 1), with both at a threshold of 85%,
indicates that the binary classifiers (AB, DIST, MPM) performed
worse for DPCA, while the 4-class classifier was slightly better. Ta-
ble 6 shows detailed results for the best overall test database classi-
fiers (those in boldface in Table 5).
Test database results for the exclusion strategy are presented inTable 7. Now, the choice of which design to select for this task is not
as clear cut as for the PCA case; based on the validation database
results (not shown), both 85% and 95% thresholds were acceptable.
The highest test database p values for the DPCA case occur for 95%
variance explained, which was one of the indicated best validation
cases. Although the η values for DPCA in Table 7 are similar to
the PCA results in Table 3, the p values are significantly lower for
DPCA; thus, because the goal of this strategy is to obtain high pvalues, it seems that one should choose the PCA method over DPCA
for best performance.
5 Summary
Several PCA-based pattern classifiers have been developed for mon-
itoring model predictive control systems. These PCA-based classi-
fiers gave accurate classifications for a 2 × 2 simulated case study,
the MPC-controlled Wood-Berry distillation column model. For ex-
ample, the AB classifier, used to detect abnormal operating condi-
tions, could accurately classify over 98% of the independent test
datasets. Using the best exclusion classifiers, over 77% of the plant
changes in the test database were identified as a plant change with
a false alarm rate of less than 3%, an important consideration in in-
dustrial applications. The classifiers performed slightly better when
they were based on PCA as opposed to DPCA. In general, a smaller
amount of training data had a detrimental effect on the classification
results. Choosing the number of PC’s to retain by parallel analysis
gave mixed results but is still recommended because it is an auto-
matic technique, and it produced reasonably accurate AB and DIST
classifiers along with a very accurate exclusion-based, plant change
classifier.
Acknowledgements
Funding for the research presented in this paper, provided by the
ChevronTexaco Research and Technology Company (CTRTC), is
greatly appreciated. The authors would like to thank Ron Sorensen
(CTRTC) and Jim Gunderman (ChevronTexaco) for discussions
about industrial MPC systems. Lastly, the first author would like
to thank the Control Engineering Laboratory at the Helsinki Uni-
versity of Technology for providing him with a visiting researcher
position, where many of the ideas in this paper were initially devel-
oped.
References
[1] J. M. Maciejowski, Predictive Control with Constraints. Har-low, England: Prentice Hall, 2002.
[2] S. J. Qin and T. A. Badgwell, “A survey of industrial model
predictive control technology,” Control Eng. Prac. (in press),
2002.
[3] Y. Zhang and M. A. Henson, “A performance measure for
constrained model predictive controllers,” in Proc. European
Control Conf., (Karlsruhe, Germany), 1999.
[4] B.-S. Ko and T. F. Edgar, “Performance assessment of con-
strained model predictive control systems,” AIChE J., vol. 47,
no. 6, pp. 1363 – 1371, 2001.
[5] D. J. Kozub, “Controller performance monitoring and diag-
nosis. an industrial perspective,” in Proc. 15th IFAC TriennialWorld Congress, (Barcelona, Spain), 2002.
[6] T. J. Harris and C. T. Seppala, “Recent developments in con-
troller performance monitoring and assessment techniques,”
Chem. Proc. Control-VI, AIChE Symposium Series No. 326,
Vol. 98 , pp. 208 – 222, 2002.
[7] S. L. Shah, R. Patwardhan, and B. Huang, “Multivariate con-
troller performance analysis: Methods, applications, and chal-
lenges,” Chem. Proc. Control-VI, AIChE Symposium Series
No. 326, Vol. 98 , pp. 190 – 207, 2001.
1972Proceedings of the American Control Conference
Denver, Colorado June 4-6, 2003
![Page 6: Fl Acc03 Pub](https://reader038.vdocuments.site/reader038/viewer/2022100806/577ce7031a28abf103942e31/html5/thumbnails/6.jpg)
7/31/2019 Fl Acc03 Pub
http://slidepdf.com/reader/full/fl-acc03-pub 6/6
[8] B. Huang and E. C. Tamayo, “Model valiation for industrial
model predictive control systems,” Chem. Eng. Sci., vol. 55,
pp. 2315 – 2327, 2000.
[9] B. Huang, “Multivariable model validation in the presence of
time-variant disturbance dynamics,” Chem. Eng. Sci., vol. 55,
pp. 4583 – 4595, 2000.
[10] F. Loquasto, III and D. E. Seborg, “Monitoring model predic-
tive control systems using a novel neural network approach,”in AIChE Annual Meeting, (Reno, NV), 2001.
[11] B. M. Wise and N. B. Gallagher, “The process chemometrics
approach to process monitoring and fault detection,” J. Proc.
Control, vol. 6, pp. 329 – 348, 1996.
[12] L. H. Chiang, E. L. Russell, and R. D. Braatz, Fault Detection
and Diagnosis in Industrial Systems. London: Springer, 2001.
[13] W. J. Krzanowski, “Between-groups comparison of principal
components,” J. Am. Stat. Assoc., vol. 74, pp. 703 – 707, 1979.
[14] A. Singhal and D. E. Seborg, “Pattern matching in multivariate
time series databases using a moving-window approach,” Ind.
& Eng. Chem. Res., vol. 41, pp. 3822 – 3838, 2002.
[15] J. E. Jackson, A User’s Guide to Principal Components. NewYork: John Wiley, 1991.
[16] The MathWorks, Inc., Statistics Toolbox Ver. 3.0. 2000.
[17] W. Ku, R. H. Storer, and C. Georgakis, “Disturbance detec-
tion and isolation by dynamic principal component analysis,”
Chemom. Int. Lab. Sys., vol. 30, pp. 179 – 196, 1995.
[18] M. C. Johannesmeyer, A. Singhal, and D. E. Seborg, “Pattern
matching in historical data,” AIChE J., vol. 48, pp. 2022 –
2038, 2002.
[19] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification,
2nd ed . New York: John Wiley, 2001.
[20] G. E. P. Box, G. M. Jenkins, and G. C. Reinsel, Time Series
Analysis Forecasting and Control, 3rd
ed . Englewood Cliffs,NJ: Prentice Hall, 1994.
[21] G. E. P. Box and A. Luceno, Statistical Control by Monitoring
and Feedback Adjustment . NY: John Wiley, 1997.
[22] R. K. Wood and M. W. Berry, “Terminal composition con-
trol of a binary distillation column,” Chem. Eng. Sci., vol. 28,
pp. 1707–1717, 1973.
[23] M. Morari andN. L. Ricker, Model Predictive Control Toolbox
User’s Guide. Natick, MA: The MathWorks, Inc., 1995.
Table 1. PCA results: η values for different training sizes and
variance thresholds.
Variance Explained
75% 85% 95% PA
N = 2000 500 200 2000 500 200 2000 500 200 2000 500 200
AB 98.7 98.3 97.8 98.3 97.0 97.1 98.5 98.0 98.1 97.1 96.6 97.8
DIST 91.1 93.0 87.1 92.3 94.3 84.0 80.4 89.6 88.6 94.5 93.8 92.3
MPM 74.0 73.9 70.5 73.0 71.5 74.6 73.4 74.6 75.1 72.8 73.3 76.0
4-class 68.5 68.6 58.8 63.9 58.9 59.5 64.4 59.8 62.6 65.1 63.6 60.1
Table 2. Detailed results for the best test database PCA classifiers.
η ηN ηD ηP ηP +D T 1 T 2AB 98.7 99.0 100.0 95.8 100.0 1.1 0.3
DIST 94.5 100.0 99.5 81.0 97.5 0.8 4.8
MPM 76.0 99.0 22.0 95.0 88.0 4.3 19.8
4-class 68.5 97.5 37.0 72.5 67.0 71.0 29.0
Table 3. PCA exclusion strategy plant change classification results.
% Var. Expl. η p (P only) p (P, P+D)
75 82.8 77.9 90.6
85 80.5 81.3 93.4
95 82.3 55.3 79.5
Par. Anal. 77.5 94.5 97.6
Table 4. True and Indicated classifications using the PCA exclusion
classifiers (parallel analysis case).
Classified as (%)
Actual Dataset N D P Not Classified
N 98.0 0.0 2.0 0.0
D 0.5 94.0 0.0 5.5
P 3.5 19.0 77.5 0.0
P+D 0.0 97.5 2.5 0.0
Table 5. DPCA test results: η values for different variance
thresholds.
Variance Explained
75% 85% 95% PA
AB 96.8 95.8 96.5 88.9
DIST 90.4 88.4 89.1 85.8
MPM 68.6 72.6 74.5 73.0
4-class 61.2 64.6 61.1 65.4
Table 6. Detailed results for the best test database DPCA classifiers.
η ηN ηD ηP ηP +D T 1 T 2AB 96.8 96.5 98.5 92.0 100.0 2.5 0.8
DIST 90.4 1 00.0 8 7.0 9 0.5 84.0 7.3 2.4
MPM 74.5 100.0 19.0 86.0 93.0 5.3 20.3
4-class 65.4 84.0 26.0 67.0 84.5 68.6 31.4
Table 7. DPCA exclusion strategy plant change classification
results.
% Var. Expl. η p (P only) p (P, P+D)
75% 82.5 72.7 86.8
85% 83.0 68.0 82.4
95% 84.0 76.0 92.3
PA 76.0 62.8 78.5
Disturbance Classifier (DIST) No Disturbance Disturbance
Normal / Abnormal Classifier (AB)
Normal
Abnormal
CLASSIFY AS:
Normal NOT
CLASSIFIED
CLASSIFY AS:
Disturbance CLASSIFY AS:
Plant Change
Figure 1. Illustration of the exclusion strategy.
1973Proceedings of the American Control Conference
Denver, Colorado June 4-6, 2003