manchester university electrical and electronic engineering control systems centre (csc) a...
TRANSCRIPT
Manchester University
Electrical and Electronic Engineering
Control Systems Centre (CSC)
A COMPARITIVE STUDY BETWEEN A DATA BASED MODEL OF LEAST
SQUARES AND PARTIAL LEAST SQUARES ALGORITHMS
Prepared by AWAD R. SHAMEKH
Under supervision ofDr. BARRY LENNOX
10-5-2006
• MODELLING AND IDENTIFICATION• LEAST SQUARES ALGORITHMS• MULTIVARIATE STATISTICAL METHODES• SIMULATION RESULTS AND CONCLUSIONS• FURTHER WORK
The Presentation Contents
1.Collection of data
2.Selection of identification algorithm
3.Selection of a model structure
4.Specifying of Criteria
The Steps in identifying a model of a process are as follows:
Modelling is a useful way to consolidate information about a system and to explore its characteristics. a model can be constructed either theoretically or by identification.
Modelling and Identification
Least squares algorithms
Since its invention in1795 by Gauss, the least squares technique remains the most popular tool in the identification field.
The reasons for its popularity are that it does not contain 1. high-level mathematical analysis 2. it is easy to implement and 3. modification and extensions have been made to it that make it extremely robust and applicable
Recursive Least Squares algorithm (RLS)
In the recursive computation technique, the identification of the current parameters is performed based on the old estimated parameters and therefore the capacity of memory storage will be significantly reduced.
Summary of the recursive least squares
1. It is commonly used in on-line controlling systems. It could be performed explicitly as in the self tuning regulators or implicitly as in case of model reference control.
2. Dose not required a large memory size.
3. It can provide an over view about a system behaviour, such advantage arises when the system is subjected to drastic changes in the operating conditions.
Multivariate Statistical Methods
The multivariate statistics is a modern data analysis technique that has been widely used in Industry with good results. By using the multivariate statistical algorithms the data can be compressed in a manner that retains the essential information in small number of factors which describe of how the variables are related to each other. Principle component analysis (PCA) and Partial least squares (PLS) are dominant techniques in multivariate statistics.
Partial Least squares PLS
PLS regression originated in social science by Herman Wold, 1966, and then Entered in chemometrics by his son Svante. The PLS decomposes X and Ydata into orthogonal sets of scores (T,U), loadings (P,Q) and Weights (W,C) which are evaluated to maximize covariance between the scores of X and Y.
Non-Iterative Partial Least Squares (NIPALS)
1
11
uX
uXw
T
T
11 wXt
1
11
tY
tYq
T
T
11 qYu
11
11
tt
tXp
T
T
1
11
old
oldnew p
pp
oldoldnew 111 ptt
111 oldoldnew pww
Select the first column of Y as , in case of multi-output system1u
The regression coefficient b for the inner relation is:
11
11 tt
tub
T
T
the X and Y block residuals are calculated as follows:
T111 ptXE
T1111 qtbYF
TQBWPWθ 1The same procedure for all Y columns should be repeated, results PLS parameters vector
Recursive Partial Least Squares (RPLS)
Recursive PLS algorithm is devolved as in the recursive least squares, by updating the
covariance matrices )( and )( YXXX TT . As the new data become available, the old
data can be exponentially discounted by the forgetting factor, , depending on the
rate of data changes.
In this study the modified kernel PLS algorithm is implemented to develop a RPLS Model. As introduced by Dayal and MacGregor, the algorithm contains the following steps:
The covariance matrices should update as
tTtt
Ttt
T xx )( )( 1 XXXX
tTtt
Ttt
T yx 1)( )( YXYX
If Y contains more than one variable, then aq is computed as the largest eigenvector
corresponding to the largest eigenvalue of aTT )( YXXY .
aTT
a YqXw a
aa w
ww
aw is computed as the eigenvectors corresponding
to the largest eigenvalue of aTT )( XYYX
For a >1
112211
11
.....
aaTaa
Ta
Taa rwprwprwpwr
wr
aa Xrt
aTT
a
TTaT
a rXXr
YXrq
)(
)(
the output covariance matrix deflated as
)()()( 1 aTa
Taaa
Ta
T ttqpYXYX
The RPLS model Coefficients are calculated by
TTTRPLS RQQWPWθ 1)(
Simulation Results
A set of highly correlated variables denoted by X-matrix and observations of y-vector are used to test a model of four different ways of identification, Ordinary Least Squares (OLS), NIPALS Partial Least Squares (PLS), U-D Recursive Least Squares (RLS), and modified kernel Recursive Partial Least squares (RPLS). The X-data and y-outputs are
Three cases are under taken to demonstrate the performance of the four types of identification algorithm, OLS, PLS, U-D RLS and RPLS
1. Correlated data
0.4958 1.0 1.0
1.5034 1.0 0.0
3.4944 1.0 1.0
1.9985 0.0 1.0
321
xxx
X
2417.1
0067.2
2389.5
2475.3
y
Table (4.1.a), parameters of the estimated models from X& y. (LV=3)
Table (4.1.b), parameters of the estimated models from X& y. (LV=2)
1.30950.0362-0.6287PRLS
1.24930.1273-0.7459RLS
1.30950.0362-0.6287PLS
2.0335-1.05030.8166OLS1.33300.0000- 0.5800Actual
Model 1a 2a 3a
1a 2a 3a
2.0335-1.05030.8166RPLS
1.24930.1273-0.7459RLS
2.0335-1.05030.8166PLS2.0335-1.05030.8166OLS1.33300.0000-0.5800Actual
Model
Table (4.2.a), parameters of the estimated models from X& 0.001y , LV=3
1a 2a 3a
1.6246-0.043640.0007RPLS1.24930.1277-0.7455RLS1.6246-0.043640.0007PLS1.6246-0.043640.0007OLS1.33300.0000- 0.5800Actual
Model
Table (4.2.b), parameters of the estimated models from X& 0.001y , LV=2
1a 2a 3a
1.30940.0366-0.6284RPLS1.24930.1277-0.7455RLS1.30940.0366-0.6284PLS1.6246-0.043640.0007OLS1.33300.0000- 0.5800Actual
Model
2. Artificial data
The following process transfer function has been excited by GBN signalAnd its out put is estimated by means of OLS, PLS, U-D RLS and RPLS
)()(7.05.11
5.0)()()()(
21
211 tvtu
qqtvtuqGty
)(
9.01
1)(
1te
qtv
The objective is to identify the ARX model for the considered process as in the structure below at different signal to noise ratio.
)2()1()2()1()( 221121 tubtubtyatyaty
Remark: In the case of recursive identification, the output error criterion is applied
Table (4.2.a), the estimated parameters compared with the actual at signal-to-noise ratio (0.5) all latent variables (4) are considered in the PLS's estimation.
1b 2b 1a 2a
0.7053-1.50410.49200.9828RPLS 0.7050-1.50370.49520.9816RLS 0.3422-1.15470.90650.9884PLS 0.3422-1.15470.90650.9884OLS 0.7000-1.50000.5000 1.0000Actual
Model
Table (4.2.b), the estimated parameters compared the actual .The PLS's estimation are performed with (LV=4) at signal-to-noise ratio (0.05).
1b 2b 1a 2a
0.7015-1.50100.49930.9937RPLS 0.7015-1.50100.50010.9934RLS 0.6259-1.42190.61970.9949PLS 0.6259-1.42190.61970.9949OLS 0.7000-1.50000.5000 1.0000Actual
Model
3. Non-isothermal Continuous Stirred Tank Reactor (CSTR)
The process is defined as irreversible A B and the reaction is carried out perfectly in a mixed CSTR.
CSTR
AooCF
AFCJFJoT
oT
TAn ARX model of each outputvariable is identified individually where thesystem is driven by random walk of Arrhenius rate constant the desiredmodel has the following structure:
).6(......)1(
)6(......)1()6(......)1()(
2621
161161
tuctuc
tubtubtyatyaty
The prediction is carried out recursively by U-D RLS and RPLS
ConclusionsThe study reveals some notes about the studied and applied algorithms, these are summarized as
1. The motivation behind using the ARX model is its simplicity and to ensure model accuracy number of lags should be increased. In contrast, this leads to a huge regression vector especially in the fat system data, as in the case of a distillation column.
2. A system can be identified perfectly with OLS algorithm assuming that the variables of regression matrix are independent. But such conditions in many situations are not guaranteed, which is related to an unstable model.
3. As it has been documented in the literature that the importance of the PLS appears in heavy multivariable systems.
4. From the results apparently, there is on difference in the model accuracy of U-D RLS and RPLS. However, in some studies, it has been shown that RPLS-based model is better than its competitor, U-D RLS, when they are used in control design.
Further work
1. Survey and analysis of General Model Predictive control (GPC).
2. Design of Dynamic Matrix control for the CSTR case study.
3. Comparison between U-D RLS and RPLS using the error optimization. technique.
Thank you