lt nt#3lecture note #3 (chap 6)(chap.6) · 2010. 4. 19. · truncated impulse response (fir)...
TRANSCRIPT
S t M d li d Id tifi tiSystem Modeling and Identification
L t N t #3Lecture Note #3(Chap 6)(Chap.6)
CHBE 702Korea University
Prof Dae Ryook YangProf. Dae Ryook Yang
1-1
Ch 6 Id tifi ti f TiChap.6 Identification of Time-Series ModelSeries Model
• Identification of time series modelIdentification of time series model– Model structure
Parametric estimation– Parametric estimation• Least square method
ll i h di b l d– Excellent properties when disturbances are uncorrelated– Otherwise, there may be systematic errors and bias
• More sophisticated methods are needed– Handling correlated disturbancesg– Extension of linear regression
1-2
Model structure• Time series
T– Multivariable time series– Multidimensional time series (temporal+spatial)
1 2[ ]Tk k k kmx x x x
• Model classification– SISO/MIMO– Linear/Nonlinear– Deterministic/StochasticDeterministic/Stochastic– For linear discrete-time systems
• Difference equation and ARMAX modelsDifference equation and ARMAX models• Transfer function model• State-space model
1-3
ARMAX M d l d DiffARMAX Models and Difference EquationsEquations
• ARMAX (AutoRegressive Moving Average with eXogenous ( g g g ginput) model 1 1 1( ) ( ) ( )d
k k kA z y z B z u C z w
where d is time delay and 1 1 21 2
1 1 20 1 2
( ) 1
( )
A
A
B
B
nn
nn
A z a z a z a z
B z b b z b z b z
with unknown parameters
1 1 21 2( ) 1
B
C
C
nnC z c z c z c z
21 1 1[ ]Ta a b b c c with unknown parameters 1 1 1[ ]
A B Cn n n wa a b b c c
1( ) (AR)k kA z y w 1( ) (MA or FIR)k ky C z w1 1( ) ( ) (ARX)d
k k kA z y z B z u w 1 1( ) ( ) (ARMA)k kA z y C z w
1 1 1( ) ( ) ( ) (ARIMAX or CARIMA)k k kA z y B z u C z w
1-4
( ) ( ) ( ) ( )k k ky
Integrated Controlled
• ARMAX: General model• ARX: Controlled autoregressive model and Linear
regression model when disturbance is measuredregression model when disturbance is measured• AR: Model harmonics compounded with noise, and
Truncated impulse response (FIR) modelTruncated impulse response (FIR) model• ARMA: Model-based spectrum analysis• ARIMAX: For nonzero means and drift or nonstationaryARIMAX: For nonzero means and drift or nonstationary
disturbance cases1 1 1( ) ( ) ( )k k kA z y B z u C z w ( ) ( ) ( )k k ky
1 1 1 1 1( ) ( ) ( )(1 )A y B u C w ( ) ( ) ( )(1 )k k kA z y B z u C z z w
Integrated white noise: Nonstationary (random walk)
1-5
Prediction Error Method (PEM)– Methods to predict y based on previous data and the identified model– PEM
N N
It i i i th i f th di ti t h d f th t t
2 2
1 1min ( ( ) ) min ( )
N N
kk k k kk k
E y y E
• It minimizes the variance of the prediction steps ahead of the output ywhere the prediction is based on the present data.
(1 )k k k ky A y Bu Cw
1
( )(1 ) (1 )
(1 ) (1 )( ( ))
k k k k
k k k k
k k k k k
y yA y Bu C w w
A y Bu C C Ay Bu w
( ) ( )( ( ))k k k k k
k k
y yz w
2 2{( ) } {( ) }E y y E w y
2 2 2
{( ) } {( ) }
{( ) } { }k k kk k k k
k k wk k
E y y E z w y
E z y E w
1-6
Minimum attainable variance
Transfer Function Models• Transfer function models
*( ) ( ) ( { } )k u k v k i j v ijy H z u H z v E v v k u k v k i j v ij
1 11
1 1
( ) ( )( )( ) ( )k k k
B z C zA z y u wF z D z
1
1
( ) (Output error (OE) model)( )k k k
B zy u vF z
1 1
1 1
( ) ( ) (Box-Jenkins (BJ) model)( ) ( )k k k
B z C zy u wF z D z
– OE model: No assumption on disturbance sequence {vk} – BJ model: Filtered white noise {wk} sequence by C/D
1-7
• Difference between output error and prediction error1 1 (Output error method)k k ky ay bu
( di i h d)b
– Output error relies more on the accuracy of future output modeling.Prediction error uses actual output
1 1 (Prediction error method)k k ky ay bu
– Prediction error uses actual output.– Output error identification is a nonlinear estimation problem.– Prediction error identification is a linear estimation problem.p
• Algorithm for OE identification1. Least square identification to find initial estimate of F and B.
2. Filter the data according to
1 1: ( ) ( )k k kM F z y B z u v
1 1/ ( ) / ( )F FF F
3. Subsequence estimation of F and B from the model.1 1: ( ) ( )F FM F z y B z u v
1 1/ ( ), / ( )F Fk k k ky y F z u u F z Prewhitening filter
4. Repeat 2-3 until the estimate converges.: ( ) ( )k k kM F z y B z u v
1-8
• Comparison of error models for identification
1-9
Maximum-Likelihood Method• Select estimate so that the observation Y is most probable.
max ( ) ( )p Y p Y
Likelihood function• Example 6.4
Likelihood function
( { } 0 and { } 0)TY v E v E vv
– Assume that– Likelihood function:
1/ 2 1( ) ((2 ) det ) exp( 0.5 )N Tv vp v v v
1/ 2 1( ) ((2 ) det ) exp( 0.5( ) ( ))N Tv vp v Y Y
– If the model is linear in parameters with normally distributed white noise, the maximum likelihood estimate is same as Markov estimate.
1log ( ) log ( ) 0.5log((2 ) det ) 0.5N Tv vL p
the maximum likelihood estimate is same as Markov estimate.• Cramer-Rao lower bound
1 12log log log( )
TL L LC E E
g g g( ) TCov E E
1-10
Fisher information matrix
F ARMAX d l• For ARMAX model1 1 1( ) ( ) ( )d
k k kA z y z B z u C z w 1log ( ) 0 5log((2 ) det ) 0 5 ( ) ( )N TL
wherelog ( ) 0.5log((2 ) det ) 0.5 ( ) ( )v vL
Tk k ky
1 1 1 1 1A A Bk k n k n k d k d ny a y a y b u b u Unknown
1 1 A A B
C C
Tk k n k n k kv c v c v v
T1 1 1[ ]
A B C
Tk k k n k d k dn k k ny y u u v v
– The empirical likelihood function when v=v2I (v
2 unknown)1 1 1[ ]
A B C
Tn n na a b b c c
2 2 2 2l ( ) ( / ) l ( ) ( / ) l ( ) ( / ) ( )N
2 2 2 2
12 2
log ( , ) ( / 2) log(2 ) ( / 2) log( ) (1/ 2 ) ( )
( / 2) log(2 ) ( / 2) log( ) (1/ ) ( )
v v v kk
v v N
L N N
N N V
2 2log ( , ) (1/ ) ( ) 0v v NL V
( ) 0NV
2 2 4 2
2 log ( , ) (1/ ) ( ) ( / 2 ) 0v v N vv
L V N
2 (2 / ) ( )v NN V
( 1) ( ) ( ) 1 ( )( ( )) ( ) (Newton-Raphson method)i i i ii N NV V
1-11
( ( )) ( ) (Newton Raphson method)i N NV V
E l 6 5 LS d ML id tifi ti• Example 6.5 LS and ML identification
1 1 1: k k k k kS y ay bu w cw
Local minimum
– For colored noise, ML identification performsbetter than LS.
– LS can estimates only a and b.
• Example 6.6 Pseudolinear regression1 1 1: ( ) ( ) ( )k k kS A z y B z u C z w
– Estimate high order polynomials A and B by least squares– The computed residual sequence {k} yields a good approximation of
white noise sequence {w }white noise sequence {wk}.– Extend the regressor with {k} and then estimate the polynomials of A, B
and C using least squares identification.– It is also called two-step linear regression.
1-12
Kalman Filter• State-space model
1k k k kx x u v { } 0, { } 0k kE v E e
• Optimal estimate of xk based on the input-output datak k k ky Cx Du e
1 2 0 0 0{ } , { } , (0) { }T T TE vv R E ee R P E x x R
k
• Kalman filter (Kalman-Bucy filter)
211min ( ) {( ) }k kk kJ x E x x
– Kalman filter will minimize the above minimization when vk and ek are independent and normally distributed.
( )K C 1 1 1
12
1
( )
( )
k k kk k k k k k
T Tk k k
x x u K y Cx
K P C R CP C
11 1 2( )T T T T
k k k k kP P R P C R CP C CP
1-13
D i ti• Derivation– The prediction error– The prediction error dynamics ( )x K C x v K e
1 11k kk kx x x The prediction error dynamics
– The mean prediction error– The mean square prediction error
1 ( )k k k k k kx K C x v K e
1{ } ( ) { }k k kE x K C E x
1 1{ } {[( ) ][( ) ] }
( ) { }( )
T Tk k k k k k k k k k k k
T T Tk k k k v k e k
E x x E K C x v K e K C x v K e
K C E x x K C K K
T T– Let { } and T Tk k k k e kP E x x Q CP C
1T T T T T
k k k k k k v k k kP P K CP P C K K Q K 1T T T1
11 1( ) ( )
T T Tk k v k k k
T T Tk k k k k k k
P P P C Q CP
K P C Q Q K P C Q
– Minimization of Pk+1 gives1( )T T
k k e kK P C CP C k k e k
11 ( )T T T T
k k v k e k kP P P C CP C CP (Riccati equation)
1-14
• Cases for time-varying parameters
1k k kv { } 0, { } 0k kE v E e
1 ( )Tk k k k k kK y
Tk k k ky e
11
21
1 1 2
( )
( )
k k k k k kT
k k k k k kT T
k k k k k k k k k
K P R P
P P R P R P P
– Excellent for time-varying systems– R1 and R2 are important design parameter that should match the temporal
k k k k k k k k k
R1 and R2 are important design parameter that should match the temporal variations of k and the observation noise, respectively.
1-15
Instrumental Variable Method
• Correlation between the regressors and the prediction error g pleads to bias of the parameter estimates obtained from least-square solutions to the linear regression problem
• Replace regressor by some other variable Z: IV method– In order to make the estimator consistent { } 0TE Z v ( )Trank Z p
1( )z T TZ Z Y
– The instrumental variable should be chosen so that they are
1( )z T TZ Z Y 1 1( ) {( )( ) } ( ) ( )z z z T T T T
vCov E Z Z Z Z
The instrumental variable should be chosen so that they are simultaneously uncorrelated with v and highly correlated with .
1-16
E l 6 8• Example 6.8
Biased least square estimate of parameters1 1 1: 0.9 0.1 0.7k k k k kS y y u w w 2 2( { } 0, { } )k k wE w E w
– Biased least-square estimate of parameters
– Instrumental variable
1[ ] ( ) [0.957 0.047]T T T Ta b Y 1 1
1 1N N
y u
y u
1 1k k kz az bu
1 1
1 1N N
z uZ
z u
1( ) [0.918 0.075]z T T TZ Z Y
• Shows reduced bias
• Example 6.9For a choice of IV
1 1N Nz u
0 1u uZ
0
– For a choice of IV2 1N N
Zu u
1( ) [0 413 0 047]z T T TZ Z Y
– Gives very poor estimate.– It might be difficult to choose appropriate instrumental variables.
( ) [0.413 0.047]Z Z Y
– Thus, an iterative procedure are usually used.
1-17
E l 6 10 (Th Y l W lk ti )• Example 6.10 (The Yule-Walker equations)– Consider the AR process
1 2 2: ( ) ; ( { } 0 { } )k k k kS A z y w E w E w : ( ) ; ( { } 0, { } )k k k k wS A z y w E w E w *
1 1( ) { } {( ) } { } { }
A An nT T T
yy k k i k i k k i k i k k ki i
C E y y E a y w y a E y y E w y
2
1
1
( ) , 0( )
( ), 0
A
A
ni yy wi
yy ni yyi
a C iC
a C i
– Choosing numbers M> nA and p>nA and1[ ]
A
Tk k k ny y
[ ] / ( )Tz y y M k p M p 2 2[ ] / ( , , )k k k pz y y M k p M p
1( 1) ( 2) ( ) ( )( ) ( 1) ( 1) ( 1)
yy yy yy A yyaC i C i C i n C iaC i C i C i n C i
2( ) ( 1) ( 1) ( 1)
( ) ( 1) ( 1) ( )A
yy yy yy A yy
nyy yy yy A yy
aC i C i C i n C i
aC i p C i p C i p n C i p
( )TZ ( )TZ Y1( )z T TZ Z Y
1-18
( )
Some Aspects of Application• Prefiltering, smoothing, prewhitening
1 1( ) ( ) ( ), ( ) ( ) ( )f fY z F z Y z U z F z U z
– For periodic variation, use F(z1)=1 zd when d is the period of trend.
1 10: ( ) ( ) ( )f f
k k kM A z y B z u v w
• Bias reduction– Trend eliminationTrend elimination
Diff i i f d
1 1( ) / , ( ) /N N
k kk ky y N u u N
1 1: ( )( ) ( )( )k k kM A z y y B z u u v – Differentiation of data
( )( ) ( )( )k k ky y1 1: ( ) ( )k k kM A z y B z u v 1 1: ( ) ( )M A z y B z u v
It introduces new noise correlation
It gives improved accuracy– Offset estimation via an extra parameter
: ( ) ( )k k kM A z y B z u v It gives improved accuracy
1 10: ( ) ( ) ( )k k kM A z y B z u v w
Extra parameter
1-19
p
Convergence and Consistency
• Convergence in Lp, 0<p<∞g p, p
• Convergence almost surelylim { } 0p
kkE x x
g y
• Convergence in probabilitylim { , , 0} 1kn
P x x k n
• Central limit theorem{ , 0} 0kP x x
– Let {xk} be a sequence of independent random variables with common distribution function F with finite mean and variance 2.X has a limiting normal distribution with mean 0 and variance 1 as N→∞– XN has a limiting normal distribution with mean 0 and variance 1 as N→∞.
1If , thenN
N kkS x
(0,1)
distN
NS NX Normal
N
1-20
Effi i t ti t
• Efficient estimate,
• Consistent estimate
2 2{( ) } {( ) } for any other estimate E E
• Consistent estimate2lim {( ) } 0NN
E
lim { 0} 0 plimP
Probability limit
• Unbiased and asymptotically unbiased estimateslim { , 0} 0 plimN NN
P
{ }E
(Unbiased estimate)
lim { }NNE
{ }NE (Unbiased estimate)
(Asymptotically unbiased estimate)
1-21