0884152553

Data Reconciliation

, & Gross Error Detection An Intelligent Use of Process Data

Shankar Narasimhan and Cornelius Jordache

Publishing Company Houston, Texas

Data Reconciliation & Gross Error Detection

Copyright 0 2000 by Gulf Publishing Cornparty, Houston. Texas. All I-ights resel-ved. This book, or parts thereof. may not be reproduced in any form without express written permission of the publisher.

Gulf P ~ ~ b l ~ s h i ~ i g Company Book Di\,ision PO. 13ox 260s 1 Hou~lnn. l'esas 77252-260s

Tr, our grtru Professor Richard S. H. Mak, who played tlze roles of an initiator and a catalyst.

"Since all measurements and observationb ar-e nothing more than approximations to the truth. the same must be t!-ue of 311 c:ilcuiations resting upon - them, and thc: highest a m of all cn:npu!2iior!s matie COiICZi71iEs c0i:Cr::ie ~!:cllOnlsila i:lcsr t ? ~ !o approxirnatc, al; ncari)- YLL pr;?c!icab!c. t ~ l i!i2

tr:!ii~. R u t this can be accomplisi~cd in ;lo OCIICS \<a?, i!l;?i;

5! ;I .;:litab!e coinhiriatioo of rnorr nhse!-vaiio~li O::r!? r ! ~ 1;umL.er- absi;l~ite;y reyui,ite t i ) : the determiria~ion of :he

unl:no\%~i quantiiies."

Contents

Acknowledgments, xi;;

Preface. xv

Chapter I: The Importance of Data Reconciliation and Cross Error Detection, 1

Piocitss I h t a Conditioning Metl?ods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Iilduhtriai Esamplcs of Stcidy-State Dats Rccor-iciliaiioil . . . . . . . . . . . . . 5 Cr-udi. Split 0p:il-r~ization in a Preheat 'Train of a Kei~ner-y . . . . . . . 3

?vIin~!l?izing Water Consumption i r ~ hlinera: Beneficiation Circuir\ . . 7 1 ilara Recdnciliatio;~ Problem Formulztion . . . . . . . . . . . . . . . . . . . . . . . .

Exa!nples of Slnlpir iiecoilc:liation I'roblems . . . . . . . . . . . . . . . . . . . . . i 1 S!:stt"ns ?Yith .4li Measured Variables . . . . . . . . . . . . . . . . . . . . . . . . . 1 i

S).srems Wit\, Unmc.usurcd Variabies . . . . . . . . . . . . . . . . . . . . . . . I 1 17 Systan Cor?tain;ng Ci ros i El-:orb . . . . . . . . . . . . . . . . . . . . . . . . . ---- 7 ;j Zitrlzfir.; iroir? Data Recurlciliation and Gross Err-cr Detection . . . . . . . . -

.I\ B1-it:' H'istory ofi1a:a Recoriciliation anci Cross I:rr:,r 5etection . . . . . 2 1 ? i S c o ! ~ and Osgi~niration oftlie Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . -7

S~~liiiriar: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 - ,, l<efe~- i~~icrs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . :D

Chapter 2: hleasurement Errors and Error Reduction Techniqtles, 32

Cl:t\\if~i-,ition of h.leasurements Error\ . . . . . . . . . . . . . . . . . . . . . . . . . . . :?- Random Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Ciross Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

Err-or Reduction hletliods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Exponential Fillers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Moving Average 1-'ilters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . JS I'olynomial Filtel-s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Iiybrid Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . j6

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

Chapter 3: Linear Steady-State Data Reconciliation. 59 I..inear Systems With All Variahleh Measured . . . . . . . . . . . . . . . . . 59

Cieneral Formulation and Solut~ol~ . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Statistical Basis of Data Reconciliation . . . . . . . . . . . . . . . . . . . . . . . 61

Linear Systerni With Both Meawred and Ul~r~ieaiured Variables . . 63 The Constructiori of a PI-ojection h4atrix . . . . . . . . . . . . . . . . . . 66 Observahility 2nd Redi~ndancy . . . . . . . . . . . . . . . . . . . . . . . . . . 69 h4atri.i Decornpo\itio~~ Mtihod\ . . . . . . . . . . . . . . . . . . . . . . . . . 70 (;rap11 Theoretic \letl~od . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2 Otlitr Ciassificutroii klcthotls . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 7

kj \tiri?atii12 . Mxi\tirsiiii.nr Error Co? .I 1.iank.c hl~itrli . . . . . . . . . 77 I /

Si~i?i;iatiorl PCciln~iiliC !'or ~\~!lli:iliiig 1jat;i i i ~ ~ ~ ; i ~ i i i : i i i ~ i i . . . . . . . . h i Slillllll~:l-\ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . >? J: XI.^.. .,rice\ .. S?

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

ChBpter 4: Steacf-State Ilata Keconcilistion fcr Bilinear S\ stern\. 83

Solutioi~ Techniques for Equality Constrained Problems . . . . . . . . . . . . . 122 Methods Using I. agrange Multipliers . . . . . . . . . . . . . . . . . . . . . . . . . . 122

. . . . . . . . . . . . . . . . Method of Successive Linear Data Reconciliation 124

Nonlinear PI-ogramming (NLP) Methods for Inequality Constrained . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reconciliation Problems 128

. . . . . . . . . . . . . . . . . . . . . . Sequential Quadratic Programming (SQP) 129 Generalized Reduced Gradient (GKG) . . . . . . . . . . . . . . . . . . . . . . . . . 132

. . . . . . . . . . . . Vari;~ble Classification for Nonlinear Data Reconciliation 134 Comparison of Nonlinear Optimization Strategies for

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Reconciliation 136 Surnrnary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

Chapter 6: Data Recczc:'iation in Dynamic Systems. 142 The Xeed for Dynamic Data Reconciliation . . . . . . . . . . . . . . . . . . . . . . . 142

. . . . . . . . . . . . . . . . . . . . . . . . . Linear Discrete Dvnamic System Model 143 Optinial State Estimation Using Kalman Filter . . . . . . . . . . . . . . . . . . . . . 148

Analogy Between Kal~nan Filtering and Steady-Stare Data Reconiilistiori . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I53

Optiiil31 Control and Kalrn-n Filterin? . . . . . . . . . . . . . . . . . . . . . . . . I55 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . K;~l:ii;l~i Fil:cr l~llplen?eiit:ition i57

1)ynaiiiic D;:u Reco~iciliation of Non!incar Syxrenis . . . . . . . . . . . . . . . . 1 i)O h'oii!irtea~- State Extini:itior!s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 Nor;linzar Data Reco!-iciliation Mctliods . . . . . . . . . . . . . . . . . . . . . . . I O l

Su~:;rnary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17!

C!~apter 7: Introduction to Grass Errnr Dt-tectio~i. 173 Problc.~:~ Statem~n!. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i74

Bas!c Stat;s!ical Tests for Gross En-or Liercction . . . . . . . . . . . . . . . . . . . 175 T hz Glohril Test (GT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 Thc Constraint or Noddl Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I80 Thc hleasurcmc~lt Test (MT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 X i The Gclicralired Likc!iliood Ratio (GLR) Test . . . . . . . . . . . . . . . . . . ; 5 ., Cornp:ii-ison of thc Powcr of Basic Gross Error Dciection Tests . . . . . 190

(;SOS\ Error Detectioli IJ\iii$ PI-incipal Cornporicnt (PC) Te\ti . . . . . . 1% PI-lncipal Componcilt Tests for Residu:ils of Process C:onstlairirs . . . . 196 Principal Con~poncnt 'l'csts or] Measurement Adjcstille~its . . . . . . . . 147 Relationship Bct\vccn Principal Component Tests and Otlier

St:~tistical Tebts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198

. . . . . . . . . . . . . . . . . . Statistical 'Test5 for General Steady-State Models 290 $

. . . . . . . . . . . . . . . . . . . Techniques for Sinzle Gross Error Identification 203 Serial Eliniinat~orl Stratesy for- Itlentifying 3 Single Gross Errc11-. . . . 2Cl4

. . . . . Identifying a Single Gross Error h5; Principal C O I I I I I O I ~ ~ I I ~ Tests 107 . . . . . . . . . . . . . . . . . . . . 1jetectability and Idzritifiability of Cross El-rors 100

Detectability of Gross Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1 0 Identifiability oiGross Errors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

PI-oposed Prohlerns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sumntary 223

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

Chapter 8: Multiple Gross Error Identification Strategies for Steady-State Processes, 226

Stmtegies rnr X,luitipis i;!-oss Emor Identification in Linear Processes . .

Sin~ulr;,i?eoui Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . sc,.:. ' - 1.11 >ri-;~te~:~:; . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Coir!i>ii;a!ic)i-iA Si;lti 'ci~\ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Pcr-fori!ianci. .'il~-a,u;-:.\ for I j \ .a l~iat ins Gross fir!-01- !ilcn!i!Ycativil si!-,llcgii,s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . Coi?ip::i-i,o!~ :I: J.i;ri:i;iiz GI-:.,I\\ Ei-1-01 !deiitrf'~c;i~ior~ Sira~ezies ; I ? < l I : ! !I ! 1 ' . . . . . . . . . . . . . . . . .

. . . . c:-<>,\ E,-:(,c [<:c.;:,fi(:j;;<;!: tr !<(>1;!!!j,>:ll p!-:l~,:,$~s t.,Tiiil? . - . . . . . . . . . . . . . . . . . . . . . . . . . . .\r:r;!i!::,ii GL2. l:cii~:,ci

s i r ti t i ........ r ; i i : i t i n . . . . . . . . . . .

!?-oix!\cS Pi-i>h!e;n\ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

S?!l!ii!1~1." . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Kc!'~PLI?c~> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

rj- I oh!:r:i > Forn1uiati:)n f i l l - :>e!:<rion of >\leasiiri.lnrnt f3i;ica . . . . . . . . . 2S2

Stnrihiical Pi-o;?i.:~iz.- oi' T I . I T ~ ~ \ i t i i i ~ i ~ i ;rrld the Global Tc\t . . . . . . . . . . . . 781 Gc. i l i . i -a l~~i~~ Lihi.iih<><)ii i-::i!r,) hI~'[liod . . . . . . . . . . . . . . . . . . . . . . . . . . 289

3 9 c, !-,iult Lliagiio\l\ 'I~~ih111~.~iii.> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - . 707 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 'I'i1c St;ll< o f the ‘ A i ~ t -

Siilriiiiai-y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20s

Kefel-enL.c\ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ?CIS

Chapter 10: Design of Sensor Networks. 300 F cstirnalion Accuracy of Data Recorlciliatioii . . . . . . . . . . . . . . . . . . . . . . 30 I Sensol- Network Dcsigr? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ?(I?

Methods Based on Matrix A1gel)ra . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303 Methods Rased on Graph Theol-y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 l i Methods Hased on Optimization Techniques . . . . . . . . . . . . . . . . . . . . 322

Developments in Scrisor Network Ecsiyn . . . . . . . . . . . . . . . . . . . . . 323 Surnrnary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kef$c.cnces 325

Chapter 11: Industrial Applicatiorls of Data Reconciliation and Gross Error Detection Technologies, 327

Process Unit Baia:lce Rccuuciiialion and Grus; Error Deteciion . . . . . . .

. . . . . . . . . . . . . . . . . . . . . Parameter Est1rna:ion and Data !?eco:lci!iciti~~ . . . . . . . . . . . . . . . . . . Plant-W~de Materia! and 1Jtilit;es Kecoilci!iaiioii

. . . . . . . . . . . . . . . . . . . . . . . CascStcdies . . . . . . . . . . . . Rccnnciliation of' Sef,ne:.> Crude F'iehcar T:-air; E131::

. . . . . . . . . . . . . . . . . . . . . Keco~iciliatioii of A-numoniu P i x : Dac;: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sumiiiar-).

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . f2~:fcrenczs

- - \<cto!-a ;i;li: Thei:. PI-operrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 3 ..- - VIar~+.ces and Tiirlr PI-opeitis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 I >

- -- Refs~-cnces i / / . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Appendix R: Gralth Theory Fundarr:enfal\. 3-3 . . . . . . . . . . . . . . . . . . . Giaphh. PI-cicess GI-,i;;i~s. a:ici Su!;sra;lh\ 37t.

,.>:> . . . . . . . . t'at!i.;. Cycles. aiitl C::nnccri~iry . . . . . . . . 1

. . . . . . . . . . . . . . . . . . . . . . Sp~niiinp 'l'recs. I(~-ai?lhes. zrid Cti,?;.~!.. :;SO

Graph Operations 7 > , . L l ; . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . Cutset\. I-undamcniais C~.lrsets. a i d F~ir-ida~ni.rltai C lcie.; ?" I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . [{efcrence 3K3

Appendix C: Fundamentals of Probability and Statistics. 384 . . . . . . . . . . . . . . . Randorn Variables 2nd PI-ohability Density Functions 383

. . . . . . . . . . . . . . . . . . . . . . . . Statistical Properties of IZandom Val-iables 589 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ffypothesis Testing 391

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References 393

Acknowledaments

Thc authors are indebted to several people who have contributed to the preparation of this book. T h e main contr ibut ions c a m e from Prof. N;lrasinlhan's students at the Indian Institute of Technology (1lT)- hladras. 'f. Rcnganatllan and J . Prakash. currently doing their doctc?ra! prozrams. prepared the solutions for all examples lvith assislance fr-om Sreerari~ Mriguluri. Mnrukuria Rajar~~ouli spent 110ui.s in :he riigtit psepal-- ins tlic ~ a b i z s and iigurzs in dil.l'erent ch:i~::eu. Thc s~~ccessfiil ccniltle- iioii of ii-~i, book was clue to their eftoris. ii:--. S;icl-iln Pat\i.a!-dhan ;1.!1(! S. i'cslir;~\ an;im. faculty at f Jl' M d r a s . pro\~icli.c! critical in~)t!is ii? ii:ipro\ c tile 'li!c.us and c!ari:y of- tni: text.

Thanks are also due to Liii: RAGE softv.-are cie\~zl:)pment t a r n at E ~ g i - ncers India Lirni!cd, R&D Center. c o n s i s i i ~ g i\f Dr. i\ladhukar Gasg. Dr. V. !?a\ ikuliiar. and XIIS. Sheoi-aJ Sin?i! fi-ti~;: ~,~hclir? PI-<ti'. N3sasim!72r:

. . ~ a i n e d \aid::bic pi-acrical cx:;cricr:ce i n iin!~leixie~iing dais reco~~~:ili:i:i~>ir to indiistrial ixocesscs. Prof. Ysras~rrrhrtn aiss rhafiks l'i-of. Jcse I<:?go: a:id DL-. Didier Maquiri at CRAN-INPL, in Nancy. I-i.ancc, for- a1-rai:ging a :;urulne!- visit for hint. during which a significan: part of the iexi rx:;i\

com!Aetcti. Dr. Jordache thanks all colleagues and ;I-!an:iyr-i; f t - om Clic~nsh;l~-c.

Coi-p.. Raytheon Process Antcirnation. and cspeciall! Sinlulation Sciences liic.. for ci~alleriging him with practical issues that helped hin-i ciarii'!. niarly implenientatior~ (jztails for data I-cconciliarion technology. Dr. Tor~i Clir~ksc;lies pi-ovidcd valuable input on ?olynonlial filter-s. Drs. Miel!-Jii~g Lin and Ricardo Duiiia helped with clarifying sonie liteoreticai dcri\,;i- lions mentioned in :he text.

Special thanks to thc R&L) management of Sirrtulation Sciences Inc. i S I M S C I m ) for the encou:agernent and support given to Dr. Jordachc during the writing of this manuscript and for allowing him to use DataconTM software and the training book for preparing a dctailed ind~lstrial example included in this book.

Very special thanks are also due to our respective wives. Jaishree Narasimhan, and Doina Jordache. who shielded us from all the problems on the tlorne front during the past two years.

Moreover, the authors express a heart-felt gratitude to Debbie Markley of Gulf Publishing Company, who patiently reviewed every detail of the manuscript and worked long hours to help with finishing the index and to make sure this book is as accurate as possible.

Fiilaiiy, the authors want to acknowledge and thank Prof. Miguel Bagajewicz for his excellent review and very useful suggzstions on additionai 123aterial to be included.

Preface

The quality of process ciata in chemical/petrochemical industries sly- rlificantly affects the perforniancc a11d the profit gained frorn using v:i!-i- OLIS software for process monitoring, oiilinc optilr~i/atioii, arid contro!. Unfortui1atcly plant ~ n c a s t ~ r e ~ n e ~ ~ t s often contain errors that invalidate [lit. p r o w s mode! used for optimitation anti co i~~ro l . D i m rectmciliation and gross error- detection are technique:, deveioped iil the p;~st 30 years i'oi~ i,n~"uvin? 111s accuiacy of' data so that the!- s:iti~t? ):he p!ant ninclzl. Dur-- i ! ; ~ t h e l:!.i decade. thsy ha:.;. br!:n r \ i d t l ~ a j~p i i~~ i i ir ! refine!-ie\. pe:l-.:,- ~ l i e r l l i~ l l pldnrs, n~inera! prc)cessir!c i:idusrries. and so L~rtll. i!l i l i - c i ~ i -

achieve i:iorc ai.cur:kte plant-wide accour?iins : r i d superic?r- profit~ibiiiiy (1:

plailt opera:lc~ns. A!tnough cor~iinercial software h r dut;t rcco~lci!i:3rion and grosi cn-or

dercction are ;.,\,ailal.le, t!lc accnir~paiiying ~nsiluals usual!). gi\-cf ii:tle the- or-et~cai back:_.soar!ti. In c;r-df:r ro I)c ah!< tc: sc:ecr ihe !IS>; ~nerhnd.; ;r!!,i ?air: tilt: iiloi; bci:cfii:; from ilieir iii?p!z1nzn1aiiurl. :)nz n e d c a sr;oi!

uridersta:!dil~z of the Fui:damcr:ta! coiiceprs. 'This l ~ ( ? ~ i . cxplai~~s :!I- ??a-ic: concept, i n datd reco~~cili~ition 2nd rross erTor detection usir~? marly i!lv<- trativi. exainples. It also contains descriptions of diftsreii! techfiiquei that have beel: cizveloped for these ~ur-no.;es anti prescnth :r. unified perspecti\-e of these di\-crsc inethods. Certaii: crircna for selectin: \,arious tcchniqus arid guidclincs for their practical i~nplcn~rnration are al\o ~ndicztcd.

The fhcus in this entire text i\ 011 classical rllc~t-itiii\ thxt nuhe use o f ~)roceisi i~odel ~~111~t ions . S L I C ~ c ( : I ~ \ < ~ \ . ~ ~ : o I I l;l\vs. equiiib:-iuni I-c!:i- tions. ziilti equiprner~t performance cq~~arion:,. to rect!r~ciie data arid to detect and iderltif'y g;-oss errors. In rcczr~t years. the usi. of' artificial lieus- a1 net\vor-ks for- data reconciliatioil and gr-oss error detectior, has bee11

proposed. These meihods have not been includeti because they have riot iittaincd matur-ity, although they ntay become important in the future.

This book is organized in such a way that it is useful for both industrial personnel and acade~nia. 'Phe book will be a valuable tool to enzineers and managers dealing with the selection and implementation of data reconciliatio~i software, or those involved in the development of such soft- Lvare. The book will also be useful as a supplementary reference for an irndei-g~aduatelgraduate level course in chernical process instrt~mentation and control in which basic concepts can be taught or as a text for a full graduate level course in these topics. Unlike this book, the other books that are currer~tly available on these topics do not present an in-depth analysis of the different techniques available, their limitations or their interrelationships.

The book is organized as follows: The firs, -;lapter motivates the need foi- data reconciliation and gross error detection and introduces the major conccpts in\'olvecl. Chapter ? introduces the statistical cIia1-acterizatio~i of liiea>llr.cnlellt errors mil v:irious filtering techniques used for error rzdtic- tiori t!l;it can for111 jl;111 of ail eve!-all data processing stratezy. The nest f n u r ch:lpterc dcsl n,ir11 ihc st~bjeci of data reconciliation in incre:i.;ing i?: si o!' con:ple~ity. Strad! -sta!e li11ea1- diita reco~~cilia:ioin is i l ~ s c ~ i h e d i ~ i C'i;~pti-r- ?. ~?::iolrlpo\itii~n ~ecilllii!!lcs fill- Iirlear m o d e l with borh n:ea-

. . .;i~i-i-ii srii! ~~nmi.~i;ii~-i-c? i ~iii~iblit, :ire tiesct-ibc4 here. The techniques rei;~~.- s.i rcj !!IL. cia\sii'i~stii)~: of :;ti-iab!cs as obser\.ablz- a n j rc.~iundant ars aI-,o p! <\e:~ic:i. >fe!hocis (01 c~ti i i i : i~ii l~ thc ~ie;<surer]zrlt C:-SOI- v2ri:ir1ces fic111 !ilca.urcci dxta arc also cii'sc~ibed i ! ~ t h i ~ chapter.

C!iap!zr 4 de'~l:, \\ itin htsadv-state data reconciliatroc For- biliricar sys- :cI:-!\ ~ o r l s i > t i ~ ~ y ~.(-~ri:y~l~snt b:il;i!~ce~ ~tild. ~ I I C C I T ~ ~ I I ~ : i \ e< , eiierz>. h<!i- ~ : ! C U . 7'11:' ~ i ~ ~ : i \ , ; ~ t i n ~ i t9r ~ i>~l \ id i -~- i~ig silcki proccssei i s iI1cst1-a:ec! w i n y ? > L z ~ i ~ ~ p l e ~ I ' I ~ O I V 111i1i~~:ll indu\tl-ie~ :is ell :I:, utili~y di~~i-i13u[io~i ili.r\\ i)?-ks II I C ~ C ~ I I ~ C ; I ! iri~lt:\tr-ie\. Cl~:i j i t~~. 5 t t - ~ i l t j nor1lil:ear data rcco~icili- ::tii)r:. N:)niiilear- ~nocizl\ ar-c o t r~i i used to ;;ccur-atcly describe chcnlic;~I pr~ces.;cs. 7'112 iii<:~~i eft icie~i: 3i1d widel) used so!urion procedur~c for :he non!rnear I-econciliariorn 31-e pi-escnied in this chap:er. Ilandling inequalir i'oil\ti.aints. qircii as hotrridh oil ~ar-iables. is also analyzed in this chapter. I);ira reconcil iat i~~~i leciiiliqut. for dyniirnic systcllls are discussed iri C";-i.lprer (7. Both Kalnlar~ iiltel-ins nlettiods and gerleral optimization tzch- iliquei riesigrrcd for- d\ na~iiii. riordinear problc!ns are pre.;en!ed.

Ct~:ipters 7 throit$~ 9 tical tiit11 the pr-oblem of gross en-or detection. Chapter 7 in t r c td~~c~s the issuz\ irlvolvetf in gross error detectior~ and ~k\~.i- ihi .s tlrc il:l\ic \t;lli\~izal tc.;l\ th;~! call bc wed to dc.tcc1 grciss er-~-or..

The underlying assumptions, char-acteristics. and relative advarttagcs :and disadvantages of various statistica1,tests arc also discussed. FOI- identity- ing nnultiple gross errors, corl~plex strategies ar-e requil-ed. A plcttior;~ of strategies have been proposed and evaluated in the I-csearch literatur-c. ,\ special eff'or? has becrr made in Chapter 8 to gi1.e :i ~inificd perspccti\c h classifying the different strategies on the basis of their core pr-inciplzs. \t'e also descr-ibe in detail a typical strategy fronl each of these classes. Chap- ter 9 treats the problem of gross error identification in dynamic systcr~~s.

The efficacy of data reconciliation and gross error detection dependi significantly upon the location of measured variables. Recent attempts to optimally design the sensor network for rnaxinijzing accuracy of data reconciliation solution are described in Chapter 10.

Several industrial applications and existent software sysienis for data reconciliation and gross error detection are also discussed in Uhaptc.1- 1 1 . Various aspects related to the benefits of offline and on-line data reconciiiation. the methods n-tostly used and their perfo~manccs are analy~cd hrre.

In order to make this book self-sufficier~t with r-espect to the n:xrhi-- ma tic:^? background required for- a good undzrstandirl~. :tppzridice> iiii. irlc1udc.d tl~at descrihe the necessary b a i c co:iccpts fro111 l inex a!gsi?i:a. %rap11 t!:co:v. atid prohabi1i;y ar;d :,ta:isiicai h\;po~hcci.; tc\tiil:.

The Importance of Data Reconciliafion and

Gross Error Detection

PROCESS DATA CONDiflONLING METHODS

I n aily modern chelnicni p!ant, pctrc~cheiiliiai procr3si or refiilel).. h~ndred5 oi- eve:; tiiousai~ds of variabiei---silcfi 2s Eo\v r:;:c.s. terr?pei'a- tu!-cs, pres.;i!i.c;>, ie\.els. ~ : d coiiij~o~i:io:~s---31-e ~ . o L ~ : ~ I I z ! ~ nieasured ail,! auio!na:icaliy I-ccoi-ded i'<ji- tile purFc>c o f proc<a.: contro!. online op;i- mizatio::. <>i. process econo:lii; e.:,tiua;ioi:. h,Toc!crr~ compurers 2nd dam l;cc~~iisition systein\ faci!itate the col!ection acd processi~g of a 2re:;t :.o?i!rne of dai;~, ofteii ;actpled \\ itil a frequency of the order of rnifiutes or ei7cn secc)~:ds.

The iisz of corilpctcrs no: on!\. ai!i;\~;s d3t3 tc: be (;b[?,irliC: iit ii si2a:e~ freqaency. hgt bas a l i o :-ei~!itcti :n thc cli;l~inalic:i~ :)f e i ~ o r s present in ~nal-iual recoi-dilly. This ir? ii!;c!i has srcat:y iii~p!-o\.ec! ri?e sczuracy ~icd \:a!iclity of process data. The incre;irii aiEi)Llpt of i!:!iirn;aiio~:. !loweve:.. ciin be cxp!oitc<l for SLIT-ther ii~lp~-o\~iii: the liccardc! arid cunsistenc). of .-v,,,-nv ,,. .,,,., s ,I,,+,, t ,t~.~"g!: :I systematic data checkizs and rserttment.

Process 111eastircments are inevit2ibly coi.rup~cd !I!. errors d~iriilg the ri:casur-erncnt, processing and trans~niaion of' the rr?easured s i~na l . The total en-oi in a ineasurelncnt. which i.; rl~c dilikrencr hct\veen tile measured value arid the true value of a v::i-iahle. car1 !7e co~i\.c~lien[i>. represented as the sum of the contributions fro111 o types of errors--rzzrzdorll cl-rnnr a:ld gl-oss errors.

The tel-in r-rrndor~z ~ r r o r - iinplies that lieither tile ~n;ignitude nor the sigr: of ~ h c ei-ro:- can be prctlicted \x,~ittl ccr!ainty. In othci- u.nrt!s. i f the Inca-

surcment is I-epeatetl uith the banle in:,irurnent under- identical process conditions, a different value may be obtained depending on the outconle of the rantiom CITOI-. Ttit' (311ly possible way these errors can be chal-acter- izcd is by the use of probabiiity distr-ibntions.

These errors can be caused by a number of different sources such as power supply fluctuations, nc~work transmission and signal conlersion noise, analog input fi1te1-ing. changes in ambient conditions, and so on. Since these el-rors can arise front different sources (some of which may be beyond the control of the design engineer), they cannot be corllpletely eliminated and are always present in ally ~lieasurement. They usually correspond to the high frequency components of a measured signal. and are tlsuail~. srnall in magr~itude except for some occasional spikes.

On the other hand. ~ I I J S S P I .~J I - .S al-c cauied by nonrandom events such 35 instrument malfullitionin_r rdue to inlproper i~istallatio~l of rncasuriny de\ ices), miscrilibratio~i. u Far- (11- cnr-I-oio11 of sensors, and solid deposit<. The ~~oili-ailcio~ii na1111-c (1: ij.~i',i: r'i~oi-< iiilplies that at any gi\:eii ri!ne the); h a \ < a certain niagnirude aid 5ign \xhicii nia) be unknown. 'l'hus. if the

. . n:cast!rtnteiit i \ rzpel;:i'!.! \::ti? rtii' <; i i i i i . iri<tr-ument urldcr identical iondi- t i ~ r i , . t1:e con~r-ibuiior~ or a \ \ ~krilatr i ~ I - ( I S > err:.)[- to thc rneahiirzd \aiile u-iii bc the S:~III<.

. . 31, <<>!j~\~.i i ig gocd i,?\!Liii:a:i<>~t ;ti:(l II~~;:I:ICII:II!CC !>I.OCCC!I.J~C>. xi I!, p<-

iibie ti> eIl$~lI~C !h::! LT! ' , i \ , C l . t t l i \ Ji.2 1101 ?rc\elit 11; I!-&< 1 1 1 ~ ~ 5 L I l ~ ~ ~ l ~ l ~ ! l ! S Lit

I.>, !'or- s:‘ri11e :r!~it:. (;:-o,< rii-oi\ c;:i!.\cd 17:). hellso:- i~~i\~iilib~-aiic>:i n?a> ol~cur s~!dtIcnl; at ;I p;ir-tii~li:ii- tir;ii' atid 1!7el-~:ai'tcr rei~laiil :I: 2 c.-on>tai!t lei i ' l o:- r~t:yr-,iri~~Ie. 0t!1~21~ sin)\< ~ I - I - O I - criu ,eb such as the T.\ e21- t c ~ ~ ~ l i ~ ~ g o+ \enL;<)!.> ci11: ClcCtir ;~-;iiic!:il!> ,)vc:- :i 11crio<! of time arid 50 t!ie i-,l:i~!~i-

iud? of !:I,' g[o\> erfi!:. :!ic<~:t\,:\ \ ; < I \ \ I , , ' ~G\'I?s r~!;i!i\.el> i~>i!g T ~ I ~ I K , ;JCI-;- ~iti. '1 11~13. $i')>< <::XII\ c-~;;i.ii- Ic.\ \ !.I-cc~LI?I>T~! !>QI 1hei1. I I I ~ ~ I I ~ ~ L I ~ ~ C \ 21< i>'[>i-

<2l 11 l:i~;e~ r l ia~~ !.l~ohc L)C : - c i ~ : ~ i o ~ ~ ~ L:I-I-OI-~.

Eii-k11-3 in nlea\i~!cd ::ata i'iil lc;ici to :,igriitic:~n: dctc!-io:-atio~i i n plarit 13~1-fol-mai~ce. Sni~iil r;~;idoili ,iil,J ~ I - G \ . ci-i-:)~-\ can lead to deterioratiori in [!1i' pe~-forriia~icc cf cn!!!i(-l . -!r.!:!r. \?!?i'!-e;fi !:!!-ger gross cr-ror-s can i ~ u l l i - 6. gains :icliic\al>le rhi-oil,~i! p:{>cfis oprlii~i/arion. 1i1 \omc crseh. err!-,- I ~ < O L I \ d:iti~ c;ixl a150 d:i \e iilc p~-occ<s into ;ill u ~ i e c c ~ ~ ~ o ~ t l i c or-.-e\s~i ',\or~c-rrii L I I I S : ~ : ~ o p ~ r ~ t t ~ r ~ ? 1-eg1111e. 11 I > t\~c.rcSo~-c i~~ipoi-t:~r~t t o r ~ ~ l u c c . if ni)t ccjrii~~letcly cliriiiitdti.. : i c eftcct o f hot11 I-antlo111 ciricj yrct:, elxor,. Sc\-er-al ciat; pi~occ.;\i~tg teihitic]i~eh i,ail be used together to a c h i e ~ c this objective. l i t t l i i re\[. ~ { c di.hcr-i!>i. ri~cthodx ~\ - t~ich c;~rt play an iil111oi-t:irlt rc:le a j p:wt 01 ;ii1 iriiity~.:i:e~i ii31;t pr-occs\iii~ strategy to reduce errcrri in l i l ? ~ i ~ ! l l ~ ~ l 1 l C l l ~ 5 111:ttie 111 ~ O ! l i i ! l ~ l ~ > ~ l \ {)I-ll;i:<\ iil~~U~trif?h.

Research and development i11 the area of sigrlal conditioning have led to the design of analog and digital filters which can be ~ s e d to attenuate the effect of high frequency noise in the rneasurerncr~ts. 1,al-ge gross en-or< can be initially detected by using variotls data validation checks. These include checking whether the measured data and the I-ate at which it is changing is within predefined operational limits. Today, smart sensors are available which can perfol-m diagnostic checks to determiric whether there is any hardware problem in rneasurerncnt and whether the rncasured data is acceptable.

More sophisticated techniques include st~zfistical qzral i~ control tests (SQC) which can be used to detect significant errors joutliers) in process data. These techniques are usually applied tc> each ineasured variable separately. Thus, although these methods improve the accuracy of the measurements, they do not make use of a process rilodel and hence do not ensure consistericy of the data with respect to the interrelationships between different process \~ariablzs. Never~tieless, these teciiniqnrs rrlust be used as a first step to reduce the effect of randoin errors in the d;iw and to eliminate obvious gros:, er-rors.

It is possible to further reduce the effect of randolii erroi- and ;:]so eliili- inate systematic gl-nss er-I-or- i l l the dar:i hy exploirins tiie i-ei:i:inns!iir~.; tiiat arc k;~:)wn to exi~C itzr\vce11 diffel-ent \,ariahies <if s process. tech~iq;li.s of tlara ~-~cc;tlc.iIiclfion ar:d gi-ox e r 1 - 0 r d~fe( . r io i : tixt !I;\\.&:

'Jeer: d c v r i o p ~ i i~: ihe field of chcmica! e~igineeririf dtirins tile past 75 y7;ars fc~r ttlis parpose are :iie pi-inci;~al focus of this book.

Data reconci!iation (IIR) is a technique that hai heen deve:op-,d to impre>\ e ~ h c accilracy of 1iieasurclnei:~s i)y r ed~~c i~ i? the effc>ci cf r;tlictoin err-or. iri ilic dats. The priilcipal dilli.rence he~n,eitn data i - c ~ < > i ; ~ i i i ~ t i ~ ; ; ariii otlrer fillelin: tcch~?iq:ies i, ihnt da:a recancili:itioii cipiiciti\~ rnahe. 1i.e oi' process niodi.1 constraints and obtains es:inlat.-5 of process :,ariahies hy adju:;ting ~ ~ r ( x e s s nicasuu'-ernznts sc; th:it tlic. eslirnate.; x;it15f?~ :he c(>i~si~~:~iiiix.

The r-?i~onciled esiinlates ar-c. ;spetted to be inore accurate tji:in tht, i ; L i ; l , ~ , - ~ , t i i t i i ~ and, more irupclrtantly. ai-e also consiaient \\'it11 the kll~~u'11 ~tlatioiisliij~s betwee~t process \lar-iables a s defined by the constraints. 111

order for data reconciliation to be effecti\~c, therc. sllonld hc no gros., er-ror-s eithcr iri the measurerncnts or i l l the pi-ace.; ;iiodei constr;~ints. Gro';> erro~- detection is 21 companion technique to dats reco11ciliati:in

. . ha\ beer1 tleveloped to identify and clinlir~iiti: 21-oss errors. 1 hur, data reconciliation and gross error dctcction are applied together to irnpro\.c accuracy of rtieas~rred data.

Data I-econciliation and gross error detection both achieve enor reduction only by exploiting the redundancy property of measurements. Typi- cally, in any process the variables are related to each other throuyh physical constraints such as material or energy conservation laws. Given a set of such systern constraints. a minimum nurnher of error-free measurements is required in order to calculate all of the system parameters and variables. If there are more measure~ner~ts than this ~ n i n i ~ n u m , then redundancy exists in the measurements that can be exploited. This type of redundancy is usually called sp~.,ariul reda)~danc~y and the system of equations is said to be overdetermOzed.

Data reconciliation cannot be performed without spatial redundancy. With no extra measured information, the system is just detertnined and no correction to erroi;e:>us measurements is possible. Further, if fewer variables i1l;tn necessary to detennine the system are measured, tihe system is ~t/~rl~ldei~t-/tzitIC(: aiid the values of some variahies can bz estimated o i l l ~ throus!? otiier nleans or if additional measurenlents are provided.

A second t;.pc of redundanq rhat exists in measurements is rertzpoml -. . irii/c:ld(~)l~y. 1r11s arises due ti; the fact that rneasurelnents of process ~aiiahles r!;-c iilad,: ci~ntinuallj~ in rirne a! a s:tmpliiiz rate. producing i.,~orc d~tt:: ~i!:;i: nc.czb\ary to detei.mine a steady-slate p r o c c > ~ . If rile o i I t ? t c I :! :eady stare. !!ICE tenlporii? redundanci, can Sc zxploiti.c! h), \i!l?pIy a \ e r a s i n ~ tfic n:casrir?n;cnts. and a?p!>irig steady-,:ate da:a I-cconcilia:ic-rn to the a\.cragerl values.

Lf rhe PI-i;cc;s stare i.; ciynalnic. hourrver. tllc evo!ution of thc process scace i s describec! by dif;--I-cn:ial cqua t io~~s co~espc,rl:fing to mass and cilc:rsy hsisr!ces. U. ! I~Y~I l;lilc~-en[iy capture 120th the :~-~r~poral and spatial :-c(.f~!ri~i:jrx:; C J ~ ' r~:c:i<~:rccI ;.:~iii??!es. !'<>i- ~ U C ? ? 2 ~ ~ O C ~ S S , dy~?;$n~ic &:ia XC-

c?~icilintion ;ilid grai' i'i.i.(>,. Jztectiori teci.,r:igilcs have Sr-eil dc\~eloped to ~:i)lain :ic.ctir.i!te c\tiiii:ite.< co::sistent with !he differential model eqcations of the 1>i-(>cc.s.

Signal pr.occssing anti d:~!a reconciliation techniques for error reduction can be applied to industsia! processes as part oi' an integrated sirateg) referred to .is i/gttt c.otzdirii;/!irr: or c1(1/cz n~rri/jcutio/i7. Figure 1 - 1 i1lu:;tra:-: tile varioii5 oper;rtions and the position occupied by data rzconciliation ill

ilara conclitionir~g for oil-fin- rndustlial applications.

771.. I ~ n ~ ~ ~ ~ r ~ u ~ t c ~ c . q[Du:a I<i,c~~n;.i:rario,z 3 r d G:os.r Error r ) ~ : c ~ t ~ ~ n

- -- -i- I Process Data 1

1 Data Ac~i.iisition / Historian i

1 . 1

Applications i .___ I - - - __ J

1 I 1 v v v

i - -~ I TP------~

---- . -. -. -.

1 ! 1 X!,?b!,:,on 1 j Opiir : l i~~:~~T(j ; :f!;.;::ied 1 1 ,.%,:,:c:-::~z: ! ::I<:~J,!IC:,. ' . , l&ailo2i :~ J i---1 i ! ~ ! ~ ! : , I C : I A . , Z C ' . .

Figure 1-1. Onlinz data cciiec!ion s n d conrliiionino system.

:NRrilSiR!AL EXAWIPLES OF STEADY-STATE DATA RECONC6LIATfOhi

llere we wil! briefly describe two exarnpies of indclstriai ;!p~ilicatian:, c)f stead)-stsie data i-econciiistion draw11 froni DLI; exp~rience i l l o~dc'r t~ illustrate the need thr sgch 3 !ec!iniqcie and the bcneiiti ;ha: 1;3;1 be deriveti f:-i;ni it.

Crude Split Optimization in a Preheat Train of Q Refinery

In any reiineq, the crude oil is initially heated by passing it through aj: intel-conrlccted set of heat evcllangess called the crude preheat train before beins fractionated. In a crude preheat train, typically the crudc is split into one or more parallel streanis, each of ~1l;ich is heated by passing it through a train of heat exchangers before being merged and senr to

a furnace for further heating. 'rhe process str-earns that are used for lieat- ing the crude are the various product and pump-around stseains from a downstream atniosplleric 01- vaclluril CI-ude distillation coluinn.

In order to m a x i m i ~ e energy recovery fr-on1 these procesx sti-earilx. the optimal flows of the crude splits through the diffel-ent parallel heat exchanger trains should be deterillined online. say every few iiours. For determining the oplimal flows, the total inlet flow of ci-ude and all hot process streams along with their inlet tcmperaturcs have to be specified. Moreover, details of all heat exchangers. such as heat exchanger areas. and overall heat transfer coefficients, also have to be specified.

Generally, in a crude preheat train, all the stream flows, as kvell as all internlediate tenlperatures, are measured. Thus tl~er-e are more measurements than those required for perfor~ning the optimization. It is possible to ignore some of the measurernents and use only the measurements of inlet flows and iillet tempei-aturcs of al! st[-earns for determining the optimal crude split f l o u . Houe\.er. since a11 measurements coiitain en-ors, any optimization cxerci.ie casriecl out using such rneiisuren~e~lt> \\;ill no! :iecessa~-ily result in the predicted gairih.

In of-der to over-coilte [hi\. >te;dy-state reconciliation :ind gi-ois error :ietcction is applied io measlircd d a i ; ~ tu elir11i;iate Incasuri:lrieiiii contain- i l l ? ~ I - E S CI-1-or-s ;i~;d L ) b ~ ; ~ ! ~ i i - c ~ o ~ ~ e i l ~ ( ~ . zs1i111:ite5 of :ill sti-c'irTi ii<>\lx \ ;111d ~ < I ? ~ ~ ~ I - : I ~ L I ~ C S tviiich s a t ~ ~ i ~ ~ I!]? f-lc>\\. :1!?c1 i .~?~I~:tip) l12i:111~:c< of L I - I I ~ ~

preheat train. '4s 1:afl of I!:? reconc-i!iaiictn. :hz o?;e~-all iisa! ii-ansfes ~ i i ~ i -

ticien:s of ail excl~drigcrs ;II-e 31'10 e';tii~li!;~d. 7,

I iiese csiiin~?tcd hcst tr<in:;!r cc)eff'icicnts a-ill inol-p, co!-rectiy r e f c ~ t :he actual ctli-I-cn: pel-~,lr!l?anic Ljf ihc !leaf cx~h::~lg~i-s ti1311 t!lii: 0i-igi11;iI desizn \,-a!~!es. 'TI>;. r-eii;nciieti e;tii~i;iii.s oi' inlet i'!oibs a:l:i lc:np?T:ir~Ii-L' 01. d l siL-ea!~ls. 2iiid tlie e:;ti111;i:ed O ~ : ~ I ;i!l lie21t tl-ansfti- coci.fi:ic~~t\ i ~ f :l.il

cxclirl~lgei-:;, are u ~ c d to Ji.1~1-ri;ine thc: optiril:~! x:alues of tine c!ndc s p l i l ~ ulii;h ai-e thcn impleriientcd in the [,Ian!. U:;c c;f reconcilei! csti:r~;~ie\ in illc opt in~izi~l ion is iikely i i ~ result in sc t~ ia l energy I-eco\ery frun-i tlie ,,,,,,ss bcii;, ,:;,: L. ;h; pi-cdicted optinial values.

I t <!:!?u!c! be nored that the ti!ne pel-iods selected foi- recoii;lliatio:i and op:irrriz;ltioi~ ar-e s e l e i ~ e d based on the tirrie constants of the bysterll. Since a change in the cr-udc 5plit flows ha\ an ef'fect o n the iio\\.:i.;tr-eait~ cr-ade distillatior? coluiiiri and hence affect\ all ttie dislillate l t r ~ z a i i ~ s \vhich arc. used k)r pi-e-l-reating the cr i~dc. it taktzs t\\:o 1iou1-\ foi- all the stream tlo\vs and ternper;tturt.s to reach a new stcady slate after a new set of ~[ i t inial crude splils \-alut.s 3rc in~p!e~iienled. The proce5s is opcr-;lteti ,it !llis sleady ,tntc for. 311 2idilitional t i i c ) hour-s :ii'ter \4 i~icll tllc oj i t i~l?i~; i -

tion of the cr-udc split flows is repeated. The measurements i~iadc during the preceding two hours of steady-state opcratiorl arc averaged and used as data for the reconciliation problem. This example is described in greater detail in the concluding chapter on industrial case studies.

Minimizing Water Consumption in Mineral Beneficiation Circuits

In a mineral beneficlation circuit, crushed ore is washed with water along with other additives in an interconnected network of claxsifiers 01-

flotation cells in order to liberate the particles containing the minerals from the gangue material. In order to minimize the water consumption for a desil-ed conccntratiolt of the bcneficiated ore, the performance of the flotation cells has to be simulated for different flow conditions. Tlie simulation r~iodcl in turn requires data on the feed charactzri\rics as \vcll as on parameter-s such as pulp densities.

Generally, the fiours of the feed stream and pare n.:itei- 5trean.s 21-e measured. Using sarilples drawn from differ-ent s t reai~i l . thf corrce:it:-a- tions of different illinerals in each stream and t!:cir puip dciihitie 2:-e alht; rrleasurecl in the 1:lboratcry. These ~ile-<u~i.rne!its c o ~ i t a i : ~ t?::o:\ ni;d ai-2 aiso 110: ~ ~ ~ i i i i i l e ~ ~ ! \~i! l? 111e fSI0 i4 ' a11d CO:??p~:i??ill b~i!:?il<c.\ O!. ti:? ~ i i i ~ ~ < i . i ~ ! t.c.ncficia:i<i[? cj;ctlii. Slc.ady-state recon,-i!ia:ioi; :rnij . r i - , j \k ~r!-i!r ~i<;i'<iii):; - c~irj be aj;plicd to ti12 i ~ : e ~ i s u i - ~ i ~ ~ c ~ i t s in ardes to obtai :~ ?:c(.~nciIL:d citi- n:ntes of ail striiain iln\vs. pulp densities and niiner-al corirci?ii.n:ii)ii\ ~L:C!I

that they satisfy the iliatcria! balances. The recoxi led e.;rimari.s aie 9si.d in the deiai!ed siiiiuirt~ion o!' the flota!ioii c e l ! ~ an3 10 tictsl-!nine !hc 1iiii;i-

. . n ~ a i al;lc>Llni cf \iulei- :ti hc acidcd. I n OliZ ~ L ! i ! l exerci.\e. : I \i;:> ;x,i\ib!: reduce ii ;?!<I- c,>ri\llr;ip:i~i; b) f ; v ~ J);31-C21:i.

DATA RECONC!hlATEOM PROBLEM FORMUU'IION

As stated in the przcediiig sec t io !~~ . i1;lta r.:,ro~icili,irlcx~ i~lipr-i>\e\ t i ~ c accur-ac\ of process data by a(l.jusiirig the rne;isui-ed \ ;iii:zs \o thsr t!1c~ satisfy the procc\s conxtr:iiiits. The ;iliioui~t of acljilst~iieni iilaiie it3 ihi. ilteasureiilelits is rnininiixci siirce t l ~ c r;indoii! errc?i-s in ti;: iiiea\urc.ntcr~t~ are expected to bit sinall. I n tile gcnei-a! a s c . not ;ill \-ai-iabic\ ot' rtle process arc measured due to eco!io:nic oi- teih11ic31 ii~ilit~itin~:\.

7 > Itic c>tiriiates of unn~casured var-iable.; :ij \veil as moilci paramcrzr-5 ar-c also obtained as part of tllc reconciliation prob len~ . The c\tiinatiu:i of'

~innieasur-ed values b ~ e t i oil tlie r-ecoi~ciled ~~ieasurcci valuzs i \ also hilo\\ n

as da?a coctptation. In general, data reconciliation can be foniiulated by the following cofistrained weiglited leccst-squczrus optirnizafiot~ problem.

subject to

The objective fuiictio~i i - 1 defines the total weighted sun] square of adjustments made t o measurements, ysllere w, are the weights, y, is the measurement and x, is t!>e reconciled estimate for variable i, and u, are :he estimates of unn~easured variables. Equation 1-2 defines the set of mode! consiraints. Tt:? v.,.;riri~:\ w, are ciioseii depending on the accuracy of differei~t ineasurcri?e!;l:..

Tiis ~riodcl can;tr;~in!s are ~cncrally niatcrial and energy balance?. but could inclt~dc ir!equnii:>, i-c!atic;ns iniposed by ieasihiliry of process oprra- :ions. The deterniiiiistic riatilral lav~:, of conw-vation of niasi or eiierg) are r\~piially u.;ec!l I? i.o!;$:iaii~t.; for d21d recuriciliation becausi the:,' are usuzily 'Kli~\i'!i. tilipiiical 9:- :)!iier t!;pi.s of eqi~ifiio:li in:'~!:.i:l~ 111;1114. tliimca\urt'd parar-i~e:::rs a]-e not ~-ccoini!~i~i?detI, ~ I J L : ~ t!?ey ;:i-cl a! best ki~own oiii!. appi-ol- j111alel j-. f'orcin. tile ~li:,;\l;y--d \ ar?ab!e:, l o obey incssct rciatlon., c,i13 caasc inacc:urate data rzco;i!:i!i:itio;i aoiutinr; anti incoiri'ct gross error- tii;lg:in>ii.

,Any i ~ ~ a s s o: energy cai:.;e1-~3tia11 1 3 ~ 2x1 be cxpi-essed 111 [he !;)llov;- iiig geslerai ihrin j l j:

The ci:lar::ii> to:- uitic!; thc abai,. ecju;rtioi: i b ~vritrei? could be ths o\,rra!I ni:i!eriai ~ I G M . . iile t7i,i&. \)I' individuaf cor;lpo!ien;s, or the flow of encl-ry. -. If there is no aicuniiilatiou of any of these quantities. then these co~lstrairits ar-e algebraic in character and define a i!c~dy-:!?": < \ r \ p l - ~ t i / > n I,-' - Ax'L.-

For a dyiiamic psoces.;. tiowever. !lie accumulation trt-tns carinot be i~e~ lec tcd and the coristraiii!~ are ciiil'crcntial equations. For rlrost process units. there is no giinc~-ation or depletion O I material. Jn the case of rcac- tors. though. the gerlera~io~~ 01- deplctii,n of individual com~,oiicnti due !o reaction should he taken into account.

For s a n e siiriple units siich as splitters, there is nci change either i n the co~nposition or ~enlpe:-atilre of strearns. For such units. the component aaci ellei-gy hala~~ces i-ediici. r o ;: sirnpli. t;)rin sacl: cib

where rhe variable x, represents either the temperature or- co~nposition oT stream i. The above equation is also useful when two nr more sensors are used to measure the sarne variable. say tlow rate or ten~perature of a stream.

The type of constraints that are imposed in reco~lci!iation depend un the scope of the reconciliation problein and the type of process units. Furthermore, the complexity of the solution techniques used depends strongly on the constraints imposed. For example. if we are interested in reconciling only the flow rates of all strearns, then the rnateiial baiarlces constraints are linear in the flow variables and a iitzeur- ciuta recorzciiiil- tiotz prohkt?~ results. 011 ;be other harid, if we wish to reconcile cornposi- tion, {etuperaiure or pressure measurements along with i'lows, :!?en a ? ! : i l l -

liimir data reconciiin!io?? pi-oblem occurs. An issue to be addressed is the kind of constraints t h ~ t we can legitl-

inately impose in a data reconciliation appl!cafion. Since data reconciliation f~ircz?: the estin~ates of all variables io satisfy t l ~ c iinposed constraints. this issue assL:ines great importzice. Usually. materir?l and energ). balance constrain:s are inclncied because :hey are \,slid ph) sicd 1av:s. !t s i lo~id t c noted, howcvel-. that thcic cquatiiir:s 2:-e pmeraiiy .,i~ri:- ten ass!!:ning i1:ai ere is fio IOSS of male:-ia! oi. cneszy frwm r!le procc'.$ ~!ni: ti) the erivii*>nment. i4'hiie t!lis may be va!id for n~atci-iai flow. sip- njfjcant losses i ~ i cr!ergy il?:lT occur for trxam~!e from ii~~propcrly inscia:- cct heat cschangcrs. f n such cases. it is hmer no: to impose ?he eil.ersy balances or sltcr!tatis.eiy iiiciude an anknswn ioss tel-in if; the batr?ncc eqcation thai car1 Se ectiri;ated YS part (_if the i~~o:iciliz:ii.::.

Oriizr than material 2nd el:erekr co~lxr\~allor? c;,ns!ain:s, ;: mi>iki of 3 L 2

i;roces.i un i t ca!i contain equations involving the uriil p:rra:nsters. Fix example. a heat excilancrer model can icc!ude a raiir~g equation relaiiriz the heat duty to the overul! heat transfer ccefiiciznr, t!ic exchanger a;c;i available for heat trarlsitr, and the strez-n flows and te11iperatu;-;s. Eqca- tion 1-5 describes this relationship.

where Q is tile ties: duty, II is the ov~rafi heat transfer corfficicrit. A is the exchanger area, sad AT,, is the logarithnlic mean tempemtare difference.

Should this equation be i!icluded as a constraint whcn appiying data recor;ci!iation t:t p:.ee"""e ininvolving heat exchangers'! Generall~: since

the o\lerall heat transfer coefficient is unknown and has to be estimated from the measured data, this equation may be included and U estirxiated as part of the reconciliation problem. If there is no prior information abotrt U, however, and no feasibility restrictions on it, then inclusion of this constraint does not provide any additional information and estimates of all other variables will be the same regardless of whether this constraint is included or not. Thus, the data reconciliation problem can as well be solved without this constraint and U can subsequently be estimated by the above equation using the reconciled values of flows and temperatures.

On the other hand, if U has to be within specified bounds or if there is a good estimate for U from a previous reconciliation exercise (as in the c ~ u d e preheat train example discussed in the previous section, where the esti~rates of U from the reconciliation solution of the most recent time per-iod can be used as good a pt-iol-i estimates), then the constraint should he included along with the additional information about U as part of the reconciliation problem. The overall heat transfer coefficient can also be related to the physical properties of the streams. their flows, temperatures and the heat exhangel- characte~istics using co~~ela t ior~s . !t is not aiivis- able to use srich a: equatioi~ in ths. reconcili:l:ion model since the cc~n-ela- t i o ~ ~ s thelnseives can be quite en-or!eot~s and forcin: the f OV,S arld te~liper- a:ui-cs to fii this equation mi): i;:cr-rase the inaccuracy oi'the cs::n?a!es.

Ai:othe;. inlportaii! qucstior~ is \iit<:hzr to perfol-11; rcconciiia~ioii ~ s i i ~ g :I .S~~C/ti\'-~fc~te as a n'yilnilzic. ~lloi!c! of the process. Praciical!y, a p:-ect-ss is :le\.ei truly a: :I steady staie. 14oi~.ever, a plant is norilia!iy opera:etl for several f:ours or days i n a rzgic\n around a nt~mina! steady-state operating 1ioir:t. For :ipplicatioil~ such a ofi!ii:e :)ptimizaiion (as in ;he case of :i

c n l d ~ :;[)lit optiri1iz:ltiofi exarnplr) where ;cconcili:i!ioi! is performed once ?\cry few hours. i t is appropri:i!e to e~nploy steady-s!atc reconciliatio!~ on nieasr!remalts averitged over the time period of interest.

During translent ccjndition (such as during a changeover t:) a nei4. crude :ype in a refinery) wher? the departtire frorn steady state is significant. sready-stale reconciliation should not be applic~l Secr:x:;e i: -.::!! I-esult i n large adjustments to ~iicasnrcd values. Mcast , -~~-~er t . taken dur- in: such transient periods can be rccorlciled. if necessary. using a dynam- 1,: inodel of the ~,rocc.ss. Similarly $or process coritrol applications where reconcili zition needs to be pzrfornled every few niir;utes. dynarrlic data ireconciliation ir; appropriate.

Data reconciliation is based (.w tlie assumption that only random error-s LII-e prcscnt in the measurements which follow a normal (Gaussian) discri-

butio11. If a gross error due to a measurelllent bias is present in some measurement or if a significant process leak is present wliicl~ hris not been accounted for in the model constraints, then the reconciled data may be very inaccurate. It is therefore necessary to identify and remove such gross errors. This is known as the gross en-or detectiolz jn -oDI~n i .

Gross errors can be detected based on the extent to which the measurements violate the constraints or on the magnitude of the adjustments made to measurements in a preliminary data reconciliation. Although gross error detection techniques were developed primarily to improve the accuracy of reconciled estimates, they are also useful in identifying measuring instruments that need to be replaced or recalibrated.

EXAMPLES OF SIMPLE RECONClLlATlON PROBLEMS

In order to obtain a good understanding of t!le issues and underlying assunlptions in data reconciliation. some of the simplest possihie cases are introduced here. We assume a process operating a: a stead!! state. consti-ained by a sei of iiiiear equations.

Systems With All Measured Va~icrbies

Let us first consider t!lc sirnplest tir~ta reconciliation pi-oblc~n: tire r-eci,i.- ci!iation of the streain flcv<s of a process. lciti:<lly, a!l 1 1 ~ : ~ rates are assumed tci be diiectly rzeasured. :'lie fio\v imeasuremsnts contairi unLno~vn random errors. For chat reason, the malerial input 2nd c;i~tput of

process uni: and of the overall proctss do no1 balance. The ai:n $3:'

reconci!iation is to make 1:1inor- ~idj:ls:nien:-: :t: the me3s~irenlen::. i:: order to ~ n r i ~ e rher:i consistent vliti~ the mare~ia! i:alar~ccs. 7 he ;?il~j~steii i::e:isui.:- rner:ts. \i~hich are nfcrreti t o as estimate>, are expi:cted to be 111ol-e accurare than the ~neasurements. Although the problerr, considered here is siiiiple, i r does have important indiistrial appiicatic~ns in accurate accoiliiting far- tlie material flows as. for example, in a lube blending piact. in t!le strtaln ; ~ n i i

water distribution subsysleni of a p!ant, or ill a complete r e f i ~ i e ~ ~ .

Example 1-1

Let us consider a si~np!e process of a heat exchanger- witii a bypass as shown in Figurc 1-2. Let us also igrrorc the energy flows of t!~is process and focus only on the mass flows. It is assumed that the flo\vs of' all six strea~ns of this process are n~easur-eti and that thcse measurements con-

Data Ke~~~nriliutinn and Gross Error Defection

Figure 1-2. Heat exchanger system with bypass.

tain random errors. If we denote the true value of the flow of stream i by the variable xi and the corresponding measured value by yi, then we can relate them by the following equations

yi = xi + E ; i = 1 ... 6 (I - 6)

where E, is the random error in measurement y,. The flow balances around the splitter, exchanger, valve. and miser can

be written as

xi - x2 - x j = 0 (1 - Saj

The measured values (given in Table 1-1) do r,ot satisfy the above equatiocz, sic:- t'.~;. rcrt-iy randerr? zrr2rs !r is desired to derive esti- m2tes of the flows that satisfy the above flow balances. Intuitively, we can impose the condition that the differences between the measured and estimated flows, also referred to as adjustments, should be as small as possible. As a first choice, we can represent this objective as

The Imnportance 1.f Dora Kcconciliu(ion und Gross Error I)etecriori 13

The above function is the familiar least-squares criterion used in regression. Since it is immaterial whether the adjustments are positive or negative, the square of the adjustment is minimized. Although other types of criteria may be used such as minimizing the sum of absolute adjustment, they do not have a statistical basis and also make the solution of the problem more difficult.

The least-squares criterion is acceptable, if all measurements are equally accurate. The adjustment made to one measurement is given the same ~mportance as any other. In practice, however, it is likely that some measurements are more accurate than others depending on the instrument being used and the process environment under which it operates. In order to acccunt for this, we can use a weighted least-squares objective as a

e * more general criterion, given by

6

~ 4 i n z w , ( ~ , -xi ) ' ,=, (1 - 9)

where :he weight:, wi are chosen to reflcct the accuracy of the respective measurements. More accurate ~neasurernects are given larger weights in order to force their aajustn~ents to be as small as possible. Gznerally, it is assumed that the error vxiances for all the measurements are known and that the weighs are choscn to be the invcrse of these variances.

--. 1 nc re con cilia ti or^ problem is thus a constrained optimization problem Q with the objective functioil given by Equation 1-9 and the constrain~s

given by Equations 1-?a through 1-7d. The solution of this optirnizatior; prcblein can be obtained analytically for flow reconciliation. Table 1-1 shows the true, measured, and reconciled flows for the prGcess of Figure 1-2. The reconciled flows shown in column fm1r nf this t&!e are okclned by assu~ning that all measurements are equally accurate (weights are all equal). It can be easily verified that while the measured values do not satisfy the flow balances, Equations 1-7a through 1-7d, the reconciled flows satisfy then?.

Table 1-1 Flow Reconciliation for a Completely Measured Process

Stream True Flow Measured Reconciled Number Values Flow Values Flow Values

Systems With Unmeasured Variables

In the previous example, we have assumed that all variables are measured However, usually only a sub\et of the variables are measured The presence of unmeasured var~ables not only co~npl~cates the problem solutlon, but also ~n t roduce~ new questions such as whether an unmeasured

ariable can be ect~mated, or whether d meaxured varldble can be recon- clled as Illustrated b> the following example

Let us consider the flow reconciliation problem of the simple piocess shown in Figure :-2. However, we will cot assume that all the flcws are rneasured as b e f ~ r r . Instead, we will assume that only selective fiu\vs are measured and in each case discltss the issues and prob!ems invclved in partia!ly measured systems.

Case I . Flows of streams i , 2, 5, and 6 are measured, while the other two stream flows are unmeasureci.

The object~ve in this case is to not only reconcile the measured flows. but also to estimate all ?he ~~nrr~r:icl~red flows as part of the reconciliation problem. As in Equation 1-6. we relate the measured and true stream flows.

'The constraints are still given by Equation 1-7. It should be noted that the constraints involve both measured and unnleasured flow variables. The objective function is the weighted sum of squares of adjustments made t c ~ 1ne:rsured variables. and is given by .l

The Imnporlance of Duru Reconciliation ur:d Gloss Error Detecriorl

Min wl(yl -x112 +w2(y2 -x212 +WS(YS -xs12 xl~x2>XS>x6

Since the unmeasured variables are present only in the constraint set. the simplest strategy for solving the problem is to eliminate them from the constraints. This will not affect the objective function since it does not involve unmeasured variables. Variable x3 can be eliminated by combining Equations I-7a and 1-7c, while variable x4 can be eliminated by combining Equations 1-7b and 1-7d. Thus, we obtain a reduced set of constraints which involves only measured variables.

The reduced data reconciliation problem is now to minimize 1-1 1 subject to the constraints of Equations 1-12a and 1-12b. It can be observed that this reduced problem involving the variables x i , x2, x5, and x6 is sim- i!ar to the completely measured case, and an analytical solution can be used co obtain the reconciled values of the measured variables. Using {hi. same measured values for x,, x-,, x5, and x6 as given in Table 1-1, and assurni~g all measuremects to be equaily accurate, the reconc~led i.a!uec whicn are obiained are shown in Tabie 1-2 In the col~imn mder Case !. Once the reconciied values f ~ r the measured variables ere okained, the estirxates of the uilmeasured variab!es can be calculated usil:g the original constraints.

Thus the estimate of x4 is equal tc~ that of x2, and the estirnate of xi is equal to that of xs. These values are also indicated in Table 1-2. By comparing with the resu!ts of Table 1-1, it can be observed that since There are fewer measured variables in this case, the estimates of some variables are less accurate than those derived for the completely measured system. The central idea that is gained from this case is that the reconciliation problem can be split or decomposed inro subproble~ns-the first being a reduced reconciliation problem involving only measured variables, followed by an estimation or coaptation problem for calculating the estiniates of anmeasured variables.

Table 1-2 Flow Reconciliation of Partially Measured Process

Reconciled Flow Values

Case 1-Streams 3 Case 2-Streams Case 3-Streams Stream and 4 unmeasured 3,4,5,6 unmeasured 2,3,4,5 unmeasured

1 100.49 101.91 100.39 2 64.25 64.45 -

3 36.24 37.46 -

4 64.25 64.45 - 5 36.24 37.46 .-

6 100.49 101.91 100.39

Case 2. Only flows of streams 7 and i 0, measured

In :his case, only Equations I-7a and 1-7b contain measured variables and are useful in the reconciliation problem. The objective f~inction is set ~ 1 7 as before to minimize the adjustment made to measured variables and is ~ i v e n by

As in Case I . we ti-y tc elin~inate the un~rieasaied variables from the cc)nstr;,ints 1-7a arid 1-7b. Our attempt tc produce an equation involvirlg (mly ~neasured variables by suitably combining :he originai constraints ends in failfire. Thus, the reccncijiaiion problem we obtain is to minimize 1 - 1 -3 \vithout any constraints. It is immediately obvious that the best esti- rnarzs of x i anti x2 3ri given by their respective nleasured values which resclts in t.he least adjgstmcrtt of zero for 1-13. The estinlates of the unmsasurcc! variables can now be calcuiated using the constraints. The estimate of x6 is equal to x,. the eslimatc of x4 is equal to x2, and the estimates of x3 and xS are both equal to the difference between x, and x2. 'These values are a11 given in Table 1-2 under Case 2.

Two irnpo17ant observations can be made in this case. First, no adjust- 111crtt is made to the two measured variables x,, and x2. This is due to the fact that there is no additional information in the fonn of conssaints that relttte only the measured variables which can be exploited for adjusting their measurements. Such measured variables are also known as noni-e- duildnnt val-inbles. Second, a unique estimate for every unmeasured variable is obtained using the constraints and estimates of meaxured variables.

Tlre Irnpor-imzcr ofDu(a Rrro~rciliurion o~zd GI-osr Error Detecr;on 17

These unmeasured variables are also known as observable. A formal definition of the concepts of observability and redundancy is given in Chapter 3. It is sufficient at present to note that while the partially measured process in Case 1 gives a redundant and observable system, Case 2 gives rise to a nonredundant, observable system.

Case 3. Only flows of streams I and 6 are measured.

The reduced reconciliation problem we obtain for this case is

Min wl(yl -x112 + ~ g ( y 6 - ~ g ) ~ (1 - 14) X1 ,x6

such that:

Equation 1-15 is obtained by adding all the constraints 1-7a through 1-7d. Assuming that the measurements of xi and Q are equally accurate, their reconciled values obtained are given in Table 1-2 under Case 3. We now attempt to calculate the estimates of the remaining four variables. We wit1 not be successful, however, in obtaining unique estimates for these variables. In other words, :here are many soiutions-in fact, an infinite nurnher-which can satisfy the constraints.

For example, one possible solution is to take the estimates of x; and x5 to he both equal to that of x,, and the esti~rkates of x2 and x4 to be equal to zero. Alternatively, we can choose thc estimates of x2 and x4 :o 5e equal to that of x,, while the estimates of xa and x5 are chosen tc be zero.

Without additional info-mation, there is no way of det-erminicg which of these myriad possible solutions is rnorc: accurate. The variables .x2. x3, x4. and x2 are denoted as unabservablc. in rhis case. An interesting feature of this case is thzt thoilgh there are some cnmeaslired variables which cannot be u~iquely estimated, reconciliation of the variabies x, arid x6 can still be performed utilizing the available measurements. Therefore, Case 3 is a redundant, unobservable system.

System Containing Gross Errors

In all the cases considered in Example 1, the measurements did not contain any systematic error or bias. In such cases, data reconciliation does reduce the error in measurements. We wi!l now examine the case when one of the measurements contains a systematic bias or gross error

and demonstrate the need to perform gross error detection along with data reconciliation.

Example 1-3

We reconsider the flow process shown in Figure 1-2 for which the true stream flows are as given in Table 1-1. We will assume that all flows are measured with measurements as given in Table 1- I, except that the measurement o f stream 2 contains a positive bias o f 4 units, so that its measured value is 68.45 instead o f 64.45. As before, we reconcile these measurements and obtain estimates which are shown in Table 1-3, in column 2, when all the measurements are used.

A comparison o f these estimates with those listed in Table I - I , clearly shows that the accuracy o f the estimates has decreased due to the presence o f the gross error. Furthermore, although only the flow measurement o f stream 2 contains a gross error, the accuracy o f all the flow estimates has decreased. This is known as a smeariilg effect and it occurs due to reconciliation which exploits the spatial constraint rela'. 'ions between different variables.

In order for data reconciliation to be effective. it i s therefore necessary to identify those measurements c~:itaining gross errors and either eliminate them or make appropriate con:l;enxation. The iast colurnn o f Table 1-3 shows the reconciled esti~nates obtained when the flow measurement o f strearn 2 is discarded acd not ussd ir: the reconci!iati~n process. Clearly. ?he accuracy o f the reconciled estimates has improved cocsiderably. even though Be recfundaricy has ciecrzased by discardins thz mzzsurement.

Table 1-3 Flow Reconciliation When Stream 2

Flew Measurement Contains a Grass Error

Recanciled Flow Values

Stream All measurements used Stream 2 measurement elimii~ated ~- ..

1 100.89 100.23 2 65.83 64.53 3 35.05 35.71 4 65.83 64.53 5 35.05 35.71 6 -- 100.89 100.23

- --

Thus far, we have not considered the important question o f how to identify the measurement contaiuing a gross error based only on the knowledge of the measured values and constraint relations between variables. There are several ways o f tackling this problenl and in this example we illustrate one approach. Given a set o f measurements, we can initially reconcile them assuming that there are no gross errors in the data. In the flow process example considered here, the reconciled estimates obtained under this assumption have already been shown in the second colu~ml o f Table 1-3. From these reconciled estimates we can compute the differences between the measured and reconciled values (measurement adjustments) for all measured variables, and these are shown in Table 1-4.

Table 1-4 Measurement Adjustments for Flow Process

Stream Measurement adjustments

I f the c~fistraints are linear as in t!lis example, the expected variar;ce of the adjustments can be analytically derived which will be a function of the constraint matrix and !he measurement error variances. For the flow process example considered here, it cai; be shown :hat the standard drvi- ation o f measurernent adjustments for every variable is 0.8165. A simple statistical test can be applied to determice i f the computed measurement aliustments fail within a confidence interval, say within a 220 interval. In this example, the 220 interval (95% confidence in~erval) i s [- 1.6 I .6].

From Table 1-4, we can observe that the measurement adjustments f ~ r the flows o f streams 2, 4, and 6 fall outside this interval and as a t~rst cut the measurements o f these streams can be suspected o f contai~~ing a gross error. Among these the measureme~it adjustment o f stream 2 has the largest magnitude and can be identified to contain a gross error. After discarding the measurement o f stream 2, we can again reconcile the data and compute the measurement adjustments to examine i f any more gross errors are present.

The procedure used above is a sequential procedure for gross error detection and makes use of the statistical test known as the rrreasurenzeizt test. A variety of statistical tests and methods for identifying one or more gross errors have been developed and are described in Chapters 7 and 8. Although in this example we have only considered a gross error in measurements, it is possible for a gross error to be present in the constraints due to an unaccounted leak or loss of material. Some of the methods described in Chapters 7 and 8 can also be used to identify such gross errors. The example also clearly demonstrates that data reconciliation and gross error detection have to be applied together for obtaining accurate estimates.

BENEFITS FROM DATA RECONCILIATION A N D GROSS ERROR DETECTION

Development of a data reconciliation and gross error detection package for a system and its practical implementation is a difficult and costly task and cannot be justified without its benefits for a particular industrial application. The justiiication for data reconciliation and gross error detection nlay come from the many iiiy~ortant applications for improving Frocess performance shewn in Figure 1 - 1 which requires accurate data for achieving expected benefits a:; outlir?ed below:

i . A direct application of data reconciliation is in evaluating process yieids or i n assessing consumption of utilities in different process cnits. Keconciied values provide more accsrate estimates as compared to :he use of r2w nieastlremen!s. f;cr example, refinery-wide material balance reccnciliatior? aids in :t Getter estimate of overail refinery yields. Similarly. a plant-wide energy audit using reconciled flows and temperatures hetps in a better identification of energy inefficient processes and equipment.

2. Applications such as simulation ana opti~nization of existicg process equipment rely on a model of the equipsilent. These models usually contain parameters which hatie to be estimated from plant data. This is also known as model tunillg, for which accurate data is essential. 'The use o f erroneous measurements in model tuning can give rise to incorrect model paranleters which can nullify the benefits achievable through optimization. There are two possible ways in which data reconciliation can be used for such applications which we illustrate using a simple example.

Let us consider the problem of optimizing the performance of an existing distillatior column. From the operating data, measurements of flows, temperaturzs and compositions of all inlet and outlet streams of the column can be obtained. One possible way is to reconcile these measurements using only overall material and energy balances around the column. The reconciled data can now be used along with a detailed tray to tray model of the column in order to estimate parameters such as tray efficiencies. The tuned model can then be used to optimize the performance of the column.

Alternatively, a simultaneous data reconciliation and parameter estimation can be performed using the detailed tray-to-tray model of the column. In this case, if measurements of tray temperatures and/or compcsitions are available, they can also be used and reconciled as part of the proolem. Obviously, the second approach leads to a significant increase in effort and computation time. This approach is also referred to as rigorous on-line modeling and has been incorporated in many commercial steady-state simulators.

3. Data reconciliation can be very useful in scheduling maintenance of process equipment. Reconciled data can be used to accurately estimate key performance parameters of process equipment. For ewam~de. heat transfer coefficient of heat exchangers or the level of catalyst activity in reac!ors can be estimated and used to determine whether- ihe heat exchanper should be cleaned or whcther the ca t a ly t sbculd be - replacedJregenerated, respectively.

4. Many advanced control strategies such as rr,odel-based control or inferential contrci require accurate estimates of controlled vzriables. Dynamic data reconci!iatior, techciques can be used to derive accurate esGmates for better process coilttol.

5. Gross error detection not only improves the estimation accuracy of data reconciliation procedures but is also usef~ l in identifj'ing instrw mentation problems which require special maintenance and correction. Incipient detection of gross errcrs can reduce maintenar,:, cost; a;;< provide a smoother plant operation. These methods can -.1~0 b? pxtend- ed to detect faulty equipment.

A BRIEF HISTORY OF DATA RECONClLlATlON A N D GROSS ERROR DETECTION

The problem of data reconciliati011 was first introduced in 1961 and during the past four decades more than 200 research publications in the

22 Ilirra K<,r-r~~iciiiu!ior~ ~ I I I ~ Gross Err-or- Llrtt'i.rioli

two areas of data reconciliation and gross error detection have appeared. Our purpose in this section is to trace some of the significant contributions that spurred developments in these two areas.

Interestingly, the problem of data reconciliation was first posed by Kuehn and Davidson [2] who were then working in the systems engineering division of IBM Corporation. They derived the analytical solution for a linear material balance problem for the case when all variables are measured. In a series of papers between 1968 and 1976 [3, 41 several important ideas in data reconciliation and the optimal selection of measurements, particularly in linear processes, were introduced. These included the treatment of unmeasured variables, and the decomposition of the reconciliation and coaptation problems using a graph-theoretic approach.

The key concepts of obser-vability and redundancy were also introduced in these papcrs. The classic paper by Mah et al. in 1976 [5] also treated the general linear data reconciliation problem including estimation of untneasured variables. The interrelationship between linear algebraic and graph theoretic approaches were brought oct in this paper. More importantly, the paper clearly demonstrated through sinlulation of a refinery process that data reconciliation does substantially iriiprove accuracy especially \\ahen sufficient redundancy exists in the measurements. The problem of detecting gross errors callsed by meastirement biases and process leaks ~.va.: also tackled in this work.

The nexi major contribuiion was the concept of a pi-ojectior: matrix introduced by Crowe ec al. [Ci]. These authors decomposed the reconciliation aild coaptatior, problenls by using a projection matrix to eliininate the unmeasuied variables. This approach is more general and can be used even if sotlie of the unrneesared variab!es are unobservable. The ti:e of the Q R factorization ir. obtaining tne projectio~i maiI1x and in the solution of unn:easured variables was proposed by Swartz [7] and more recently by Sanchez and Rornagnoli IS].

Data reconciiiation for nonlinear processes was first addressed by Knepper and Gorman [9j who used !he iterative technique rmpcsr.! h x r - , Britt and Luecke ! 101 for parameter estimation in non1ine:ir regression. Their approach has some limitations as compare3 to the approach of successive linearization and use of projection matrix to solve the linearized subproblem proposed by Pai and Fisher 1 1 I]. In general, to solve the nonlinear data recoi~ciliatio~l problem which involves bounds and other inequality constraints, a constrained nonlinear optimization method has to be used. Tjoa and Biegler [12] made use of succes.rive quadralic PI-0-

grar7znzirlg (SQP) for solving a combined data reconciliation and gross error detection problem. as did Ravikumar et al. [13].

In parallel, methods for steady-state data reconciliation were being developed in the mineral processing area. One of the earliest applications of data reconciliation to a mineral processing circuit was published by Wiegel [14]. A representative sample of publications in this field are by Hodouin and Everell 1151, Simpson et al. 1161, and Heraud et al. [17], among others. A survey of computer packages for material balancing in mineral processing industries was published by Reid et al. [18].

The problem of data reconciliation in dynarnic processes has received attention only recently, although it was first tackled using an extended Kahnan filter by Stanley and Mah 1191 who used a simple random walk model to describe the process dynamics. Alrnasy [20] used steady-state reconciliation techniques for dynamic balancing of a linear time invariant dynamic model of the process by considering the equivalent discrete input-output formulation. For a linear dynamic system, the optimal estimates are obtained using a Kalman filter wnich, however. cannot handle inequality constraints.

Dynamic data reconciliation has only recently been extended to nonlinear, constrained problems. Liebnlan et al. [21] have transfoi-ined the system of differential-algebraic equaciofis describing a dynamic mode! into a scandard r~onlir7c~czr- pl-ogrcin; (NLP) and recoi~ciled the data using constrained noalinear optin~ization methods. As compared to steady-state reconciliation which is increasingly being applied to industrial processes, it may take a few more years of developnient before dynamic data reccincili- atior. is also ready and commercially available for industrial applications.

Within a f e v ~ yeai-s after Kuehn and D2vidson's paper on data reconciliation sppeared. the probleni of identifying gi-oss errors in data and its importance in data reconciliation was pointed cut by Ripps [22j. Kipps also proposed the procedure of measurement eli~ninatiorz as a technique for identifying the measurement containing a bias. This has now become one of the s:?~?c!zrcf cfrztezies G in nlult_ip!e gross error identification. Although statistical tects for gross en-or detection were proposed by Reil- ly and Carpani [23j as early as 1963, they did not attract much attention since they were presented in a conference paper. The global rest and n:easrtl-enle~zl test were proposed by Alrnasy and Sztano 1241 in 1975 and the r7odcl or corzstl-aifzt test by Mah et al. [5] a year later.

More than a decade later, the generalized likelilzood r~ltio (GLR) resl was proposed by Narasiinhan and Mah [25], the Bclyr.sicln test by Tamllanc ct a].

[261, and more recently the principul comnponent test by Tong and Crowe [27]. Although, strategies for identifying the location of one or more gross errors were developed by Mah et al. 151 and Romagnoli and Stephanopou- 10s [28], a variety of serial elimination strategies using one or more statistical tests for multiple gross error identification were developed by Serth and Heenan 1291 and Rosenberg et al. 1301. More importantly, they also compared the performance of these strategies through simulation for determining the best among them.

The method of simulation and its use in evaluating performance of gross error detection tests and strategies was first clearly explained by Jordache et al. [31]. Different measures for evaluating the performance of gross error detection strategies were also introduced in the above three papers. A different strategy called the serial compensation strategy for multiple gross error identification was proposed by Narasinlhan and Mah [25]. Simultaneous strategies for multiple gross error detection have also been proposed by Rosenberg et al. 1301 and more recently by Rollins and Davis [32]. The investigation for detennining the best gross error detection method and for improving its performance is still being pursued.

Applications of data reconciliation to single process units either in the laboratory cr in an opeiating plan: were reported by Murthy 1331, Madron et al. 1341, Wdng and Stephanopoulos [35], Crowe 1361. Sheel and Crclwe /07]. among others who applied it to reactors, and by MacDonald and Ifowat i3Sj to 3 nnnequilibrium isotkermzl flash unit. Keconci!iation of flows in indils- tria! processes weye reported by Mah et al. [5j and Serth and Heenan [39j, thoagh it is not clear whether these were implemzntcd in acfxal przctice. Applications of data reconciliation to actual icdustrial processes were reported by Ravikomar et al.1131 and many other papers mentioned Ir: Chapter 1 1 . Development cfcornmercial software for industrial app:ication:; of data reconciliation and gross error detection began ir? the late 1980s. C . ~xce!lent reviews of data reconciliation and gross error detection have

been written at regular intervals by Hlavacek [39], Mah 1401, Tamhane and Yah !4!j, ?n.h jA2], 22d rt-cent!;. 5;. Crcye 1431. The book by Mah 1441 contains a chapter nq this topic a~ does the book by Bodington [451. Cur- rently the only book wholly devoted to this area is by hladron 1461, which has been revised and expanded recently by Ververka and Madron 1471.

SCOPE AND ORGANIZATION OF THE BOOK

This book provides a sumniarized analysis of the various approaches to data reconciliation arid gross error detection. Certain criteria for selecr-

Tlie Irrrj~urtunce of Dofu Reconciliutiun and Gross Error Det~cfiori 25

ing various techniques and guidelines for their practical implementation are also indicated.

In Chapter 1, we have presented the need for data conditioring in process monitoring. Various signal processing and error reduction techniques were briefly mentioned. Data reconciliation which provides a model based error analysis and correction was introduced and illustrated by a simple example. Major concepts in data reconciliation, such as redundancy and observability, were also defined.

Chapter 2 introduces the statistical characterization of measurement errors and various univariate error reduction techniques. Data filtering, which is widely used for data conditioning, is described in more detail. Various filtering techniques are presented and compared.

Proceeding from Chapter 3 onward, the material is presented in increasing level of complexity. Chapter 3 describes the problem of steady-state linear reconciliation. Both theoretical and computational issues related to the linear data reconciliation are elucidated. Decomposi- tion techniques for linear models with both measured and unmeasured variables are described here. Observability and redundancy are important issues for this case. Variabie classification techniques related to the observability and redundancy coacepts are therefore presented. Both graph-theoretical and matrix-based approaches are described.

Chapter 4 deals with steady-state data reconciliaticn fcr bilinear systems. Bi!inear constraints, such as component materiai balarices and csr- tain heat balance equations occur frequently in many industrial reconcili- ator, applications. Bilinear equations contain terms that are products of mi; random variables. Specidized reconciliation s~lution methcds have been proposed fcr bilinear constraints. This chapter preients sorne of them along with tthzir zssociated benefits and shortcomiiigs.

Chapter 5 rreats nonlinear data reconciliation. Ncnlinear mode!s often ased to accurately describe most chemical processes. Various techniques used for solvirlg the nonlinear reconciliation problem are dis- C U S S C ~ . Some are bused clil ~ u c L ~ a a i v c ; l i~~ea~- i~ i~ i i t , l l , w h i t : others are derived from gencrz! aoclinear prograrr,~ing techniques. The most eff- cient and widely used solution methods are presented in this chapter. Decomposition techniques for nonlinear problems are also analyzed. Inequality constraints such as bounds on variables are often imposed with nonlinear models in order to obtain a feasible solution. The treatment of inequality constraints is finally analyzed in this chapter.

In the previou:; chapters only steady-state processes are considered. Data reconciliation techniques for dynamic systems are discussed in

Tlrc It~r/~orfat~re of Uara Reco~~cili~zriotr atld Gross b-rrur- Detection 27

Chapter 6. The reconciliation problem for a linear dynanlic process becomes a state estimation problern which can be solved via Kalman filtering methods. General optimization techniques have to be used for dynamic nonlinear problems which are described as part of nonlinear dynamic data reconciliation techniques.

While data reconciliation attempts to eliminate inaccuracies caused by randorn errors in measurements, gross error detection deals with the identification and I-ernoval of systematic biases in measurements and leaks. Chapter 7 introduces the issues involved in gross error detection and describes the basic statistical tests that can be used to detect gross errors. The underlying assumptions, characteristics and relative advan- ta,aes and disadvantages of various statistical tests are also discussed. Interaction between gross error detection and data reconciliation is also highlighted.

In any industrial application using process data, it is very important to identify all gross errors, so they can be removed or appropriateiy accounted for. None of the statistical tests described in the previous chapter provides satisfactol-y gross e11-or identificatio:~ for all practical scenar- ios and more complex stratezies are required. Chapter 8 describes some of ti15 most successful such strzitegies. The applicability of these methods to nonlifiear processes is fur~l?er discussed. Finally. the eifect of Scunds or inequality constraints on gi-o:;s error detzctio:~ is analyzed.

The previous two chapters describe gross error detection methods far stzady-state processes. Chapter Y treats the gross error identification for dynamic systems. The dynamic feature of a prscess introduces new i ss~es such as combining information from measurements co!lected over a pa-iod of :jme and on-line in~plzmenta:ic?n.

The cfficacy of data reccnciliation and gross error detection depends ci_cnificar;tly upor, location of measured variable:;. Recent attempts to optima!ly design the sensor network for maximizing accuracy of data reconci!iation solution and a: minimum cost are described in Chapter 10.

Several large scale inr1lrc!ria! app!icatior?~ 2nd existent software sys-- tenls for data reconciliation and gross errol- detection are discussed in Chapter 11. Various aspects such as the context of the industrial application, the problems associated with each typc of application and the methods llsed to solve them are analyzed i n this last chapter.

SUMMARY

Measurement errors occur frequently in process instrumentation. Some errors are s~nall and random (random errors), others are large and systerilatic (gross errors). Data validation and data filtering are used to reduce the errors in process data. Filtered data, however, usually do not satisfy the plant model. Data reconciliation exploits redundancy in process data in order to determine the necessary measurement adjustments used to create a set of data consistent with the plant model. No data reconciliation is possible without data redundancy (more measurements are available than the min~rr~uln needed to solve the simulation problem). Data reconciliation solution obtained from data with gross errors is not reliable because a large error spreads over other variables, causing unreasonable ciata adjustments. Data reconciliation and gross error detection are closely interre- lated. They need to be implrrnented together in crder to obtain a reliable data recocriliation. Scatistical tesis arc useful tools Cor gross error dctection. 1,ocaticn of instrumentation is important for both data reconciliation and gross emor detection. An optimal senior placement car? be predetermined. U~~neasul-ed variables and rngdel parameters can tte estimated by data reconcilistion, providing that enough ~neasurcd data is avaii- able in order to make them observable. Oniy natural material and energy conservation laws are acceptable for plant models used in data reconciliation. Con-elations c;r approxin~ate relaticns among process variables are no! recommended, since they introduce additionai sources of enor. Data recoilciliation can be applied to both steady-state and dynamic processes.

28 flula Kccurlciiinlio~~ ntirf Gross Error flef(~cfion

REFERENCES

1. Reklaitis, G. V. Introduction to Material and Energy Balances. New York: John Wiley & Sons, 1983

2. Kuehn, D. R., and Davidson, H. "Computer Control. 11. Mathematics of Control." Chem. Eng. Progress 57 ( 1 96 1) :44-47.

3. Vaclavek, V. "Studies on System Engineering. I. On the Application of the Calculus of Observations in Calculations of Chemical Engineering Bal- ances." Coll. Czech. Chem. Commun. 34 (1968): 3653.

4. Vaclavek, V., and Loucka, M. "Selection of Measurements Necessary to Achieve Multicomponent Mass Balances in Chemical Plant." Chem. Eng. Sci. 31 (1976): 1199-1205.

5. Mah, R.S.H., G. M. S:wley. and D.W. Downing. "Reconciliation and Recti- fication of Process Flow and Inventory Data." Ind. & Eng. Chein. Proc. Des. Dev. 15 (1 976): 175-183.

6 Crowe. C. M., Y.k G. Campos. and A. Hrymak. "Reconciliation of Process Flow Rates by Matrix Projection I: Linear Case." P.lChE Journal29 (1983): 881-888.

7. Swartz. C.L.E. "Data Reconciliation for Generalized Flowsheet Applica- tions."Arnerican Chelnical Society ?4ztional Meeting, Dallas, Tex. (1989).

e. Sanchez, M., and J. Romagnoli. "Use of Orthogolial Transformations in Data C1assifica:ion-Recc.ncilia~ion." Co171purerr Clienz. Etzgtig. 20 (1996): 1.83493.

9. Geppe; , J. C., and J. W. Gorii:~n. "Siatisiical Analysis of Coristrained Dsta S .~ t s . '~ AlChE Joui-i~a/ 26 (1980): 260-254.

10. Briit, F1. J., and R. 1-1. Lueche. "The Estimatiorl of Parzmeters ir. Nonlinex 1. 1n1il;clt : . ~Models." -li.chtza~trctr-ics 15 (1973): 233-247.

i :. Pai. C.C.D., and G. D. Fisher. "Application of Bsoyden's Method to Recon- ciliation of Nonlinearly Constrained Data." AIChE Journal 34 (1988): 873-876.

12. I'joa, 1. B., and L. T. Riegler. "Simultaneou:; Srracegies for Data Reconci!iation and Gross Error Detection of Nonlinear Systems." Coinputers Cheni. Engt7g. I5 < 1091 ): 679490.

1.3. Ravikumar. V., S. R. Singh. 141.0. Garg, and S . Narasimhrm. "RAGE-A Soft- \\.are Tool for Data Reconciliation and Gross Error Detection," in Foirndariotls qf Cony7utc.r-Aided PI-ocess Oj>erations (edited by D.W.T. Kippin, J. C. Hale, and J. F. Davis). Amsterdam: CACHE/Elscvier, 1994,429-436.

14. Wiegel, R. I. "Advances in Mineral Processing Material Balances." Catzad. Metall. Q. 1 1 (1972): 41 3-424.

15. Hodouin, D., and M. D. Everell. "A Hierarchical PI-ocedure for Adjustment and Material Balancing of Mineral Processes Data." h7t. J. Miner. Proc. 7 (1980): 91-1 16.

16. Simpson, D. E., V. R. Voller, and M. G. Everett. "An Efficient Algorithm for Mineral Processing Data Adjustment." Int. J. Miner. Proc. 31 (199 1): 73-96.

17. Heraud, N., D. Maquin, and J. Rago:. "Multilinear Balance Equilibration: Application to a Complex Metallurgical Process." Min. Metall. Proc. 1 1 (1991): 197-204.

18. Reid K. J, K. A. Smith, V. R. Voller, and M. Cross. "A Sunfey of Material Balance Computer Packages in the Mineral Industry." in 17rh Appiicnriot~s q/ . L'onlputers and Operations Re.seurclz it1 the Miner01 Industry (edited by T . B. Johnson and R. J. Barnes). New York: AIME, 1982.

19. Stanley, G. M., and R.S.H. Mah. "Ertimation of Flows and Tempel-atures iil Process Networks." AIChE Journal 23 (1977): 642450.

20. Alrnasy. G. A. "Principles of Dynanlic Baiancin~." AICIIL' Joui-nul 36 (1991): 1321-1330.

21. Liehman, M. J., T. F. Edgar. and L. S. Lasdon. "Efficien: Data Reccnci!ra- tion and Estirllation for Dynarnic Processes Using Nonlinear Prc>~ran~:nins Techniques." Cotnplrters Clzen~. Ei lg~~g. 16 < 1992): S63-986.

22. Ripps. D. L. "Adjustment of Expesinierita! Data." Chenr. Etz,~. PI-og;.. .i'w1[1

Scr. iZo. 53 61 (1965): 8-13.

23. Kei!ly, P. IM., and R. E. Carpani. "Application of Statis!ical Theory lo Adjustment of klaierial Balances." presented at the 13th Can. Chzm. 5 2 Cod.. Mo~treal, Quebec, 1963.

54. Almasy, G. A, and T. Sztzno. "Checking and Conectior. of Measur-cmen[~ on Basis of Linear Sys:enl Model." Proh. Co?7tro! l i f 0 ~ 1 1 . 7-!ieor-\. 1

(1975): 57-69.

25. Narajiillhsn, S., arid R.S.H. Mah. "Gerieraiized Likelihood Ratio kletttod fo: Grot.; F ~ n r Identification." AICizE Juut-rial 33 (1987)- 1514-1521.

26. Tarll!lane. ,\. C., C. Jordache, arid R.S.H. Mah. "A Bayesian Approach io Gross Error Detection in Chemical Process Data. Part I: Model Develop- ment." Clzc,mom~~rics and hztel. Lab. Sys. 4 (1 988): 33.

27. 'Tons. H., and C. M. Crowe. "Detection of Gross Errors in Data Reconcilia- tion by Principal Component i2nalysis." AIClrL- Jourrzai 31 ( 1095!: 1712-1722.

30 Oufa K~~conciliariori arid Gri1r.7 Err<,,- Di~cccrio~i

28. Romagnoli, J. A., and G. Stephanopoulos. "Rect~frcat~oli of Process Mea- surement Data in the Presence of Gross Errors." C'!:enz. Eng. Sci. 36 (1 98 l ) . 1849-1863.

29. Serth, R.W., and W. A. Heenan. "Gross Error Detecting and Data Recolicili ation in Steam-Metering Systeins." AIChE Jour-11al30 (1986): 743-747.

30. Rosenberg, J., R.S.H. Mah, and C. Jordache. "Evaluation of Sehernes for Detecting and Identification of Gross Errors in Process Data." Ind. & Eng. Chenz. Proc. Des. Dev. 26 (1987): 555-564.

31. Jordache, C., R.S.H. Mah, and A. C. Tamhane. "Performance Studies of the Measurement Test for Detecting of Gross Error in Process Data." AIChE Journal 31 (1985): 1187-1 201.

32. Rollins, D. K., and J. F. Davis. "Unbiased Estimatio~l Technique for Identifi- cation of Gross Errors." AIChE Jo~tnznl 38 (1992): 563-571.

33. Murthy, A.K.S. "Material Balance around a Chemical Reactor, 11." Ind. Eug. Cl~em. Process. Des. Dev. 13 (1974): 347.

34. Madron, F., V. Vevel-ka. and V. Vanacck. "Statistical i211alysis of Material Balance of a Chemical Reactor." AICI7E Jour-170123 (1977): 482486 .

35. Wang. N. S. and G. Stephanopoulos. "Application of Macroscopic Balances to the Identification of Gross Measurement El-ror-s." Biarech~lul. Bioeirg. 25 (1983): 21 77-2208.

36. Crowe. C. M. "Reconciliation of Proce\.z. Flow Rates hy Matl-ix Projectio:l. 11. Tlic Nonlinear Case." A1CI:E Jolri-rial 32 (1986): 61 6-623.

37. Shcel, J. P.. md C. M. C~owe. "Sinii~lntion and Optirnizatio~i of an Existing Eth yibenzc~le Dehydrogenatior! Reactor." CCII~. .I. C I I Y ~ ~ . Eng. 47 (1 969): 183-1 57.

38. MacDonald, R. J., aad C. S. Hoivat. "Data Reconciliation and Parametzi Estimation in Pisnt Perl'or!nnnce Anal~~sis."AIC!~B.lo~r~-no! 34 (i 938): 1-5.

-39. Hlavacek, V. "P.nalysi.i of a Complex Plant-Steady Staic and 'kansielit Eehavior I-Plant Data Estircation a~;d Adjustment." Co~npitrer.~ Cl7er:r. Eng~zg. i ( 1 977): 7.5-8 1.

40. Man, R.S.H. "Design and Analysis of Perfornlance Monitoring Systems." in iZ;ie,,r;,u: ,.',IA r>> Cmrrr-01 11 (edited by D. E. Seborg and T. F. Edgar). Ne\i York: Engineering Foundation. 1982.

t * 77ie /inpor-lo~ii.<, of Drrfo h'eco~rciliufio~~ and Grus.7 Er-ror Derec.rio,l 31

42. Mah, R.S.H. "Data Screening," in Foundations of Co~nputer-Aided Proce.7~ Operations (edited b y G. V. Reklaitis and H. D. Spriggs). Amsterdam: CACHEIElsevier, 1987, 67-94.

43. Crowe, C. M., "Data Reconciliation-Progress and Challenges." J. PI-oc. Cont. 6 (1996): 89-98.

44. Mah, R.S.H. Clzemical Process Structures and Irtfonimtiorz Flows. Boston: i $ e I Butterworths, 1990.

45. Bodington, C. E. Planning, Scheduling and Corztrol bztegratiorz in Process Industries. New York: McGraw-Hill, 1995.

46. Madron, F. Process Plant Performance: Measurenzer~r ar~d Data Processing for Oprirnizatioiz and Rerroj5t.s. Chichester, West Sussex, England: Ellis Hor- wood Limited Co., 1992.

47. Veverka, V. V. and Madron, F. Material and Energ?; Balancing in Process Indusiries: From Microscopic Balances t o Large Plants. Amsterdam, The Netherlands: Elsevier, 1997.

41. Tamhane, A. C., and R.S.H. Mah "Dat;~ Reconciliation and GI-oss El-ror Detection in Chemical Process Nctworks." Tc~cl117or1rerr-ics 27 (1985): 409-422.

M~~~rsurctrre~rr E~I-ors arid El-ror Kcducrio~r Tee-hrriqur.~ 33

Measurement Errors and Error Reduction

Techniques

CLASSIFICATION O F MEASUREMENT ERRORS

As mentioned in Chapter 1 . there are many sources for instrument errors which deterrninc a measurement error in virtually all measured process data. Some of the nleasurernent errors are random snd srnall i ~.lii;doru er-I-CI!-sj. ~vhilc: others are s)stcn~atic and large (gross errot-s). Some authors, joch as Madron / 1 ] and Jiebman et ai. [2], prefer to dsf'ine a separate c l a s ~ called .y<rte:?~atic eri-ors which is distinguished from the gross error catttscry. in their classification. systematic errors are c o n s i ~ t e n ~ measuremerlt biases. while :he grilss error class includes large 1neasurc11tei;t-re1:1ied Errors (biaszs and out!iers). instrument comp!ete failure or process-relared errors such as process leaks.

In this text, in order tc: simplify the notation and terminology we clas- iify ail instrument and process errors in two categories: random errors and gross errors (as defined above). .4ny significant systeinatic bias is I I I C ~ U ~ C ~ ; I I L ~ I C grabs t:Sr'Ol'cdiegOry.

Random Errors

I t is generally observed that if the measurement of a process variable is repeated under identical conditions, the same value is not obtained. This is due to the presence of random errors in measurements. Random errors cannot be either predicted nor accurately explained. We choose to model the effect of nlndonl errors on measurements as additive contributions.

Thus, the relation between the measured value, true value and random erici in the measurement of a variable i is expressed by Equation 1-6. In this chapter, unless otherwise required, we drop the subscript i and rewrite Equation 1-6 as

where y is the measured value, x is the true value and E is the random error. The random error usually oscillates around zero. Its characteristics can be described using statistical properties of random variables which are described in Appendix C. Its mean or e.xpected value is therefore given by,

and its var-iaizce

where o is the standard deviation of the measurement error. Sta?zda?-d deiliatio~l is a measlrre of the measurement precision. The sn~aller the standard deviation, the more precise is the measurement and the higher the probability that the random error will be close to zero.

If the r a ~ d o m errors in the measurements of two different v31-iabies i and j art: also considel-ed statisticr?l!y independent, then tiley Ilave zero correla~ion, that is,

Althcugh statistics! independence does riot a]\+-ays rzpresent rca!it\. this sssun:ption is widely used in data reconcllistion 1itera:ure because i t offers a sirrlpler matliematica: description of the measurement errors. Measuremenrs obtained from two different instruments can be correlated if they share a common source of error (for exa~aple, a change in the ambient conditions affecting accuracy in a group of measuring devices). This type of correlation is known as spotiol coi-relatiorz. The degree of association between errors E, and E, is expressed by means of a con-elu- lion coe8icie~lt, riJ:

Mra.sirr-e~~irrt: Err-01-s urtd Error Reducrior; Tc.chttiql<e.s 35

Equation 2-5 can be used to estimate c o v ( ~ ~ , & ~ ) , because rij can be obtained from statistical analysis of a set of repeated measurements 131.

Another type of correlation occurs when the same source of the error persists for a number of measurement periods. In that case, the measurements errors of the same variable at different time instants are serially correlated. Serial correlntiorz is also produced by delay time in control operations, due to unit capacity or inertia. For instance, if there is a delay of k time periods, an output-measured value y,,,(t) can be correlated with an input value at time t-k, say yi,(t-k). Kao et al. [4] showed that neglecting serial correlations can be significant for gross error detection and suggested remedies to account for serially correlated data. A summarized statistical description of the serially correlated data, based on tlme series analysis, can be found in Mah 151.

Estimating the standard deviation. The standard deviation of a measurement error plays an important roie in data reconciliation and various other error reduction techniques. Since the true standard deviation is never known, an estimate of the standard deviation can be obtained by usiilg a sarnple standard deviation, according to the following f<?~inl~la [ 3 ] :

where s is tlie estimated value of standard deviation, yi is the i fh observs- ti011 and is the arithmetic average of N obsei-vations cf the same variabie. This t'onnuia prcvides an unbiased eztin?a:e of the standard deviation. Sample size hT is important far the reliability of the estimate. Tlie more observations, ttie more reliable tlie estimate. Madron [ I ] lndicaces that a minimurn of 15 observations ( f ~ r a steady process variable) shotild be used.

A n i~nporr2nt r~?~~;rrrnr 'nt for estimating tho standard deviation of a measurement error fro111 a s;tmple of measurements using tile ribove equation is that all the measurements of the variable should be drawn froin the same statistical population. Practically, this implies that if we use a sample of N measurements of a variable made at successive time instants for estimating the standard deviation of the measurement error, then it is implicitly assumed that during this time interval the true value of the variable has not changed. Moreo\rer, it is also assumed that the nleasurement

errors at different time instants all have the same standard deviation. Alternative ways to estimate the standard deviations (in fact, the entire covariance matrix of measurement errors) when the tr71e values are not constant and when gross errors also occur are described in Chapter 3.

A complete mathematical description and statistical treatment of the random errors requires a probability density function (see Appendix C). This implies knowledge or assumption concerning the distribution type. The usual assumption in data reconciliation literature is that process data follow a normal distribution. Madron [ I ] suinmarizes the main reasons for selecting the nonnal distribution as follows:

1. It was found that the no~mal distribution approximates well the behavior of measurements in natural sciences, particularly within the range mean k30.

2. An error is often the sun1 of a larze number of single, elernentary errors. According to the central limit theorem, under certain generally acceptable conditions, the distribution of such a sum approaches the normal distribution (for a large number of elementary errors).

3. The theory of nonnal model error is well developed and is easy to ireat mathematicaily. The values of probability density and distribution hnctions lor standard normal distribution are available i n tabulaled form in any statistical textbook which facilitates solving of pr-actical problems.

One irnlnediate practica! use of the probability density functior; far the ~orrnal disrribution is in estimating the standard deviation of e fucc~ion of r a n i ~ n l variables. This problem is i~nporta~it foi estimating s:a!ldard deviation for a secondcry tandorn vcr-iahle which is ca!culated based on ronlc other directly measured variables, denoted as pt-inzu~y wt-iables. I[ is assulned that prababiiity properties (mean value, standard deviatio~~j of ths directly measured varizbles arc known. For examp!e, a flow rate F can be estimate$ hy sing -2assre:r.ertz i3 27 cSf;.ce gauge according to the forinula:

where k is rhe orifice gauge constant, Ap is the pressure difference on thc orifice, po is the inlet orifice pressure and T is the fluid tcnlperature [ I ] .

The mean value of a function z = f(x) of random variables x is defined as:

where p(x) is the m-variate probability density function of the vector of random variables x. If f(x) is a linear function, i.e.,

[hen the mean value of f(x) is also linear:

The variance of a function f(x) of random variables is defined as

If fix) is linear and i!te primary errors are uncorrelated. 2- 11 I-educes tt,

if ftx'i is :t nonlinear fi~.rnction--such as Equation 2-7 aboire-the problem becomes Inore cornplcs and a solution can he obtained by either intz- grating :ht. Equation 2-1 1. or by linearization cf f(x) (Taylor expansion) which enables l~sirig F.quation 2-12 for an approximate solution [ I ] . The latter approach gives rise to the following general approxircation formula:

A practical aspect of the estimation problem of the standard deviation for iinear functions deals with coniputing an overall at.cur-ucy for m 171easur~~- n1er11 sJ:sfetn. For example. let us assume that t h e e devices contribute to produce a measured value: a sensor, a transmitter. and a recorder. Each coinponent has its own error. and standard deviation. The overall error and standard deviation can be obtained by a linear combination of each componen! e1-r-01- and standard deb-iarion. .4n overall standard deviation is obtzined

by using the Equation 2-12. Zalkind and Shinskey [6] used similar analytical derivations as Madron [I] and provide examples of estimating instrument error and standard deviation by combining component information.

An extension of these type of calculations is given in Nair and Jor- dache 171. They applied the linear combination rule to estimate the effective standard deviatiun of a measurement system based on the information obtained from the accuracy and precision of the measurement system. The accuracy of the system is a measure of the agreement between the instrument reading and the true value. This information is provided by the instrument vendor and it is usually estimated by the linear combination rule applied to measuring and processing components as presented above. The pi-ecisiorz of a measurement is a measure of the agreement o f several repeated readings of the same measurement. Tt-. sample standard deviation estimated by Equation 2-6 is a measure for the repeatability of a measurement in steady conditions. An overall ieffec- tive) standard deviation can be estimated as

where s, is the instrument accuracy (usually given as a percentage of the iilztrument range) and s,, is the precision of the instrument (the repeara- bility stai~dard deviation).

Gross Errors

A detailed definition of' the gross elrors was g i ~ e n in Chapter 1 and at the beginnlfig of this chapter. Usually gross errors are associaied \vi;h sensor faults. Figure 2-1: reproduced from Dunir; et al. [a]. il1us:rates graphically the mcst common types of instrument faults: bias, ccmplete failure, drifting, 2nd precision degradation.

If a grcss error exists in a ineasured value, the measurement equation Eqi~ation 2-1 changes to:

where 6 is the magnitude of the gross error. Nore that process leaks, which are also categorized as gross errors, cannot be modeled by h u a - iion 2- 15. They represent mode1 errors and therefore affect the constraint equations as shown in Chapter 7.

Gross errors significantly afkct the accuracy of any industrial application usin: process data. They have to be detected and rern~ved. Sorne of

Figure 2-1. Instrurneni types of fault;. Reproduced with permission cf the America0 Institdie o f Chemical Engineers Copyright @ IF96 AlChE A// righrs reserved.

them, such as occasional outliers (spikes), czn be detected by usifig special tilteii~ig ~echniquz.; or statistical quality contro! jaiso kr!own ;is .stulisticill pr:".ess w ~ z ~ I - o ~ ) . Other types might be Inore diffic~ll: to detect withoat zi physical model. Data reconciiiatioi: is the appr~priate tool ic most cases.

ERROR REDUCTION METHODS

Analog and digital filters have been widely used to reducz rc::?-x errors (high-frequency noise) in process values. Inadequate sampling frequency convei-ts a high frequency signal into an artificial low-frequency signal. This phenomenon is known as signal uliasil~g. Analog fil!ers are used to prefilter process data bcfore c;i~npling and prevent aliasing. Digi- tal filters are used afterward to further attenuate high-frequency noise. Seborg et al. [9] provide a sun~marized presentation of both analog and digital filters for process data used in control applications. This text

includes a review of various digital filters, which are very helpful tools for data conditioning before data reconciliation.

Val-ious classical digital filters have been designed. Each filter type has its own advantages, as well as related shortcomings. Some are able to significantly reduce noise but they introduce a sizable delay in the filtered response. There are other filtering procedures that do not add a long delay but do not produce satisfactory noise removal. Other types of filters give both satisfactory noise removal and time delay in some cases, but perform poorly for measurement having variable frequencies or noise associated with a fast dynamic in the process variables. Overshooting1 undershooting is a comluon problem with the last case.

In general, a trade-off between the amount of noise attenuation and the time delay in the filtered results is required in order to achieve the best performance for any type of filter. This can be accomplished by tuning the filter parameters which, unfortunately, is not an easy task. The random noise is often combined with instrument bias, slow drifts, or fast process changes, or other disturbances such as a cycling in the feedback control loops. To distinguish between a trae process change and random noise, the dynamics of the process must be well known or a diagnostic system, such as an expert system. should be used. ir, the absence of such information, it is advisable to avoid exce:isive filiering. Too rnuch filtering tends to nlask significant changes in !he process variables.

A b ~ i e f description and analysis of ;he niost v,~idely used classical digitai filters is give]: below. The discussion is restricted to durn ,filter-ing. which means coise rernovzl in the most recent measurernents. Data fiiter- ing is to Se distinguished from da!a snzoothirzg which deals with pasi data. The foriner estimates the c u r r e ~ ~ t value bascd on the cul-rent and past measurernents and it is of prirnav concern in process controi. The latter estimates the value of the central poin? from pasi a11d recent mea- sureme1;ts (valucs from both sides of the central point) and ir is mainly used for fault diagnosis a r~d steady-state process optimization. Mzny ~~J!!LC)TS, however, do not distinguish between the two tezr,:: x c ! 2 z :he data "smoothing" term for data filtering as we!].

An ir~regl-(11 of aOsol~ife er-tars (IAE) similar to that of Kim and Lee [ lo] will be used to conlpare various filtering techniques in this text. Here the IAE is the sun~nlation of the abso1u:e difference of the filtered values and the conesponding truc values over a specified number of time steps. Note that Kim and Lee used smoothed values instead of tIve values. but using simulated true values gives a cleaner measure for the

40 Dutu Kc~cor~i.i/iutiofr utid Grors Error D~vecrion Mcorrrre~~zen: Errors and Error Reduction Techniques 41

amount of filtering and delay. The lower the IAE, the more filtering (with reduced delay) is obtained.

Exponential Filters

This filter is by far the most commonly used in industrial applications. It is a discrete-time filter, equivalent to the first order lag in a continuous system, and is a standard filter incorporated in many DCS systems. It is also known in control area. It can be analytically described by the following equation:

where xk = raw (unfiltered) measurement at time tk yk = filtered value at time tk 0 = filter parameter.

The exponential filter- requires filter initialization: at tk= 0, yo = xo. The filter parameter 0 is a tuning parameter with a range O < 0 I 1. If 0 is close to zero, significant filtering is obtaineci, while for 0 close to 1, very little filtering is done. Note that the expoilential filter is of the ir61ai:e iini7~1se respotxe (ilfi) type, which means that the effect of acy input 5ig- rial is felt forever but with di~ninishing effects.

1 Exercise 2-1. Derive For~nela 2- I6 from rlicfrr-st-order di,@,~ri!i-rl / eyuution urcd bi coti~rol li~erature for the j i f i . f -oi&~ kg. !

I I 1 illhere 4 =$l!er tirile c o ~ i s f u ~ ~ t . in units of time. Dircus.~ the

relaficnsi7ip between the.filter rinzc catlstant zf arzd fhc filter pal-clr;zczet- o.

Hint: E.xpreA.; cl, rivutivc dy(t)/d: iis iiii iippraiimatiorz by a l~~zckward step:

The exponential filter has several advantages. The effect of an impulse (spike) icput xk is immediately reduced to Oxk. It is computationally efficient and is easy to use and tune for steady state or slow dynamic signals (single parameter tuning). It does not overshoot and ultimately approaches the proper steady-state value. For these reasons, the exponential filter is used in many control systems. It has a problem, however, in that significant measurement noise attenuation is accompanied by relatively large delay in the filtered signal.

Example 2-1

Figure 2-2 illustrates the filtering provided by the exponential filter for two different parameters (0 = 0.2 and 8 = 0.4) to a data set presented in Table L-1. This particular data set contains two step signals. steady-state noise, and a spike (outlier). The true values are plotted with the dark solid iine. Raw values were simulated by adding random errors (generated with a selected standard deviation) to the true values. A spike was also simulated in one data point.

The integral absolute error (IAE) defined above was included in the chart for performance comparison. As shown by the individual filter p:o:s. the exponential filter with lower filter pxameter (0 = 0.2) indeed does inore filtering in steady-state sitilatiot~s than tfie filter with C) = 0.4.

0 15 30 45 60 75 90 105 120 135 350

Sampl~ng Tlme

Figure 2-2. Exponential filters.

However, the overall IAE for 8 = 0.2 is higher than the IAE for 0 = 0.4 because of the increased delay after step changes and the spike. Tuning the exponential filters for noisy data accompanied by frequent step changes and occasional spikes is a challenging task.

Table 2-1. True Values and Raw Data for Example 2-1

Time True Value Raw Value

+ * ' Measrrreraerir Er-rors and Error- Reducriotl Techrziq~;~.~ 43

Table 2- 1. True Values and Raw Data for Example 2- 1 (continued)

Time True Value Raw Value

Table 2-1. True Values and Raw Data for Example 2-1 (contint~ed) -- --

l ime True Value Raw Value

95 80.0000 80.0000 96 70.0000 70.0000 97 70.0000 70.6065 98 70.0000 67.9913 99 70.0000 68.9857

100 70.0000 67.1895 101 70.0000 72.4049 102 70.0000 7 1.0063 103 70.0000 68.7471 104 70.0000 68.7806 105 70.0000 72.069 1 106 70.0000 67.9624 107 70.0000 64.4622 1 08 70.0000 66.7040 109 70.0000 70.2901 l l0 70.0000 69.5325 1 1 1 70.0000 70.9931 1 I2 70.0000 76.0272 113 70.0000 69.8086 : 14 70.0000 74.4 137 115 70.0000 69.8983 116 70.0000 72.3616 117 70.0000 72.6424 i 18 70.0000 65.6905 119 7G.000(! 73.2806 12C 70.0000 67.8898 !21 70.0000 7 1.5254 122 70.0900 69.5290 123 70.000ti 68.2254 124 50.0'300 68.8 :87 175 70.0900 55.8263 126 70.0009 7 1.034b 127 70.9000 65.5383 I28 70.OC);HI 74.9540 ! 29 70.0000 72.4687 ! 30 7G.0903 57.6565 13: 70.00GC 69.2903 ! 32 70.000CI 69.5289 : 31 70.0000 72.3914 1 34 70.0000 65.6845 I35 73.6000 71.2673 136 70.WiK) 70.7821 137 i0.0000 71.9 197 138 70.0000 68A?19 139 70.0000 72.0655 140 70.0000 69.9292 141 70.0000 7 1.9856 142 70.@0(K) 67.6774 143 70.0000 68.8003 1 44 70.0000 66.7914 145 70.0000 68.7419 146 70.0000 7 1.8966 147 70.0000 70.5520 : 48 70.0000 71.8154 141) 70 0000 71.9483 . .-,, q,, ,,n<>,, <,, ,<- T '>

Various modifications have been proposed to enhance the performance of the exponential filter:

1. Rhinehart [ l l] developed a method for uutor7zatic turziizg of the first- orderfilter. The method assumes that the sampling period is small in comparison to the time for real process changes and that the noise is a random error which follows a normal distribution with zero mean. Instead of specifying the filter parameter- 8 or the filter time constant z , the user needs to specify the desired confidence interval for the f i l - ter value (usually 95%). The method adjusts the filter time constant to minimize the time lag while maintaining the desired accuracy.

2. The double exponential filtel; or second-order filter, is equivalent to two first-order filters in series where the second one filters the output from the first exponential filter. This type of filter was used by Tham and Parr [I21 for signal reconstruction when an outlier is detected by a validation test. The classical derivation of this filter, colliing from tinle series analysis is given in Seborg et al. 191.

3. The rloizlit:ear- e.r,~oi:ei~rialfiltel- is anot!lcr variation of the exponentiai filter (Webex-, (131). This fi!rer heavily filters the noise, while reducii~g the delay. The nonlinear filter uses a design noise band, de:emtin~d as a multiple R of staudard deviations. The fox-~n of the fiiter is as Equa- tion 2-16. except that the tilter parameter e is defincd as

where: R = tuning parameter o =standard deviation of the measurement error

46 l iati~ Kci-r~r~ci!iariori orid Gross Errol- Dercr-riot1

The relationship between the nonlinear exponential filter and the exponential filter is easy to obtain. Equation 2- 16 can be written as

Then

I Ax1 0 = --- if /Ax\ < Ro and R o

Since the true standard deviation is never known, a sa~nple standard deviation can be used as an estimate. The filter pararneter 8 is then used in Equation 2-15 (as before. 0 < 0 5 I) . 'Typically, R lies between 3 and 5. If R is less than 3, little noise redtiction is achieved. 011 the other hand, if li is greater than 5, signi5car;t delay i e s ~ ~ l t s with a rnargina! improvement iri the noise filtering.

The nonlinear fi!ter acts like an exponential filter with a filter paraineter 0 tha: varies depending or1 the rnzgnitude of the difference between filtered and raw measoremen:s. ?'he filter paianleter e is low far signals close to the previolls filtered 7,7alut: and high for signals far fionl tine previ- o:~s filter value. Measurements fai from the previous filtcred valut: have 8 = ! (no fi!tcring at all oatside the design noise band). Thel-cfore, delay is elilr~inated for situations when there is a rapid significant measur-emcnt change. Since the nonlinear exponential filter is tuned for the frequency of a partictllar noise level. it performs optimally for signals whose noise jevci 1h sccauy. it is not recommended for filtering signals with spikes :;ii;ce such signals are not sufficiently filtered or not tiltered at all.

Example 2-2

The filtering performance of the nonlinear exponential filter (with o = I and R = 4) for the same data set as in previous examples, is shown

(. Meosurrtrrrrir Errors a ~ d Error- Kedr~crion Techniques 47

in Figure 2-3. As noticed from the overall IAE for the filtered values, the performance of this filter is higher than that of the simple exponential filters presented in Example 2-1. The only problem with this filter is that it does not filter out spikes at all. Therefore, this filter type is not appropriate for signals with significant outliers.

Exercise 2-3. Reverse nonlirtear exponential filter. Modifi the definition of the filter parameter 8 for the nonlinear exponential filter so it filters more data outside a chosen noise band and less (or nofilforing at all) inside the noise band. Using data in Table 2-1, calcularefifilter values ard IAE and plot the results as irz E-xample 2-2. Describe a situation where the reverse nor~litzear exponential filter can be useful. For what krnd of signal is this filter- type the most irzappropriate ?

Figure 2-3. Nonlinear exponential filter.

48 Durn Kecnrr/ iliiirlorl arid GI-oss O-ror- Dercc~ii~rr

Moving Average Filters

This is another conlmon class of filters. The general analytical expression is:

where N = number of data points in filter w, = weight for the measurement x,

In the absence of past data for the previous N time points, a history initialization is required:

The moving average is a fir~ite itn[?ul.re rt..rpunse (FIR) filter which means that the effect of any input lasts only for N steps. For the common form, all input data is given equal weight, i.e., wi = 1/N. The equal ~ v e i ~ h t ~noving averaze c a ~ c e l s out p~r iodic noise. Like in the case of the cxponenrial filter, the moving average is easy to tun% for steady-state or quasi-steady-statc sig!~als. requiring only the adjustment of the umber of input values used fo calcrrlste the average. Furtlier~noi-c, as with the rxpoire~tial filtzr, the nlo\.ir?g averagc dces not overshoot and reaches correct steady stare after a step cl-iange. The moving average is also easy to implement and fast t s compute, although it requires more storage &[id calcul3tion t!1a17 :he expocential filter. Moreover, being an F!R filter. the n?c?vir,g avenge requi:-es special init1ali7ation as shcwn above.

The 111oving average is tr?cst effective when esiiinating tlie center poini rather than ths cllrre~it valuc. I: is par-titularly useful ior esti~naring a fixed value or a linear trend.

Example 2-3

Figur-e 2-4 illustrates the filtering provided by the ~noving average filter with equal weights for two different parameters (N = 10 and N = 10) to the data set presented in Table 2- I . As noticed from the individ~~al filter plots. the moving average filter with fiigher nc;rnber of data points (N = 20) does more filtering in steady-state situations than the filter with N = 10. Howew

el-, the overall IAE for N = 20 is much higher than the IAE for N = 10, because of the increased delay after step changes and the spike. n e prevj- ous history persists longer in tlie filtered values for the case with a larger number of data points. For this reason, the moving average filter with equal weights is not recommended for signals with step changes or spikes.

For dynamic data, a better performance can be obtained by using a nlov- ing average with unequal weights. As in the case with equal weights, the summation of all wi weights over i = I, . . .. N must be equal to 1. One possible set of such weights can be provided by the following formula ( 1 41:

where the weights w; are exponentiaily increasing with i and satisfy the summation condition. T~vo tuning parameters are involved: the number of data points N and an exponelit r: Usually O < r I 10. For a fixed nurnber of points N, a higher exponent r results in a higher weight for the more recent rneasurernent. i.e.. less filtering. A lower value for the exponent r provides more filtering b3t aiso acids in(~r-e delay.

1 5 5 + -~ - L-J - -- R a w \'sliicc (iAE=265 1) t--4

J ! ; % I

Figure 2-4. Moving average filters.

Exercise 2-4. Prove that the surnrnation over N of w, k given by Equation 2-23 is equal to I .

Exercise 2-5. Repeat the filter calculation for the data presented in TuOle 2-1 with the moving average filler with u~zequal weights as given by Equation 2-23. Choose N=20 and r=4. Compare the results with the results in Figure 2-4. Is there any incentive to use the more complex filter weights given by Equation 2-23 rather than a simple moving average filter with equal weights?

Note that the filter with unequal exponential weights described above by Equation 2-22 and 2-23 is to be distinguished from the exponentiull~ weighted moving average (EWMA)fiIter which is often used in statistical process control area. The EWMA filter is analytically described as 131:

whel-e Yk = sample mean (mavins average with eqzal weights) at time tk yk = filterid value a! tirr~e tk

i = filter parameter ; O < ?L < 1

Initially, yo is taken as the control target po ( yo = h). This filter can also h= expressed as a weighted average of pas: sample means, i t . ,

Equation 2-25 indicates that the weights assigned to ?he sample means decrease geometrically with age. For that reason, riiis IilLer 1s sornetinies referred to as the geotnetrir nzoving avc,oh.e J j ; l ~ e c ? - [3] . Some autinors jS, IS], prefer to use the curren: measuremect, xk. instead of the sarnple mean 7,. A recommended range for 3L is 0.05 < < 0.5 [3] . MacGregor 11 5 ) indicates that a common choice is h = 0.2.

Q Measurcnrerlt Er7-U~S arzd ,511-or R P ~ ~ I C I ~ O ~ I Techniques 5 1

Polynomial Filters

Polynomial filters can be derived from the least-squares filters, which have been designed for data smoothing. While least-squares polynomials are widely used in data smoothing, they are also suitable for data filtering [lo, 141. The general form of the polynomial filter is shown by the fol-

@ lowing equation:

rn m-l Y k = b m t k +bm-,tk + . . -+b2 t ;+b l tk+bo ( 2 - 26)

where: tk = current time m = filter order (a nonzero, positive integer)

bo, . . ., b, = filter parameters chosen such that:

N

Min ~ ( X i - y i ) ~ bo. . .b ,n i = ]

wl~ere: N = number of time steps (data points) included in the tilter.

Polynotnial filters are FIR type filter:; since they use a !imited history of inputs. They provide good noise reduction, while following thi overa!l trend of the measurement data. The alnouni of filtering and the delay depend on the nirm'oer of d2ta p ~ i n t s used and the order of po!ynomial.

The disadvantzge with Equation 2-26 for the polynomial filter is th2t the filter panmeters bo, . . ., h , vary with each time step and musi be recalcil- lateci by soiving the least squares problem at each time step. Because ol'

z 8 excessive computations, initial comaerciai applications viere gecerally limited to first order filters.

An alternative form of the polynomial fi!ter using time invariant filter parameters can be derived for measurement signals sampled at uniform intervals. Unifonn sanlpling intervals are typical with digital sample data control systems. In this form, the polynomial filter Secomes:

where xl, . . ., xN = unfiltered signal values at titne t i , . . ., tN c,, . . .. cN =time invariant filter factors

t, - t,,., = . . . = t2 - t , , uniform sampling interval

The important characteristic of this polynomial filter form is that the filter factors cl, . . ., cN are not functions of the time step as are the bo, . . ., b,, parameters in the conventional formulation shown in Equation 2-26. Once the polynomial order nz, and the number of points N are selected, the cl , . . ., CN filter factors are constants. For a particular set nt and N, the filter factors c!, . . ., cx can be calculated by a procedure described in Exercise 2-6.

This fhrm of the polynorllial filter was developed and applied to industrial pl-ocesses in tlie early 1970s and was recently published in first- order foml [:0, 131.

Exercise 2-6. Ptuve Equalion 2-28.p~- U I I ( ~ ~ ) I - I ~ I sntrr/di~ig irite??lnl~ n~idfiticl 017 e.~pres.~Iori for calcztlcrti~lg f l~efi l~ei- , facfo/-s i'i, . . ., cw fIint: Dei-itxe rile leu,t-squares ol~jectivc,fr~r;c:io~? 2-27 >r>ith ,'ei/?cct ro ho, . . ., h,,, crrrrisfifirlcljir.~t ci I-egi-es.si~)n ~ql4iiti01:,f01- the vecfor h. tt:lii~h IS a gcrzr,-(lIfi;i;l-i~ b = (PTP)-'PTx ;t~/zci-e I i.i rlie vector of ~c,~$Ifered si,~rial ~:rrlu:~.s (iild P is sorne regi.r-~~s,iorl ri!atrix jscc [16] / . A;ext, ILSC Eqlu:zioii 2-26 :t~i-iffen as y b = [ t t tkl f k 2 . . . t y ] b , arzd b y

r-elvlac;rzg .~ccior b r! i:h fhe previa~is Y E ~ Y C S S I ' O I ~ , C:IECI taking k=N (fi);. tt'!~~ ,%,st 0;- c~t,-;-cr;t 57rit0 ~ ( I ~ I I T ) arzd a.s.~[~~trirrg t/mt t,,J=N.4? ,t,!?ew At i.5 a cor~stc~nr srrr~l,vlir7g iiitel-val, get c;rl expi-<l:.~iarrfo:- )',I

.similar to Eqmcrtic:~ 2-28.

When using a large number of points in a polynomial filter, it rnay be Inore convenient tci use ail analytical representation of the fiiter factors [ IS]. These forms of the tjlters can he derived by recognizing that the f i l - ter factors can be represen:ed analytically as

Substituting c~ factors from Equation 2-29 into Equation 2-28 gives a filter form in ai's which greatly reduces data storage requirements. For example, a polynonlial filter using 100 points in the form of Equation 2-28 requires storage of 100 c factors. The same filter i11 the form of Equation 2-29 for the c factors requires storage of only two a coefficients. The filter factor coefficients %, . . .. a,, can be determined by a least-squares solution of the non-square system of equations fonned by writing Equa- tion 2-29 repeatedly, for i=1,2, . . ., N.

For a given number of filter data points, the higher the polynomial order the more closely the filtered response follows the measurement data. The high-frequency noise is not removed, however. A low-order polynomial is usually preferred for filtering, although the lower the order, the larger the delay. Typically, a first or second order is used for most polynomial filters in process control systems. For a selected po~yi,,.,~nial order, the only tuning parameter is the number of data points N. A high number of points gises a smoother output but also more delay. Over- shooting is another negative behavior of polynomial filrers. It occurs when there is a fast rate of change in a signal value and affects the result even after the sigl~al becomes stabilized. The iarger the nunlber of points used by the filter, the more overshootirlg occurs.

tvher-c: q,. . . .. q,, = filter factor coefficients Figure 2-5. First-order poiynomiol filters

Example 2-4

Figure 2-5 illustrates the filtering performance provided by a first- order polynomial fi!ter with two different number of data points (N = 10 and N = 30) to the data set presented in Table 2-1. As shown by the two individual filter plots, the polynomial filter with higher number of data points (N = 30) does more filtering in steady-state situations than the filter with N = 10, but it takes longer to reach steady state after a step change. It also does more overshooting after a step change or a spike than the filter with the lower number of data points. Tuning the polynomial filter is relatively easier for steady-state signals but, as with the other filter types, it is more cumbersome for signals with step changes or spikes.

Hybrid Filters

As seen in the filter examples presented above, none of the individual filters behave in a satisfactory way for unsteady-state signals. The performance of classical digital filters can be enhanced by creating hybrids that combine the features of different filters. The silllplest hybrid one can build is an ari:hmetic average of the filtered values from two types of filters. To create a better filter, the two participating individual filters vzlurs should have opposite feztures. For exarnplc, if one filter is able to signifi- cantl~. r-duce noise but introduces a long d e l q , the other filter needs to ireare a illoch shortrr deiay a l thou~h it might follow the noisy data too closeiy. For that reason. the most appropriate combinations for such hybrids are: polynornia!imo\ring average, ~olynorniai/exponential and cxpnr~ential/nonli~~ear exponentia!. A more coinple~ hybiid that eliminates overshooting biit fcllows the

process dynamics is presented in Clinkscales and Jordache [!4]. To achieve this behavior. their fiiter first detects the type of the change in the process value by analyzing the trend. A modified Shewhart test used in statistical process control 131 has been used to detect a change in the state of a variabie as follows:

x

when most ation Shev This proc ena? reas in s

1 sre sic I L

ca in sir rt

a c < 1

4

a '

first- = 10 two

data i.ilter

e. It l ter :la- t is

al , - * it

I

0

Mmsirretnenl Errors n ~ x i Error Redrrr~ion Tecllt7irjues 55

where Xi is the current (ith) raw value, X, is a long-term average for the most recent steady-state situation and o is the steady-state standard deviation of the measured variable X. If lZil > SCT,, where SCL is a selected Shewhart control limit, a significant change in process data is detected. This test is much simpler than the usual CUSUM tests used in statistical process control. It is easier to implement, and, with proper tuning, enables a faster state-change detection than the CUSUM tests. For this reason, the Shewhart test is often used in association with CUSUh4 tests in statistical process control [17].

Five major signal types can be determined with their algorithm: steady-state, step ckarlge, rairzp, inzpltlse (spike), and urzdeterrnined trurz- sient state. The filter is forced to follow more closely the process dynamics in the case of a true process change and to do more filtering in the case of the random noise (steady state) or spikes. To eliminate overshoot- inglundershooting, the filter value is limited to the maxirnun-t/minimum short term unequally weighted moving average (more weight on the most recent measurement).

A similar approach has been used by Tham and Parr [ ! 2 ] . They also apply statistical tests to determine if there is a trend in ti?e data. They classify the trends in three categories: (a) a distinct :rend: ib) a trend that. due to noise, is not inunzdiately disceri~ible: and (c) no trend. hdditioilal tests are applied to detect ou:liers for each type of trend. If an outiiel- i s

found, a s i g ~ ~ a l recon~truction fo:niu!a is used to estin:a?e a value a.hich is used to replace the outlier.

Esercise 2-7. Ci-enre lzybi-ids (sint1:le arith:?zetic average_\)j%jr. ~ol~~no;~~icrl/ex~(it~c~itial arzd polyrlor~iial/r,zo1~i!zg cit.erngP rlsrrzg ilr~tcr 1 ' frcnl TaOle 2-1. Use 0 =0.2,for the expoi7e?itialfi!tet; A'= l/)for- rile

nzovir~g average and N=20 for !hc.fi~si-order polynot?lic2ifiltet: For t h ~ f i ~ - r ~ - n t ~ ~ ~ ~ r L n o l y t 7 o r ~ 7 i ~ l l f i l ~ nr7d hT = 20, tlzefilre?- fctc~ot-

I coef~icierirs are as follow: a. = 4 . 1 critd al=0.01429; rh~f i l ter J'uctors ci carz he calculcrted with Equatiorl 2-29. Cor71pat-c tlze t-e-csults with f11ose obtai~ied wit17 lize iizdiridual filters.

There are many other error reduction techniques, but analyzing all of them is beyond the scope of this book. Statistical process control represent an area of special interest for data validation and data conditioning in connection with process control applications 13, 15, 17, 181. Stanley 1191 provided an excellent review of almost all known error reduction methods, including data reconciliation. The focus of this textbook, however, is data reconciliation which exploits redundancy in process data to more accurately adjust the process values and detect gross errors.

SUMMARY I ' Random errors can be naturally described by a normal probability distribution which is suitable to most measurements associated with physical sciences. Other distribution types can also be used. Standard deviations of secondary random variables can be estimated from the standard deviation of the primar). random variables. Individual filtering techniques can be used for error reduction in process measurements, but they are not easy to tune. Some reduce significantly the errors, but with large delay. Others have less delay, but ove~shoot/undershoot after a trse step process change. Hybrid filters perfor~n better than individual digitai fi!ters, but for best performance they need to be able to reccgcize the type of I process signal.

REFERENCES

I . Madron, F. Process Plant Perjornzance: Mc,n.rurenzeni and Data Procexsincp for Optin1imtior2 and Retr-o5t.s. Chichester, West Sussex, England: Ellis Hor- wood Lirnitcd Co., 1992.

2. Liebman, M. J., T. F. Edgar, and L. S. Lasdon. "Efficient Data Reconcilia- tion and Estimation for Dynamic Processes Using Nonlinear Programrninz Techniques." Conputers Clzem. Engng. 16 (no. 10/11, 1992): 963-986.

3. Wadsworth, 1-1. M. Handbook of Statistical method.^ for Engineers arid Sci- eiirisfs. New York: McGraw-Hill, 1990.

4. Kao, C. S., A. C. Tanlhane, and R.S.H. Mah. "Gross Error Detection in Senall) Correlated Process Data." lrid. & Eng. Chelri. Resetrrrh 29 (no. 6 , 1990): 1004-1012.

5. Mah, R.S.H. Clze~nical Process Structufes and Inforr?ial~on Flows. Boston Butterworthc. 1990.

6. Zalkind. C. S., and F. G. Shinskey. "Sta~istical Methods for Computing Over-All System Accuracy." lS;1 Joz~rnal (Oct. 1963): 63-66.

7. Nair, P., and C. Jordnc!le "Rigorous Data Reconciliation is Key to Optima1 Operatioils." Control for rlze Process I~zdusfries, Vol. IV, no. 10. pp. 1 18-123. Chicago: Putman, 1941.

S. Dunia, R . S. I. Qin, T. F. Edgar. and T. 5. McAvoy. "Identificati~n of Fauir? Sznsors Using Principal Componer~t Analysis." AiCllE Jounzai 42 (no. 10. 19961: 2797-28 12.

9. SeSorg. D. E., T. F. Edgar, and 3. A. Mellicharnp. PI-ocess Dy17cci::icr Q I I ~

Cl~~7r~-ui. New York: J ~ l i n Wiley & Sons, 1989.

i O . Kim Y. H., and j. M. Lec. "Impi-ove Frocess hleasuremcnts with a teas! Squares Fiiter." fidr-ocorborr Proce.ssir:g (Aup. i592): 143-146.

1 i . Khinehart, R. K . "Method for Auton~aiic Piciapiation c:f the Time Con5tanr for ;I Fii-st-Order Filter." lt7d. & Etcg. &/;ern. Resrorci~ 30 (no. 1. 1931 i: 275-277.

12. Tham. M. T., and A. Pan-. "Succeed at Online Validation and Reconstruction of Data." Chetn. Er7g. PI-ogress (May 1991): 4 t ~ 5 6 .

j j. \l'eber, K . "Measurement Smoothirlg with a Nonlinear Exponential Fiiter ." AiCl11: .lou1-1iul26 (no. I, 1980): 132-1 33.

14. Clinkscales, T. A., and C. Jordache. "Rybiid. Digital Filtering Technique.; for- PI-occss Data Noise Attenuation with Reduced Delay," presented at the AIChE Spring National Meeting. Atlanta. Ga., April 1994.

15. MacGregor, J . F. "On-line Statistical Process Control." Chein. Engtlg. Progress (Oct. 1988): 21-31.

16. Montgomery, D. C., and E. A. Peck. IIITI-oductio~~ to Linear Regressiotz Analysis. New York: John Wiley & Sons, 1982.

17. Lucas, J. M. "Combined Shewhart-CUSUM Quality Control Schemes," Journal of Quality Technology 14 (no. 2, 1982): 51-59.

18. Rinehart, R. R. "A CUSUM Type On-line Filter." Process Control and Qualify (Amsterdanl: Elsevier) (no. 2, 1992): 169-176.

19. Stanley, G. M. "Avoiding the Chattering Rule: Filtering and Other Tech- niques for Ignoring Noisy Data but Noticing Instrument Problems," presented at Gensym Diagnostics Working Group, Woodlands, Tex., Oct. 1 , 1992.

Linear Data ~econciliation

Linear data reconciliation for steady-state systems has already been introduced in Chapter 1 . The examples analyzed in Chapter I are instances of a linear data reconciliation problem. The gensral formulation and solution of linear data recoilciliation problenls is discussed ill this chapter. Vector notation is used in this and subsequent chapters becacise ir provides a coinpact reprzsentation and allows powerful concepts from licear algebra and rnatrix theory to be exploited. A-ppendix A provides an introductior: to some basic ccnceps of vectors and matrices.

LINEAR SYSTEMS WlTH ALL VARCABLES MEASURED

As shown in Chapter- 1, the simplest dsta reconciliation prcblenl ir~volves a linear model with all variables d i r e c t l ~ measured. We also assume that the measurements dc not contain any systematic biases.

General Form*drr,i-n end Solution

The model for the measurements described by Equation 1-6 can be written as

where y is a vector of tz n~easul-ernents, x is the corresponding vector of true values of the measured variables and E is the vector of unknown ran-

dom errors. Although in Equation 3- 1 we have assumed that the measurements and variables have one to one correspondence, it does not impose any limitation on the applicability of the method. Other forms of the nleasurenlent model in which the variables are assumed to be indirectly measured can be converted to the above model using appropriate transformations. These issues are discussed in Chapter 7 along with gross error detection strategies.

The constraints described by Equations 1-7a through 1 -7d can be represented i11 general by

where A is a matrix of dimension ;r! x n, and 0 is a m x I vector whose eIements are ali zero. Each row of Equation 3-2 corresponds to a constraint. It can be easily verified that for a fiow reconciliatior1 problem, the elements of each I-ow of matrix A are either + I , -1 or 0, depending on whether the coi~esponding stream flow is input, output or, respectively, not associated with the process unit for wiijch the flow balance is written. In general, if some of the variables arc: kno.aln exactly, the RHS of Equation 3-2 is a constant nonzero vector. c.

The objec~lve function. Equaticr~ 1-9. can bt: repre~ented in general by

Min (y - XI'' w ( ~ - X ) I

Thc n .r i7 nlatrix W is usua!ly a d i azon~ l matrix. tbe Ciqoilal elcments icpseseniing the wzights z s in Eqtiation 1-5. Roll-ever, ir, gcneral. it car! also contzir: nonzero off-diag~nal elements. The ~nterpretatior, of the element:; of M' in terrns of the statistical properties of the errors r is discussed in the next section.

The analytical solution to the abo\:e problem can be ol-~tainec! using the method of Lagrange multipliers [ 1 , 2)

where we have denoted the solution for the estirilates using the notation 2 In der-iving the above solution it is assumed that the matrix A is of full row rank, which i~nplizs that there are no linearly dependent constraints in Equation 3-2. If the RHS of Equation 3-2 is not identically zero, but a known constant vector c, then the estimates are obtained by replacing the kector Ay in Equation 3-4 by Ay-c.

Statisiical Basis of Data Recsnciliation

So far we hstre dzscribcd the fonnula:iori of t!~e dsta ieconci!iatic~n prob- iein from a pirely inrliitive viewpoint, especially with regard to the selec- iion of the cbjective function weights to be uszd for different measurements. The data reconciliation probiern car, also be explained tlsing a statistical theoretical basis, which not only helps in understanding this subject better, but also provides useful quantitative informaticn ahmt the improvement in the accuracy of the data obtained through reconciliation and the statistical properties of the resulting estimates. These can be used to identify grossly incorrect data or to design sensor networks as described in Chapter 10.

The statistical basis for data reconciliation ariscs fro111 :he properties that are assumed for the random errors in the measurements. Generally. as mentioned in Chapter 2, it is assumed that the random errors follow a

* multiva-iate normal distribution with zero mean, and a known variance-

62 Llar~i Recniiciliiiiion rirrd Gross Erroi- Drrccfrori

covariance matrix C. However, it should be kept in mind that sometimes the primary measured signal is transfonned into the final indicated variable of interest. If the transformation is nonlinear such as Equation 2-7, then the error in the indicated variable need not be normally distributed.

As indicated in Chapter 2, only the linearized fonn can be approximated by a normal distribution. Thus, if possible, the variables x in the measurement model of Equation 3-1 should represent the primary measured variables, and relationships between the primary measured variable and the variables of interest should be included as constraints. In case the constraints are nonlinear, then a nonlinear data reconciliation technique as described in Chapter 5 can be used to solve the problem.

The matrix C contains information about the accuracy of the measurements and the correlations between them. The diagonal element of C, 0; is the variance in measured variable i, and the off-diagonal element 02, is the covariance of the errors in variables i and j. If the measured values are given by the vector y. then the niost likely e\timates for x ale obtained by maximizing the likelihood function of the multivariate norma! distribution:

I 'r ..I Max expf- O.i(y - x) X (y - x)} x ( 2 ~ ) " ' ~ I C !"I2

where ICI is the determinant of 1. The above maxinium likelihood esti n~ation problem is equivaleni io rninilnizi~g the function

The estimates are also required to satisfy the co:;siraints, Equation -3- 1 . Comparing Equ:idons 3-6 and 3-3, we note that the fonnulstjcrr of the data reconciliation problern from a statistical viewpoint. simply reqcires that the weight matrix W be chosen to be the ir~verse of the covariance matdx C. This choice is also reasonable, if we consider the matrix C to be diagonal. Ir. [!?is ~ ? P P En.--+:,-- 2-6 hpci\rn,=c , --,~ . . ~ ~ - ~ "-"

where oi is the staridard deviation of the error in measurement i. Equa- tion 3-7 shows that the weight factor for each measurement is inversely proportional to the standard deviation of its en-or. Since a higher value of standard deviation implies that the measurement is less accurate. the

above choice gives larger weights to more accurate measurements. Another advanrage of using Equation 3-7 is that it is dimensionless since the standard deviation of a measurenient error has the same units as the measurement. The estimates can now be obtained using Equation 3-4 by replacing W with C-I.

@ It is also now possible to derive the statistical propel-ties of the estimates obtained through data reconciliation. Consider the case when all the variables are measured. The estimates are given by

a = - C A ~ ( A C A ~ ) - ~ A Y = [ I -CA~' (ACA~)AI~ = BY (3-8)

Equation 3-8 shows that the estimates are obtained using a lineartrans- fonnation of the measurements. The estimates, therefore, are also nomial- ly distributed, with expected value and covariance matrix given by

Equation 3-9 implies that the estimates are unbialed, which is a property of ~naxirnun~ likelihood estimates for the linear s>zstelns. Equation 3-10 gives a measure of the accuracy cf the estimates. In the case where some of the vanables are unmeasured, it is possible to der-i\:e similar properties. These statistical properties are exploited to identify measurements with gross errors as we!l as to design sensor ne!works.

LINEAR SYSTEMS WITH BOTH MEASURED AND UNMEASURED VARIABLES

For part~ally rneasuied systems. the reconc~!~at~on pioblem 15 usually solved by decompos~ng ~t Into two subproblelns i?. 41. !n the first cub- problem, the reducdant measured variables art: recnnc~led. followed by a coaptat~on problem In which the ob\crvdble unmeasured var~ables are e\t~mated T h ~ r stratqy 1s more eiiic~enl than an attempt tc estlmate all the vanable\ s~multaneou\ly The general formulaticn and solut~on ot the reconcll~atlon ploblem lor part~ally meajurcd system\ 1 5 no& descr~bcd

Let the number oi unnieawred var~ables be p The vanables are class1 fied Into two sets, the vector x of measured vanables and the vector u of

unmeasured variables. The measurement model is still given by Equation 3-1 and the objective function by 3-6. However, the constraints have to be recast in terms of both the measured and un~~zeasured variables. Equa- tion 3-2 is written as

where the columns of A, correspond to the measured variables and those of A, correspond to the unrneasured variables. Matrices A, and A, are of dimensions 172 x n and nz x p, respectively.

The unrneasured variables u have to be eliminated from Equation 3- 11 using suitable linear combinations of the constraints. This is equivalent to pre~nultiplying the constraints by a matrix P, also known as a projection matrix [4]. The matrix P should satisfy the proper-ty

I'A, = 0 (3- 12)

Premultiplying Equation 3-11 by matrix P, we ge: the reduced set of constraints invoiving only measured variables a:,

The number of co!uin!?s 5.f I' should clearly bp, equal to the number of constraints, in. As many in3epencie1:t rows s s possible are co~structed for P which satisfy property 3-i2. The riumber of such ruws, t. is lifiked to the observabilit:~ of the unnteasured variab!er,. If all the unmzasured variables are obser\.ab!e as ir, Cases 1 and 2 of Example ! -2, then t is eqaal :o n1 - p. This ca:l be easi!y infei~ed by notiilg that i is equal :o ihe n u n - Scr of constrai~ts in the reduced coristrainrs of Equation 1-13. If all p unnieasured variables can he uniquely estimated, then this requires p of the constraint equations. Thus, only the remaininp 177 - P constraints are available for recorlciling the measured variables. It can also be proved that tor sll the unmeasured variables to be observable. the p columns of A, shoulu be independent.

Exercise 3-3. Proile that ifthe col~rnlns of tnatr-rx A,, are linearly independent, tl~er? lrnique estimafrs for he ~2ariabir.s u exrst. i

Exercise 3-4. Solve the reconciliation problem.for- the cu.re of linear constraiizts with constant terms: A,x + A,u = c, when the colunzns of A, nzay or- tilay not be linearly independent.

If not all unmeasured variables are observable, then t is equal io ~n-.s, where s is the number of independent columns of A,. The interpretation of this result can be done as follows. Since the unmeasured variables

-cannot be uniquely estimated, the estimates of a few of the unmeasured variables have to be additionally specified in order to uniquely solve for

I the remaining unmeasured variables. If p - s is the ntinlber of unmeasured variables whose estimates have to be additionally specified, then to solve for the other s unmeasured variables requires s of the constraint equations, resulting in nz - s remainkg constraints for reconciiiation.

A comparison with Case 3 of the Example 1-2 sholvs that 171 = 4, ti = 6. and p = 4. However, only three of the columns of A, are independent, and an estimate of one of the unmeasured among xs to xi have to be addition- a l ! ~ specified in order to estimat~, all thc other variables. Thus. :? - s = i or s = 3. and the number of constraints in the rzduced set. m-s is equal to ! as observed froin Fxuatiurl 1-15. Note that the ilurnbsr of ionstrainis in the redl~ced set is a!so known as degrees of ~edtcnd~t:q..

The reduced data reconciIiation prcblern is to nliililnize 3-6 sub.ject to the ccmtraifits, Equation 3-13. Since the cons?rain:s are similar to Equd- tion 3-2, the reconciled valuzs f ~ r x ctin be obtained asing Equation 3-8. with the matrix A being replaced by the reduced matrix PA,.

Usicg Equation 3-14, we can ow substitute for x In Equation 3-12 ar~ii obtain the estimates u for the variables u, provided all the variables are observable (01 Gle ~oiumns of A, arc independent). Since A, is a 171 x p matrix with p < f i z , a least-squares approximate solution car, be used. Fmrrl the theory of generslized inverse 151, the least-squares solution is given by

66 Lfnru Recor~~linriu~r arid Gr.o.ss Error Deirctior~

The general solution for Q for when all the variables are not observable is deveioped in the next section. The decomposition strategy described above is also useful for data reconciliation of processes with nonlinear constraints as described in Chapter 5. The only additional issue to be discussed is the construction of the projection matrix P, which follows.

The Construction of a Projection Matrix

There are several different matrix methods for the construction of the projection matrix. One such method is given by Crowe 141. However, probably the most efficient method is to use the QR factorization 151 of the matrix A,. Such a method was first applied to data reconciliation by Swartz [6] and recently utilized by Sanchez and Romagnoli [7] to decompose and solve linear and bilinear data reconciliation problems.

Consider the case when the columns of the m x p matrix A, are linear- ;y independent. Then it is possible to factorize A,, as

where n, is a pe:mutation nxtrix (that is. the columns of 1-1, are the permuted columns of the identity matrix). R , is a nonsingular p x p upper triarigular matrix, and Q is a m x 1 1 1 orthogofia! matrix, that is,

In essence, thc i o l ~ m i l s sf Q form a basis for the nz-dimensional space, while [he inatrix K, represents :he p columns of A, in terms of the first 11 basis vectors, Q , . Since Q is or!hogo~ial, the matrix Q2 has tile property

From Equation 3- 18. it is clear that the matrix QT is the desired projec- lion matrix P.

The QR factorization is also useful in estimating the unmeasured variables easily. Using the QR factorization. Equation 3-1 1 can be written as

where n,u is a reordered vector u. Premultiplj~ing Equation 3-19 by QT

we get

or, rearranging

RR,U = - Q ~ A , X (3-21)

Using Equation 3-1 6 for R in Equation 3-21 we gel

or,

Since Rl is a p x p uppel- triangular matrix, Equation 3-23 can be easily solved by backward substitution to givc the estimates of 11. The soluticn car, be formally expressed a?;:

By substituting for the estimates of x (obtained ilsing Q: for P in Equaticn 3-14) in the above eqcation, we obtain the estimates f ~ r ir (sicce i7,c is a reordered for-ITI of the original verror uj.

In the case when only s of the columns of A, x e independcat, then the QR factorization takes !ne form

where R1 now is a s x s nonsingulal. upper triangular matrix, and R2 is a s x ( p - s) matrix. The projection sllatrix is still given by Qz. In the same way, the unmeasured variables can be partitioned into two subsets of s and p - s variables.

In order to use the QR factorization for estimating the unmeasured variables we substitute for R in Equation 3-21 using Equation 3-25 and for n,u using Equation 3-26 and obtain

The upper part of the matrix Equation 3-27 involves only the unmeasured variables:

which, since R1 ic nonsingular, gives the solution

Equatiort 3-29 indicates that the solution for the first s (reordered) unrrleasured variables can be obtai~led only if estimates of the remaining p - s unmeasured variables are specified. This is also c~nsistent with the fact Char not all ucmeasured val'ables are observable. Thz Qii factorization described here is aiso useful i11 identifying which of :he cnrneasured variables are unobsen~i?bIc as described it1 the next section.

U'e illustrate the colistruction of the project~cn matrix by QR Fdctoilza- tic11 2nd its utility in dr,ter:nini~;g observable and uncbser\lab!e variables by usir.5 the fiow reconciliarion problem used in Case 3 of Example 1-2 where fiows of streams i and 6 are the vnly va~iables measured. Frcjrn the constraints Equations 1-7a thrcugh 1-7d for this process. we can obtain the matr-ices corresponding to measured and unn~easured \~ariables. and these are given by

The QR factorization of matrix A, gives

From the matrix R. it can be inferred that s = 3 , and that fhe sub- niatrix corresponding to the first three colun~ns and first three rows is R,. The projection matrix is the transpose of the last column of Q. The reduced constraint matrix is given by

The reduced constraint matrix can be seen to be equivalent to Equation 1- 15 which was obtai~ied using si~npie algebraic manipulatiori.

;>bservability and Redundancy

In Chapter 1, we introduced the concepts of obscrvat?ility and redundancy without f~rnially defining them. I n this section. we define these terrrls ciearly and discuss diffe~ent techniqaes for variable classitication.

:'he concepts of observabi!ity and redundancy are in~imately linked with the solvability and estimability of variables. In medium and !ar,oe scale procsss plants, there are hui~dreds of variables, and, for technic,?! and economic reasofis it is not possible to measure all of them. It is thus important to know f ~ r a given process and s sei of rileasured variables, which of the unmeasured variables can be estimated. The concept of observability deals with this issue.

It is also useful to kilow whether a measured variable can be estimated even if its sensor fails for some reason. Redundancy deals with this question. Observability and redundancy analysis can be exploited for adding new measuring instruments or for altering the choice of the set of variables to be measured. It can also play a useful role in efficier~t deco~nposition and solution of the data reconciliation prob!ern.

Definition of Observability: A variable is said to be observable if it can be estimated by using the measurements and steady-state process constrain ts.

Definition of Redundancy: A measured variable is said to be redundant if it is observable even when its measurement is removed.

From the above definition of observability, it is obvious that a measured variable is observable, since its measurement provides an estimate of the variable. However, an unmeasured variable is observable only if it can be indirectly estimated by exploiting process constraint relationships and measurements in other variables. Measured variables are redundant if they can also be estimated illdirectly through other measurements and constraints.

The observability and redundancy of variables depend both on the rneasul-enlent structure (also called the seizsor networkj as well as on the nature of the constraints. We have already seen how the measrlrernent selection affects the observability of flow variables in the three cases for the example shown in Chapter I . A systematic approach is necessary for determining which of thc: unmeasured variables are observabie. There a:.e broadly two apprcaches that have bcen f ~ l l o w e d for solving !his problem. One class of methods is based on the use of linear algebra and matrix theory, ujbile the cthcr uses principles of graph theory. Both approaches are discussed here since they provide valuable insights.

Observability and redundarxy classificaticr~ of variables can be cai~ied out as pal? cf thc solution of the data reconciliation problem [6, 71. We first describe bow unobservable variables can bc identified during the construction of the projection matrix.

Unobservab!e variables are present only when the colu~nns of A, are not !inearly independent. In such cases, the QR factorization of A, has the form show11 in Equation 3-25, which can be used to rearrange the constraints in the form of Equation 3-27. The solution for unmeasured variables given by Equation 3-29 can be written as

I where

I R, = RT'R, (3-31)

The matrix K,, contains all the necessary information to classify unmeasured variables. If a row of R, has no nonzero element, then the

; 0 '@ corresponding unmeasured variable in the LHS of Equation 3-30 can be

1 estimated purely from the estimates of x and is therefore observable. If on the other hand, a row of R, contains a nonzero element, then the corresponding unmeasured variable on the LHS of Equation 3-30 is unobservable since it depends on the estimates chosen for the p-s unmeasured variables on the RHS of this equation. All the p-s unmeasured variables in the RHS of Equation 3-30 are :*!so unobservable, since their estimates have to be specified.

Redundant measured variables can be identified either by looking at their reconciled estimates or by considering the reduced constraint matrix QTA,. A nonredundant measured variable will not be adjusted since it is - not possible to estimate this variable indirectly through other variables. Hence, its reconciled value \~iill be identical to its measured value. Corre- sponding to this variable, the elements in the collumn of natrix QTA, u~ill all be zero.

Example 3-2

In order to classify the measilred and unmeasured variables of Exam- ple 3-1, we can make use of the Q R f;ictosization already con:puted in the example. The nldtrix R,, which is useid for classifying i~nrneas:lred variables, can be computed as

Since all the rows of R, contain a norzero eleinent, this implies that all unnleasured variables are unobservable. In order to classify the measured variables. we make use of matrix Q ~ A , computed in Exanlple 3-1. Since both the colun~ns of this matrix contain a nonzero elen~ent, we infer that both measurements (flows of streams 1 and 6) a-e redundant. These results can be cou~itei--checked with the results of Example 1-2, Case 3.

Observability and redundancy classification using the projection matrix was also used by Crowe [8]. Crowe applied theoretical rules and derived algorithms to claqsify flows and concentrations in material balance data reconciliation. The procedure allows the inclusion of chemical reactions, flow splitters and pure energy flows. Fewer classification rules, however, are required by the Q R projection matrix approach described above. Matrix methods for obser\~ability and redundancy classification in bilinear processes were developed by Ragot et al. 191.

Graph Theoretic bfethod

The use of graph theoretic concepts for observability and redundancy classification when only overall flows are considered have been developed by Vaclavek [lo] and Mah et al. [l 1 1. Later, Vaclavek and Loucka [12] extended their classif cation ideas to multicomponent systems, while Stanley and Mah [ l3J developed classification algorithms for energy systems. Kretsovalis and Mah [14, 15, 161 developed graph theoretic algorithms for classifying flows, temperatures and cornpositicn variables in general processes, while Meyer et al. (171 developed a simpler algorithm applicable to bilinear processes. In this section, wc focus only on overall flows of the process.

In order to use g r a ~ h theory. the pri?cess under coilsidcraiicn should be represented as a process zr-aph. The process zraph can be simply derived from t ! ~ z flov:sheet of the process by adding an extra node denoted as the environrnent node and cocnecti!lg all process feeds and products to it. Tius. for the prQcrss of Figure 1-2 in Chapter 1, the process graph is sliown in Figlure 3-1, u~hcre ~~e direction of the streems are not indicated since they are irrclevai?t for the present analysis. The following s i r~~p ie yp,l powerful result is ohtai~ed f ran graph the:~ry for identifying unobservable flows:

,An unmeasured flow is unobservc15le if and only if it iorms part o ; some cycle consisting solely of unmeasured flew streams of the prccess graph.

As an example, consider Case 3 of Example 1-2 for which the unmeasured flows of strcanis 2 to 5 were shown to be unobservable. The process graph f a this case is shown again for convenience in Figure 3-2 with the measured streams marked by a cross. It can be easily observed from Figure 3-2 that these streams forin a cycle. On the contrary, for Case 3, the unmeasured flows of streams 3 through 6 do not form any cyc!e arnong them and are therefore observable. This can be verified fi-oin the pl-ocess graph for this case shown in Figure 3-3 in which the measured e d ~ e s are marked.

Figure 3- 1. Process graph of heat exchanger with bypass

Figure 3-2. kiwi exchanger process graph with unobservable values.

Redundant measurzd variables can also be identified by using the following simple pr~cedure. We merge every pair of nodes which are linked by an unmeasured stream, obtaining in the process a reduced graph which contains only measured streams. A11 the measured streams of this reduced graph are redundant and will be reconciled. Any measured stream that gets eliminated as a result of the merging process is nonredundant. For example, we apply the merging process to Figure 3-2.

The reduced graphs obtained after merging, in sequence, nodes linked by streams 2, 3, 4, and 5 are shown in Figures 3-4a to 3-4c, respectively.

74 I1u;ii Re io r~c i l i ~ l r o~ r arid Gri~cc Error l leredioll

Figure 3-3. Heat exchanger process graph with observable variables.

The final reduced graph of Figure 3-4c contains the measured edges 1 and 6, which implies tha; they are redundant and will be reconciled. This can be compared with the results of Case 3 of Example 1-2 which shows that flows of sirearns 1 and 6 are present in the reduced data reconciliation proble~n. It can also be observed tha: the reduced data reconciliation problem can be obtained by writing the constraiilts based on the reduced pi-ncctss graph of Figlire 3-4c. Thus, for flcw reconciliation of processes containing unmeasured vanahles, [he reduced data reconciliatior? prob- !sin can be fornlulated using a reduced graph instead of using a projection matrix technique.

This exarcple is ased to illusrrate the presence of observzbieiunobserv- abje unmeasured variables and redundantlnonredundant measured variables coexisting in the same procesr. The process graph for this example is drawn from Mal; [ I l l and is shown in Fie~irz 3-5. The rnras~lred flows of this process are indicated in the figure. For classifying the unmeasured variables easily, the measured edges from Figure 3-5 can he deleted resulting in the yraph shown in Figure 3-6a. From this figure it is observed that streams 8, 11. and 14 f o m ~ a cycle and are therefore unobservable. The remaining unmeasured flows are observable. In order to identify the redundant measurements, the nodes linked by unmeasured edges are merged, resulting in the reduced graph shown in Figure 3-6b. All the measured flows present in Figure 3-6b are redundant. but the measured flow of edge 1 is nonredundant since it is eliminated during the merging process.

Figure 3-4. Heat exchanger graph after inergin] unobsenablr strerlrns (oI stream 2, (b) stream 3, (c) strecr~s 4 and 5 Reprinted ~ v i I permissioq from [I !I 0 1 976 Ame:;ccn Chemicoi Society

Figure 3-5. Process graph of a refinery subsection. Reprinted with permission from [ I I ] . Copyright 0 1 976 American Chemical Society.

Dara Rpco~rcilioriot~ urld Gloss Error- Derecriorf

Figure 3-6a. Subgraph of unmeasured variables of refinery process. Reprinted with permission from [ I I ] . Copyright 0 1976 American Chemical Society.

Figure 3-6b. Reconciliation subgraph of refinery rocess. Reprinted with permission 6-0, [ i I J Copyright O I 976 American ChenlicoP5~ie~.

Other Classification Metfzods

Besides the two major variable classificatiori methods previously described, a different approach was developed by Ron~agnol i and Stephanopoulos (18, 191. They used an output set assignment of the mass and energy balance equations. This approach has been Inore recently revised and used in a data reconciliation computer package (PLADAT 1201). The :nost importact merit of this approach is a classification of process constraints which enables building a reduced set of constraints for data reconciliation (redundant subset).

If such a redundant set exists, the reconciliation prob!em can be decomposed into a redundant subpr-obiern (solving the reconciliation problem with the redundant set of constraints) and a couptation suhprw0- lern (solving for the observable unmeasured variables). But a QR de- ~ o r n - position approach for both variable classification and solving of the data reconciliation problem is more straightforward. Recently. Sanchez and Romagnoli 17) also used it for reconciliation problems involving lineal- and bilinear constraints.

ESTIMATING MEASUREMENT ERROR COVARIANCE MATRIX

Hitherto, we have as:;unied that the measurement error covariance matrix Z is complete!^ known. One possibie method of obtaini:ig rhc error variances and covariances is frorn the errar characteiistics of different compancr?is (such as sepsor, transmitter. recorder. ctc.) as explaiced in the preceding chapter. In order LC use this approach, q e need information about the stzndsrd deviations of thc errors comtnitted by the different components, as well the transfal-mations used in pr~cess ing and transmitting the data.

It is generally difficult to obtain this information. a!thoagh the hzstrrr- rnent Engiizeeis Handf7ook by Liptak 12 i j is a goou source for such da:;i. It should be noted that the data given in this handtook is under idcai laboratory conditions and may not be valid under actual process conditions. If nenlinear transformations are involved in processing thc raw measured data, then the standard deviation of the measurement error can b- c o n - putcd only by using linear approximations of the transforming filnctions as in Equation 2-13. Bagajewicz [22] has shown that the measurement error obtained in this manner can be considered to be normally diztrib-

uted, if the range of the measuring instrument is large co~npared to the standard deviation of the measurement error.

An alternative way of estimating the covariance matrix is from a sample of measurements made in a time window. If yi, i = 1...N is the vector of measurements made (typically at successive sampling times), then an estimate of the covariance matrix can be obtained analogous to Equation 2-6 as

where 3 is the sample mean given by

The above rnetliod of estimating the covariance matrix is known as the nil-ecf ~lzetlzorl. An inipor-tant requirement for estimating C using Equation 3-32 is that the true values of all variables should be consta:lt during the iii:le interval in ~i~hic l i the above nieasurenlents are made or, in other \:,or&. the proces,; shculd truly be in 2 steady state. In actuai practice, the true values c11ar.g~ continuously arid the estimates obtained wil! be poor if these changes are comparable in magnitude to the rceasurement errors. On the cther hand, if the measurements contair! a gross error, the estimate of :he covariance matrix is cot af ic ted provided the magnitude of tile Fross error is constant i;i this tin:e interval.

Almasy ugd Mah [23] first proposed an inc!it-ccr ~~~e:llod of rrtirnating rhe co\laria:ice n;atrix when the true values of the pn)czss are u r ide~go i~~g constant changes. 'Their method exp l~ i t s thz constraint model given by Equat~cn 3-2. For- this pur~~ose . we define t!ie constraint residuals i- as

Using Equations 3- 1 and 3-2 and the assumption that the measurement errors follow a nor-ma1 distr-ibution wit11 0 mean and covariance matrix C, \rre can prove that r follo\vs a normal distribution with 0 and covariance matrix V given by

It can be observed from Equation 3-34 that the constraint residuals do not depend on the true values of the variables and do not need any ~nfor- mation concerning their behavior. From a sanlple of measurements, we can obtain an estimate of the covariance matrix of constraints residuals as

Note that in obtaining an estimate of V using Equation 3-36 we do not make use of an estimate for the mean of the constraint residuals frorn the sample of measurements, since we know its true mean to be 0. The estimate of V can be used in Equation 3-35 to back-calculate and estirnate for C.

We first note that the matrices V and C are square symmetric matrices of dimensions rn and a, respectively. The use of Equation 3-35 to estimate C from V, therefore, implies that we have to solve for n(n + 1)/2 parametel-s from in(17z + 1)n equations. Since i z is greater than nz, several possible solutions for C can be obtained. In order to obtain a unique solution, Alnlas). and Mah [23] suggested that the sun1 square of off-diagonal elernents ol-' 1 be minimized subject to satisfying Equation 3-35. This is based on tile argument that. in practice, C is usually diagonal or diagonally dorniriant. Art ana- lyticai solution can be cjbiaincci for this problem as follows:

1x1 the vector d consist of the diagondi elemsnts and the vector t consis? of the of?-diagonal e!cments cf C iwhich can be formed by piacing ihe colunms one below the other, considering only !he el~ments below thz diagonal). \ire car, similarly arange the diagonal and off-diagonal elements of as a vector p and I-ewrite Fq~ation 3-35 a:;

where the rnairices M and W are mal;pirr,o matrices that can be constructed froni the elements of the constraint matrix A 1241. The solution for the diagonal and off-diagonal elements of C are given by

Dora Kcconr-il;trtiotl arxd Gross Error I)~=recriotz

where

Keller et d. 1241 extended the above indirect method for obtaining an estimate of C which has only a few nonzero elements in specified locations.

In order to obtain a good estimate of C using the indirect method, the following two conditions have to be met:

( I ) The true values of process variables corresponding to each measurement should satisfy the constraint Equation 3-2.

( 2 ) The measurements should not contain any gross errors. If a gross error exists in any measurement, then the constraint residuals will not have zero mean.

Typically, when the true values of process variables undergo changes, we cannot ignore accumulation terms in the material and energy conservation constreints and the first cozldition may nor be met. In such cases, it has to be questioned whe:her the indirec! method offers an advantage cver the direct method. In order to tackle this problem, Almasy and Mah [23] recommend illat each yi used in Eqlation 3-35 to obtair, an estirnate of V should be the average cf the measurements made within a time interval in which the process is operating around a nominal steady state. A set of N such time perlads should be chosen to obtain a sample of aver- azed rneastire~nent values io be used in Equaiicn 3-36.

A justification for this recommendation can be given using the fact that, in ~ r ~ c t i c e , steady-state data reconci!iation is applied to measurements averaged over a time interval in which the process operatss zround a nominal steady state (see industrial examples discussed in Chapter I ) . Even if the true \ralues in this time period are randomly fiuctl~atino about the nominal stead\!- state values, we can expect the average of the true valdes to satisfy the steady-state conservation constraints as also assumed in data reconciliation.

In order to tackle the problem of gross errors in the indirect method, Chen et al. 1251 proposed a robust method of estimation in which the different constraint residual vectors are given appropriate weights when computing the estimate of V using Equation 3-36. A sniall weight is assigned to a constraint residual vector if it is not consistent with the other vectors in the sariiple set. The estimation algorithm is iterative and

described by Chen et al. 1251. This procedure is useful if only some of the measurement vectors have gross errors.

If a gross error of constant magnitude is present in all measurements, then the above procedure will not eliminate the problem. One practical solution is to choose the data from time periods that are widely separated in time so that they do not share common features such as the same gross error with the same magnitude present in all the samples. This can be done if a large historical database of operating data is available.

One can also choose to combine the estimates obtained from different methods in a judicious manner. The indirect estimation method, however, has not yet been extended to treat nonlinear constraints.

SIMULATION TECHNIQUE FOR Eb-ALl!_ATING DATA RECONCILIATION

We conclude this chapter by describing a simulation technique that can be used for evaluating the effectiveness of data reconciliation and to estimate the error reduction that can bc achieved. For the purposes of simu- iation, the following inpur data has to be obtained: t

(i) Tne proczs flowsheet, which ir~dicates the number cf process units, the process streams and ilieir connectivity. The type of process unit need not be specified if sirnukition of cnly overall flov: reconcilia- tien is to be performed.

(ii) The "true" or "nominal" steddy-state flolr~ values of ail strrams. These true values must be consisteri: with the f l o \ ~ baiances of the process. These tzue values are usefir1 fcr judging [he improvement in accuracy achieved through data reconcilia~ion.

(iii) The set of nieasured flows of :he process and the standard devia;ion of the enor in each n~easurelnent. The standard deviation may be expressed as a fraction of the true value or specified as an absolute \>due.

At first. random errors arc generated which follow a standard normal distribution with mean zero and giver, standard deviations. These are added to the true values to obtain the sirnulated "n~easurements." The constraint matrix, A, is obtained based on the process connectivity information, and the submatrices, A, and A,: are also obtained corresponding to the measured and unmeasured variables. The projection matrix IS ' now computed using a Q R factorization of A,. The estimates can now be computed using Er~uations 3-14 and 3-24 or 3-29. The error in the esti- matt.; can now be computed by comparison with the true values.

In order to obtain a statistically accurate estimate of the en-or reductio~l achie\rable thsough data reconciliation, it is necessary to perform several s i ~ n u l ~ t i ~ ~ trials with differ-ent random measurements generated in each trial. T h e reduction in error due to data reconciliation is coinputed and averaged over all the trials. Typically about 1,000-10,000 simulation trials are used to obtain this est i~nate .

Many software packages like MATLAB and mathematical libraries Q

such as IMSL or HARWELL have pseudo-random number generators that can be used for simulation purposes. It should be noted, however, that it is implicitly assumed in the sirnulation that there are n o rnodel errors (that is, the true values of variables satisfy constraints) and that the measurement errors are norriially distributed with known variances. In practice, since these assumptions may be violated, the error reduction that can be achieved will be less than that estimated through simulation. L!ltimately. the benefits of data reconciliation should be evaluated in pr-actice through actual inlpro\;ernent in process perforrl1ance.

SUMMARY

An analytic21 soluiioii is a\,ailahle for linear data reconciliclticfi with all variables measur-sd. Unmeasured variables can be eliminated from the reconciliation mcdel by a projecticn matrix. A [educed model is obtained, which can be used to reconcile the measured variaS!es. lJa~-iablr classificatio~r (a> !-r.drrr1c;ltrt~ih1~1~~-ci~~11il~1i7: 1ne;i:;ured vali-

ables and oh . i e t?~~h ic~ / i r ; l o~ . s e~ -~~(~~~~d un~l-reasured variables; can lie perforr;~ed using n~atli:; nlethods, whilz sclvi~ig the! data reconciliation problem, or by a separate g!-aph-:heoretic algorithm.

REFERENCES

I . Kuehrl, D. R., and H. Davidson. "Computer Control. !I. hlathematics of Control." Clzern. Eng. Progress 57 (1961 ): 4 4 4 7 .

2. Seber, G.A.F. Linenr Ragr-e.ssiorz AnnI>~.sis. New York: John \i:iley k Sons, 1977.

3. Mah, R.S.H. Cl1en1icnl Proces.~ Stt-rrcturc~s and h~for1?1ntiat7 Flo~r..~. Boston: Buttenvorths, 1990.

4. Crowe, C. M., Y.A.G. Campos, and A. Ihymak. '.Reconciliation of Process Flour Rates by Matrix Projection. I : Linear Case." AlChEJolo7wl29 ( 1983): 881-888 .

5. Noble, B., arid J. W. Daniel. Aj~plied Litlci~t- Algebra. Englewood Cliffs, N.J.: Prentice-Hall, Inc., 1977.

6. Swartz, C.L.E. "Data Reconciliation for Generalized Flomsheet Applica- tions," American Chemical Society National Meeting, Dallas. Tss., ! 989.

7. Sanchez, M., an3 J. Romagnoli. "Use of Orthogonal Transfonllations i n Data Classification-Reconciliation." Corl7pute1-s Chcrn. Et~gn,?. 20 ( 1996): 4 8 3 4 9 3

8. Cromre, 2. M. "Ohszi-\.ability and Redundancy of Procesh Ds:a for Steady Sta!t. Rzconciliatio~i." Cl~etr~. 1711::. Sc.i. 43 (! 989): 2903-19 17.

9. R . q ~ t . J., D. Maq~~in. G. Bloch, an2 W. Ciomo!ka. "Obsei-\),?Sil,ry and Val i- ahles Classification in R i h e a r Processes." Betie:ir.r Qrrrrt-t~r-11 J. Alr:o:r~trric Cor:trol-Joltrr~n(n 3 1 (1990): 17-13.

10. Vzclarrek. \'. "Studies on Sy:itcm Engi:leering 111. Optimal Cfioice o f t l l i

Balance M ~ 2 ~ ~ r e l n e n ; s in Complicated (llhe:ni.;al Engineerin: Syste111s .-. Cliett~. G I ~ . Sc/. 24 (1469): '347-955.

: 1 . Mah. R.S.8.. G. M. Sra~:ley, and I). 'A'. Ilowning. "Reconc~liai~on and Recti- c tication of Process Flo\v aad In\iento~-y Data.'' Iilci. & Glg. C:IP~TI. i ' t - c )~ . !]CS.

/lei,. 15 (; 976): 175-1Si.

12 Vacidveh. V , and M Loucha "Selcct~on of hleacu~enienr~ Yece\\dr\ to Achieve Mtilticomponent 121:lss Balances in Chemical Plan:." Ch(~tt7. Etlg Sci. 31 (1976): 1199-1205.

/

13. Stanley. G. M., and R.S.11. Mah. "Observahility and Redundancy Clas\ilica- tion in Process Networks. Tlreorems and Algorit!rms." Cilc~ttl. Etlg. Sci. 36 (1981): 19311954.

14. Krctsovalis, A., and R.S.H. Mah. "Observability and Redundancy Classifica- tion in h4ulticomponent Process bretworks." AIChE Journal 33 (1988):

/

70-82.

15. Kretsovalis, A., and R.S.H. Mah. "Observability and Redundancy Classifica- tion in Generalized Process Networks. I: Theorems." Cornpzrrers Ckem. Etlgng. 12 ( 1 988): 67 1-688.

16. Kretsovalis, A., and R.S.H. Mah. "Observability and Redundancy Classifica- tioil in Generalized Process Networks. 11: Algorithms." Cornputera Chern. Enyng. 12 (1988): 689-703.

17. Meyer, M., B. Koehret, and M. Enjalbert. "Data Reconciliation on Multi- component Network Process." Cornjxrters. Chern. Eng. 17 (1993): 807-817.

18. Romagnoli, J., and G. Stephanopoulos. "On Rectification of Measurement Errors for Complex Chzmi(,-1 J'lants." Clten~. Etzg. Scie~zce 35 (1980): 1067-1081.

19. Komsgnoli. J.. and G. Stephanopoulos. "A General Approach to Classify Operational Parameters and Rectify Measurement Errors far Compkx Chemical Processes." Coi77p. ;Zppl ro Ci:e!it. En,?. (1980): 53-174.

20. Sa~lchez. M.. A. Bandoni. and J . Ro~nagnoli. "PLADAT: A Package for Process Variable Classificatign and Plant Data Keccr.cilia!ion." Corrlpz~t~r-~ CII(,III. E r i , ~ n ~ . h i6 (Sll{lfil. j 11992): S199-S506.

22. Pa~ajcwicr.. ?.I ..On the Probability Distribution and Reconciliation of I'roczs; Flznt Data." Cornplrr. CIlem. 1511s. 20 (1 956): 8 13-8 19.

23. .h!masv. G. A., and R.S.H. klah. "Es~irnatior: of Measurement Errur Vari- ances from froces\ Data. ' lild. EIZX. C'l7er:i. 5'1-ocesc Dc7.s. L)ci.. 23 (i981'1: 779-784.

21. Keller, J. Y.. M. Zasadrinski. and h4. Darortacl;. "Analytical Estimator of Measurement Error Vsriance\ in Dais Reconciliatioil." Cov~pirter-s Chein. E I I ~ I I , ~ . 16 (1392): 185-1 38.

25. Chen, J . , A. Randoni. and J . A. Ronlagnoli '-Robust Estimatioil of Measore- rnent Enor Variance/Covari:tncz from Process Sarnpling Data." Cotni~l!tel:~ Clzern. Ei~gilg. 2 1 ( 1097): 593400.

SteadymState Data Reconciliation for Bilinear Systems

BILINEAR SYSTEMS

In a chemical plant, the process streams contain several species o r cornponerxs. Besicies stream flow rates, the compositions of some of the screams are also rncasurzd. Since composition analyzers are comparative- i y more expensive, on-line analyzers rnzy no: be used in many c:ises, and these :neasurements are obtaiced from a laboratory, which may also irlciease the errors in r!ie reporied data. Neither the o\-era11 flow balancr, nor &he component balances are generaliy satisfied by the measurements. It is therefore necessary to reconcile both flow and coinpusi!ic?n ineasuie- n-ici:ts siniultaneously.

l'he consirainls of the data reconci!iation problerri arc linear if ive consider o l ~ l y overall Row balances. Iloivever. if we wish to simu!tancously reconcile flow and composition measurements. then component balarlces also have to be included as cocstraints of the data reconciliation problem. These constraints contain component flow rate terms which are products of the flow rate and composition variables. Since these constraints are nonlinear. it is possible to obtain the solution using a nonlinear data reconciliation technique. It is also possiblc to solve the muIticomponent data reconciliation problem more efficiently by exploiting the fact that the nonlinear : e m s in the constraints are at most products of two variables.

The ierm bilitaear data reconciliation is used to refer to problelns containing this specific form of constraints. The reasons for developing special techniques for solving bilinear data rcconciiiation problems are two-

fold. First, these techniques will bc more efficient as cornpared to techniques used for solving nonlinear data reconciliation problems. This becomes especially important w h ~ n plant-\vide data reconciliation is performed. Second, a significant number of industrial applications of data recoriciliation is for multicomponent systems.

An important exarnple is the mineral beneficiation circuit where mineral concentration measurements and flows are reconciled. Other typical

Q .

examples are reconciliation of flows and compositions around a single distillation column or a sequence of columns such as a chill-down train of a petrochenlical complex. In several cases, reconciliation of flows and temperatures of energy flow subsystems are also bilinear problems if the specific enthalpy is only a function of temperature. A crude-preheat train of a refinery and a steam distribution network of a chemical process are illlportant examples. It should be kept in mind, however, that these special techniques cnly solve the problem efficiently. but do not give any additional benefits.

Ir? this chapter, we describe two methods that have been specifically * a

developed for reconciling data of bilinear systems. While these methods are more efficient than nonlinear programming techniques, they have the disacl\.antage that. ai present. neither of the nietl~ods can rigorously handle iriequelity constraints, s ~ ~ c h as simple bounds on variables. Thus. !I!

certain cases. it is possible that these n~eth(ids rnay gi\.c rise to negative estirnatcs of flocfs and co!npositlo!~s.

DATA RECONClLlATlON OF BILINEAR SYSTEMS

In order to illustrate a typical bilinear data rccor?ciliatia:l problern, we co~lsider a simple exar:lple of reconciling the ilows and con~positions of a binary distillation column as shown in Figure 4-1. We will assume that i i~c rivws and co!npont.rlr rnoie fractions of feed, distillate, and botton~ s t rearr~~ are iiieasurecl. A typical set of measurzd values is shown in the last column of Table 4-1. The discrepancies in the mater-ial f l o ~ ~ s and 1101-rnalization equations are shown in Table -1-2. I t is observed from this table that the measured flo\rrs and compositions do not satisfy the mater-i- a1 flows or norntalization equations.

Figure 4-1. Binary distillation co lumn.

Table 4-1 Operating Data of Binary Distillation Column

Stream Variables Measured Values

Flow Cornpor~tnt 1 ' A Cornpoilent 2 %

1-1 n w Co111~'o:lcrlt I % Co~nponcrit 2 'Z

Flow Cornl~oiicnt 1 C/r

Coinp~.ne;it 2 $6

Table 4-2 Ccnstraint Balance Residuals before R~conciliation

Balancc Type Residuals

The data reconciliation ot>jective is forrnulatcd s s in Equation 2-3 and is given by

where W's are the weights, and xJk's are the mole fractions of cornpo- nents. The first three terms in rhe above objective function are the weighted sum of squared adjustments made to stream flows and the other terms involve the adjustments made to the mole fraction measurements.

The reconciled estimates have to satisfy the material balances around the coiuntn. The different types of constraints that can be imposed are

(i) Overall flow balance around the column (i i ) Component flow balarices fcr all the corliponents (iii) Normalization equations for luole fractions of each str-earn

All of the abol~e constr-aints need not he imposed since they are not 31; independznl. For a separator. such as the distillarion colurm co1:sidered ic this cxiunple: a coinpiece set of independent co:isrraints is the ccmponent flow balances and ?he noro?alizaticn equaticms. The overall flo\v balance can be derived win2 :hex two $pes of eyuations and, thus. it need not he i:i~pox"d One comiilon n1ist;tke is to assume chat bjV iinposin: :he overaii ilosv ba1:;nce ar?d compone1:r t lou balances, t l ~ z rcco:lciled mole fracti~n estirn;~:es wil; automatically satisfy the nor!nalizatio:i constraint for al; streams. This is not the case. howcver, :is shown later ti~rotigh ihis example.

Thus. the constraints for this example are

The component balance constraints, Equations 4-2 and 4-3, contain products of the flow rate and compositions which make the data reconciliation problem more difficult to solve as compared to the linear case considered in the preceding chapter. The objective function 4-1 along with the constraints Equations 4-2 through 4-6 can be treated as a nonlinear equality constrained optimization problem and can he solved using a constrained nonlinear opti~nization program. However, efficient methods to solve these types of problems have been developed. Using any of these methods, the reconciled data for this example may be obtained and -.re given in Table 4-3.

In Table 4-3, the second column shows the reconciled estimates when component flow balances and normalization equations are imposed while the third coiunin gives the estimates when the overall flow and component flow balances are used.

Table G-3 Reconciled Data of Binary Distillation Column

Recon:iled Values

Wth NormoiizuSon Varicibles Constraink

F:ow Cornponei~t I % Component 2 C/o

Fiow Co~liponerrt 1 % Component 2 C/o

FIOUJ Component 1 % Conmonent 2 %

The constraint imbalances after reconciliation for these two cases are given in Table 4-4. The results in this table clearly demonstrate the necessity of including normalization constraints in rnulticomponent data reconciliation problems.

Table 4-4 Constraint Balance Residuals After Reconciliation

Residual Values

Normalization Normalization

Balance Type Constraints Imposed Constraints Not imposed

Overall Flow 0.0000E+00 0.0000E+00 Component

balances 1 8.5453E-13 2 < 1.OE-13

Normalization Equations

I; 0.0000E+00 D 0.0000E+00 B 0.000OE+O0

General Problem Formulation

The preceding exalcple shows that multicu~nponent data rccouci!iation for a ciistil!ation colun~n is a bilinear problem. In a si~nilzr man~er . the data arcu:id a sequence of separation columns czn be recunciled which also gi\les rise to n Siiinear problem. In thc mineral processing industry, a cornlnon appiicatio:~ of hilinezr data reconciliation is the reccnciliatioi: of flow and mi~lera! conipcsitions of a beneficiation C~I-cuit. Wc first present the ger:eral farmulation for n~uiticon:poncr?t data reconciliation of sucli typical procssses.

Dependii~g on the process and the silbsystc~n that is ccnside:-ed, severai different types of process units may he encoitntered. I n chernicai process industries, tne different ~ ; ~ ? i i s xvhere the flow or compositiciis of streams undergo changes may be classified as mixers, splitters, separators, and reactors. The type of ccnsrraints that can be imposed dc;-~r;.':; c:: T!:c :::t:r: of the process unit. It is therefore important ~o have -1 ~ I p a r l~nderstandi~lg of the complete set of independent constraints that can be written for each unit and. hence, for the entire sub-process. Although, for each process unit difTerent cornbinations of independent constraints can be written, usually the independent equaticns are chosen as described below.

Mixers

A mixer has two or inore input strcarns and has one output stream as shown schematically in Figure 4-2a. If the streams are single phase then the constraints imposed for these units are

(i) Component flow balances:

(ii) Nonnali~ation equations:

Sp lifters

A splitier splits dn input stream into two or rnore oiltpui streams as shown schenlatically in Figrirr: 4-25. The consrrair?ts that can be written f ~ r this unit are

( i j C'ornponcnt flow balances (equality of composi:ions):

( i i ) Overail flow baiance:

] = I

(iii) NOI-rnalization equation for input sit-ea~n

Inputs

Figure 4-2b. Splitter unit.

Input stre: output sirearn

Figure 4-20. Mixer unit

All other constraints such as coinponent balances and nomiz4iza:ion constrain!~ for output su-eams can be der iv~d by appropr,at:: combinations of above equations. It should also be kept ir? milid that if a sp!ittel- is a part of a subsystem, then the ncma!ization cquat io i~~ sllouid be wilten o:11~; fcr the inpu: stream of the sp!itter and liot for the o~itput streams of the splitter.

-

Exercise 4-1. Show thatfit- a splitrer s l io~~~ri in Figure 3-20, the rzumber oJ'iizdepende,lt equations is C.S'+2. Aiso S!IOW thlzt the con~ponent balarlces arid rlor-rnalizatior~ eyi~crtiotls of all out/~ul .strc.nm.~. cot7 be derived ~ ( r i n c the above CS+2 equations iinposed i / for- a splitter:

(i) Coinponent tlow bala~ices:

I;JxJh - c I , F , x , ~ = 0 j = 1 ... S; k = 1 ... C

(ii) Overall flow balance:

(iii) Normali~ation equation for input stream:

(iv) Split fraction definitions:

The use of split fraction variable:; introduces as Inany additional variables as the number of output streams and thus the nurilber of independent equations that tiiust be written for a splitter usin? s!,iii fraction l~ari- ablcs is q u a ! tc) CS-tS+2. The use of spiit iractic!:: variab!es al-c, complicates the problem further, since the cornponzixr balar;ce> ai.2 170

ionger bil in~ar but are trilinear (pr~ducls of three varil;bles).

I I

Exercise 4-2. D~!~iorzstrute ih(lt i : ~ elir?~inntiug the .sp!itfraciiorl I

11at-iai2les ji-ot~z tiic aboi~c mltel;.~ciri~,c set ofCS+S+2 rq!rntior~.r, i! i r I p~rii/,lc io obteiri t b ~ CS+S inciei,eiiderir cyiai!iisii qfiiic,fiisr

I Sot-~nulatinr~ >)r [ I sp:itra: I I

An alternative formulation which makes use of the definition of split fractions is sometimes more convenient. Let a, be the ratio of the flow rate of outlet stream j to that of the inlet stream of the splitter. Then the followilig equations also constitute a cornplete nunredundant set of equations for the splitter.

Outputs 1 npul s

Figure 4-2c. Separator unit

A separator, which is the inverse of a mixer, takes an input stream and separates it into two or more streams of different compositions as shown in Figure 4-2c. If all streams are single phase, the equations for this unit are similar to those of a mixer.

( i ) Component flow balances:

(ii) Nornialization equations:

We consider a reacioi with a single feed s:ream and a single product stream as shcwn in Figure 4-2d. Reac~ci-s wit!l rnultiple feed or product streams can be modeled by using a mixer befoie the reactor 211d a separator after the reactor. Due to the re;ictioils that occur, neithe1- the ovcrail mt)lar flo\v nor the molsr flow:; of components are conser~~eci. 'These are :G.O alternitfive chcices of niodel equations fcr a reactor. In thc first approach, we assume that iridepe~ident rzact io~s which occur in the reactor are specified. Let nLi be the stoichinlnetric coefficient of component k i n I-eaction j. and let the unknown extents of reaction be t,, j = i...R. \iherc R is the nurnbt.1- of independent reactions specified. Using the eltznts of I-eactions we cart \\'rite the following eq~lations

input Output

Figure 4-Zd. Reactor u n ~ t

(i) Component balances:

(ii) Norrl~aliration equations:

The alternative set of model equations is obtained by us~ng the fact that each elementai species is conserved. If we denote the number of atoms of element j in component k by aJk then the following equations can be written for a reactor

(i) Elemental balances:

( ~ i ) Norii~ali~ation equations:

As r;hoxi,n hy Reklaitis [I 1, these sets oi' equations are equivalent and give identical results only if a complete set of iildependent reactions that can occur among the components present is specified. In the absence of any ii~formation regarding the reactions that occur. the elemental balance nod el can be used. However, if energy balances also have to be included as part of the rcconciliation, then thc extent of reaction niodcl is convenient as shown subsequently.

The model equations of the various units can be classified as either process rtnit type or as stream-type relations. Overall flow and colnpo-

The objective is to determine estimates of ail flows and compositions such that the total weighted sutn square of adjustments made to flow and cornpositio~l measurements is minimized. The objective function is given by

The above for~nulation of the DR problem is in terms of flow rate and mole fraction variables. Alternatively, we can also formulate the problem in terms of overall flow and component flow variables, where the component flow NJh of component k in stream j is defined as

Using these variables, the component balances can be written as

The ncl-rnalization equations can a!so be written as

It can be observed from Equations 4-li and 4-i 2 that :he constraints ars l i ~ e a r in the flow variai2les and this feature can be exploited in the so!ution pr-occdure. Although the constraints are now in ternis of flew izariahlcr;. the objective iunctiotl sti!l contains mole fraction variables since these are the measured q~antities. In order ic cvercorne this prcb- leni, Crowe 131 proposed a rriodiiied objective of the D R prob!em wh~ch is to minirnizc the sum square of adjustmcilts made to flows and component flow variables. In :!:is c-5.- t!:: :r.odified ohjecti:.~ fvnction is

Since thc component flo\vs arc not the measured quantities, in the above objective fuiictio~i it is necessary to clarify the notion of the measured value of the component flow variables and the weight factors to be used fol- these variables. I\ component flow Ni, is taken to be a measured cju:iritiry i f . botll the f:ou F! and the composition xjk are nicacur-ed.

In the previous chapter, it was shown that the weight factor of a measured variable can be chose11 to be the inverse of the variance of its measurement error. An estimate of the variance of the error in the product N~~ is obtained by linearizing it in terms of the flow rate and composition measurements as

The variance oil, of the error in N,, can be obtained by applying the rule for linear sum of Independent norrnally distributed variables

The weight WNjk can be taken to be equal to the inverse of the vali- ance o i .

~k ,

The choice of a modified objective function for DK and the weight factors for the "measured" component flows can lead to larzer adjustments being made to the measurements. However, the modified cbjzctivc still indirect!y attempts to miniinizc tile totai ad-justment made rn the rneasured val-iables.

Tne modified objective function 4-13 subject to the constraints Ecjua- iions 4-1 1 acd 4-12 gives rise to a linear DK problem in the flow val-i- ables. For the special case considered here, all variables are measared and we can irnnediately obtain the es:im:rtes of all flows using the analy:ical solution of Equation 3-4. From these estiriiates, the rccoaci:ed values of the mole fractions can 5e obtained as

Example 4-3

Crowe's method is applied to reconcile the data of thc binay distillation column discussed in Example 4-1. The measured tlows and cornpositions are as given in Table 4-1. The true flows and compositions and h e reconciled T~alues obtained using Crowe's method are given in Table 4-5. In order to obtain the reconciled values, the measurement error variances in flows are taken as 5% of the true values and for the compositions they are taken as 1 % of the clue values. As compared to the reconciled values shown in Table 4-3 which are obtained using a nonlinear optimization technique, Crowe's method gi\.es more acculste flow estimates at the expcnse of greater inaccu-

racy in the composition estimates. This is due to the fact that Crowe's method adjusts the component flows rather than the compositions.

Table 4-5 Reconciled Data of Binary Distillation Column Using Crowe's Mathod

Variables True Values Reconciled Values * •

Flow Component 1 % Component 2 %

Flow Colnponent 1 O/o Component 2 5%

Flow Comporient I 9% Corr~ponent 2 % - - -

TI-eatrnerzt o f LTrzmeaszcred Variables

Thz presence of unmeasured flows or- con~position variables intro- tluces subtle conil)licaticj:is in Crowe's ~nzthod. Lkpendin~ or: the measur-erlle11ts made. t i e s t r . ~ ~ n h can be c!assifieci into two categorizs:

(ij Strca~ns u;ith messu~.eci i1cn.s and seine 0:- a!l cor?~posi:~ocs unn:easured

(ii) Strean-is with unnieasured flows w d some or all ;oillpos!tions unmea- s~rreci

A i~ieasured ;/slue for the coniponefit fl(3x.v of 2 stream cannot be obtained if the corresponding composition variable is unmeasured or if tile stream !iotv is unnieasured or both. Since there is one co one correspondence between co~nposition variables and their component flows, it is appropriate to corlsider the component flow as unmeasured if the corresponding composition var-iable is unmeasured I-egardless of whether the stream flow is measur-cd or not. However, if a streani flow is unrnea- sur-ed, then treating all component flows o f this str-ea~n as unmeasur-ed will I-csult iri a loss of infor~uation of the measurcti cornpositions of this

stream. In order to avoid this, Crowe's method classifies the strea~n flows and componeiit flows into the following three categories:

(i) Category I consists of all ~neasured stream flow variabies and "measured" component flow variables. Thus, this category consists of ~neasured variables only.

(ii) Category I1 consists of all component flows corre.iponding io measul-ed compositions, but unmeasured strea~n rlow. It also contains all the unmeasured stream flow variables. Thus, this category consists of a mixture of nieasured compositions and unmeasured stream flow variables.

(iii) Category 111 consists of all coniponent fiows corresponding to unlneasured compositions for wllich the streani flow may or may not be measured. Thus, this category consists of unmeasured variables only.

'Tlie flows and cortlpone~it flow variables in the different categoi-ies arc denoted by supel-scripts I, 11, artd 111. The objective f~inction for the DR proble~rl can now be formulated as

The above objec:ive furiction has k e n expi-cssed coinpact1;r usins vectors F. KC, and xk, corresponding io overail floivs, culnpocent k flows. and compositions of all streams in each category, respectively. The weight m~tr-ices W are diagoilal matrices with Lhe diagonal entries bein: the weights for the appropriate variables of all :;trearils in cach category.

The constraints of the 13R problem are the material balances for each unit ~vritten as described earlier. These equations can be cast in terrus of the variables in the three catcgorii-s. For so!ving this proble!x, Crowe [3] proposed a two-stage decomposition strategy for e1iminati:ig unmeasured variables from the constraint equations. In the first stage, unmeasured component flows is1 Category II! are eliminated by using a projection

matrix. For this, the procedure used in linear DR can be followed because the co~lstraints are linear in the component flows. In the second stage, the unmeasured flow variables in Category I1 are eliminated by using a second projection matrix. This requires some algebraic manipula- tion of the constraint equations which are described in [ 3 ] .

The reduced DK problem still requires an iterative procedure to solve for reconciled compositions of Category I1 and component flows of Cate- gory I, starting with guesses of Category I1 flow variables. It can be verified that if estimates of Category I1 flows are given, then the reduced reconciliation problem becomes a linear DR problem which can be solved analytically. These reconciled estiinates are used to back-calculate the unitleasured flows of Category I1 using a similar procedure as described in Chapter 3, which are used as starting guesses for the next iteration until convergence.

After the estimates of variables in Categories I and 11 are obtained. they can be used to back-c:ilculatc the estiinates of unmeasured component flows of Category 111 as described in Chapter 3. Since Crowe's method directly gives the estimates of component flows in Categories 1 and 111, the mole frac~io11 est1m:ites are obtained using Equation 4-1 6.

Example 4-3

We ccjnsidei- tile c~inera l tlot;~tion process analyzed by S~nit!i 2nd Icltiyen [S j showi~ in Figu1.7 3-4. The process ccnsists of three flotation cells (separators> and a mixer. and eight streatxs each consisting of t;./o miilcl.als, copp-r 2nd zinc. i ~ ? additicn io ganzue material. The floiv cf strean 1 is taken to bc unit nias.; (basis). v;hile the otlrer stream flo\vi arc. unmezsurt:d. The ~nicerai ca1lcel:tr-tliions of all slreams except 8 ar.- rne;i- sured. These vall;es are s h o ~ . i ~ in the firs! row of Table 4-6. Based o n this informaticn the flow and component variables can be classified as

Category I-F, , N , , . N , Category I-F,, N,,. NZ2 .... F7 N71, j\.T7~

S f i ~ a ~ / ~ - S i n f e Dar~l h'ec.on~rlinrrori for Bilinear- S?~rern~

Figure 4-4. Mineral flotation process [5] . Reproduced wifh permission of the Cana- dian Sociev for Chemical Engineering.

Although, the flow rate of stream 8 being an unmeasured variable should be classified as a Category I1 variable, it can also be classified as a Category I11 variable becuzrse all its con2po.sitinns ore ~nlviecr.rlrred

The co~straints imposed for this process are the flow balances and componen: flour balarlces arour~d each uriit. Kon11aliz2tion equations a;-c. not in~posed because we have eliminated thz composirion variables zcr- responding to the unmeasured gangut: In each strearn. This !eads to a reduced nuntber c>f variables and constraints in the data reconciliatio~i problem.

We start with an initial guess for rhe flows in streams 2 to 7 as given in Table 4-6, and apply the iterative procedure tc, ob'iin the reconciizc! \ialu,-s.

e s Row 2 of Table 4-6 shcws the reconciiea estimates of the flows and ]nine:- a1 ccr,certtrations obtained using Crowe's metihcd. It is observed that the esiimate for zinc concentration in strearn 8 is negative. For comparison. rhe reconciled estimates obtained usiilg a nonlinear programming technique (NLP) (described in the next chapter) are aiso listed in the last row of Table 4-6. Again, the estimate for zinc concentration in stream 8 is infeasible. This points to the need for imposing bound constraints in the data reconciliation problem which we be discussed in the next chapter. The n~aximurn difference brtween the (feasible) mineral concentrations between the two

solutions is about 2-796. Since Crowe's method uses an objective function different fmrn the standard DR problem, its estimates will be less accurate than those obtained using the NLP approach.

Table 4-6 Measured and Reconciled Data of Mineral Flotation Process -

Method Var. Stream

1 2 3 4 5 6 7 8

1 0.5 0.25 0.125 0.5 0.75 0.125 0.25 Measured yc,% 1.928 0.450 0.128 0.090 19.88 21.43 0.513 35.36

y~, '% 3.81 4.72 5.36 0.41 7.09 4.95 52.10 -

I; 1 0.9229 0.9147 0.8324 0.0771 0.0853 0.0823 0.0081 Crowe kc, 96 1 Q451 0.4498 0.1285 0.0906 19.834 21.431 0.512 0.2976

R , , , % 5.0356 4.8617 j.0461 0.4099 7.1167 4.9235 51.930 -15.91

F 1 0.9253 0.9164 0.8287 0.0747 0.0836 0.0877 0.0089 NLP ?,, S'C 1.9122 0.4509 0.1301 0.0899 20.00 21.44 0.5098 35.554

L Z z 7b 4.2759 -1.058A 5.3583 0.41 6.9694 4.95 52.1 16 -130.1 -

"lilitic~l i~nltres offloas for .srreni,i I rhmirgh 8 ar-e iistrd iri this row

Sirnpscn's Technique

'P'ne ~ppiication of data reconciliatior; in the :nirieral processing industries, especia!!y to n?ineral beneficiation circuits has beer; investizated abcut 30 years ago. Several methods fcr ipccifca!ly solving DR prob- Ic!ns aiising in thesc industries have been dcvelopcd. Arnong these, method deveioped by Sin;pson et al. [Gj is very efficient.

Bzfi:re desori5in.g Si~npso~i ' s techniquz, it is instructive :o examine mrne of ilte process units encount.ered in the mineral processing ir?dus- trizs and see in what respec:s !hey difier frartl the co~espoilding ur?its in chemical process industries. In rtlineral beneficiation processes 171. the 01-e is first crushed to obtain particle sizes in the range 16-20 cm. The crushed particles are further reduced in size to between 10-300 microils in grinders. Generally. grinding is clrrried out with addition of water andlor recycled slurry. The particles containing the minerais are separated fro111 the garyue particles in separation units that are either classifiers 01- flotation cells.

The slurry containing the rnineral particles is refer-red to as the concentra:e and that contai~~ing the gangue as tailings. Wakr may also be added to the separation units in order to maintain a desired pulp density. Figure 4-53 shows a schzrnatic of such units, where the feed (F), tailings (Oj,

and concentrate (U) contain both solids and liquid, but the stream denor- ed by W is a pure water stream. Although, these process units sre similar to separators, all streams (except pure water streams) are two-phase streams and conservation equations have to be written for the overall slurry flow as wzll as for the solid or liquid flows through sazh of the units. We refer to these process units as two-phase separators. Similarly, there are two-phase mixers where mixing of these streams occurs.

Figure 4-5a. Two-phcse separator

The type of equations written for these unirs differ from the sirnple separator in three respects. First, separate conservaticn relations hatre to be written for the ovcra!l s!urry fiow ar:d the solid tlow. Secondly, iabo- rarory meassrelneitts of tile pulp df-nsitv of the s l u ~ ~ y streams may uisi, be availsble, which needs to Sc recoilciled. Thirdly, rhe solids also ccn- rain g a q u e maierial which is not measured znc! only the mineral coilce!i- trations of the solids arc measured. 'Thtts normalization eqca t io~s are not imposed since they arc irrelevant. !I! ordcr fc? take into account the above aspects in the rrlodei equations.

the following variables are associated with each streamj.

( 1 ) Thc slurry flow rate (5) (2) The flow race of solid (S,) (3) The mineral concentrations ( x , ~ ) expressed as a weight fraction c ~ f

the slurry flow and (4) The pulp density (p,) which is the ratio of the solid to the total s l u q

B 8 t?ow in a strcarn.

Using the above variables and the notation in Figure 4-5a, the model equations for these units can be written as follows.

(i) Overall flow balance:

(ii) Solids flow balance:

S , - - s ~ - S [ , = O

(iii) Pulp density relations:

pFF-SF=O

poo-S,=O

puU - S" = 0

(iv) Mineral component balances:

These tinits are siniilar to mixers expect that two-phase streams zse i~!volved as shown ir, Figurz 4%. As il? the case of the two-phase sspa- rator, the stream \N is a pilrc ivaiel- str-earn. 'The equations arz siniilar t~

those for a two-phase separator.

Sirud~-S~rz:c i 1 ~ 1 1 u Krr-o/r< il~uib~rz for L i i l~ ,zem,- ,r\Tfn,l ,- 107

Ball Mills

In crushers and grinders, the size distribution of the input and output streams are measured. Balance equations are written for the ore quantity within each size range. If a slurry stream is also recycled to the mill or water is added. then as in the case of two-phase mixers a sluriy balance and pulp density relations have to be written. Figure 4-5c shows a schelnatic of a general mill.

Figure 4-5b. Two-phase mixer unit

Figure 4-5c. Eall mill c ~ n i t

Le? u.,; be the weigt:t fiactioil of soiids of size rarigt- i in streant j, and a, be the ncmber of size ranges. Thc following constraints are imposed for a mill in general.

( i ) C)vernll tlow balance:

(ii) Overa!l solids flow bslance:

(iii) Solici flow balance for each size range:

where ki is the increase (which could be negative or positive) in the weight of solids in size range i due to grinding. These are typic;~lly

unknown quantities arid have to be estimated as part of the reconciliation problem.

(iv) pulp density relation:

(v) Normalization equations (for weight fractions):

It car] be o5serbed :hat the n!odel equation!: for different units are bilinear. Tile method dzveluped by Siinpson et a!. i G ] exploiis this feature for effi-fiiien~ s:)lu:io~; of the DR psoblern. \Ve describe tnis method ti?r the simple case uhen th? ntineral beneficiatio~l circuit co11sis:s o:llp ~f tw;o- pl~ase jllixe!-s 2nd jepzrritors. We assume thai the n~~easuremen:~ inade are sl~lrry tlows (or liquid flows for pure iiquid streams). pulp densities. nlin- eral concentr-ations given in terms of mass fractivn of mineral in wet solids. The ol?jc.ctive funrricln of the DR probleni in this case is given by

where i;; is the ovcrali flow I-ate of strearn j, .sI is the total nuniber of streams, and s2 is the number of slurry streams. It should be noted that all \xanahles \\,hetiler measured or no! is ii?cluded in the objectiire function.

Slradv-State [)art<< Rrcorzciliurio~~.fi~r Bilinear Sv.vtrms 109

Since there are no measured vaiues associated with Unmeasured variables, an initial estimate of these variables can be used in the objective function. Moreover, the weights for all unmeasured variables are chosen to be zero so that the objective function is identical to the standard DR objective function which minimizes the weighted sum square of adjustments made to measured variables.

In Si~npson's method, the nonlinear data reconciliation problem is approximated by a linear data reconciliation problem by suitable choice of working variables and linearisation.

The pulp density relations are first used to substitute for the variables p, In ternls of the flow rates.

With this substitution, the pulp density relations need not be considered in the DR problem. As seen before, the colnponent balances are linear if we express them in terms of component flow rates. The variables xjk in the objective function can also be expressed in terms of component flow variables as

Using Equations 4- 19 ar?d 4-2S, the objective function Cali be written as

The second and third terms in the objective function are 110 longer quadratic in the flow vaiables. The objective function can he ~ynly;mutect by a quadratic f~nct ion by using a first order apnroximation of the flow ratios around sorne estimates Fy, Srand N$

where a, and bj, are respectively equal to and i/Fyand -N~~/(FT)' and

where cJ is equal to -S;/(F;)2. The quadratic approximation of the objective functiorl is therefore

given by 8 0

Since the constraints are linear in the flqw and component flow vari- aSles. we now have an approximate linear DR problem corresponding to the objective 4-24 and tlow balance constraints for overall slun-y flow, solids flow and component t1ou.s. This linear DR prob!em can be solved using techniques described in il?e preceding chapter. This DR problern. however, can be coived :nor? efficiently by reducing it to s n unconstrained optimization problem bl- elirnin:~ting all thz constrain!^ tosether ~ i t h a suitab!e c!ioice ot depc~~dcn: variables. The dependent variables are so chosen that their relation to the independent variables is obtained ezsily. Graph theoretic concelits are eaplcited to achieve this.

1,eT us first consicier The oxrerzll f!ows of streams. Proin the preceding chapter, it may tte recs!led that the concep: of a spanning tree of a prccess ?:-a;~h is useful in dc:ern~inin_r rhc observabiiiiy or es:in~~lbi!ity of ~inrnea- 4 4b

s u r d flow variatoles. A funda!nzntal cutsei cf a stream which is a branch of the spanning tree provides a f l a y b:tlance equation which relates the flow of that st]-earn with the flo\;ls of sti-earns which are chords of the fundarnen~al cutxet. 'These ideas tail be useci to convznientiji choose the dependent variables. We construct a spannir:: tiee of the prccess g:.:ph I::? ~ h r - ~ t thr. flows of the branch streams of the sparining tree as (+r_.ymlrrlt variables and the chord stream flows as independent variables. I t can be immediately deduced that the f~~ndaniental cutsets of the spanning tree can be used to relate the dependent and independent flow variables. These relationships can be expressed as

where Fbi is the flow of branch strearn i, F,, is the flow of chord stream j, p , is a coefficient which is 0 if chord j is not part of the fundamental cutset of branch stream i, + I or -1 if chord J is in the fundamental cutset of branch i, and the flow directions of chord J and branch i, are opposite or same with respect to each other, respectively. Thus, tlie dependent branch flow variables can be eliminated from the objective function. Equation 4-24 by using Equation 4-25.

If we consider the solid flows. the above ideas can again be applied since the solids flows are related in exactly the same manner as the overall flows. This is true also of component flows of streams for each c o n - ponent. Thus, the solids flows and component flows of branch streams can be chosen as dependent variables and related to the corresponding chord flows similar to Equation 4-25. The solution of the r-sl~lting unconstrained optimization problem can be obtained by setting the derivatives of thc objective function with respect to the chord flows to zero and solving the resulting lineal- equations. Complete details of the linear equations to be solved at each iteration are given in Simpson et al. 161. It should be noted that in this technique initial estimates of chord flo~vs (total. slurry and component flows) only have to be guessed since the branch flow estimates can always be calculated ~lsing Eqtiation 4-25.

-We ill~l.strate Siri;pson's n le th~d [or the ~ n i ~ e r a l flacation process con- s idsed 111 Example 4-4. The process _graph of Figure 4-4 is consrs-acted by including tlie environmerit nodc and is shown in Figure 4-6. A spanning tree of this process g~-apll is constructed a;id i \ st:ov:n in Fi~u1-c 4-7.

Figure 4-6. Graph of mineral flotation process.

S;m~i~-Srare 11arr1 Re~~orfcilirrtiorl for iir!rrlcnr Svsrcrri\

Figure 4-7. Spanning tree of mineral flotation process.

The choice of this spanning tree implies that the flows of branch streams 1, 2. 3, and 8 are thc dependent variables. The fundamental cutsets with respect to this spanning can be easily obtained and are given by the sets [!. 4, 6, 71, [2 ,3 . 5, 6, 71, [3, 3, 71. a:ld 18. 5, 61.

Based on the strear11 f l : ~ directions, the matrix of coefficients, pij. car; now be const~vcreil and is given by

where the romJs of P con-espond to the branch streams, 1. 2, 3. and 8 while tlle coluinns of P ccjrresponci to he chord streams 4, 5, 6, and 7 arranged in order. We start with initial e:;tin~ates of chord flow variables and coinpute the branch flows using Equation 4-25, which are shown in 7Bble 4-7. For these estimates, the coefficients aj and bjk are computed. The solutio~i of the reduced unconstrained DR problem gives new estimates for the illor-d flows which arc used for the next iteration. The iterations are carried out until convergence. The final reconciled estirnates obtained are shown in Table 4-7.

Table 4-7 Reconciled Data of Mineral Flotation Process

Using Simpson's Method

Method Var. Stream

Initial F* 1.25 1.15 1.: 0.9 0.1 0.15 0.2 0.05 Estimates Nc, 3.399 1.413 0.183 0.0810 1.986 3.216 0.102 1.23

N,, 11.532 10.823 10.789 0.369 0.709 0.7425 10.42 0.0335

Simpson I: 1 0.9246 0.9160 0.8424 0.0754 0.0840 0.0736 0.0087 &"% 1.9189 0.4505 0.1258 0.0918 19.932 21.462 0.5:s 34.773 i,,% 4.5575 4.3564 4.5278 0.4097 7.0231 4.8803 51.657 -13.77

Generalization of Bilinear Data Reconciliation Techniques

In describing the above methods, we have reconci!ed all mcasured values. It is more common in industrial applications to hold some of tile measured variables constant during the process of reconciliatioi~. A sini- p!e way to accomplish this is to assign a very high weight (01 very small standard deviation) in the objective function to rzieastlrements that havc to be kept constant. This will force the adjitstrnenc made TO these nle'i- surernents to be negligibiy sn?a!l.

Both Ciuwe's method and Simpson's me~hoci have been describccl for- processes involving primarily mixers and separat~rs. if splitters, reactors. or grinding nlills are present in the process, then these metk~ods have to be suitably modified hecause ihe type of equations iinposed fc,r these units do not conforrii to those ~f other units sluch as miser\ and :;tpara- tors. Crowe [3] has atitlined the modifications necessaq. to take spliiterc into aceourit. FCJ: this purpose, the splitter eciuaticns ::re for-r;;ulateti using split-fraction variables.

As pointed out earlier, the use of split-fraction variables leads to a tri- l inear ctrnrtllre fnr th: rc.!rynrr.? balances. In o~-der to use Crowe's method for the bilinear problem, the split-fraction variable\ are esti~~iated in ar? outer iterative loop. For each guess of the split-fraction variables. a bilinear problem results which can be solved using Crowe's method. In general. a constrained optimization technique is required to obtain updated estinlates of the splil-fraction variables at each iteration which robs Crowe's method of much of its efficiency. If only one splitter is pi-esent

in the process, however, then a univariate optimization merhod such as the golden section search [8] can be used for this purpose.

Treatment of Enthalpy Flows

Although both Crowe's method and Simpson's method were developed to solve niulticomponent data reconciliation problems, it is possible @ Q

to extend these techniques to take into account enthalpy balances and to reconcile temperature variables. In general, the enthalpy of a stream is a nonlinear function of the stream temperature and composition. However, if the enthalpy of a strearn can be assumed to be a linear function of t an - perature and independent of composition, then si~nultaneous material and energy balance reconciliation also give rise to a bilinear problem. Even if the enthalpy of a stream is a nonlinear function of tell,,-isture but is independent of con~position, the methods discussed in this chapter can be used with minor modifications.

An important subsystem that satisfies this assun~ption is that of a crude preheat train of a refinery, where the enthalpy of a petroleum stream is related to the ternper.ature and physical properties such as API gravity and normal Soiling point of the strearn. For the purposes ofthis chapter. we wiil make this assun~ption and describe 1112 n~odifications necessai->- t o apply Ci-owe's method or Si~npson's method for- simultaileous material a ~ ~ d energy balance reconciliation. As befure, u e first describe the enzrgy bslal~ccs for different types of process unjts.

uiiere H(T) is the spec~fic enthalpy of :he stream whlch is a\sumed to be only a functio~i of temperature

Splitter E~ztlzalpy Ralaizce

Heat Exchanger

By definition, we assume for a heat exchanger the data for both the cold and hot side fluids need to be reconciled or estimated. We also assume that the streams are single-phase fluids.

Hot Inlet Hot Outlet

Figure 4-8. Heat exchanger.

- . I he equations for this unit S ~ O W I : i ! ~ Figurc 4-8 are

(i) F l o ~ ~ ~ balafices for hot a11d cold side fluids:

(ii) Enthaipy balance:

(iii) Component flow balances:

or in terms of specific enthalplcs of streams

116 ~ > ~ r n f<ecor~ci/inrio~i a/ld Gross El-!-or fl?rec:io17

(iv) Normalization equations for outlet streams:

Heaters or Coolers

A heater or a cooler is a heat exchanger for which data of only the process stream are reconciled, while data of the utility stream are assumed to be unavailable or unimportant. The constrai~lts for this unit are a subset of the constraints of a heat exchanger. The material, energy, and normalization equations for the process stream only are written.

Crowe's method can be easily extended to include enthalpy balances and temperature variables in the reconciliation problem as suggested by * a Sanchez and Romagnoli 191. The specific enthalpy variables can be treated in a similar m m e r as composition variables. The enthaipy flow of different strcalns can be classified into tine three categories in a similar nanner 2s component flow variables. The objec:ive functior? will cow contain terms for the adjustme!lts made to enthalpy flows of Category 1 streams m.d specific enthalpies of Category 11 streams. The two-step Crowe's projection technique described earlier c w be applied to d s o ~ b t a i n the reconciled values of specific enthalpies cf ail streams. If the specific enthalpy is a i~onliil- ear function of temperature, then the tempsrature estimate of each streani can be rect?v~,red from the specific enthalpy. In genera:, this may require ihe

1

4 * solution of a one-dimecsiona! iioniiriear equatior~ for each str-earn.

Simpson's method has aiso been extended to include splitters, as well 3s to treat enthalpy balances along wifh flow sfid compor?ent balances [lo]. It should be cautioned, however. thar generalizing these methods to include other types of process units such as a flash drum (which can also be described by bilinear equations for ideal therrnodynarnics) may not be a trivial exercise. A significant disad\rantage of these methods is that at present they cannot take into account simple bounds on process variables. This can seriously limit the use of these methods in industrial applications where it is required to obtain feasible estimates of process variables.

SUMMARY

A complete set of independent constraints has to be imposed for each process unit in formulating data reconciliation problems. Different independent sets of equations can be imposed for a process unit, but some are more convenient than others. It is important to include normalization constraints on compositions to ensure that the reconciled estimates satisfy them. The constraints of bilinear data reconciliation problems contain products of two variables (flow and compositio11 or flow and tern- perature). Special methods have been developed for solving bilinear data reconciliation problems. These are efficient but cannot handle all types of process units and they also cannot take into account feasibility constraints such as bounds on variables. Nonlinear data reconciliation techniques can be used to solve bilinear proble~ns. These are less erficient, but do not have any other limitations.

REFERENCES

2. h?eycr, M.. id. Snjalhcrt. and B. Koehrr.!. "Data Kc~or:cil~aiion b3n bislti- collipoaen! Nctworks [isin: 0h:;enrabiliry and Ktldurtdancy Classification," in C<~tlil:xte~- Applicgtiorls it1 (..l~(,r?~i(,tll Er~giil~et.ir!g (i,dited by H . Th. BUSSC- n1;ikzr and F I>. Icdernaj. A~nstcrdam: E!sei,ies: 1990.

3. Crcwe, C . M. '"'I<econci!iat~on of Pi-ocess :;!ow Rates h> Matrix ?roject;on 11: Thr Nonlinear Case." AIChE Jour-t~al311 ( I 989 ): h 1 O--h??

3. Rao. R. R.. and S. Nasasin~han. "Colnpasison of Tcchrliques F r Data Kecon- ciliation of Multicomponent Processes." hid. & Fig. Clitrri. Rex 35 (1 996): i 36'1368.

5 . Smith. H. W.. and hJ. tchiycn. "Corl~puter ,-\<!justlnznt o f hletailurgical Bnl ances." Cfi! Bull. ( I 973): 97- 100.

6. Simpson, D. E., V. R. Voller, and M. G. Everett. "An Efiicient Algorithm for Mineral Processing Data Adjustment." Itit. J. Miner. Proc. 31 (1991 ): 73-96.

7. Wills, B. A. Mineral Processitzg Technology, 4th ed. Oxford: Pergamon, 1989.

8. Edgar, T. F., and D. M. Himmelblau. Oljtimizatiotz c.f Cherniccrl PI-ocesses. New York: McGraw-Hill, 1988.

9. Sanchez, M., and J. Romagnoli. "Use of Orthogonal Transformations in Data Classification-Reconciliation." Cntrz~~uters Chem. Etzgtig. 20 ( I 996): 483-493.

10. Siraj, S. M. Atz EfJicierzt L)e~otnpo~itiut~ Srraregy for- Genet-rzl Data Reconcil- iatiori. M.Tech thesis, IIT Kanpur, India, 1995.

Nonlinear Steady-State Data Reconciliation

The steady-state co~lservation constraints that are used to describe most chelnical processes are nonlinear in nature. If we are interested in only overall flow balance reconciliation of such processes, then linear data reconci!iation technique5 described in Chspter 3 are sufficient. hlorec?ver, iintler some restrictions some of these processes can be solved usir~g bilinear data reconciiiatiori techniques as describcc! in Chapter 4.

If we \visil, hcwever. to take jr.~o consideration thermodynamic equilibrium relaticrlships and con!p:ex coire!ation: for thermodynamic and physical properties. then nonlinezr data reco:lcilia:ion techniques 111ust be used. Moreovei-. in Chapters 3 and 4, we have considered only eq lda l i~ COt7-

.';irc7ircrr corre~poi~dirig to material and enc!gy conscrirai.ion and have no: imposed even si~np!e bollnds on the variables. The reconciied estimates c;f \~ariables can therefore becon-ie infeasible. Fnr example, negative reconciled estimates for flows or cornpositions can be obtained. If we impose bounds on the estimates of vaiiables or other feasibility constraints, then these give ~ i s c to ineqiiality cr?.strsi?.rs ir t'.: .I::: rec~nci!i~:icn ;::;S!=;;; which can be solved only l ~ s i r q a nonlinear d a t ~ reconciliation soluticn technique.

norti Ker o~;criror!o!l arrd (;I-or Error- Oerecrioll

FORMULATION OF NONLINEAR DATA RECONCILIATION PROBLEMS

Equilibrium Flash Data Reconciliation Example

We will use a simplified version of the example of a single stage flash unit drawn from MacDonald and Howat [I ] to illustrate the formulation and solution of nonlinear data reconciliation problems. Figure 5-1 shows an isothermal flash unit with a feed stream containing propane, n-butane and n-pentane. The steady-state constraint equations for this unit are as given below:

Component Balances : Fzi - Lxi - Vyi = 0 i = 1, 2, 3 (5 - 1 ) Figure 5-1. Equilibrium flash unit

General Problem Formulation 3

Norn~alization Equations: xi - ! = 0 i -1

As in the linear case, it is assumed tl-tat the random measurenient errors follow a normal distribution with zem mean and a variance-covariance matrix C. The general nonlinear data reconciliation problem can be fon~iulated as a least-squares minimization problem as follows :

M i i l ( ~ X, 11 C-'(y

subject to

f (x, u) = 0 (5 -S) Equilibrium Relations: yi -- P,""'('17)xi / P i = I , 2. 3 (5- 5 )

-. POI- si~npliciry, Racuk's law has bzen used to describe !he equilibrium

relations. Tliz sa:uration prcssurc is obtained using the Antoine equaticn which is given by:

Antoine Equation: in P?' = Ai + Bi / (T + c,) i = 1,2.3 ( 5 - 6 )

wjirre f : m x I -gector of eqgality constraints; g : q x 1 vector of inequality constraints: C : n x n variance-covariance matrix; u : p x 1 vector of unmeasured variables; x : n x 1 vector of measured ;.ariables: y : n x 1 vector of mcasured values of' meaxureiiiencs of variables x.

The nonlinear data reconciliatian problem is to reconcile the measurements of the flou~ rate, temperature, pressure, and composi:ions of the feed. liquid and vapor product streams so as t o satisfy the constraints 5- i through 5-6.

The equality constraints defined by Equation 5-8 typicaliy include all material and energy conservation relations, thermodynamic equilibrium constraints, constitutive equations for material behavior similar to Equa- tions 5-1 through 5-6 of the equilibriun~ flash example. The inequality constraints given by Equation 5-9 may be as elementary as upper and lower bounds on variables or complex feasibility constraints related to equipment operation.

In the above formulation, we have tacitly assumed that the variables x are directly measured. However, this does not impose any limitation. If measurements are functions (linear or nonlinear) of the variables (for example pH is a function of H+ concentration), then we can always define a new state variable for pH which is directly measured and the relationship between pl-1 and H+ concentration can be included as part of the equality constraint set. We will first consider the solution techniques for nonlinear data reconciliation problems in which only equality constraints are present in the process model. Two solution techniques and their variants are discussed in the following section.

SOLUTION TECHNIQUES FOR EQUALITY CONSTRAINED PROBLEMS

?'lie minimization oi' Equation 5-7 svbject to the equality constr:;ints CIS Equat~cn 5-8 can be achieved by usin? a general purpose fionlinear optimization tcchniquc. Ho\vever, since thz objec!ive function is quadratic in nature, efficient ~echniques have been developed to so!ve the problem. - i he estimates :)btai:~ed by colving this optimization pi.oblein can h:: shown to be ~ncl.ri;~~tri?r likrlihooci' estir7zatr.i. (MLE). I t should he noteti. however, that these estimates ;nay he biased, whereas, i n 'he linear case !he estirnares are unbiased.

Methods Using Lagrange Multipliers

The equality constrained nonlinear data recoficiliation problern can be solved by using the classical method of Lagrange ~nultipiicrs [2]. The Lagrangian for ?lie problem is ziven by

The solution of the data reconciliation problem can be obtained by setting the partial derivatives of Equation 5-10 with respect to the variables x, u, and 1 to zero-thc necessary conditions for an optimal solution point of Problem defined by Equations 5-7 and 5-8-and solving the resulting equations. The following equations are obtained:

where

are ihe Jacc~bian matrices scctaining ihe p~r t ia l derivatives a f the no:ilin- ear functions f with respect io Y and u, respectively.

Since the constrr?ints are nonlinear, solving for x, u and ?L involve; an iterative nunreiical procedure. Thc systern of normal eqtiaiions 5-i 1 through 5-13 can bz sclved by any sirnultaceous equation sol\>.-r [3j. Stephenson and Shewchuck [4] used a Newton-Raphson iterative method which is bawd on a quasi-Newton liriearzation of the nonlinear model. Their :~lgo. rithm takes advantage of the sparsity of the Jacobian matrix and the invru-i- ance of the partial derivatives of the linear tcnns in the model equations which m,akes thc cornputations more efficient for large syslem:;. Ser-th et al. [Sj reporred a sirni!ar approach but with a different nonlinear equation scl\~er.

Madron 161 suggested an iterative approach for solving the nor-ma1 equations 5-11 through 5-13 based on successive linearization. Let 2~ and iik represent the estimates of the var-iables obtai~ied at the start of iteration k. k linear approximation can be obtained for the nonlinear ~ O J I -

straint from the Taylor's expansion of the function f(x,u) in Equation 5-8 and retaining only the constant tenn and the first-order derivative tern).

where the Jacobian matrices J$ and J , k are as defined by Equations 5-14 and 5-15, with the superscript k indicating that they are evaluated at the estimates sk, U k . The Jacobian matrices that appear in Equations 5-11 and 5-12 are also replaced by their estimated values at iteration k. The resulting set of equations are now linear. In Madron's procedure, these linear equations are de-coupled by eliminating vector x from Equations 5-12 and 5-13 using Equation 5- 1 1.

From Equation 5- 11,

Using Equations 5-16 and 5-1 7 in Equations 5-12 and 5-13, and rearranging we obtain the fol!owing linear equations involving u and h

Eqca~ion 5- IS can be solved for c:hiaining die new estimates for u and h The estimates for iL are used in Equation 5-17 to obtziil the mu; estimates for x. This procedure is repeated using these new estimates as initial guesses f a the next iteration. 4 disadvar~tage with all these n~ethcds is that the inclusion of Lagrange ~nlu!tipiiers h in the solutior. increases the size of the problem which require more computationai time.

To reduce the size of the problem, Madron 161 proposed a Gauss-Ior- dan elimication process of the original linearilinearized constraint matrices (J, 1 J , for the nonlinear case). The structure of the I-esulting matrix provides useful information for variable classification.

Method of Successive Linear D a t a Reconciliation

A simpler way to handie the nonlinear data reconciliation is to successively solve a series of linear data reconciliatio~l problems by linearization of the nonlinear constraints. A linear approximation :o the nonlinear constraints are obtained as in Equation 5-16. We thus obtain a linear data

reconciliation problem for minimizing 5-7 subject to linear equality constraints of Equation 5-16 which can be solved using the technique described in Chapter 3. Britt and Luecke [S] proposed an alternative solution procedure for the linearized problem. Their soIution for the estimates which have to be used at the next iteration are given by

where

Equations 5-19 th ro~gh 5-26 were deriveci by Britt arid Luecke 121 for parameter esti~naiion in n:~nlir?ear regr-ession. It was adapred for noniic- ear data reconciiiation by Knepper and Gorman [7j and also used later by MacDonald and Howat [ i j. The a1gnri:hm requires initial estimates for all variables coatained in the vectors x and u. T ~ P measured values y can be used to ini!ializs the variables x. Bri:: and S u e c k s i s c designed a simplified al~orithn-t that can be ~ s e c i to initialize the u ~ ~ n e a s u r ~ d parameters n. At each iieration, the functicn f(x,uj and the Jacobian matrices J, and J, are re-evaluated with the new estimates. The iterations are continued until I I u ~ + ~ - ul, I / and I I x k + ~ - xk li satisfy :I srnail tolerancr, crireri- on. If convergence is achieved, the solution might not be a global minimum solution. This difficulty is common to most nonlinear least-squares estimation problems.

A variant of the above algorithm suggested by b e p p e r and Gorman [b] in order to reduce the conlputational time, is to hold the Jacobian matrices constant at the initial estimates and I-e-compute thein only after the constraints are satisfied (can.stant dit-~cfio17 I I ] ? P I - O L I C ~ ) . ?'his approach, however, is characterized by slow couvcrgence.

Another variant suggested by MacDonald and Howat [ l ] is a de-coupled procedure, in which the estimates for u are held constant and Equa-

tion 5-20 is repeatedly used until the estimates for x converge. Equation 5-19 is now used to obtain new estimates for u, and tile procedure is repeated until all the estimates converge. MacDonald and Howat [ I ] demonstrated through application to a non-equilibrium flash unit that the coupled algorithm provides marginally more accurate estimates at the expense of a greater computational time. The de-coupled procedure can be a useful computational scheme when the nonlinear equations are * . implicit in the parameters.

The method of Britt and Leucke [2J and their variants described above have some limitations. Equations 5-19 and 5-20 involve the inverse of two matrix products, R and (J",)TK-'(J!). In order for the inverse of the matrix products to exist, the following conditions should be satisfied:

(i) The matrix Ji should have full row rank (ii) The matrix J k should be of fill1 column rank.

The second of the above conditions imply that all the u ~ i ~ n e a s ~ ~ r c d * 1)

variables should be obser\lable. This is identical to thc condition seen in the case of linear systems in Chapter 3. where it i ~ a s sl~own that in order for all ~~nineasured variables to be observa!)le, the colurrins of the constraint nia?rix corr-esponding to thesc variables must be lineal.ly iiide~:-n- dent. Even if this condition is met. the first condition rriay n o t be s:itisfieil by some processes deperrding on which of the variables are ~neasur-d (see Exercise 5-1 ). Thus. the above methods canr?ot he applied i!i general h r all p- ~ocesses.

Exzrcise 5-1. C'onsidei- rl:e jlora/ior: prxess dr.rc.rihed iri E.~ci!;lj,lc 4-4. Grrzerare the yrulir~z~tr-i~v ((/fflze JacoJ~iarz corre.s~~ot~diri~y fo

rlzensursd variables (this cat1 be dotie arzctl~yricuilyj. Shox. rlzur tile t-ow correspo17dir1g ro the flow balat7ce,fo1- r7cd~ 4 ill rhis . ) I I ( ; I I I L ~ ~ ~ ~ . x con.rl.s~.> o~iix c?j;rr-os arid hence 111vi.e tlicir illis tl2crtri.x does r z c x /iiiijef~t!: ?-OIL rar7k.

An approach that can be used in general is based on the use of Crowe'h projection matrix technique [8] to solve the linear data i.econciliation problem in each iteration. The basic steps involved in this approach are as follo\vs: 9

Step 1. Start with the measured values as initial estimates for the variables x and initial estimates for u which are provided by the user.

Step 2. Evaluate the Jacobian matrices of the nonlinear constraints with respect to variables x and u at the cur~ent estirnates.

Step 3. Conipute the projection matrix Ph such that it satisfies

The projection matrix can be obtained using QR factori~atiorl of tlle matrix J: as described in Chapter 3.

Step 4. Compute new estimates for x using

Step 5. Compute the ne\xr estimates fol- 11 through Equation 3-34 ~ttiliziilg the QX factorization of mat1-ix J b .

Step 6. Stop if the no{. estimates are not significantly d i fkre~?t from those obtained in the preceding iteration. 0therwi:;e using these ne\v estimates repeat the procetiure s!arting with Step 2.

Pai and Fisher [9j were the first to use a procedurs siiliiiar to the algo- r i~hm described above. The additional mcdificatioiis th::t thei!- nlgorithrr? incorporates are:

(a) A Broyden's update PI-ocedure [I01 for updating the Jacobian :~:atri- ces rather than recalzulating the111 ai czch i;c,.,;laii. iii oi-dei- to reduce the computational effor? involved.

(b)A line search procedure after Step 5 based on a penalty function n~ettiod to compute the estimates to be used for starting the next iteration. The penalty function Ilf(x,u)ll + ~ ( y - x ) " ~ - l ( ~ - x ) was used. where a is an arbitrary number 95 a 21.

The modification described in (a) can improve computational efficiency of the algorithm and has been demonstrated for smail problems. Hen-

Nonl'tzear Stead3,-Sraie Data Reconriliarion 1 29

ever, the modification in (b) is of questionable utility. This is due to the fact that the objective function of data reconciliation is quadratic and the solution for the estimates obtained in Steps 4 and 5 are optimal, although they do not satisfy the nonlinear constraints. A line search procedure for modifying these estimates can improve feasibility with respect to the nonlinear constraints but at the cost of sacrificing optimality. This may not lead to an overall reduction in the computational effort. The right choice for the parameter a which plays the role of a subjective weight for the least-squares objective function is also diffjcult to come up with. Pai and Fisher used = 0.1.

Swartz [11] first recommended the use of QR factorization for separat- ing the estimation of the measured and unrneasured variables at each iteration. If the problem is highly nonlinear- and the size of :he prob!~m is large, this can become computationally inefficient. Ramamurthi and Bequette [I21 reported an increase in the computational time with the noise !eve1 (magnitude of gross e~sors j in t-i~easurernents. Becacse of the iuccessi\re linearization process. more iterations are usually required to converge a problem with large errors in data. This p t~cedure is very popular. however, for data reconciliation problerns involving overall mass and eaergy balance equations.

In the precedins ctaptei. the I-econcilea values far the binaiy distilla- :ion colutnn I-epor~ed in Table 4-0 and the reco!lciled values reported it: the !as[ row of Tab!€ 4-6 were obtained usirrg :he sl~ccessive linear- data ~econiiliation algorit'hm described above. !t can be cbseriied from the results of Table 4-6. that the reconciled ertimatc of zinc concentra:ion !n stl-zam 8 has a high riegative value, which is absurd. Thas, :his meihod cannot he guaranteed to give feasilile extin1a:es in all cases.

NONLINEAR PROGRAMMING (NLP METHODS L! FOR INEQUALITY C O N S T M I ED RECONClLlATlQN PROBLEMS

Ti?. .~?ajor limitation of the methods described in the preceding sectioil is their inability to handle inequality constraints. In many situations, especially when significant gross errors exist, standard data reconciliation can cause the effect of the gross errors to spread over all estimates. If sufficient redundancy docs not exist, the estimates for variables which have srnall values contain significant errors. In some cases, infeasible estimates s t~ch as negative values for flow rates or compositions are obtained. In order to tackle this problem, it is necessary to impose limits

or bounds on the variables. These inequality constraints on measured and unmeasured variables take the form

xmin I x < x,,, (5 - 24)

In extreme cases, other types of feasibility constraints have to be imposed. For example, when data reconciliation is applied to heat exchanger networks for reconciling flows and temperatures, it is possible that the estimates violate thermodynamic feasibility, such as the temperature estimate of the hot stream being lower than the corresponding cold xtream temperature estimate. In order to combat this, an inequality constraint should be imposed that forces the temperature of the hot stream to be greater than the corresponding cold stream temperature at both ends of every exchanger. These type of feasibility constraints can be cast in the form of Equation 5-9.

The solution to the nonlinear data reconciliation problem when constraints such as Equation 5-2 or 5-24 though 5-25 are imposed, can be obtained by u:;ing genera! purpose nonlitlex programming techniques. A detailed descriptien of such techniques is beyond the scopc. of this test and the reader is referred to excellent books on this sub-iect such as Gill et al. [13] afid Edgar and Himtr?elblau 1141. We discuss below the broad features of two popular i~on l i~ea r pscg~amnling techniques especiaIiy with reference to tlieir application for solvifig ncnlinear datz reconciliation problems.

Sequential Quadratic Ercgramming (SQP)

The seqlienlial quadratic progrmn~aiizg tc:chnique (15, 16. 171 solves a nonlinear optimization problem by successively soiving a series of quadratic programming problems. At each iteration. a quadratic nrn<.r.?m

approximation of the general optimization problem is obtained by a quadratic function approximation of the objective function, and a linear approximation of the constraints both using Taylor's series expansion around the current estimates. In the case of the data reconciliation problem defined by Equations 5-7 through 5-9, the objective function is already quadratic and only the constraints have to be linearized. The resulting quadratic program at iteration k + l is formulated as

130 I ) ~ ~ ; / ~ ~ e c . o i ~ ~ - i I ~ ~ ~ r t o n arrd GI-oss EI ror- I ~ i r i ~ i l r / , r i

subject to

where z is the vector of original variables (x, u); s = z - is the direction of search for iteration k + l ; VO, Vfi. Vgj are, respectively, the gradients (derivatives with respect to

variables z) of the objective function, equality constraint i, and inequality constraint j, ali evaluated at the current estimate L k .

B is the Hessian (matrix of second order derivatives of the objective function with respect to the variab!es z j evaluated at the current estimate ik.

In thz quadratic program for~nulation, all variables ar-e included in the cibjeciive function. Co~npar ins with the data reconciliation objective flifiction Equation 5-7, it can be dcduced that

Note that the Hessian matrix is constarit anu is also singular if there are unmeasured variables in the process.

The solution of the quadratic program gives the search direction for obtaining the estimates. A one-dimensional search is performed in direction sk at each iteration k , so that the new value for z at the next iteration is

where aL is a step-length parameter between 0 and 1. The step length is obtained by minimizing a penalty function (sirn~lar to the Lagrangian). The procedure is repeated using the new estimates until convergence.

There are several issues of particular interest in solving data reconciliation problems using SQP. Generally. in SQP the exact Hessian matrix (or its inverse) is not computed at each iteration because of the computational burden it entails. Instead, an approximate inverse of the Hessian matrix (or its square root) is obtained by a symmetric Broyden's update technique. In the case of data reconciliation, Equation 5-29 shows that the Hessian matrix is constant and, therefore, there is no need of updating it. Secondly, Equation 5-29 also shows that the Hessian matrix is positi\re semi-definite, if unmeasured variables are present. Therefore, the QP solver that is used should be capable of handling positive semi-definite Hessian matrices. Thirdly, the solution obtained using the QP is used as a search direction and the optimum step length in this direction is obtained by minimizing a penalty function. If the objective function contains nonlinear terms of higher order than quadratic, then this line minimization gives estimates which are more optinla1 and less infeasible. In the case of data reccnciliation, however. since the objective function is q:lr?dl-atic. a step l e ~ g t h of unity sives the optimal estimates that satisfy the linearircd approximation of the cnnstraintc.

In this casz. line minimization will i~lzproiv t l i ~ . f e ~ . ~ i b i l i f ~ 1t:iiil rcsperr to ~ 1 1 ~ ~:ollii~~t'(i:- con~tra i~z t .~ bj. s~;cl-ifictl~g opfiluciit\.. Ti is dehntable whether this can lead tc sn overall reductictn in the nurnber of iterations required foi- cosvergence. Thus, sven though a general purpose SQt' technique c a n be uscd, hy exploitin? the special features as discussl-ct i?bove i! is possible to develop a more efi'icicnt tailor-made SQP rechnique for solving nonlinear data reconciliation problems.

Methods for solving qusdratic programs are available in HARWELI, library, SQPHP [ I 7j or QPSOL [ I 81. An efficient SQP solvei. denoted as RND-SQP. hac been recently d-\'eloped by Vasantharajan and Biegler 1191. In this technique. a reduced QP is solved ar each iteration by parti- tioning the variables into a dependent and independent set, the number of dependent variables being equal to the number of equality constraints. Using the linearized equality constraints. the dependent variables are expressed in terms of independent variables and are thus eliminated SI-orn the Q P subproblem. Furthermore. the Q P subproblem contains the inequality constraints only. The solution for the independent variables obtained by solving the reduced QP is used to obtain the solution for the

dependent variables. This technique can also be adapied for data reconciliation to obtain a very efficient method.

Generalized Reduced Gradient iGRG)

The GRG optimization technique solves a nonlinear optin~ization problem essentially by solving a series of successive linear progra~unling problems. At each iteration, a linear program (LP) approximation is obtained by linearizing the objective function and constraints. The LP subproblem is formulated as in Equations 5-26 through 5-28 with the difference that the second (quadratic) term in Equation 5-26 is not present.

The GRG technique differs from SQP in one fundamental aspect. At each iteration, the GRG method requires estimates which satisfy the nonlinear constraints, whereas in SQP the estimates at each iteration need not be feasible with respect to the nonlinear constraints. Instead of a line minilnization as in SQP, the solution of the LP subpoblem is adjusted using an iterative procedure such as Newton-Raphson in order to obtain estimates that satisfy the nonlinear constraints.

The LP subproblem is itself solved using a standard algorithm by par- titioning the variables into dej>eizdenr (basic) variables and i~?dc~)etzdeizr (i:oizDasicJ variables; the dependent variables are implicitly determined by ihe independent 1-ariatiles; making the ~Ljeciive function a function c?f oniy the noi:basic variables. The nonbasic variables are split f ~ r t h e r into superbasic varia'nies, which lie bztiveen their bourlds and norlbasic variables which &re at their bcundc. A one-dimensional search is perfcnxd in the direcrior. of the grsdient ef the superbasic variables (hence the term "reduccd ~radient"). Vanoils comniercia! CRC- algorithms d%er in the methods they use to ca!q out the search and to rcgain a feasible point with respect to the conlinear ccnstraints [20, 21 J

An icteresting issue concern!: the manner in which unmeasured variables are handled in both SQP and GRG. In both these approaches, no distinction is made between measured and unmeasured variables. A ques- t ~ o n woi-rtl co~lsidenilg IS whether Crowe's projection matrix method for decoupling the esrimation of measured and unmeasured variables can be gainfully exploited in each iteration of the nonlinear programming techniques. The answer is that Crowe's projection technique cannot be utilized for elinlinating unmeasured variables if there are bounds on these variables, because the estimates fol- unmeasured variables obtained using this technique may violate the bounds. Furthennore, both successive qua-

dratic programming (RND-SQP) and GRG employ a projection technique to eliminate not just unmeasured variables but a set of depellieilt variables (equal to number of equality constraints which are uruaily more than the number of unmeasured variables). An analogy between this and Si~npson's technique discussed in Chapter 4 may also be drawn.

Example 5-1

We illustrate the necessity for including bounds in data reconciliation for obtaining feasible estimates using the mineral flotation process considered in Example 4-4 for which the measured data are listed in the first row of Table 4-6. The last row in this table also gives the reconciled estimates obtained using a successi\~e linear data reconciliation solution procedure along with Crowe's projection (method of Pai and Fisher [9] discussed in the preceding section). Since this method cannot handle bounds, they were not inlposed. These reconciled estimates show that an absurdly large negative value for the estimate of the zinc concentration in stream 8 is obtained. The same problem was solved using SQP by irnposing lower bound of 0.1% and upper bound of 100% on all concentrations and lower bou1:d of 0 and upper bound of 1 on all flows. The reconciled estimates obtained arc shown in Table 5- 1.

Table 5-1 Reconciled Doto of Minera! Flotation Process using SQP

Mehod Vor. Stream 1 2 3 4 5 6 7 8

F : 0.926? 0.9 157 0.84-18 0.G733 O.C8*? O.CJS03 9.01 lii SQP Bc,% 1.9012 0.4526 0.1316 0.1 2rJ.267 21.163 G.5080 17.126

k,,% 4.5589 4.3728 1.4212 0.4!0i 6.9002 4.95 51.255 0.i

In this prob!em, -we are able to obtain feasible estimates by including bouncls in the nonlinear Dl< problem. Thc concentrations of Cu in stream 4 and ZII in stream 8 are at the lower bolinds in the reconciled solution. By comparing these results with Table 4-6, we note thac the reconcileci estimates for all other variables are not significantly different.

nirfn Rrcoii~.iliuno,~ onrl Gross EI mr- /)~~f~.rpi.?lori

VARIABLE CLASSIFICATION FOR NONLINEAR DATA RECONCILIATION

In Chapter 3, we have reviewed the methods of observability and redundancy variable classification for linear data reconciliation. Many of those methods [ I 1, 22-29] are also applicable to nonlinear data reconciliation (particularly for problems with bilinear constraints).

For problems with higher levels of nonlinearity, a usual procedure is to perform a model linearization first, and apply the variable classification methods for the linear models. Albuquerque and Biegler [30] recently described such a procedure. Although designed for variable classification in connection with dynamic data reconciliation problems, their method can be used for steady state nonlinear data reconciliation problems as well. In this approach. an LU decomposition was used to build a p ~ ~ ~ e c - tion matrix, in order to separate the unmeasured variables from the measured ones. The variable classification rules are very similar with those described by Swar-tz [ l l ] with a QR decomposition algorithm. Since the LU decomposition is part of some NLP solving methods for data reconciliation, we briefly describe [he Albuquerque and Biegler algorithm for uriable classification here.

Eqtistion 5-!6 describing a linearized niotiel. can also St: written in an abbrcvia~ed form suclt as:

\~:i.,erc we grouped all constant terms frorr? the modei iinearizaticr? i ~ t o a global constant vccfor c. We aisn dropped the subscript L:, hy assuming rhat the linearization is pertc~rnred about ihe final solutior; point. Ic order to elirt-rinatc the ~lnn~easui-ed variables u, a p;ojection matrix P sach that P.1, = O should be const~ucted. Let us assurne that ar? ti' decorcposition of matrix is performed 3s fo1lo:vs:

where E and n are sonie per~nutation matrices, 1, is a lower triangular matrix, U l is an upper triangular matrix of rank r (thc column rank of matrix J,,) and U2 is ~0111~ ' rectangular matrix. If J,, is of full row-rank, the 7ero rows in the ~rppcr !I-ianzular rn:ttrix would not exist. Furthennore, i f J,

is of full colu~nn-rank, U2 would not exist and no unobservable variables would exist. A projection matrix P for matrix J, car be created as follows:

Similarly to the rules derived by Swartz (111 or Crowe (221, observ- abilitj- requires a zero row of Uc1U2. Furthermore, the nonredu~idant measured variables will have zero columns in the matrix PJ,. All other measured variables are declared redundant. It should be noted that these methods depend on the actual values of the measurements and may give rise to incorrect classification due to numerical problems.

Alternatively, graph-theoretic theorems and algorithms have been developed by Kretsovalis and Mah (23, 241 J U L observability and redundancy classification in general processes, which are based on rdgehrtric solvtlbilin, qf tlze c.otrj.ttaint ec~uutiorzs rutlzer than an actuul solution of the DR yrohletn. Hoanever, this method has the limitation that it does not take into account all types of process units (such as flash units) in its analys~s. Speciaiized graph-theoretic observabiiity and redundancy classification al~orirhlns for bilinear (niulticoi7iponent) processes have also been developed by otlicr researcher:;, as indicated in Chapter 3.

Erercise 5-2. P!-o:,e rllr ohservubi!ih crrd redrlndanc!; i-ules,!iir tile I L,U dec.ov;p~siticn cr(~,v/-orrclr rf(x-ri!,ed gbove. I

Several issucs regarding the variable c;assificatictn issues in connection with SOP solvinz a1go1-ithln are irn!~ortant and wil; be briefly mentioned here. Fzirst, SQP requires initial estilnates of all variables (nica- sured and unmeasured). If there are unobservable variabies, SQP will still be able to provide estimates for all variables. It uses the initial estimates of unmeasured variables (just sufficient number to make all other unmca- sured variables observable) as "specificatiorls" and pel-forms the data reconciliation.

Which of the i~nmeasured variables ~vill be chosen as the specified ones is irrlplicit in the numerical method of solution (basically when the choice of independent and dependertt val-iables are made based on the

136 Dnru Rci.orr< itiuiiol~ ntid GI-ass r or- Drtectir~!~

colurnns of the linearized constraint matrix at each iteration). The only way we will get to know which of the unobservable unmeasured variables have been chosen as specified is by examining the final results. If the reconciled estimate of an unmeasured variable is equal to the initial estimate provided by the user, then the unmeasured variable is unobservable (Note that there may be other unobservable variables that may have been back-calculated based on the choice of specified variables which we cannot figure out from the results).

Similarly, a redundant measurement can be identified by examining the reconciled values. If the reconciled value of a measured variable is equal to the measured value, then the measurement is nonredundant. In some rare cases, the initial estimate and final reconciled estimate may be the same due to numerical values (small variance, etc.). Again, we may not ocl: dble to pinpoint the cause for zero adjustment precisely. Thus, the only way to perfornl observability/redundancy variables is through comprehensive algorithms cited above.

The KND-SQP algorithrrl [19] automatically generates a reduced quadratic program by eliminating all equality constraints (mass, energy, conlponent balance:) and an equal n~lrnber of variables frorn the criginal prcblen. Tilerefore, :his ,nl.r7es [he smallest reconciliation problem. There is no need to identify redundant variab!es because the RND-SQP uses an LP type technique to separate tne variables into dependenUindependent variables and piirninates ail dependeni variables (a m!xtt!re of msasured and un~nezsured variables) from the problem to construct reduced QP at each iteration. But if a redundancy anaiysis is required for sensor place- i11cil! e r other rcasons, a separate redundaccy analysis by one of the rnectiocis mentioned above can b~ perforrnzi.

COMPARISQN OF NONLiNEAR OPTIMIZATION STRATEGIES FOR DATA RECQNCfLIATlON

Nonlinear programming codes are already co~;;,ncicir;!!j: 2; ,;:,:;!: and they have proved to be nume~lcally robust and reli-'-le '7.r large-scale inrlxls- trial problems. They perform best when rigorous model5 are used 131, 321. Norilinear programming allows a complete formulation for data reconciliation. as described by Equations 5-7 through 5-9.

T-joa and Bieglcr 1331 have developed an efficient hybrid SQP method specifically tailored to solve nonlinear data reconciliation problerns. The data reconciliation software package RAGE developed by Ravikumar et a!. 1341 also uses an SQP solver which has been specially adapted to

solve data reconciliation problerns. Liehnan and Edgar [35] cornpx& the generalized reduced gradient (the GRG2 version of Lasdon and Waren [21]) with the successive lineal- (SI,) daia reconciliation solutjon method, and found that the NLP method was more robust at the expense of the co~nputational time.

While GRG2, a feasible path method, requires the convergence of the constraints at each iteration, SQP--the infeasible path method, satisfies the constraints only at the end when convergence is achieved. SQP and other infeasible path methods (such as MINOS, another generalized reduced gradient method developed by Murtagh and Saunders 1361) usually require less computational time than feasible path methods. Rama- murthi and Bequette [12] compared SQP, GRG and SL methods for data reconciliation purposes. Their findings are sunxnarized as follows:

1. Successive linearization yields significant biases, particularly in the unmeasured variables, while the NLP approaches yield little bias in both measured and unmeasured estimates.

2. Computational time incrcases with the nlagnjtude of the measurement error for SL, but not for SQP cr GKG.

3. Computation:ll time is a strong function of desired accuracy for SL. but not for SQP cr GKG.

3. NLP algorithms are more efficient and more r o b ~ ~ s t for highly noilli11 ear probierns. SQP is inore ef'ticient, while (;KG is r-nore reiiable.

SUMMARY

.The constraints of a nonlinear data I-econciliation problem can contain equality constraints (material balances. energy balances. equilibrium constraints, and property correlations) and inequality constraints (bounds, thermodynamic feasibility constraints). Nonlinear data reconciliation problems which contain only equality constraints can be solved using iterative techniques based on successive linearization and analytical solution of the linear data reconciliation problem. Nonlinear data reconciliation problems containing inequality co11- strain& can be solved only using nonlinear constrained optimization techniques. The Generalized Reduced Gradient (GRG) and Successive Qua- dratic Programmifig (SQP) rnethods are two competitive nonlinear optimization techniques used for solving no~llinear data reconciliation problenis. If bounds on unmeasured \lariables are imposed, then unmeasured variables should not be cli!ni~iared using Crowe's projection technique to obtain ;I reduced pi-ohlem. It is Recessar to imposc ho~!nds on variables in certain problenis :o obtain feasible es!;niafes.

REFERENCES

1 . MacDonlild. K. J. alld C. S. Hoiist. -'Dat:r Kcc~~iciiiatior! and l'aranlere~ Estilua- tior, in Plailt Pe~forn~ance Anal)sis." AIClil? Jour~lul ;-l(no. I, Jan. 1980): 1-8.

2. Eritr, H. I., and !i. H. I,:~eckc.. "The Estimation (;i' Pal-amcter\ in Nonl~near Implicit Modcls." Tech~~nrt~c~rr-ic.~ 15 (nu. 2, 1973,): 231--217.

3. Dennis. ?. E. Jr., a ~ d R. B. Sch:iahel. iV!rtt~r~-ic.crl rMerlrod.sfor IJ~icot~.rircrirzc~d Opri- nlizrrtio~i nt~il i2iorlliriecit- E(/iwrio/,s. Eng!cwood Clit'h, S.J.: PI-entic(:-Hal!, 1983.

3. Stephenson, G. R.. and C. f:. Shc\vchuck. "Reconciliatioil of Process Data with Process Simula~~on." ,l\iChE Joiirntrl 32 (no. 2. Feb. 1986): 247-3-54.

5. Scrth. R. W., C. M. Valer-o. and W. ,I. Hcenan. "Detection of Gross Errors in Nonlinearly Constrained Darn: A Case Study." Chern. Eng. Conlrtl. 51 (1987): 89--104.

6. Madron, F. Process Plant Perjbnnance: Measurement aizd Darn Processins for Optimization arzd Reirojits. Chichester, West Sussex. England: Ellis Hor- wood Limited Co., 1992

7. Knepper, J. C., and J. W. Gorrnan. "Statistical Analysis of Constrained Data Sets." AIChE Journal 26 (no. 2, Mar. 1980): 260-264.

8. Crowe, C. M., Y.A.G. Campos, and A. Hrymak. "Reconciliation of Process Flow Rates by Matrix Projection. I: Linear Case." AIClzE Joun1crl29 (no. 6. 1983): 881-888.

9. Pai, C.C.D., and G. Fisher. "Application of Broyden's Method to Reconcilia- tion of Nonlinearly Constrained Data." AIChE Journul 34 (no. 5, 1988): 873-876.

10. Broyden, C. G. "A Class of Methods for Solving Nonlinear Simultaneous Equations." Muth. Conzp. 19 (1965): 577.

I 1. Swartz, C.L.E. "Data Reconciliation for Generalized Flowsheet Applica- tions," presented at the Amer. Chem. Society National Meeting, Dallas. Tex., 1989.

12. Ramamurthi, Y., acd B. W. Bequette. "Data Reconciliation of Systems with Unmeasured Variables Using Nonlinear Programining Techniques," presented at the AIChE Spring National Meeiing, Orlando, Fla , 1990.

13. Gill, P. E.. W. Murray, and M. H. Wrisht. ?I-c~crical Oprirnizurio~!. Lonr!on and New York: Acade~cic Press, 19s 1 .

i4. Edgar, T. F., and D. M. Himmelblau. Optinziznrio~? cf Cj7e1tzicai Pi-oresse:.. Ncw York: McGraw-Hii!. I SX8.

15. EIan, S. P. "A Globsliy Con~ierg-nt Met!iod for Xonlinear Programmi~;g." J. Opti:?lizntiotz Tlieor~ Confro1 22 (!977): 297.

i6. Powel!, M.J.D. "A Fast Algorithm P J ~ Nonlineariy Cons:raincd Optitni~arion Cuic~llarions." Dlrtldce Cot$ Namer 4tluly.si.s, 1977.

17. Chen, H-S., and M. A. Stadthem. "Enhancen~ents of Han-Powe!i Metho4 for Succesive Qcadl.atic Programming," Cor7zputcrs Che117. Etzgtig. 8 (110. 314, 1984): 229-234.

18. Gill, P. E., W. Mun-ay, M. A. Saunders, and M. H. Wright. User-'s Gxidefbr. SOUQPSOL; A Forrt-a11 Package for 2~cadrutic Progrartz~t~irzg. Technical Report SOL 83-7. 1983.

19. Vasantharajan, S., and L. T. Bicgler. "Large-Scale Decomposition for Suc- cessive Quadratic Programiuing." C o n ~ ~ ~ u t e t - s Clzern. Etzgi~g. 12 (no. 1 1 . 1988): 1087-1 101.

20. Abadie, J. "The GRG Method for Nonlinear Programming," in Design and Imnplenzentation of Optimization Sojhvare, Sijthoff an3 Noordhoff. Holland: H. Greenberg, Ed., 1978.

21. Lasdon, L. S., and A. D. Waren, "Generalized Reduced Gradient Software for Linearly and Nonlinearly Constrained Probleilis," in Design and Imple- mnentatiotz of Optimization Sofhvare, Sijthoff and Noordhoff, Holland: H. Greenberg, Ed., 1978.

22. Crowe, C. M. "Observability and Redundancy of Process Data for Steady State Reconciliation." Clzem. Eng. Science 44 (no. 12, 1989): 2909-2917.

23. Kretsovalis, A., and R.S.H. Mah. "Observability and Redundancy Classifica- tion in Generalized Process Networks. I: Theorems." Cotnpufers Chern. Engng. 12 (1988): 671-688.

24. Kretsovalis, A,, and R.S.H. Mah. "Observaoilit;~ :--I Redundancy Classifica- tion in Generalized Process Networks. 11: Algorithms." Conzputers Clzern. G~gng. 12 (1988): 689-703.

2.5. Mcyer, M., B. Koehret, and M. Enjalbert. "Data Reconciliation on Multi- component Network Process." Computers Chenz. Erzgng. 17 (no. 8, 1993): 807-8 17.

26. Ronlagnoli. J.. and G. Stephanopoulos. "On Rectification of Measurement Errol-s for Complex Chemical Plagts." Chci7~. Czg. Scierice 35 (i98O): 1067- 105 1 .

25. Rornagnoli, J.. and G. Stephancpouios. "A General Approach !o Classify Operational Faranictzrs and Rectify Measurement Errors for Cornpiex Chzn:ical Processcs.'' Con/?. AjqL to Cliem Erzgmzg. (1983): 153-174

28. Sznche~ . id., A. Bandeni, and J . Romagnoii. "PI-AD.4T: A Package for PI-ccess Vanable Clarsirication and Planr Data Reconciliation." Cnnlpui~r.~ Clioll. Engrzg. 6 16 (Suppi.. : 992): S439--S506.

19. Sanchez, M.. ar,d J. Kot~lagnoli. "Use of Orthogonal Transf~rniations in Data Classificatiori-Reconciliarion " t,'ornpu!em-s Chen?. Engng. 20 (no. 5, 1996): 483493.

30. Albuquerque, J. S.. and I,. 7'. Biegler. "Data Reconcifiation a~id Gross Ermi Detection for Dynamic Systems." AlChE Jortrmzal 42 (no. 10, 1996): 284 1 --2856.

31. Nair. P., and C. Jordachc. "On-line Reconciliation of Steady-State Psocrss Plants Applying Rigorous Model-Based Reconciliation," presented at the AIChE Spring National Meeting, Orlando, Fla., 1990.

32. Nair, P.. and C. Jordache. "Rigorous Data Reconciliation is Key to Optimal Operations." Cor~fr-ol .for the P r o c ~ s s Irrdustries, Vol. IV, no. 10, pp. t 1 8--123. Chicay: Pulnam Publ., 199 1 .

33. Tjoa, I. B., and L. T. Biegler. "Simuitaneous Solution and Optinlization Strategies for Parameter Estimation of Differential-Algebraic Fnuation Sys- tems." Ind. & Eng. Chen7. Keseurch 30 (no. 2, 1991): 376-385.

34. Ravikumar, V., S. R. Singh, M. 0. Garg, and S. Narasimhan. "RAGE-A Software Tool for Data Reconciliation and Gross Error Detection," in Fo~hrtrr- dations of Comnputer-Aided Pr0ces.r Operations (edited b y D.W.T. Ripping. J. C. Hale, and J. F. Davis). Amsterdam: CACHEElsevier, 1994,429435.

35. Liebman, M. J., and T. F. Edgar. "Data Reconciliation for Nonlinear Process- es," presented at the AlChE Annual Meeting, Washington. D.C., 1988.

36. Murtagh. B. A., and M. A. Saunders. MINOS 5.0 User's Guide. Report SOL 83-20, Dept. of Operation Research, Slanford University, Calif.. 1983.

Data Reconciliation in Dynamic Systems

THE NEED FOR DYNAMIC DATA RECONCILIATION

In the preceding chapters, data reconciliation has been applied to a single vector of mcas~lrcnerits of process variab!es. This vector could be the mcasurenlents made at ally tinie instarit correspocding to a single cnap- shot of :he PI-ocess. it is !nore likely. howevel-. tll:lr stead),-state data rzc- o:lci!iation is ripplied to a vector containing t l ~ c average \;a!l;es of thz illcasureinenrs made over a period of :i!ne of, say. a few hours. This approach is satisfaztol-y if the reconciled data is required for applications s l~ch as steady-state simulaticn. or on-line opiimjzation where the optimal set points are calculated once every few hours.

If we coilsider applicatiolls such as regulatory co:ltrol which rcquire accurate estimates of precess variables frequently, then data reconciliation may have to be applied to ~lleasuren~ents niade at every sampling inslant. in this cass, it Carl no longer be assumed that the variables obey steady-state mal~riai a~ id erlclyj ud~dnce reia~ion.\i~ips. Storage capacities and transportzti,,~ l,&s should be t ak i i i i i t ~ accouili, and dynamic rnater- ial and energy balances that relate the variables must be used.

Estimation of process variables which uses measurements and dynamic relationships between the variables have been developed long before the sut~ject of data reconciliation was born. We discuss some of these important estimation techniques along with recent advances in dynamic data reconciliation under the broad umbrella of dynan~ic data reconciliation in this chapter. Since the area of dynamic data reconciliation is fairly

nascent and will continue to evolve, the intent of this chapter is to only introduce the reader to this topic.

Before we proceed fulther, it is useful to explicitl!; describe what we mean by a dynamic state of a process. Two features characterize a dynamic state of a process:

1. The true values of process variables change with time and thus the measurements of these variables are also functions of time even if we entertain the extreme possibility that measurement errors arc absent.

2. Due to continuously changing inputs, the accu~~iulation within a process unit also changes continuously and has to be taken into account.

The above features characterize both operations around a noniinal steady state as well as process transients that take the process froin one nominal steady state to another.

Different techniques are available for developing a dynarnic model of a process. These techniques are described under r~zodel iclei~t(ficcitiotl irt several textbooks 1 1 , 21. We will consider only discrzte time lnodels as opposed to continuous niodels because we will be dealing with rueasuse- ments made at discrete time instants which are convenirjntly treated using digital computers. Ful-therniore. \ve will cunsid51- stat? spice modcls as opposed to input-output rnodels due to their ivl~crent ad\.antayes. ?Ve \4.i!i

begin our descriptio~ with linear discrete syster:i madeis beforc ~novins on to nonlinear systems.

LINEAR DiSCRETE DYNAMIC SYSTEM MODE!.

A linear, discrete. state-:;pace dyllainic rllc,dei of a process is usual!y described by the fo1lowir.g equations:

where xh : n x 1 vector of state variables uh : p x 1 vector of manipulated inputs w, : s x 1 vector of random disturbances y, : m x 1 vector of rneasuremenrs 4 : m x 1 vector of random errors in mcasurements

The subscript k represents time instant t = kT when the variables are sarnpled or measured, T being the sampling period. The matrices Ak, Bk, and Hk are matrices of appropriate dimensions whose coefficients are known at all times. If the coefficients of these matrices do not change with time, then the resulting model is known as a linear time-invariant (LTI) system model. It is also customary to use deviation variables rather than actual variables in the model equations. Thus, the state variables, xk, represent the differences between the true values of the variables and their nominal steady-state values. Similarly, the variables uk and y, also represent deviation variables. In this chapter, we implicitly assume that all the variables are deviation variables.

Equation 6-1 describes the dynamic evolution of the state variables while Equation 6-2 is the measurement model which describes the relationship between the measurements and the state variables. The standard assumptions made about the random disturbances \vk and the random errors vk are that they are normaliy distributed variables with statistical properties giiien by

Equatioils 6-3 and 6-4 imply that the landom variables wk and vL ha:.e zero mean and covai-iance matrices givzn by R , and QL, respectively. Fxluadon 6-6 implies that the disturbances a; different times are not correlated, and similarly the measurement errors at different time instants are not correlated. Furthenore, Equation 6-7 stipulates that the disturbances and measurement errors are not cross-correlated.

The random errors in nieasurements, va arises due to several reasons 3s

explained in Chapters 1 and 2. On the other hand, the causes of random disturbances, wk, in the state evolution equation can be best explained only if we consider a first principles model derived from the differential mass and energy balances of the process. In this case, randorn fluctna-

tions in the process feed characteristics such as its flow, temperature, pressure and composition =an be modeled as disturbances.

Any randorn errors in the control inputs arising due to electrical noise in the transmission lines of the controller or due to imprecise actuator positioning can also be modeled as random disturbances. On the other hand, if an input-output model of the process is identified from the process data, then it may not be possible to separate the effects of random measurement errors and random disturbances. In this case, the differences between the model predictions and actual measurements can be attributed to the combined effect of measurement errors, process feed disturbances, and errors between the actual and computed manipulated inputs.

A linear system model of the form given by Equations 6-1 and 6-2 can be derived for m y process from the differential equations that describe the mass and energy conservation relations of a process (also known as a first principles model). Alternatively, model identification techniques may be used for obtaining a dynamic model from the outputs or response of a process to given inputs. N7e illustrate the development of a first principles dynamic model for a simple level control process taken from Hellingham and Lees 131.

A simple level control process is shown in Figui-e 5-i, which has a feed jF!) and two outpets (F1 and F3). The vaive V , is kept open at a fixed position, while vaive V2 is mzniptilsted to co~ltrol the levcl of the tznk (instead of directly computing the new vrlve position, it is assumed chat the adjustment, ci, :o the valve positior, x is computed at each ti~nej. The tank level and position o i the valve V2 are ~~ieasured, denotcd by measurements Z , and Z2, ~espectively.

The differential equation describing the mass balance for this process is given by

The outlet flow rates are related to the tank level and valve positions by

1 4 6 Duiri Re< oriril~ar~vri irrid GI on Er-r or- Dcirciiori

COMPUTER

A 4 !

- - - - - - - - - - b Fiow of infornxtion

Figure 6-1, ~ e ~ e l corttrcl process

Suklst~tuting tile above relat~ons In the rilass baiance equation we set

If we assun:e a uniform sampling interval T, between measurements and use thz subscr-ipt k to rzpi-esent the variables at sampling instant kr then the discrete equivalent of the above diffei-cntial equation can be obtained usirlg the method described in (41.

where

In deriving the above discrete representation, it is implicitly assumed that the valve position x is constant at the value xk during the time interval kT to (k+1)7: It is also assumed that the random disturbance in the feed F1 is piecewise constant (of constant magnitude within each sampling interval, but the magnitude being random from interval to interval). If the adjustment to valve position ak (computed by the controller after measurements are made at sampling instant k) is implemel~ted at the beginning of the next sampling interval, then the valve position at each sampling instant is given by

where eL+, is the random error in positioning the valve

Table 6-1 Parameter Values for Level Control Process

Parameter Value Units

Table 6- 1 givcs the values for different coilsta11:s wed fo- this process. ilsiilg these \.alazs, the stale-space model of the l ~ \ ~ e l control PI-ocess is obtained as

wherc v,,,,, and v , ,~+ , are the random errors in the measurements of the level and valve position (in temis of volts), respectively.

148 Uilra ficco~ic-i/jntior~ and G'rosc Error Defcc-?ro:i

The manipulated inputs uk are obtained using a control iaw which is qenerally a function of the measurenlents when the variables that need to L

be controlled are also measured. For a simple proportional linear control law we can write the control law as

where ySp,, represents the deviation or change from the current operating set points, and is equal to 0 if there is no change from current set points. In some cases when it is difficult or expe~lsive to measure the controlled variables an inferential control strategy is used. The manipulated inputs in this case are a function of the state estimates. Even in the case when controlled variables are measured, it may be better to base the control law on estimates ~f these variables since these are likely to be more accurate, if the estimator is designed properly. As mentioned in the preceding section, a primary reason for dynarnic data reconciliatio~l is to derive estin~ates which can be uscd for better control. We therefore assume a control law of the form

where ik are estimaies ~f [he true vaiues ot state variabfcs 2nd x , ~ . ~ are changes in the set points of state variab!es from current set points. In order to achieve good coilirol. it is therefme required to es!in;ate the staie valiables as accurately as possible.

OPTIMAL STATE ESTIMATION USING KALMAN FlWER

We first deal with th-, problem of o p ~ i n ~ a l estimation of state vanabies TO:. a process that can be described by a linear model of thc: fonn giver! by Equations 6 -1 alld 6 -2, an3 which satisfies ihe assumptions of Equations 6 -3 thrcugh 6 -7. We will also assume that i f e rnanipulated inputs at each time are known constailt values. and ignor.2 Cc>r th: !i1~11' "fly the f T ~ t t h ~ t these are functions of the state cstirnates. (We will address this issue in a subsequent scction.) The optimal linear state estimator called the Kalman filter which we describe in this section can be derived using differe~~t theoretical formulatiocs and an excellent treatment may be found in Sage and Melsa 151. We use a least squares formulation approach because it helps us to readily compare this with data reconciliation.

Initial estimates of state variables are assumed to be available which possess the following statistical properties:

Given a set of measurements, Yk = (y,, y2, . . ., yk), it is desired to obtain estimates of the state variables x,, which are best in some sense. We will denote these estimates using the notation Cklk, which are intcr- preted as the state estimates at time k obtained using all measurements from time t = I to time 1 = k. It should be noted that by using all the measurements from initial time to derive the estimates, we are autonlatically exploiting ternporal redundarzc~ in the measured data.

The estimation problem that we arc considering here is a special case of a more general problem in which it is desired to obtain estimates of state variables xj for time j, using all measurements made from initial time to time k. The estimates so derived are denoted as kjlk. The estimation problem is referred to as a prediction problem if j > k, a filtering problem if j = k, and a sn~oothing prob!e~n if j < k. Here we a;-e concerned with the filtering problem.

Due to thc presence of random disturnances in Equation 6-1, the [rue values of the state variables a: every time instant are themseives rafidom variables. Therefme, a probabilistic measure has to be used for detcrmin- ing the best estimates of the state variabies. The best estimates of the state variables at time I( are obtained by minimizing the following f3nc:ion:

Equation 6- 12 is tf~e expected sum of squares of the differences between the estimates and true values of state variables, and is thus an extensiori of the well-'mow deternunistic least squares objective function. The solution to the problt-ILI was Grs~ uuutlnai by Kain~a!~ 16, 7 1 in a convenient recursive form r;,.l is generally no\; referred to as the Kalnian filter. The Kalnlan filter equations are given by

where

Iluff~ h'ecilr~crlirrrion rrz I)?.rramic Sysfelns 151

The matrices Pklk and Pklk-, are the covariance matrices of the estimates j ikik and gklk-,. respectively. Starting with initial estimates g o and Po. the above equations can be applied in the reverse order of Equations 6-16 to 6-13 to obtain the state estimates at each time k. The derivation of the Kalman filter equations is described clearly in Gelb [8] and Sage and Melsa 15). The book edited by Sorenson [9) contains several papers on Kal~nan filtering and its applications including the original papers by Kalman.

Exercise 6-1. I1eri1.e the Kaln~ali$lter eyrctzfioi~.~ [IT millil?li:illg 6- 12 for t17r pi-oces.s li~ociel Eqirutioiw 6-1 cr~d 6-2 aizd uti1i:irlg tile ~tut i .~~ical p~-<~pe~.rie.s Eql~(iti(~/~.< 6-3 tl1i-014gIl 6-7, 6 - 10, ( I I I ~ 6- 11. I -

The recursi\ie form cf the estin1ator equations ccinsiderably reduceh the colnputationai effort in;.tilved in obtzining the estin-~ates. li can be observed from these equatiolls that :he e f f r ~ ~ t spent i i l obtaining :he sa t e estiniates 31 a i i~ne is effeciivcly utilized to obtain the estima~es at the next t in~e instant. Equatioi~ 6 1 3 can also be iiitcri;reted as 3 prt,dic'to~.- correc:or method for obrair;iilg thc cstirn~ites. The estimates jikii, i l ? ~ predicted estimate:; o f tiic state variables at time ic based o ! ~ ail tile mea- sl;renients until tirne k - ! .

The seccnd tern1 in this ecluation is the co~rection t o this estimate ba.,ed on the measurement at rime k. The matrix K!: is known as the K( l l i~~a~~~f i l re~- .q1i11, and the difference cyh-Hk2k,k.,) is known as the h117ovaiio17.s. Thc innovations are equix-sisiit t o rneasurcrllent residuals of steady-state pr~xesses xvhict~ \\ere dctiilcd i : ~ Chapter 3. and play a cr-ucial role in 21-0s. error detection, as will be x c n in Chapter 9. The Kalman filter exti~niitex ~ ~ o s x s s t l l e desirable htatistical properties of being unbiased, that is.

and also have the rninimurn variance among all unbiased estimators. Fur- thermore, it can also be shown that the Kalman filter estimates are the maximurn-likelihood estimates (Sage and Melsa 151). For a linear time invariant process, the Kalrnan gain matrix becomes constant after some time, which is also known as the s1ead.y-state Kab~tan gain.

a @ .

Exercise 6-2. Prove that the Kalma~z filter estimates are unbiased.

Applications of Kalman filter in chemical engineering have been discussed by several authors. Fisher and Seborg [lo] have applied Kalman filtering to a pilot scale nlultiple effect evaporator in LIIZ context of inves- tigating various types of control strategies. Stanley and Mah [I 11 applied it to a subsection of a refinery for estimating flows and temperatures. The dynamic model used in this application is a heuristic random walk model for the state variables which is appropriate for describing processes that operate for long periods around 3 nominal steady state with occasional slow transitions to a new n~rnirial steady state. The state variables were also forced to satisfy the steady-state material and energy ba!ances. Through this approach. they attempted io exploit both spatial and ter~lpo- ral redundancy in the data for reconci1ia:ion purposes.

Makfii et 21. [12! recent!y used a sinliiar technique for estimating flows and concentratiocs of a mineral beneficiation circuit. A first-order, identified-transfer-Pdnction n:odel was used to desc1-ibe ?he dynamics and

s e steady-state material balances were also in~poscd, a:th.sugh ths esrirnaies were expected to satisfy them in a rnininium Ieast-square sense.

Example 6-2

We illustrate the application of the Kalman fiiter far obtaining optiinal estimates for the !eve1 control process described in Example 6-1. In this process, the fluctuations in the feed flow rate and random error in positioning of the control valve a!-e taken as state disturbances. The standard deviations of these randorn disturbances are assumed to be 250 cm3/miri and 0.05, respectively. The standard deviations of the errors in measurements of level and valve position are taken as 0.01 volts each. Based on the state space rnodel derived in Exarnple 6-1, the measurements corresponding to

the closed loop behavior of the process are sin~ulated. Tne control law used for this si~nulation is based on ahserva~iorzs and is given by

where Z , , and Z l k are the measured valucs of level and valve position (in cm) obtained by dividing the actual measurements in volts by 0.631 and 1.57, respectively (see Example 6-1). A Kalman filter is used to estimate the states using the steady-state Kalman gain, since we have used a time invariant model. The steady-state Kalman gain obtained by solving a matrix Ricatti equation 191 is obtained as

Figure 6-2 shows the tl-ue, measured and estimated values of the level. It can be observed fi-orn Figure 6-2? that the estimares arc cluser to the true values as compared to the measurements. The measurement error variance, calculated from the sample data over the time period of 200 secorids. is 0.0234 i:ni2. whereas the variance of the error in the estimate i s 0.0023 crn'. The viniancc of i!le differ-ence Sctwecn the t n e values arid the set pcint (u~hich is 0 in this case) is an iildicator of the control perSol--

Figure 6-2. Measured ond estimated stotes ior level con!rol process using control law based on mec~surements.

mance. For this case it was computed to be 0.0068 cm2. The variance of the valve position to achieve this controi is 0.1057 cm'.

Analogy Between Kalman Filtering and Steady-State Data Reconciliation

Data reconciliation techniques were developed primarily for steady- state processes whereas the Kalman filter was developed independently for a linear dynamic process. Both these techniques can be derived using a weighted least-squares estimation procedure. In order to bring out the link between these two approaches, we prove that steady-state data reconciliation can be regarded as a special case of a Kalnjan filter. It can be recalled from Chapter 3 that for steady-statc processes, the material and energy conservation relations are written as algebraic constraints. The differential dynamic form of these cofiservation relations can also be used to derive a discrete linear-stat,: space model form of Equation 6- 1 as shown in Example 6-1. As a special case, we can consider a disturbance free fonn of this equation by setting wk ro be identically equal to zero for all time to get:

L,et us dcfine a new state .vector x cornpnsed of both xk and xk_, (where we have deliberately ctiose~i to omit the ti~nc. index X for ready comparihon with steady-state reconcilia:io~;). Equatior, 6- 19 can be rewritten as

where

If all the state variables are assumed to be tneasurcd. then Equat~un 6-2 can be written as

Let us also assume that we have unbiased estimates 2k-11h-l of the state variables xk-] obtained from the preceding time instant and that its covariance matrix is Pk.llk.l. These estimates can be treated as additional t~leasur-eme~zts and can be written as

where E ~ . , are the random errors in the estimate of xk .~ with zero mean and covariance matrix Pk-llk-l. From the assunled properties for vk, we can easily prove that they are not correlated with Ek.,. Combining Equa- tions 6-24 and 6-25 we can write the modified measurement model as

where

The estimates given by Equation 6-3 1 can be shown to be identical to the Kalman filter estimates given by Equation 6-13 with Kk = 0 and H, = I as in the case of the simplified model considered in this section.

f. * : Since the Kalman filter also accounts for random disturbances in the process model, it may be regarded as an extension of linear steady-state reconciliation technique to dynamic processes. An interesting by-product of the analysis carried out in this section is that it can also be proved that the estimates fik-, given in Equation 6-30 are identical to the opti-

I

ma1 smoothed estimates 2k-llk of the simplified model considered in this section.

where v is the -,lector of random e m x s \vi:l~ Lero rne:ln and col/ariaa(:e matrix C defined by

Ec!uatic?ns 6-26 and 6-20 are xi~riilar to the srer:dy-state ~neasul-enieni and cor~s~rzint models, Equjtiort:, 3-1 a id 3-2. ,2j)plyi11& [he stead)-scare reconciliation solu:ior~ to this n?ocic!. we cat! oS:ain the estimates of x itsing Equation 3-5. Substituting fgr !he different i~andbles as defined by Equations 6-21 through 6-23. 6-27 and 6-27 in this solution we get

Considering only the estirrlates of x, in Equation 6-30 and using the predictzd estin~ates defined by Equation 6-14 we obtain

Linear dynamic data reconcilia~ion techniques have been applied for estimating the flows of a process by Darouach and Zasadzinski 1131 and Rol:ins and Devanathan [id]. These axthors have converied the linear differentiai equations to algebraic equations by replacing the derivative l>y a forward difference. The problem can r,ow be solved using linear

8 * dara reconciliation sol~~t ion tcci~niq~iex similar to the procedere disc~lssed in this sec~ion.

Sifice the problem dinlensioil increases with time, efficirnt tech:iiques fol. obtaining the estimates have been proposed by thesc authors. Bag:- jewicz and Jiang [I51 also cocsidered the pic;blen~ of dynamic data reconciliation of process flcws and tank h c l d ~ l y . .Accllm;nll th.7 +lnwc and tank holdups to be polynomial functions o f time. these authors convert the differential equations into algebraic equations. Using a window of nieasur-ements, the coefficients of the polynomials are estimated.

Optimal Control and Kalman Filtering

Let u j first cons~der the opt~mal control problem for a determin~st~c

4 .I, linear process which evolves accord~ng to Equation 6-1, but without any

state disturbances. Let us also assume that the state variables are directly measured without any errors (that is, the true values of state variables are available). We wish to determine the optimal values of the manipulated inputs which minimize the performance index

n

Min J; = x ( x : ~ ~ ~ ~ + u:Fiui) u, i - I

where Ei and Fi are specified weight matrices (Ei are assumed to be nonnegative definite syrnmetric matrices and Fi are assumed to be positive definite symmetric matrices). The first tern1 in Equation 6-32 attempts to keep the state val-iables (deviations of the state variables from their current set-points) on their target value of zero while the second term attempts to minimize large values of the manipulated inputs. The weight matrices can be chosen based on the relative importance of the state \fariables and manipu!atetl inputs.

The solution to the above problem [I61 leads to a linear control law of the fonn

v.!here E, is dependeni on thc xbeighi matrices used in Equation 6-32 ai.,d the system matrices.

Lec us now c::csider (he optiinal contro! prob1t.m for a linear stochastic sy~t'rn which e ~ o l v e s zcccrding to Equation 6-1. !n !his caLe. the performance index for the optiinal cont~-01 prob:e!i~ car? bc written as

The optirnal values of the manipulated inputs, which rninilnize 6-24, can hr obtained as fc>llows:

i 1 ! Compute 2,,L ,. the predicted estimates of the state variables at time X. using the Kalrnan filter equations by treating the manipulated inputs prior to time k as known deterministic inpnts.

( 7 ) Co111ptlte the manipulated inputs at tir-ne k by using the estimates f i , k - l initend of x, in Eqii::tion 6-33.

Despite the fact that the rnanipulated inputs are themselves functions of the state estimates, it is assumed that they are known deterministic inputs when deriving the state estimates using the Kalrnai: filter. On the other hand, the optimal control law has been obtaincc! for a deterministic system for which the true values of the state variables are assumed to be available, but is used for the stochastic system also. Essentially, this implies that the optimal estimation problem and optimal control problem has been separated. The proof that this procedure gives the optimal manipulated inputs for minimizing 6-34 follows from the Separation Theorem or Certainty Equivalence Principle 1161.

Example 6-3

In order to investigate the effect of using a control law based on estimated states, the level control process described in the preceding two exanip!es was simulated using a control law similar to that used in Exa~n- ple 6-2. The control law, however; was based on the estimates of the levei and valve podion obtained using a Kalman filter. The contro! law for this case is given by

The trbe. measured ar;d simil1r;ted values of the levei f ~ r this case are shown in Figure 5-3. We can compare the true values okained in this case with those obtained in ex amp!^ 6-2 anci find that ther? is a marpina! irnprovernent in conti01 perrill-fiiance. Thc -,.ariaace of th? error be t~ ,een the true vaiues arid set point is 0.0065 cm2 w.tiich is ii~arginaiiy iov,,er than that chtai!ied wi?en the control law is bssed on mzasured values. However, t!le variance in the valve position i11 this case is only 0.0523 cm' which is about 20 of that obtained in the pl-eceding example. This implies that we are able to achieve as good control as before with less change in the manipulated variahie. This is due to the fact that the estimated states are more accurate than the measured values.

Kalman Filter Implementation

The matrices Pklk . I and Yklk occur~.ing in the Kalman filter equations. by virtue of being covariance matrices. sltould normally be nonnegative definite, or in cjther words their eigenvalues should be nonnegative. If this is

Figure 6-3. Measured and estimated states for level control process for control law based on estimates.

ensured, then the Kalman filter will he stable. However, if the Kalma~l filter is in~piemented as given by Equations 6- I3 through 6- 17, these mat1-i- cex tend to 1:)se their nonnezaiive definiteness cha~-acter ;i::d the estimates tend tc diverge, due to ~lumerical inaccuracies in the computation.

.i\ form known as :he square-root covrxiance filter can Se used to implement the Kalman filter. Equation 6-16, which is used to obiain Pklh-l, preserves the symmetry and positive definiteness char-acter of this matrix, but Equat~on 6-17, used for obtainins the updated ccvariance matrix, can czuse nun~ericai p r o b i e ~ : ~ ~ sigce it invol\~es he in-version of a ntatrix in the cornpatarion of i h ~ fijter gain nlatrix. !t is this equa:ion which is recast i11 telms of square rocics of the co\.a~-iance n:ati-ice:.

Fo~ther!nore; the computational efficiency is also inzr-eased by processing the measurements in a sequential manner rather than simaitaneoilsly. thus dvoiding the x e d to compute t!le inverse of a matrix in Equation 6- 15. We describe the steps inv~ived in the implelnentation and rctcr me reader to Yagchi [I 71 and Borrie 1181 for a detailed derivdtiutl c,i the algorirhn~.

Step 1. Sta~ting with estimates 8,. and Pk.llh.l aj~ply Equations 6-14 and 6-1 6 to compute the one-step ahead predictions, iklh-l and Phlk.,.

Step 2. Obtain the square roots Sk,k.l and C, of the covariance matrices PkI,_, and 0,. respectively defined by

The square roots of he matrices can be obtained using Cholesky fdc- torization 1191. .

Step 3. Co~npute the transformed measurements and transformed measurement matrix defined by

Since CL is upper triangular, y i and H; can be cornputed without thc cxpl~cit need to invert Ch.

S tep 4. Since ihe transforned measurements are not correlated (see Exercise 6-4), t h ~ y can be p:-ocessed seque~ttially using the fo!lowinz eqilztions:

where Sklk-, is the square root of the updated co\iariance matrix Yklk aftel- processing the first i measurements, and ti;,, is row i of the tr-ansfonned nieasurement ~llatrix H i .

We initialize the computations of this step using

Thus, after all n measurements are processed, the updated covariance matrix is obtained as

P ~ l k = ~ k l k , n ~ ~ I k , n (6 - 43)

Although, the Kalman gain matrix is not explicitly computed in the above sequential procedure, it can be computed if necessary using

where Kk,i is column i of the gain matrix.

DYNAMIC DATA RECONCILIATION OF NONLINEAR SYSTEMS

Nonlinear State Estimations

The treat~nent of nonlinear processes presents several d i f f ic~~l t ies which are not encountered in linear systems. First. it is generally not possible to analyiicaliy obtain a discrete form represent:ition of :he process analogous to Equation 6-1 starting with a jzt c~f non!inear difFercn:iul equaiions describing !he process. Secondly, it is nlathematicaliy difficult tc treat random coke, if the state transition equations or medsl~rcment equaticr?~ are nonlinear functions of the noise (see Borrie [is] for a detailed explanatio~~).

Thus, the effect cf noise in a rionlinear process is modeled as a linear additive tern. Thirdly, even if the random noises are assumed to be normally distribuied, nelther the state variab!es nor rhe measuren~ents follow a Gaussian distribution ciuc to the nonlinearity of the eqnations. Thus, a probabilistic framework can be used only under some approximations (see Jazwinski [20j for a more complete treatment). A Izast-squares formulation, hourever, can alv,'ays be used to derive the estimates.

Under the above iirtutations, the evolution of the state variables fcr a gen- erd nonlinear process is rnodeled by the following differential equation:

where w(t) is a white noise process with mean function zero and covariance matrix function R(t)G(t-z), where 6(t-z) is the Dirac delta function.

The variables are assumed to be sampled at discrete times t = kT and the relation between the measurements and state variables are represented as

where vk are the random measurement errors which are assumed to be Gaussian with mean zero and covariance matrix Qk. As in the linear case, we assume that w(t) and vk are not correlated with each other. Equatio ..., 6-45 and 6-47 describe a nonlinear continuous stochastic process with discrete measurements.

Based on a linear approximation of Equation 6-45 and Equation 6-47. at each time around the current state esiirnates, an extended Knli~ian,/i!rer can bc: used to obtain the state estiinates recursively using the following equations which are znalogous io Equations 6-45 and 6-47:

where

Equations 6-49 and 6-51 are nonlinear differential equations that have to be numerically integrated to obtain the predicted estimates of the state variables and the predicted covariance matrix of estimates. Equation 6-5 1, which @ e involves the solution of n2 coupled differential equations, can be computa- tioilally denlanding. These can be avoided by computing a state transition matrix A, at each time based on a linear approximation of the nonlinear functions and assuming it to be constant duling each sampling period (Wish- ner et al. [21]). With this additional appl-oximation, Equation 6-16 can be used to obtain the predicted covar-iance ~natrix. The method described here represents one of many different approaches for developing recursive estimation techniques and these are described in Muske and Edgar 1321.

Example 6-4 II e

A conri~i~(oir.s .cri,-,-erl rank t-ocrc.tot- (CSTK) with extel-nal heat exchange 1331. and ir, u.hic11 a fij-st order exothernlic reaction (decornpositicm of a reacraht A ) occui-s is used to illt~strate tho ;1pplicatio11 of state estimation for- :i nonlinear process. The dif::~-i.nti;~! equations describiiiy th? cliangc il l

concerltrutinn (of reactant i\) ai~d tempel-atur.: i l l the reactor are gi\7cn by

whei-e A. and To are the feed concentration and te~rtperature, respeciively and A , T are the react:)r concentration and temperature, respectively. The concel~tration a ~ d temperature vnriabies ar-e scaled using factors A, and T,. respectively. The reaction rate constant is given by

Table 6-2 Parameter Values for CSTR

Parameter Value Units

The values of all parameters are listed in Table 6-2. Corresponding to these values. it can he verified that the steady-statc reactor concentration is 0.153 ! and steady-state reactor temperature 4.609 1 .

It is assu:ced that for this process, the reactor coi1centratio:l and ternpcra- ture are n1e;licred using a satripling period of 2.5 s. uld that [hi. sranliai-d deviat~ons of random e m r s in these measurements are 0.0077 and 0.3305 (5% of the steady-state values), respectively. The open-ioop rcspocse of this process is si~nil!ated for a step change i~ the feed co~icentration from 5.5 to 7.5, m d an extended Kalnlan filter is used t<j estimate the reactor co!lcen:ra- ticm and teInperature. In fhe implcinentaticr: of the ctxiendeti Ka!il~;li~ fiiter. the praiicted sratc estimates and predicted covaiiance matnu of es:iniatio;-1 errors at each sampling illstant are obrained by integrating the differential equations (Equations 6-49 and 6-5 1) using a 4th cr-der Kun~e-Kutta r:icthod.

The true, ineasured and estinlated values of reactor concentration and temperature. respectively, are shown in Figures 6-3 and 6-5. it can be observed that the estimated values are very close to the t r ~ e values (in thc figures. the e s t i ~ ~ ~ a ~ e d values allllost coincide with the true values). The variances of errors in the ineasurernents of reactor concen!raiion and temperature. calculated from the sa~nple data over the time period of 250 s, are 5.9 x 10.' and 0.0534, respectively. In comparison. the variances of errors in the estimated concentration and ternpel-ature are only 1.36 x 10.' and 2.52 x respectively.

164 I)uru Rt~cori~.ilriztir,tr u~zd G'rosr Error Dr<~ec~iotz

0.20 1 ---- ?'nc - - - - - Measured - - - - - - - Esturnlcd

0.18

0.02 C-- - - 1 ---- . - -7

0 5 0 100 150 200 250

Tim (sec)

Figure 6-4. Estimated concentration of CSTR using extended Kclrnan filter.

'hi- - - - - - Measured . . - - - . - ki inmtsd

Figure 6-5. Estimated temperature of CSTR using extended Kalman filter.

Nonlinear Data Reconciliation Methods

It was dertlonstrated earlier that Kzlrnan filter is equivalent to data reconciliation, if we assume that the state transition equations are not cor-

rupted by noises or random disturbances. A similar progression from nonlinear filtering to data reconciliation can be made by neglecting the random noise term in Equation 6-45. There are other key differences. however, in the formulation and solution of ilonliizear dytzantic datu rec- onciliutiolz problems as compared to nonlinear filtering problems. Lieb- rnan et al. 1231 and later Ramamurthi el al. 1241 formulated the noniinear dynamic data reconciliation (NDDR) problem and also proposed solution strategies. We first start with a general statement of the problem as posed by Liebrnan et al. [23] before discussing these solution techniques.

The NDDK problem may be forinulated as

'h'

Z(U., - ujr~;:(ucj - u,) (6-55) j't0

subject to

s = f(s); x(to) = go (6-56)

There are several features to the above foi-~ntilatior! that need elabora- tion. Firstiy, :he manipclated input variables u are inciticied as part of :he objective fu~ct ion aild ar-c estin:ated at each tilnz step, although they arc assuiilsd to be constant uithir, ea-t: samp!ing p-riod. The ccn3puied values of the nia11ipilla:ed inputs. uq at ezch time J , using Equation 6-9 or any other control law, are different from the actual manipulated iiiputs to the process due to inherent er-rorc in the actuators. Thus, the conlputed values of the manipulated inputs sel-ve as ineas~~rernents, and ihe true \la!- ues of these variables have to be esti~tlated.

This formulation is Inore general as cornpar-ed to tlie model used in filtering. where the manipulated inputs are assumed to be known exactly. Secondly, the state variables are assumed to be directly measured (or equivalently, the matrix Hk is assumed to be an identity matrix). This

does not impose any limitation because by using a simple transfor~natjon, the problem can still be formulated as above.

If a measurement is a nonlinear function of the state variables, then we can introduce a new artificial state variable corresponding to this measurement and the nonlinear relation between the a~tificial state variable and the actual state variables can be included as part of the equality constraints of Equation 6-57. This transformation is similar to the treatment of indirectly measured variables in steady-state data reconciliation (see Chapter 7). Lastly, the inequality constraints, Equation 6-58 allows bounds on state variables and other feasibility constraints to be also included. It should be noted that filtering methods cannot handle inequality constraints and can therefore give rise to infeasible estimates much like linear steady-state reconciliation methods. Thus, the formulation given by Equa- tions 6-55 through 6-58 is extremely general and practically useful.

The general fonnulation of the NDDK problem comes with a price. It is no longer possible to develop a recursive solution technique as in filtering. Furthermore, a close look at the objective function reveals that all the state variables from initial time up to cul'rent time are being simultaneously estimated at each sampling instant. This leads to an ever gro~ving increase in the number of variables wi:h tiine that I-tave to he estima?ed. which is not practically acceptable.

In order to reduce the computational burden. a tllol'illg 14.iridow approach was adopted 12.1. 241. In this apprdach, at each timz r on!;, a u3it:dow cf measurements froin time r-N to rime I are used to estimate ali the srate variables within this time v~indow of size AT. The objective func- ticr, to be minimized is the weighted swn squared ditferencer between the measur-nients and state estimates withir. this time v~indow. The esti- rilate obtained for the state v~riables a! time r frorii this optirnizatio~ are ased to compute the Inanipulated inputs. The procedure is repeated at ;he next sampling insiani, giving rise to the term "moving window."

The solution strategy used to solve the estiniation problem at each qf the rnnl~inn u~;nrlc>w approach requires some explanation due to

the presence of nonlinear differential equations, Equations 6-56 along with algebraic equations Equations 6-57 and 6-58. Liebman et al. 1231 converted the differential equations into algebraic eqi~ality coristraints by discretizing them using orthogotzal col1oc:utiorz. I n this technique, the state variable functions (of time) within each sampling period are expressed as a weighted sun1 of the state variable values at different time instants, within this sampling period, representing the collocation nodal points. The weights used in this representation are the orthogona; polyno-

mials. Although, a sarnpling period can be subdivided into several elements, for convenience one element is used per sampling interval. With this choice, the state variable functions within each sampling interval j can be written as

where li(t) are orthogonal basis polynomials, n, is the order and are xi the state variable values at the ith collocation point in sampling interval j. The end points of this interval 1 and n, correspond to the samplin, 5 Instants. Using Equation 6-59, the derivati\res can also be expressed in terms of the state variable values at different tinle instants. Equations 6-56 can now be forced to be satisfied at a11 the collocation points resulting in the following algebraic equarions for each sampling interval j.

where x' is the vector of all state variables at all collocaticjn points iii

sampling inter-val j . Eqc~ation 6-60 can be ivrittzn ro:- each of ti:e i\i

sampling interbals in tlle winciou- chosen with the additional stipu1::tion that the variable values at the end of a sampling interval are q u a [ t o those at the bzginning of the next interva!. A nari!inei.;r optirnira:ion techniqlje such as GRG or SQP discussed in Chapter 5 can be used to misinlizz 6-55 sub-ject to 6-57, 6-53. a id 6-60, 11 .cila~ld b s noted t11;it the number of variah!es in this optimization problem is more th:r~t tile number of statc and inpet vsriables at :he N sampling insta~ils wlt!li~l illz

window. since we are also simu!taneously estimating the state variahlss at the collocation polnts within each interval. More details oo the type of orthogonal polynomial used, the size of the problem. and the structure of the derix~ative matrix D are availablc i n Liebman et al. [23j.

In order to reduce the cornprttatior~al effort requii-ed by the non1ine:lr- prograrnrning strategy described above. Kamamurthi et al. 1241 proposed a succes.sil~e!\ litzear-ized hol-i;orl cs!itllrrtiotz ( S L H E ) method in \rlhicll Equations 6-56 and 6-57 are both linearized around a given reference tra- jectory for the state variables. The reference values at each sampling instant j are used to obtain the linearized form of these equations fol- the sa~npling period j. If inequality constl-aints are not included, then an a m -

Durn Recorrciiinrior~ in D?.rintnrr S~.srenrs 163

lytical solution for the estimates of the state variables at the beginning of the time window can be obtained which is then used to numerically inte- grate the differential equations to obtain the state estimates at other sampling instants within the window. Although this method is efficient, it can give rise to infeasible estimates because it cannot handle inequality constraints.

In the above discussion, we have not explicitly included unmeasured variables or parameters as part of the model equations. The nonlinear programming methods can also be used to simultaneously estimate both the measured states and unmeasured parameters. Simultaneous state and parameter estimation in dynamic processes have been considered by Kim et al. 125, 261, who refer to it as error-irz-variables method ( E V M ) estimation.

In summary, nonlinear dynamic data reconciliation strategies have several advantages over classical filtering techniques as i scussed in this section, but they do not address the problem of random noise in the state equations which can be caused by unmeasiired disturbances to the process. They are also computationally more demanding because a recursive f o m ~ of the estimator has not been developed. Currently, these techniques tiai~e not been applied to industrial processes and further develop- nients are required bei'ore they can be applied in practice.

Tne ~?oiiisot:lernlal CSTP described in Example 6-4 is itsed to illustrate the application :~f no~liincar dynamic data reconciliation techniqclz. Measurements col~~~p"I ld i l l f to the open-loop response of this process for a step c 'nan~e in tile fed co!lcentratiori from 6.5 to 7.5 at initial tir:ie were simula~ed as in Exampl~ 6-4. Using a s..indow !etigth of i O sam- p l i ~ ~ g periods. a ncnlinear- dynamic data reconcil ia~icn tecilnique is applied to ?stinlate the conc-ntrntion and leruperature in the reactor..

Lower and upper houndc on concentraiion were i:nposed as 0.01 anti 0.2. respectively, and o n temperature as 4.0 and 5.0, respectiveiy. Sincr an open-loop sin~ulation is performed in this example, the ob.jective f~itic- tian of data reconciliation is the weighted sum square of differences between measul-ed and estimated values over the past 10 sampling periods-that is, the second term in Equation 6-55 is absent. The optirniza- tion at ever-y sampling instant I is carried out by maki!lg initial guesses of the temperature and concentratio~t at time I--10.

The differential equations describing the CSTK are integrated from time t-10 to time t by using a 4th-order Kunge-Kutta method to obtain the estimates of the state variables at all sampling instants within this time period. The objective function value is computed corresponding to these estirnates and the state esti~nates at time t-/O are iterated upon until

- a minimum value of the objective function is obtained subject to the constraints of upper and lower bounds on the initial estimares. This approach differs from the method of Ramamurthi et al. 1241 in that the nonlinear differential equations are not linearized, but are explicitly integrated.

It should be noted that in this approach bounds are imposed only on the state estimates at the start of the time window and it is possible that the state estimates at other sa~npling instants obtained by explicit integra- tion may violate the bounds. It has the advantage, however, that it is more efficient than the ~liethod of Liebrnan et al. /23]. The estiniated concentration and temperatures obtained using this approach are shourn in Figur-es 6-6 and 6-7, respectively.

It can be observed that the estimated states are very c l ~ s c to the true values. In order to ensure convergence of the optimizatio:~ problem at

... I nic - - -- - - 3lessured - - . - - - ristii:l.,tcd

25 35 45 5 5 6 5 -7 I 2 - 8 5 9 5

Time (sec)

Figura 6-6. Estimated concentration of CSTR using dynamic daia reccnciliation For window length of 10.

True - - - . - Measured - - - - - - - - Estkmted

4 00 1 .- -,- - -- 1 -- -- 7 - - - 7 -- - 2 5 3 5 45 55 6 5 75 8 5 9 5

T~me (see)

Figure 5-7. Estimated temperature of CSTR using dynamic data reconciliation for window length of 10.

each time, i t w;rs found that bounds oil variables had to be imposed. Coir:pal-ing t l t e c estimates wi:h t h o s ~ obtained using the extended Kalman fi]cer in Example 6-4. it is obscrved that the extended Kdman filter gives better reslilfs and is also cornpu?ationally inore efficient. Thus, it is Setter to use NDDR technique ~ G I . estimating the states only when extended Kainan filtering techniqaes do nct give estimatzs that satisfy bounds or; vari:<aics.

$ *

SUMMARY

Dynamic data reconciliation is important for process control applications. In order to exploit temporal redundancy in data, dynamic models for the evolution of the state variables have to be used in conjunction with measurements. The Kalman filter can be used to estimate state variables in linear dynamic systems. If disturbances in state variables are ignored, then the Kalman filter is equivalent to data reconciliation. Use of estimated states instead of measxcments can lead to better control. State estimation in nonlinear dynamic systems can be performed using extended Kalrnan filters or its variants. Feasibility restrictions on variables cannot be handled by these methods. Nonlinear- optimization methods can be used for dynamic data reconciliatior? in nonlinear processes. They can account for inequality constraints but at-e !ess efficient than extcilded Kalman filters.

REFERENCES

I . Ljung, l,. Syvrenr Iilet~rilir-oriot~-T!~eot;~~ for rlze i1sr:-. Englzwood Cliffs, N.J.: Prenrice-1-Iall, 1387.

2. Sodersrrom, T., and P. Stoica Sys:enz idenrificariotz. Engiewood Cliffs, N.J.: Prentice-Hall. 1989.

3. Be1lingha:n. B., and F. P. Lees. "The Detection of Maiftinction Using a F-'rticess Control Computer: A Kalman Filtering Technique for General Con- trol 1,oops." Trnils. Inst. Cl;c:rn. Eizg. 55 (1977): 253-365.

4. Franklin. G. F. J. D. Poweli, and M. L. Workman. Digital Cu11rr01 oj' Il~nat7lic SJ-srerrls. Reading, Mass.: Addison-Wesley, 1980.

5 . Sase, A. P.. and J. L. Melsa. Estimatiorl I'l~eoiy with A~)plict~tior~s to Cor11- ~,lililicatio~~s nrzd Co~ztrul. New York: McGraw-Hill, 197 1.

6. Kalman, R. E. "A New Approach to Linear Filtering and Prcdict~on Prob lems " Trclrrr ASMB J. Basic‘ Eng. 82D (1960): 3 5 4 5 .

7. Kalman, R. E. "New Results i n Linear I-'iltering and Prediction Proble~iis,"

T ~ N I I S . ASME J. Bnsir Eilg. 83D ( I 96 1 ): 95-108.

8. Gelh, A. Apl7/i~d Optir~rnl E.~tii17clriorl. Cambridge. Mass.: MIT Press, 1974.

9. Sorenson, H. W. Kali~rui~ Filreriilg: 7hror-y and A/>/,liccrtions. New York: IEEE Press, 1985.

10. Fisher, D. G., and Sehorg, D. E. Muliivariuhle Cornpurer Control-A Case Study Amsterdam: North Holland, 1976.

11 . Stanley. G. M., and K.S.H. Mah. "Estimation of Flows and Temperatures in Process Networks." AfCIzE Jouf-17a123 ( 1 977): 642450 .

12. Makni, S., D. Hodouin. and C. Bazin. "A Recursive Node Imbalance Method Incorporating a Model of Flowrate Dynamics for On-line Material Balance of Complex Flnwsheets." Miner-als Dzg. 8 (1995): 753-766.

i3. Darouack; hl.. and M. %asadzinski. "Data Reconciliation in Generalized Lin- ear Dynamic Systems."AlQ~E Ji~ui-rial 37 (1991): 193-201.

14. Ko!lins, D. K., and S. Devanathzi~l "Unbiased Estimation in Dynaiiiic Data Xeco!icili;ltion." AIChfl .loll,-17ici 39 ( i993): 1330-1 331.

15. Bag;!je\vicz. M.. and Q. Jiang. "An Integral Approach to Dynainic Data Rec- onciliation." AlC'i~f< . / ~ : : I ~ I N ! 1 3 ( 19971: 2546-2558.

I f ? Andel-son. 3.D.O.. and 1 . B. Moor-e. Ol~riii7cll C ~ l l t i ? ~ ! : I>i11eor (21!adt-nric ;M~~tiv~ti\-. Engleirood Cii t f s . N.J.: !'ren~ice-Hall. 1989.

17. R;ischi, A. #/?iilil(ll c ~ , ~ ~ i ~ - o i 0f.('toc17ii,sti~ Si..ster!l.~. Ilertf~rdshire, C'K: Prz11- tice-Tiall. 1993.

18. Borrie, :. A. Sroc.11asric. 5>.stri;7.s ,fi):- E n g i t : e r r s - ~ o i Esti11lcctioiz cind Coiltrol. Heitf~)rdshire. LiK: Frcntice-Hail. 1992.

19. P IXS~. IF\'. H., l3. F'. F'lai;11cry, S. .A. Tc:~~k:olsky. a ~ d W7. 3.. \iette~-li~is. !\/~!IIICIF

ic.ci1 Rt,ci!jc~s. Kc\\' kork: Csnibi-idge Li~ivcrsity Press, 1986.

20. Jaz\vinski, A. H. Sro~.!~i:,sti(. PI.OCC.TS~.C (1;zc1 Fi!reri~l~ 7%eo?. New York: Acadzrnic Pi-=.;.;. 1050.

21. Wishncr, R. P.. J . A. Tabaczynski, and M. Atlians. "Comparat~ve Study c f Three Nonlinear Fiitclr." i ~ ~ , ~ , ~ ~ ~ ~ i i c t r 5 (i909): 47Z495.

23. Liebinan, M. J., T. I;. Edgar, and L. S. Lastlon. '.Efficierit Data Reconcilia- tion and Estimation for Dynamic Processes Using Nonlinear Prograrnniing Technqiues." Cot~~l~ufer-s Clzf~f??. Engtzg. 16 (no. 10/1 1 . 1992): 961-986.

24. Ramamurthi, Y., P. B. Sistu, and B. W. Bequettc. "Control Relevant Ilynanl- ic Data Reconciliation and Parametei- Estiri~ation." Cot7zpurer.s Client. G I ~ I I , ~ . 17 (no. 1, 1993): 41-59.

25. Kim, I. W., M. J. Liebman, and T. F. Edgar. "Robust Error in Variables Eati- mation using Nonlinear- Programming Tcchniqnes." AlChE .lour-ncil 36 (1 990): 985-993.

26. Kim, I. W.. M. J. L-iebman. and T. F. Edgar. "A Scqriential Error in Vari- ables Estimation Method for Nonlinear Dynamic Systems." Coi?~pur~.r.~ Chem. Ef?gng. I5 ( 199 1 ): 663-670.

22. Muske. K. R., and T. F. Eugar. "Nonlineai- Stare Estimation." In Xclir/i~lrat. Pi-occ.\s Coiztro! (edited by M. A. Henson arid D. E. Seborg), Kcw Jersey: Pren- rice-Hall. 1907. -1 1 1-370.

Introduction to Gross Error Detection

PROBLEM STATEMENTS

The technique of data recor?ciiiation ct-ucially depends on the zssun~p- iion that ~ n f y randem errors arc presect in the data arid systematic en-ors 2i:ller in the messurements cr the model equations are not present. If this ~issi~!lipt~on is invalid. reco:lciliation can lead to large adiustments being macie to tf:e ruevsured vaiues. and the resulting e s t i ~ a t e s can be very inaccura~t: and even infeasible. Thus it is important to identify such systematic or gross errors befare the final recor,ci!ed estimates are obtained.

In the first chapter, ii was pointed our that reconciliation can be performed o ~ i l y if co!istr~i~:ts ale prcseni. The sams statelneat can be made with regard to the detectiorl of gross errol-s. Without the a,ailzbiiity of constraints as a cou~ter-check of the measurements. gross error detectioc cancot be carried out. Therefore. both data reconciiiation and gross error detection techniques exploit the same information available from riles-

sureriients and constraints. These techniques, therefore, go hdilci-iu-;ra~ld in the processir?g of data.

There are two major types of gross errors, as indicated in Chapter 2. One is relatcd to the instrument performance and includes measurement bias. drifting, nliscalibration, and total instrument failure. The other is constraint model-related and includes u~accounted loss of n~aterial and energy resulting frorn leaks frorn process equipment or model inaccuracies due to inaccurate parameters. Various techniques have been designed <or [he detection and elimination of these two types of gross errors. Refore

r l .

describing these techniques, at the outset it is better to clearly state the requirements of a gross error detection strategy. This also leads to a better understanding of the variety of techniques that have been proposed, their interrelationships and the achievable results from their usage.

Any comprehensive gross error detection strategy should preferably possess the following capabilities:

Ability to detect the presence of one or more gross errors in the data (the detection problem) Ability to identify the type and location of the gross error (the identification problem) Ability to locate and identify multiple gross errors which may be present simultaneously in the data (the multiple gross error identification problem) Ability to estimate the magnitude of the gross errors (the estimation problem)

Not all gross error detection strategies may fulfill all of the above requireme~ts. The last of the above requirements. although useful, is nor absoluteiy necessary. A gross e r r ~ r detection strategy car1 be analyzed in terms of the component methcds i t uses to tackle :he three main problems of detection, identification, acd multip!e gross error identification, arid the perfonilance of the strstegy is a stron2 function of these conlponent me:iiods In this chapter, sve focus on the first two componer:ts of a gross enor de:ection strategy, that of detection and identification of a sin- L

~ 1 2 gross error. Methods for nlu!tii;le grcss error detection are discussed in the fo:lo\iiiilg chaptcr.

BASIC STATlSTiCAL TESTS FOR GROSS f RROR DETECTION

This component of a gross er:-rr detcz~i:::: ,:;-:.::,; simplj- , ~ ~ ; i . ; ~ i a ;v answer the question of whether yrqcr errors are prese1.t i.n. [he data or not. It does not provide any clues on either the number of gross errors, their types. or their locations. We reiterate the fact that all detection methods either directly or indirectly utilize the fact that gross errors in measurements cause them to violate the model constraints. If measurements do not contain any random errors, then a violation of any of the model constraints by the measured values can be immediately interpreted as due to the presence of gross errors. This is a purely deterministic neth hod.

l r~ i r , r r f~ l c r / i o r~ to GI-(I.\.\ Error J)PJP(I;OI: 177

We have assumed, however, and rightly so, that all measurements do contain random errors due to which we cannot expect the measurements to strictly satisfy any of the model constraints even if gross errors are absent. Thus, an allowance has to be made for the violation of the constraints due to random errors. Under an assumed probability distribution for the random errors, a probabilistic approach is used for resolving this problem. Some basics of probability distributions and statistical hypothesis testing are explained in Appendix C.

The basic principle in gross error detection is derived from the detection of outliers in statistical applications. The random error inherently present in any measurement is assumed to follow a normal distribution with zero mean and known variance. The nor-~nalized errar- (the difference between the measured value and the expected mean value divided by its standard deviation) follows a standard normal distribution. Most normalized errors fall inside a (1 - a) confidence interval at a chosen

a , r level of significance a. Any \!slue (nor-malized error) which falls outside that confidence region is declared an outlier or a gross error.

A number of statistical tests are derived from this basic statistical principle and are able to detect grass crrors. But not all statistical tests are able to identify different types and localion of gross srrors. Some basic statistica! tests are able to detect only r~leasurement errors (biases). Other- statistical tests car, on!y detect PI-ocesc model errors or- leaks. On the other hand, the generalized likelihood ratio :es!. which is derived from maxiinurn likelihood estimation principic in statistics, tail be used to detect both instrumnt prob!ems arid PI-ocess leaks.

Thc next two sections describe thz :wo basic classes of statistica! tests used for gross error detectio11. Next, a cieriked class of statistical te\ts knowri as the pri~icipal conll;cvlcrzt ipsf.\ is a!so przsznred and co~npared with the basic statistical tests. These tests are based c;n a special ~ y p s of !inear transformation of the residual \iectc?i-s u s d in the basic tests. For (he sake of clarity in this chapter. the principal component tests are pre- .,:.,::2 ifi a sepzr;:: XC;~GE.

The .Gost cnmmonly used statistical tcchniques for detecring gross errors are based on hypothesis testing. In a gross error detection case, the rlrcll hypotlzesis, Ho, is that no gross error is present, and the crlienlniive hypotizesis, H I , is that one or more gross errors are present in the system. All statistical techniques for choosing between these taro hypotheses make use of a rest statistic which is a function of the measurements and constraint model. The test statistic is compared with a prespecified threshold value and the null hypothesis is rejected or accepted, respec-

tively, depending on whether the statistic exceeds t h e threshold or not. The threshold value is also known as the test criterioa or the critical value of the test.

The outcome of hypothesis testing is not perkct. A statistical test may declare the presence of gross errors, when in fact there is no gross error (Ho is true). In this case, the test commits a 7:vpe I error or gives rise to a ,false alarm. On the other hand, the test may declare the measurements to be free of error, when in fact one or more gross errors exists (Type I/ error). The power of a statistical test, which is the probability of correct detection, is equal to I -5pe I/ error probability. The power and Type I error probability of any statistical test are intimately related.

By allowing a larger Type I error probability, the power of a statisticai test can be increased. Therefore, in designin; 9 statistical test, the power of the test must be balanced against the probability of false detection. If the probability distribution of the test statistic can be obtained under the assumption of the null hyputhesis, then the rest criterion can be selected so that the probability of Type 1 error is less than or equal to a specified value a. The parameter a is aiso referred to as the level qf signi$ca~zce for the statistical test.

The different statistic21 tests for gross error detection and the choicz of the test cr-iterion are described ir! thc following section. For the sake of' .simplicity, \ve will acalyze the hr?sic statistical tests assuming steady- state cor:di:ions 2nd linear models. The app!icabili!y of such gross errclr test!: to !lon!inear mode!s will be further discussed in Chapter 8.

We wil! assume that the linear constraint model is gil;e~ by

where A is tile linear constraint matrix and the vector c contains known coefficients. Typically, for linear flow processcs. c is a zero vector unless some of the variables are known exactly. We have deliberately incltlded thic vector in Equation 7-1 fcr ease of comparison with the linearized form of nonlinear constraints which will be treated later. As in the previous chapters. the nieasnr-enlent ersors are assumed to be distributed no~-rrially with known covariance matrix C.

Four basic statistical tests have been developed and widely applied for gross error detection. To simplify the description of these tests. a linear model with all variables measured will be first assumed. This does not exclude the application of such statistical tests to linear models with unmeasured variables, since, as showr: in Chapter 3, linear models with

unmeasured variables can be reduced to linear models with all measured variables by using a projection matrix.

The first two tests are based on the vector- of bulailce residuals, r, which is given by

In the absence of gross errors, the vector r follows a multivariate normal distribution with zero mean value and variance-covaria~lce matrix V given by

Therefore. under Ho, r - N (0, V), etc. In the presence of gross errors, the elements of residual vector r reflects the degree of violation of process constraints (material and energy conser\:ation laws). On the other hand. matrix V contains i~lforlnation of the process structure (inatfix A) and the measurement variance-covariance matrix, C. The two quantities. i. and V, can be used to construct siatistical tests which can detect the existence of gross en-ors.

The Global Test (GT)

The gloSal test, which was the first test proposed [ I , 2, 31, uses the test statistic given hp

y = rT ~ - ' r (7-4)

Under Ho. the above statistic fo!l~ws a i('-distribution with v degrees of freedom, where v is the rank of matrix A. If the tesf criterion is chosen

2 as x where xt-a,,, is the critical vaiue of x' distribution at the chosen a level of significance, then Ho is rejected and a gross error is detected, if y2x~-,,,,. This choice of the trct c-ritrrion ensures that the probability of Type I error for this test is less than or equal to a. The global test com- bines all the coristraint residuals in obtaining the test statistic, and therefore gives rise to a multivariate 01. collective tes:.

.4 point wosth mentioning here is that the global test statistic given by Equation 7-4 is also equal to the minimum ob-jective function value of tile data reconciliation problem. This can be verified easily by substiiut- ing the solution for the reconciled estimates given by Ecluarion 3-8 in the

objective function given by Equation 3-6. This result is used later in analyzing the techniques used for gross error identification.

Exercise 7-1. Prove that the global test statistic arzd the optit?lal dafa reconciliation objective function values are equal.

Example 7-1

Consider the flow reconciliation of the heat exchanger with bypass process shown in Figure 1-2. Let us assume that all flows are measured and the true, measured and reconciled values (assuming no gross errors) are as given in Table 7-1, where the flow measurement of stream 2 contains a positive bias of 4 units. The standard deviations of all measurement errors are assumed as unity.

Table 7- 1 Reconciliation of Data Containing a Gross Error

for Process of Figure 1-2 - -- -

Strecrn Number True Flow Values Meosu:ed Flow Values Reconciled Flow Values - --

I 103 lOi.91 100.SY

The constraint matrix for this pl-ocess i: given by

where the rows correspond to flow balances for the splitter, heat exchanger, bypass valve, and mixer in order and the colurnns correspond to the six streams in order. The constraint residuals for the given measurements can be computed as 1-1.19, 4.25, -1.79, 1.761. The covariance rnatrix of constraint residuals is give11 by

Using Equation 7-4 the global test statistic is computed to be 16.674. This can be verified to be equal to the sum square of the differences between tilL ~conciled and measured values (optimum DR objective function value). The test criterion at 5% level of significance drawn from the chi-square distribution with 4 degrees of freedom is equal to 9.488. Thus the global tcst rejects the nul; hypothesis and a gross error is detected.

The Constraint or t.iodai Test (NT)

Tne vector r can also b ~ - used to derive test statistics, ot?e ;'or each constraint i. given by

\?here diag(Vj is 2 diagonal n~atris whose diagonal elelnents are yi,. Tht. ~ z o d ~ l or ronst~-ciint test [a. 51 uscs the test statistics z,, for g!-oss error detection. It cdn be proveci that z,; follows a standard norntal distributio~!, i\! (0 , l ) under H,,. If any of rne test statistics z,., (or cquivaien~i), L I I C

maximurn test statistic) exceeds the test criterion u l , ~ r e Z1..u/2 is the critical \~alue of the standard nol-lnal distribution for a level of significance (for the two-sided test), a gross error is detected.

I!nlike the global tes:. the constraint test processes each co~istraint residua! separately and gives rise to in univariate tests. Since ntultiple tests are performed using the same critical value, it increases the proba-

bility that one of the tests may be rejected even if no gross errors are present. In other words, the probability of Type I error will be more than the specified value of a. If we wish to control the Type I error probability. the following modified level of significance P, proposed by Mah and Tamhane (61 (derived from Sidak inequality [7]) can be used.

For any specified value of a, the modified value P can be computed using Equation 7-7 and the test criterion for all the constraint tests can be chosen as Zjpb,2. This will ensure that the probability that any one of the constraints tests will be rejected under Ho is less than or equal to a. It should be noted that a is only an upper bound on the Type 1 error probability and in order to ensure that the Type I error probability is exactly equal to a, the test criterion has to be chosen by trial and error using simulation. Alternati\~eiy, Rollins 2nd Davis [S] proposed the use of a critical value based on the Bonfemoni confidence interval which is given by

For l a r ~ e \a!ues of 171, Equation 7-7 educes to Equation 7-8

Exercise 7-2. Pmve rkat 0j ~lsirzg c rest crit~riotl brlscd O ~ I tile 1 iriod$~d hvel oj.rigri$cu/ice giveti bj. Equiirioa 7-7, file TI^? I ~ i ? i ) l - I ~mDnSility o f rhe ronstrciiizt rcjst wi// be less tiznlz or. ccquu/ to a. 1

i

- . I t I:; p~ss ible to obtain o i k r fc:rrns o i the consirdi~lt test by using a lin-

ear uansfcxmation of :he constraint residuais. Iiowevzr, 1101 a!l of these fo!-nls possess the same power to detect zross errors. Crowe 191 obtained a particular forai of the co~~straiiit tcst whiclt h:~.: thr r~r~.xii?zur~~ r > n i $ , P i -

The test slatistics of the i~lnnir~zuii? poii.et- c-oiz.stmirzr test are given by

or, written in vector form, The Measurement Test (MT)

m e test criterion is chosen to be the same as in the case of the standard constraint test. If there is a gross error in the process, then it can be shown that the expected value of the ~naximum among the test statistics given by Equation 7-9 is greater than the expected value of the maximum among the test statistics given by Equation 7-5. This implies that if there is a gross error, then the constraint test based on the test statistics of Equation 7-9 has a greater probability of detecting it than the test based on the statistics of Equation 7-5. If the constraint test statistics are derived using any other linear transformation of the residuals, we can show that they do not possess this property. Thus, the corlstraint test based on the statistics of Equation 7-9 has rnasimal power property ( M P ) .

Exercise 7-3. PI-ove rhat tlze expected ilalltc oj"z:,; is gr-ecrtrr than o?- cqzrill to tlze expected value of zIii, i f a 31-uss e~-r-c,r- ofany lnngr~itrtdc P is yreserr! irl co~r.st,-uirzt i. A/.so /?r~~~>ejhfi)l. lliis cnse tho! t;7e expected \>clue o f ; , i is gr-eater tllun or. eq!cai to z;, ji11- all J.

Erznzd this 1-.zult ;o sho~r 1hoi ~ !UJ .srr~tisiics givc.11 iy Eqrlcctiorz 7-9 hove n:or-e p~u.erj3f3:. dcrt7cting u g/as,s err-or- in flie consrr-aitirr r/?nfz cozzst~-airzt re:; statistics dcrived u.rirzg any ~ t h e r litze~ir transfot-vzafiorz oft/zc co~fstruint residlrcrls. Hiizt: Tile e,pected ~.ulue o f r in ihis ccisc is be,, where e, is (I ~ ~ c i t ; . vector :c;iilr l)a!zre I in poiition i oi,d ;fro c!.seabrrr lJ,se !/ii.s m a l ~ c.;i/ C l l ~ i ~ h : i - . i i ~ ~ r ~ : 1 + e irzeqrralitj~: I Jw ; 5 ( V ~ ' V ) I : ~ ( W ~ W ) ~ ; ~ . I

I

For the flow process considered in Example 7-1, the constraint residuals and its covariance matrix were computed. From these, the constraint test statistics can be obtained as [0.687, 3.0052, 1.2657, 1.0161J. The standard normal test criterion at 5% level of significance is 1.96. Thus, only the test for co~lstraint residual 2 is rejected.

The third test is based on the vector of measurernerzt adjustments,

where 2 are the reconciled estimates obtained using Equation 3-8. Using this solution, the measurement adjustments can also be written as

which, under Ho, follows a multivariate normal distribution: N (0, W), where: - w = cov(a) = C A ~ V - ' C J

(7-13)

The following test statistics,

known as the rneasrrwmerzf tesr sfrtti.sfic.s, follow a standard ncnnal distribution. N (0 , l ) under No. Tamhaile [ l O j has sho.rvn t h a ~ for a nonciiagonal covariance mairix C, a vector of iest statistics \triih maximal power for- detecting a single gross en-or is obtained y prernu1:iplyir;: a by C-' which gives

IJrider No, d I S a's,) nol-mally distributed w ~ i h zero n:ean and a covariance niatrir;

Mah and Tanihane [61 propoq~~! the following test statistics,

e 184 /~(~r;r !\'~,~-oocriruirorl rrrrd f;/osc Ermr- C)P?C~.~IOI?

known as the nzaxiil7urll power (MP) t7zea.surenzeizt test, which follows a standard normal distribution, N ( 0 , l ) under Ho . Si~nilar to the constraint test. the measurement test also involves multiple univariate tests. Using similar arguments as before, we can show that the probability of C p e I error will be less than or equal to w., if the test criterion is chosen as Z1-w2, where P is given by Equation 7-7 or 7-8 with ill being replaced by 12, the number of univariate measurement tests.

+

Exercise 7-4. P r o i ~ that zd,j have the maxiinur?~ po~vet- fot- detectirig a gross error iiz one o f the measurements. Hint: Follo~o a similar pi-oof as zr.sed,fol- solvirlg Exercise 7-3.

Exercise 7-5. .Slloir rllnrfor diccpo~~ul C, zu,j = ~ , j .

E-xercise 7-6. Let ai ( I I I C J a . ?7e tit.0 c o l ~ ~ i w ~ . ~ ofrl!atr-i.r A. Ilf'th(.m is J I ( 1 c.ol~srai7t c .sl!cll li~trr ai = c a . . sl~ow illat z,l,i = cci,j.

1 I

1 I I

Example 7-3

F7ro:n the iileas;i~-~d and reco~~ciizd values libteci Ln Tai~le 7- !, the mes- s suren1e:it zdjustinci~~s can be cornp~ited as [1.0233, 2.6167, -0.4035. --I .6333, 1.3863. --2.00hjj. The ccw;ir-iance matrix of rl;easure!ilctlt adjusiments is ~ i v e n by

. . 1 he nleasuretnent tcst statistics are therefore obtained as [ 1.2533,

?.2017,0.404.2.0001. 1.6983. 2.45773. Hccause, in this example, the rnca- F

surement error covariance matrix is diagonal, thc MP measurement test statistics are also the same. For a 5% level of significance, the standard normal test criterion is 1.96. From these n-e observe ths: the measurement tests for measurements 2, 4, and 6 are rejected. The modified level of significance given by Sidak's inequality (Equation 7-7) is equal to 0.0085. while that based on Bonferroni confidence interval (Equation 7-8) is equal to 0.0083. Corresponding to these modified significance levels. the test criteria are 2.63 15 and 2.6396. respectively. Thus, if we usc the lnoditied levels of significance, only the test for measurement 2 is rejected.

Additional examples for GT. NT and X.IT tests are found in Crowe et al. [ l 11 and Tamhane and Mah [ 121.

The Generalized Likelihood Ratio (GLR) Test

A fourth test for detecting gross errors in steady-star? processes is the generalized likelihood ratio (GLR) test b a e d on the ~nauirnurn likelihood ratio principle used in statistics. In contrast to other tests. the formulatioll of this test requires a model of the process in the presence of a zross error. also known as the gross er-inr- ~liodel. As shown in the risxt section, thih test can identify different types of z l - 0 5 ~ error ior u:hich a S T - O S ~ ei-sor medel is provided. The procedure has been illusrratsd for gro.\ srrors caused bl. measurement biases and process !ezks by .'>.arasinttan mc i Mah [13].

The gross error model for 2 bias of unknown magnitadz 5 i n nicasure- rnent j is giver: by

where e, is a u~lity vzc[or '.I ith valiie 1 i r ? po>i!ionj and xi(.) eiszwiicr-c. On the ottler hand, leakage of srtatcrial should be ir,ode!ed rs p x t of

the cons:raints. A rnass flow leak in a pmcsss node i oi ilnkno:t~n il?as~?i- tude t) can be modeled by

The elements of vector rn, are relativel>- easy to dz!irie when only rota1 flow balances are involved. If the !eak i h from a procev unit i, then only the flow constraint for this unit vector is zffected and ~ h u s nl, is identical to ei. However, if the constraints also irlclude colrlponent balances and energy balances (with precisely knowr, composition and temperature \:alues), then the vector m, can only be defined approxirn:ttely using cr:si-

186 ~~r~ Recr,~lc;l~~~lion and Gross Error Defection

neering judgment. A recommendation made by Narasimhan and Mah [13] is to choose the elements of mi as follows:

(a) Corresponding to the total mass flow constraint of unit i, m, has a value of unity in the ith position.

(b) Corresponding to the energy flow constraint associated with node i, the value of of the ith element in the vector m, can be chosen as the average specific enthalpy of the streams incident to node i. The same can be applied to a component flow constraint for the node i, by replacing specific enthalpy with concentration.

(c) The elements in m, not associated with constraints of node i are chosen to be zero.

Pure energy or component flow losses in node i can also be modeled by Equation 7-19 by choosing the corresponding element in mi to be unity and all other elements to be zero.

Using the gross errcr models, it is possible to derive the statistical distribution of the constraint residuals under HI, when a gross error either in the measurements or constraints is present. It has already been proved that under Fh, the constraint residuals follow a normal distributicn with zero mean and ccvariacce matrix given by Equatian 7-3. Under HI: the constraint residuals stil! follow a normal distribution with covariance rnatrix given by Equatiot; 7-3, but the expected value depends on the type of gross errcr present. If a gross error due to a bias of magnitude b is present in measurenen:j. !her, we can show that

On the other hand, if a gross erlm due to a pr:?cess ieak i \ present in node i , then we can show that

Therefore, when a gross error due to a bias 01 a process leak is przsent, we can write

where

Aej for a bias in measurement j fk =

mi for a process leak in node i

The vectors fh are also referred to as gross error sigrzature veclors. 1f we define p as the unknown expected value of r, we can formulate the hypotheses for gross error detection as

j where tio is the nu11 hypothesis that no gross errors exist and HI is the i alternative hypothesis that either a process leak or a measurement bias is

present. The alternative hypothesis has two unknowns, b and fk. The parameter b can be any real number and fk can be any vector from the set F, which is given by

where m is the number of nodes or process units, and n is the number of llleasured variables.

In order to test the two hypotheses given by Equation 7-24, one can use the !ikelihood ratio test. The likelihood ratio test statistic in our case is given by

?r{r/ti, ] h = sup -7-

~r ! r~f&)

wherz Fr(rlHo}. PrjrlH,) are the probabi!ities of obtaining residual vector r under Ho and HI hypothesis respectively; the supremum <"sup" in Equation 7-26) is co~npcted over all possible irali~es of t l~e panmeters piesent in the hypoth~ses. Using the normal probabi!ity density fulicti~n for r; we can write Equatior~ 7-26 as

exp{-o.s(r - b f k ) T ~ - l ( r - bfL)] h = sup

b.fh exp{-0.5rTv-'r}

Since the expression on the right hand side of Equation 7-27 is always pl~sitive, we can simplify the calculation by choosing as the test statistic

The computation of T proceeds as follows. For any vector fk we compute the estimate b* of b, which gives the supremuin in Equation 7-28. Thus, we obtain the maximu~n likelihood estimate

Substituting b* in Equation 7-28 and denoting the corresponding value of T by Tk we get

where

It can be easily observed by comparing Equations 7-7 and 7-34 that J3, and a, are related by siinilar expressions, with the exponent being equal to the reciprocal of the number of multiple univariate tests being performed as part of the test to detect a gross error.

Exercise 7-7. Prove f l~at T,,fi~llnws a central chi-syunr-e distributiorz wzth orlc degree of ft-eedom.

Exercise 7-8. Prove tllat the squar-e root ( f t l i e GLR test statistics Ix can be obtairzed using llze linrur tr~aiz.~fo~-r~zatior~ of t!?e cor~sfr-oint residuals F ~ v - ' ~ , where tlze colurizns of rnatri.~ F at-e the gross error vectors fL .

Example 7-4

This calculaiion is performi-d for every vector f, in set F and t!ie test statisric T is therefore ob:ai!led a i

Let P: bz the vector that leads t(i ill:: suprzninln i l l Bqtiat~on 7-33. The test statis:ic T is compared v;itli a pres~eciited threshold T,, arid a gr-oss crrol- is detected if T exceeds T,;. \Ve call icterpi-ei Tk as a test statistic for {lie presence o i gross error k. Since T is t!ie rnaxiinum dlnong Th, the G!,R test detects a gross prsor if any of tlie tcst slatistics 'Ih exceeds the criticnl value. Thus the GLK tcst. like tlle n~easureriient test a~:tf [he constraint test, performs multip!e ui~ivariate tests to derecc a gn)ss error. The distribution of Tk, under H,,. car; be shown to he a central chi-scluare distribution with one degree of freedom. Therefore. in order to maintain the Type I error probability of the GLR test less than or- equal to a given value a. we can choose the test criterion as x: 8-1. tlie upper 1 - P quantile of the chi- square distribution with one degree of fl-eedorn, where p is given by

If u.e consider gross eirors caused by n-ieasuremer.: b~a:;es for the s in- pie flow process used in the preceding examples. the gross error signature vector for a bias in nteasur-ernent i 1s :he ith column of rhe constr~iint matrix which is given ir. Examp!e 7-1. The G!,R tesi statistics coiiiputed frcm the constraint resijilals and its covariance matrix ~ i v e n in Exanlple 7-2 are 11.5705. i0.2:/:)4. 0.244, 4.0017, 2.3313, 6.G1011. It can he :leri-

Q e tied tiia: the G i R test statistics arc the square of the M P measureinen! te;t statistics corrtputed in Exa~nplc 7-3. The test criteria at 4% Iziei of signi!'i- cance and at the two ~iodified levels of sigl~ificafice-Sids and Ronfer- rani-arc siinply the squai-e of :he standard nonn;~l test ciiieria, 13.8415, 6.925, 6.96763. respectively. Hence, the GLR tests for measurements 2. 4. and 6 are rejected at the 5% leve! of si~nificance, \v!?i!t. nnly t!le tcst fcr nieasuremer~t 2 is rejected at the modifled levels of signitic;alice.

If we also wish to tcst for leaks in all the f6ur nodes. then ths siznature vectors fol- these four gross errors are simply the unit vectors. The G i K test statistics for these four gross er-rcrs are given by [1.57(iX, 13.2496. 0.3844, 6.0-1011. The GLR tests for leaks in i d e s 2 and 4 are rejected at the 5% level of significance while the test for a leak in node 2 alone is rejected at the modified levels of significance. I t can be observed thal the

# a GLR test statistic for a leak in ilode 1 (splitter node) is the same as the

test statistic for a bias in measurement 1. This is due to the fact that the gross error signature vectors for these two gross errors are identical. The same observation can be made concerning a leak in node 4 and a bias in nieasurement 6.

Comparison of the Power of Basic Gross Error Detection Tests

As described in the preceding sections, several statistical tests have been developed for detecting gross errors in measurements caused by biases in the measuring instruments or gross errors in steady-state conservation constraints due to unknown leaks. In order to obtain the best performance, it is important to apply the test which has the maximum power (the probability of detecting the presence of a gross error when one is actually present) without increasing the probability of T;;: I error (probability of wrongly detecting a gross error when none is present).

Thus, an important question that can be asked is which among the above four tests gives t h ~ , maximum power for detecting a single gross error in the data. This question has not been adequately addressed so far. Most of the works which conipare the performance of different gross error detection strategies only consider the overall perfoi-~nance which inclgdes all tne co:npor,ents of detection. identiiication and multiple error dete~tion, bur does not compare the detection component pa!l of the strategy in isolation. \'.ie provide some results rhat partially ariswer this questior?.

In making this conlpariscn. we have to c~nsider only the MP test for :he constraint and measurement tests, besides the global test and the GLK test. We can further sirrplify our task by making use of tl1ecretir:al results that liii\~e been derived b!; CI-owe [9! znd Na;.xiirnt~arl [14] to show thzt among the cnnstraint, measurement and GLR iests, the GLR test has the maximum power to detect z sing!e gross error. The proof of this result follows.

I ! !Je:r,mz 7-1: The GI,R test is more powerful than an MP measure-

ment test or an MP constraint :est, based on any singular or ncnsin- gular linear transfonnarion of the constraint residuals for detecting a sirlgie gross error-.

It can be observed from Equations 7-12, 7-15, and 7-17 that the MP ~nlcasure~nent test statistics art' obtained using a linear transformation of

constraint residuals. If we consider the positive square root of the GLR test statistics (without losp of generality), then we can show that the GLR test statistics are also obtained using a linear transformation of the constraint residuals (see Exercise 7-8).

Therefore, the MP constraint test, MP measurement test and the GLR test all derive test statistics based on a linear transformation of the constraint residuals. The question can then be posed as to which linear transformation of the constraint residuals gives the most powerful tests. In order to answer this question, we can consider an arbitrary linear trans- fonnation of the constraint residuals given by

Let a gross eiror nf vagnitude b either due to a measurement bias or a leak be present with corresponding gross error vector fk. Then using Equation 7-22 the expected value of the transformed constraint residuals is obtained as

The covariance matrix of the transformed constrair~ed residuals is ziven by

A tesi can then be devised based on the ti-ansfcrm-d ccnsrl-ained residuals with test statistics given by

Equation 7-38 can also be written as

It can be pasily verified that by choosing Y to be V-', A1'V-I, or FTV-' (where F is a matrix of vectors fk defined by Equation 7-23), recpective- ly, the MP constraint test, the MP measurement test or the CLR test sta-

192 Llrriu Kccr~iiciliariorr orid GI,, Error D,-rai.:ioir

tistics are obtained. In order to prove that the GLR test has the maximurn power, we have to prove that the maximum among the expected values of the GLR test statistics is greater than or equal to the expected value of any of the test statistics given by Equation 7-39. For this, we first prove that the maximum among the expected values of the GI,R test statistics is attained by Tk, that is,

where, from Equations 7-30 and 7-35 through 7-37 with Y = FTV-I, the expected values of fl and 6 are respectively

The above resu!ts can be easi!y established using the Cauchy-Schwal-tz. inequaliiy.

acd by defining vectors v and w given by

v = Rf,. and \V = RfL (7-44)

where K is a niatrix such that K7R = \Ti.

r- Ir! order to prov2 E[vTk] 2 E[z;] over all 1, first \te need to define Elz;!. which according to Equattons 7-39, 7-36. 2nd 7-37 1s

Then, we can again make use of the Cauchy-Sch\\artz inequality by defining matrices

and identifying the vectors v and w to be

Since the MP constraint and MP measurements tests are obtained using a particular linear transformation of the constraint residuals, based on the above results we can claim that on an average we can expect the GLR test to give higher power for detecting the presence of a single gross error than either the MP constraint test or MP measurement test. If we assume that only gross errors due to measuren~ent biases can be present in the system, then the GLR test becomes identical to the MP measurement test.

On the other hand, if we assume that only gross errors which affe,: only one constraint (for example. leaks in overall flow balance constraints) can be present, then the GI,R test becomes identical to the MF constraint test. However, if we allow for both types of gross errors to be present in the system, then the GI,R test is more powerful than either of the other two tests. It should be cautio!~ed, though, that an itzlplicit assumption has been made that we preciseiy know the gross error vectors f , for the different types of gross errors which can occur i r ~ the process. This assumption may not be valici if there are ur;czrtainrics in :he distribution model or gross error model. Morecver, these results a~-e valid if we assulne that, at most, one gross error is presznt.

Exercise 7-_0. PI-ove :I;rrr Mi' / ~ Z E O ~ L O P ~ H < ~ ~ I T ;ex; static:ic rd,, = (7))''' vt~t'retz oizlj gt-oss erwt-s irz t?l~~n.~mt-~t71el7:.s trr-e n!lalt.eif. 1 Exercise 7-10. Prove tl~ut MP ctv7.rrt-ai17i text .staristic ;; = (7;)"' hen olllv gross errors in c-oi1.str-(1i17r.s lvl7ich offerr a sirzgle

I colzsrrcrit7: are allow,ed. Hkrt: The XI-oss ~ r m r ~ ' e r f o ~ - ~ - ,for these qpes of gr-oss errors ntr e i . 1

R?'R = V - I and P = YR-' (7 - 46)

'f $4 ~~r~~ Kaizllici!ii~rro,i a,,d Gnu, Error I1i~rerlii)tr Irr:rod~rcriorr ro C;ro.r.~ Error Derecrio~z 195

It is now only necessary to examine whether the global test or the GLR test gives higher power to detect the presence of a single gross enor. We note from Equation 7-4 that the global test is also based on the constraint residuals and thus it uses the same infomiation as the GLR test

, in the for detecting gross errors. But there is a fundamental differenc- manner in which this information is processed. The global test performs a single multivariate test for detecting a gross error, whereas the GLR test performs multiple univariate tests, one for each possible gross error hypothesized for detecting if any one of them is present. The question is which one of these processing schemes gives a higher power. This problem has also been studied in the statistical literature. Unfortunately. it is difficult to obtain a unique answer to this question theoretically.

We can perform sirnulation studies of selected processes to evaluate the power of the two tests. Before attempting such a comparison, however, it must be ensured that both these tests give the same Type I error probability. This implies that the criterion for each test has to be chosen to give a specified value of the Type I error probability. This is possible only in the case of the global Test. As explained before, since the GLR test pcrfol-ms multiple univariate Tests, the test criterion ha; to be chosen by trial and en-01. using simulation. The result> obtai~icd through such simula~ion can at best he uszd t ~ ) make sor;:2 broad conclusions and str-ictly cannot be generalized t c ~ all processes.

I t is seen frorn the above discl~ssion, that one lest does r.ot hzve uni- forrniy higher power than the other. However. u.s recornmend that for the purpose of detecting ~,ht>t17er nrle 01- lnore g~-0.7s 21-r-ors 01-e !)r-e.~r~lf. the g!obal test (ST) shouId be uszd. This reiom~nendatioi~ is based on the fcliowing collsideratit>n~:

(i) The cornputatio~r of the GT ctatistic is more efficient since a single :est statistic is computed.

( 2 ) In practice to instill confidence among process operators, it is necessary to keep the 5!cc ?!sm, ?r~hahility he!ow :1 s7rcitied li~nit. The test criterion for GT can be chosen to precisely obtain this allowable limit. Note that any higher value of the test criterion can satisfy this linut but will result in a lower power. In the ca5e of the GLR test. the lowest test criterion value that satisfies this limit can be chosen only through simulation.

13) The GLR test requires knowledge about the gross error vectors for the different gross errors that can occur in the process for obtaining the test statistics. The global test does not require any information

regarding the gross errors for the purpose of detection. This may be an important practical consideration since complete knowledge regarding all possible gross errors that can occur in a process is not generally available.

It may be argued that GT is inferior since it can only detect the presence or absence of gross errors but for identifying the nature and location of gross error an identification strategy is required. (In the case of the GLR test, the test statistics can be directly used for identification as described in the next section.) This argument is shown to be without merit from two considerations:

(1) The use of GT for detection, does not preclude the use of the GLR test statistics for identifying the type and location of the gross error. In this case, it is necessary to construct the GLR test statistics only if a gross error is detected by the global test.

(2) It is demonstrated in the next section, that the identification strategy inherent in the GLR test is the standard serial eliruination techliique that was proposed and first used by Ripps [ I ] in combination wi:h the giobal test for i-dentifyicg measurements biases. Thus. the GL.R test for detecting and identifying a single gross error can be viewed as a gross error detection strategy which has as one of its components the GT for- detec~ion and serial eliniination foi identification.

GROSS ERROR DETECTION USCNG PRlNClPAP COMPONENT (PC) TESTS

The variance-covariance rnatriccs of constraint rzsiclua!~ V and oT measurements adjustments (w or W) arc always dense. This implics that even if measurements are independent or weakly correiated, the rsc- onciled data are always strongly correlated. The reconciled values and, hence, thf, rnrrcur-mpnr ~ ~ + j u c f m e n f ~ n r p rnrrrlated because they are related to each other via the process model. The same is true for t!le constraint residuals.

However, not all basic tests exploit the entire information contained in matrices V, or W. The non-MP constraint test and the univariate measurement test (MP or non-MP) described earlier in this chapter use only the diagonal terms of matrices V, W, or W, respectively. Alten~atively, the principaj component tests use the entire matrices. It is expected that

196 Darn Eer-o,zcilinriori nt7d Glrr.~.~ Error Deter-,tor!

such tests will be able to detect more subtle gross errors slnce they are multivariate tests. It was found that multivariate tests such as the global test often detect gross errors that are not detected by the univariate tests. This aspect is very importanr, because failure to detect ail gross errors could result in an unsuccessful data reconciliation (the reconciled solution is infeasible or questionable).

Principal component tests are related to the univariate constraint and measurement tests, because they use a linear transformation of the constraint or measurement residual vectors. The following is a brief description of the principal component tests as given in Tong and Crowe [15]. As with the previous tests, we restrict this analysis to linear models with no unmeasured variables. The case with unmeasured variables can be handled with a projection matrix. Two basic types of principal component tests can be derived as follows:

Principal Component Tests for Residuals 0 4 Process Constraints

Let us consider a set of linear combinations of vector r

1% here the colunin5 of I T * , are the eigcnvectc>rs c f V, satibfying

Matrix A, is diago~al, concisting of rhe eigenvalues of V, A,,, 1 = :...q. on its diagonal and satis5es

Macrix U, ccnsists of !he oflhonoiir,alized eigenvectors of V. so that

The vector p, consists of pr-ir~cij~ul con7~)onerir.s of cc?rzstruint yes-iduals, and its elements are pr-incipai cci~t~po~ie~lr scores.

If gross errors are not present, then r - N(0, V) and it can be shown that p, - N(0, I). Therefore, a set of correlated variables, r, is transformed into a new set of uncosselated variables, p,. The principal components are nuinbered in descending order of the rllagnitudes of the corresponding eigenvalues.

Exercise 7-11. u r - N (0, V) , show :Izat p, - N (0, Ij.

On the other hand, Equations 7-48 and 7-49 can be combined and rewritten as

which means that the residual vector r can be uniquely recons:ructed fro111 its principal componcnts if all of the principal components are retained, that is, p, E ! f I m , where r?i is the number of eqc::':,-.ts (balance residuals). However, if fewer than nl principal co~nponents are retained, we get

~ i t h pr E !)I k . and k < rn. Equati(>i~ 7-54 is rek:rcd to as the !,,-ir;u;~al cor?z!x~r~erzr rlloile/ of vector r. Equation 7-53 indica1.e~ tiut thc residuals in the V ~ C ~ O I - r can be decomposed into the cof~:ributions frarn the principal components tenn, and the rcsidl;als of the prir~cipal coinponen: niodei. r-i . Tltis r:icans that for ~ r o s s erro:- detection. irrskrld cf usins sta~istical rests for r, we ca i~ perf01711 hyp::tl~esis testing oil p, and r-2.

Since each eletnzilt of vector p, 1s distributed as a standard nol-mcl variat;le. a detection rule sin~ilar to the univariate constri~in: rest can be used and the test for ccilstraint residual i is rejected if pLi excecds Zi p2. Similar to the univariatc tests, to jiinit the Typc I err-or to level a, J3 cer? be chosen as in Equatioi~ 7-7 whe:-e thz exponent in this eqnation is replaced by the number of retained principal components, k.

Principal Component Tests on Measurement Adjustments

Similar to the principal component test statistics based on constraint residuals. pr-irzcipal mrupor7rrlt ti7ensrr:ernerzi test sttrtisiics can be defined as

where the columns of W, are the eigenvectors of W and k is the number of retained principal components. In general, k < n, where n is the number of measurements.

Exercise 7-12. Ifa - N (0, W), show tlzatp, - N (0, I).

If gross errors are not present, then it can be shown that pa - N (0, I); therefore, the principal components on measurement adjustments are a lw not correlated. Similar to the measurement test, we can conduct a test on every p, by comparing it againbt a threshold Z, +,.. Relationship between Principal Component Tests and Other Statistical Tests

Thtt principal cornpone~tt rests art: also based on a lilreai transfor!nation of the constraint residuals 3s in Equation 7-35. Ir can be vei-ifid that the transfor~lation matrix uhcd [or deriving principal component constraint test is Y = W: and for the principal component meassremen1 tzst. the transformation matrix is Y = u':A~v-'. Tong and Crowe 1151 iiriplied that, since the n31nber of retained principd componenis is usuai- ly iess than the number of ;2rincipal cornpor,ents, she modified level of significance f ~ r principal component tests are smaller.

Therefore, we expect in general to reduce the o.~erall Type 1 rn-or in detecting gross errors by the principal component test. But this algurnent is without merit, because the Type I en-or probability for any test can always be reduced by simply choosing a smallel- value of a. Further- more, because principal con1poi:ent tests do not directly identify the gross error it is possible that the strategy used for identifying the gross crror commits additional Type I errors. Some later examples in this chapter illustrate this problem.

Analogous to the global test, a collective global test based on principal components can also be proposed for which the test statistic is defined by

The collective principal component test in Exercise 7-1 3 is called ~ I - U I Z -

cured clzi-squat-e test if all the principal component\ are not retained. Another important collective test statistic is defined by

known as the Q statistic or the squared prediction error, and sornet~mes, the Rao-statistic. It can be shown that Q, is a weighted sum of squares of the last m-k principal components.

The two quantities, yk and Q,, are complementary. The former examines the retained and the latter examines the unretained princip 1 d COmPO- nent collectively; yk accounts for the amount of variance explained by the principal compo~lent model, while Q, accounts for the arnount of the variance unexplained. Tests bascd on these quantities can be conducted to examine whether a gross error is presenr in the retained or u ~ r g a i n e d principal components. For more inforn?ation on coilecti\ie pri~cipal componen: tests, see Tong and Ciowe [15].

As previously stated, a major difference betwem thc i~nivariate tests and the rnu!tivariate chi-square tests is that the former does not take the correlation arilocg the residua!^ into account and henc:: teilds to be lcss reliable when correlation increases. Howevc~; the GLR (or MP messure- ment test) a i d MP constraint :es!, does inccrporate the correlntlon by transforming the residuals using tlte inverse of the covariance matrix. This leads to a maxi~iurn power for correct!? detecting s gl-oss error. over all other tests, but only when there is a single gross error.

When rnulti?le yross errors are present, these tests no longer possess the maximum power. Tong and Crowe 1151 indicated that the maltivari- ate principal component tests not only provide better detection to subtle gross errors, but also have more power to correctly identify the variables in error over other tests. Again, this statement was not generally confirmed by an extensive conlparison study [Ih].

Exercise 7-13. Collective Prirzcipal Cornponeill x2 Tests. Similczr to the global test (Equutior7 7-4) deliise a test usir~g tl~c. statistic defirled by Equatiorz 7-56, wllich it7clrrdes the contrihritiotz of the k retailzed principal compot~ents of fhc balance residual vecfor r.

Example 7-5

Let us apply the principal component tests based on measuremelit residuals to the process considered in the preceding examples. The nonzero eigenvalues of matrix W computed in Example 7-3 are all unity. The matrix U, whose colulnns are the corresponding normalized eigenvectors is given by

I -0.7475 0.1 1517 -0.0067 0.0287- 0.4077 -0.688 1 0.0500 -0.4232

U, = 0.3843 -0.4449 -0.5698 0.2554

-0.0157 0.4950 -0.601 1 -6.0597 0.0067 0.2523 0.31 88 -0.7382 9.3564 0.0773 0.5578 0.4542J

%t;r principal coniponents were retained in ihis example. The p~incipal compoiients are coinputeii as 1-0.53 17, -2. t 18 1 , 0.2422. -3.0 1871. At 5% level cf significance, the tests for p~iiicipal ccmponen:~ 2 and 4 are ~ejecied, while ai the niodified level> of sizr,ificance only the test for the last principal coti?ponerir is rejecfed.

STATISTICAL TESTS FOR GENERAL STEADY-STATE MODELS

In the precedir;g sections. the different statistical tests for detecting gross errors were described for the simplest case when all the vrlriables are measured directly. I11 general. unmeasured variables may be prcsent, and the nleasurellients may be indirec:ly related to the variables. Narasimhan and Mall [I71 described sirnple tra~isforrnations by which the general steady- state models can be converted to the above simple steady-state model.

Using these transformatiorls all the statistical tests can be derived as described below.

If unmeasured variables exist, the constraint model is described by

Axx + AUu = c (7 - 59)

where x: n x 1 is the vector of measured variables and u: p x 1 is the vector of unmeasured variables and A, is assumed to be of full column rank, p. As shown in Chapter 3 , the unnieasured variables can be eliminated by pre-multiplying the constrai~its by a projection matrix P: (m-p) x 171 of rank m-p, where m is the number of constraints, to give the reduced constraints:

PA,x = PC (7 - 60)

The co~~stra int residuals for the reduced constraints call be defined exactly analogous to Equation 7-2:

e e P = P(A,Y-C) (7-61)

One can show that the variance-covariance mail-ix of vector p is:

v P - cov(p j = PA,Z(P~~ , ) ' (7 - 62)

Exercise 7-11. irsirzg !he rule c?f litzcai- trar~sfnnizaiioiz.~ i ~ z 7

I ~nultil.ariate statistics, pmve t!wr the r~dziced ~017r.t1uil~t 1-e.sid~iu1s

0 I I f o l l o ~ ~ ~ G nunnrzl dirt ribuiior? uiih cu~urii i~ice n1urri.r dr<fin <d iq Eqaciiio~is 7-62.

-

The statistics of the global, constraint, and measurement tests can be obtained by using PA,. p, and V,, respectively, for A, r. and V iil the appropriate equations. For deriving the GLK test statistics, we note that the gross error signatare vectors for biases and leaks are also transformed due to the use of the projection matrix. These trarisfbrmed si, atiature vectors are given by

f,, = Pf,

where fk are given by Equation 7-23. The GLR test statistics are now obtained using Equations 7-30 to 7-33 by substituting fpk, Vp, and p, for fL. V, and r, respectively. It can be proved that the GLR test gives MP test statistics for detecting single gross errors even when unmeasured variables are present.

111 some cases, the measurements may not be directly related to the variables as in Equation 3-1. An example of this was given in Chapter 2, where the pressure drop measurement is related to the square of the flow rate variable. Another example is the relationsh~p between a pH measurement and the concentration of hydrogen ions and perhaps temperature of the process. These relationships are typically nonlinear, but for simplicity we represent them by the following linear equations.

Let us assume that the constraints are given by

AX = c (7 - 65)

We define artificial variables x,, as

X, = Ds

Then Equation 7-64 beco!ncs

Eq~lztions 7-65 and 7-66 can be jointly wrntten as

Equaiians 7-67 and 7-68 represent an equivalent alternative mudel of the process in which the variables x are like "unmeasured variables" and variables x, are like directly measured variables. Therefore, the method described for treating unmeasured variabIes can now be used to derive the statistics for all tests.

The technique described below can be applied even when the measurernents are related to variables by nonlinear equations. However, the

I~irrodrrction f~ Gross Err<>,- Detrcfjor2 203

resulting modified constraint equations will be nonlinear and nonlinear data reconciliation and gross error detection techniqdes have to be used to solve this process.

TECHNIQUES FOR SINGLE GROSS ERROR IDENTIFICATIOK

The second component of a gross error detection strategy deals with the problem of correctly identifying the type and location of a gross erroI- which is detected by a test. It should be noted that the identification prob- !ern arises only if the detection test rejects the cull hypothesis. Not all detection tests described in the preceding secrioc are designed to disiin- guish betweer: different gross error types. Only the GLR test is suitabic far- distinguishing between different types of gross errors, because i t also uses info~nation regardicg the effect of each type of gross ermr on the process mode!.

In order to conpare the different techniques developed ic conjunction with the. different tests for gross error idecti5ca:ion and obtarn a good an5erstanding of the interrela~ionships, we initially restnc! our consideration to gross errers caused by biases in measure~nents. In this section, \xre alsa consider only tl:e problem of identifyi~g a single gross error in the measurement. In this case, the idectif~cation problem reduces :o sinlply identifying correctly the measurement which contains the ,oross error.

The techniques for identifying the measarernent contairling :he gross error can be 2 simple rule or a coillplex strategy, depending on the test that is used. The measurement test and the GLR test, by virtue of the manner in which they derive the test statistics, use a simple rule to identify the gross error. It has already been pointed out in the preceding section that the GLR test and the MP measurement tests are identical if we restrict our consideraticn to gross errors caused by measurement biases only. In this case, there is a test statistic corresponding to each measurement. This identification rule used in these tests can be stated as follows:

Zderttify the gross error in tlte rneasurerne~zt that corresporzds to the rnaxi~~zum test statistic

exceeding the test criteriotz.

Because of the simplicity of the above rule, it is commonly stated that the measurement test or GLR test does not require a separate strategy for identifying gross errors. We have deliberately chosen to refer to the above rule as the identification component of these tests because \ve demonstrate later in this section that this rule is equivalent to the serial elimination strategy used in conjunction with the global test for identifying a gross error. It should also be noted that if we expect other types of gross errors to occur, such as leaks, then the GLR test which constnlcts a

Dross error. test statistic corresponding to each type and location of a , simply extends the above rule to identify the gross el-ror which corresponds to the iliaximum test statistic [1 I ] .

Serial Elimination Strategy for identifying a Single Gross Error

If we use the zlobal test for detecting the presence of gross errors. a com- parritiizely inole complex strategy has to he applicd to idetitity the tueasur.2- ment con!aining :he gross error. Kipps [I j tird octlined a procedure which was later studied 311d refified by Serrh and Heer~an {I81 and Rosenherg et 31. (191. This procedure is known as the .reriai ~!'iitzinufiiftl ~ I Y ) c . c ~ L I ! - ~ .

In the serial elir~iination procedure, each tneasurerneni is deleted in iur:i and the ziobal test staristic is recomputed By elirni1ia:ing a me:isureme:it. we make the corrp-spondirig ~rarihhle u~lmeasr-licd. Hence. !he ~ l o b a i lest statistic has to be recomputed usins the reduced consrraint residuals as explained in the precedkg section.

Due to the incresse in the number of unmeasured variables. the ol?jective function value. an3 thus the global test statistic, will decrease. Ripps 111 su,n,nested that the pross error can hc identified in that nieasurernent whose deletion leads to the greatest reduction in the objective function value. Although, this strategy is called the serial elimination, it should strictly be called the Iileasureinent elimination.

Only in the context of multiple gross error detection described in the next chapter does the imp!ication of serial eliniinalion become clear. Instead [of rgpeatedly solving the data reconciliation problem or cornputi~ig the projection matrices for deletin: each measurernent in turn. Crowe [20] derived simplified expressions for the reduction in the data I-ecuncilia!ion objective fi~ticrion n lue due io the deletion o f a ~ne;l.;ilrennent i. it is $\,en 1 3

It can readily be verified that the reduction in objective function value due to elimiriation of measurement i is equal to the the GLR test statistic (or square of the measurernent test statistic) for variable i. This implies that if the tvle used in conjunction wiih the global test is to identify the gross error in that measurement which gives niaximu~ll AJ,, then this is precisely the same mle used in the GLR test (or MP measurement test) for identifying the gross error in the rneasurernent corresponding to the maximum test statistic. In other words, the global test in combination with serial elimination strategy is equivalent to the GLR test.

Another interesting afid u s e f ~ ~ l result is obtaiiied by interpc-eting [he principie i~ivolve~i in the GLR tes: from the viewpoint oi'data reconci!i:t- tioi? objective function value. If we consider Eqiratior, 7-28, which defices thz GLii test statistic, the two terms withir? parentheses on thr RHS of this equation can bz interpreted as the optiinal objective vaiue of data rr,ccnciliaticn proble~ns. Wr, have already ;lot-d that the f11-st term in this expression is the optimal objective futiction value ~ O I - ~ h c standard tiara reconciliation problem (see Exercise 7-1). The sccor~ci te-ni is :he optimal objective functior: value cf the followirig rcccnci!iation proble~ii in which the estimate of the gross error in measurement r is also ob:airred as part of the solution.

Problem PI:

subject to Ax = c

Thus, the GLR test statistic can alsc? be interpreted as the maximum dif- fef-ence between the optiilial objective function values of' data reconcili::

206 Drirn Rcc o,rci/!nrrori o11d Gro.\> ern,^- L):,I<Y./I<~II

tion under the assu~llption that there is no gross error, and (he opti~nal ob-jective function values obtained by solving Problem PI. It should be noted that in Problem P1, all the measurements are retained even if mea- surernenr i is assumed to contain a gross error; instead of eliminating measurement i, all measurements are used to obtain an estimate of the gross error. Based on this interpretation and the result given by Equation 7-69, it can be concluded that the optimal objective function values are equal, whether we choose to eliminate the measurement i (hypothesized to contain a gross error), or we choose to retain it and obtain an estimate of the gross error. This result is used in the following chapter to establish equivalence between different multiple gross error identification strategies.

Similar to the global test, the co~istraint test does not completely idell- tify the type or location of the gross error. Although more informative, since it identifies the node (or equation) in gross error, the nodal test also requires additional identification. Mah et al. [21] developed an algorithm (for a mass flow network case) for identifying the measurernznts contributing to nodal imbalances. If no measurements arc found in error. any si~nificant node imbalance is attributed to a leak or a nod el en-or. A m:ijor proh!em with the nodal tes: is the possibility of error cancellarions

hich makes i t difficult 10 find the right location of the gross error. Such :erhriiques arc: described in the next chapter.

I / Exercise 7-17. P t ~ 1 . e riziit tirr, oprir,:ul ~;l?+ctivc$rrzcriorl ixaice q f

( P I - ~ D I P ~ I PI is t-qua1 t(j the opiiiuill ol?jecti~~e~fi~rzc;iorz vulue / ch ia i~z~~d b , .rci~~i!ig ilie diitii r:co~~ciliotior~ pml,lorr ill :i-li!cll i;!ecis~~reilzerit i is elirllitln~rci alld rile c-or-~-esj,onr!iq vut-iuhle is

j tr?(!trd c!.y L ~ ! ~ ; , ~ ~ ~ ~ , ~ ~ ~ ( , . i Fro111 the results of Examples 7-3 and 7-4. we &serve that if we

choose to identify a gross error in the measurement corresponding to the maximum test statistic. then the gross error in measurelnent 2 is correctly identified by both the MP ~neasurement test and the GIJK test. In fact, even if we consider gross en-ors due to leaks, the maximum GLR test sta-

tistic corresponds to a bias in measurement 2. The global test also rejects the null hypothesis and we can use the measurement elimination strategy to identify the location of the gross error. Table 7-2 shows the reduction in the global test statistic when different measurements are eliminated.

Table 7-2 Reduction in GT Statistic for Deletion of Different Measurement

Mwsurement Eliminated Reduction in GT Statistic

Since the maximum reduction of the global test statistic is obtained when measurement 2 is eliminated, the measurement elimination procedure along with the GT identified the gross error correctly. This is not surprising bzcausz this procedure is similar to the use of the GLR test. It can be verified that :he reduction in the G T statistic due to measurement e!im- ination is identical to the GLR test statistics computed in Example 7-4.

Identifying a Single Gross Error by Principal Component Tests

We can identify the constraints in gross error by Ifispecting the cants- - hutron from the ith ~esidunl in r, r:, to a suspect principal component. say, p,,, which can be calculated by

where \vLi is the ith column vector ot rnatrlx \v,

Let us define g =(gl, . . . , gJT, and let g' be the same as g except that its elements are sorted in descending order of their absolute values. In general, the contributions of different residual\ to the suspect principal con~ponent are different and are dominated by the first few elements. These are the major contributors to the suspect principal component. The

208 ~~r~ ~ ~ ~ ~ ~ ~ i / i o r i o ~ l N I ~ GWTS Error ZICICCIIOII

major contributors are directly related to the constraints that should also be suspected. The number of major contributors, k, can be set so that

where E , is a prescribed tolerance such as 0.1

Note that since the signs of these contributions can be either plus or minus, as can the signs of the elements of w , ~ and r, the cancellation effect among the elements of g' should be taken into account in identifying the suspect constraints. This is done in Equation 7-7 1.

Similar to the nodal test, the principal component test 011 balance residuals only indicate:; which of the constraint residuals are major contributors to the suspect principal component. An additional strategy is required for idcn- tifying the source of the error (leak or measurement bias) and which of the nteasurernents concains gross errors. We can. however, always use a p:inci- pa1 componeDt tzs: en nielrsur-ement adjustmenls in order to identify a measurement in gross error. Thai can be done by inspecting the contributim fr~!:~ the jth adjustment in a, saj-, a;. to a suspect principal co~nponen! i.

The j th adjus[!ner?t contr.ibntion can be calculated by

where \rr,,i is the iih eigenvcctor of W, dnd n is :he tot21 r iu~~iber of nieasrilemenis. We can srudy th? contributions by checking th2 signs 31id niagnitudes of the elemellts in g. In general, as with the principal ccxnponent test for balance rzsitiuais, the contributions vary and are dominated by a few elements. The identification rule for the principal component measurement test is the f;::owi;,g-

ZdentifL the gross error in the measurement that correspo~zds to the major contributor to the maxi~num principal

coazponent exceeding the test criterion.

I ~ ~ / r ( ~ d ~ t ( . r i ( v ~ 10 Grg)c\ Et I O I - ~ ? ) < z ~ ~ r r ; o , ,

Example 7-7 / \ -

In order to identify the gross error using PC tests, we have to examine the contributions to the rejected principal component. Let us consider only the last principal component which is rejected at the modified level of significance (see Example 7-6). The contributors (measurement adjustments) to this principal component can be analyzed by computing the vector g (Equation 7-72). This vector is given by 10.0293, -1.1073, -0.1030. 0.0975, -1.0237, -0.91 141. The major contributor to the suspect principal component is measurement adjustment 2, and therefore a gross error is identified in this measurement.

Tong and Crowe 11 51 canied out extensive analysis of the pr-incipal component tests and outlined some practical guidelines for implementing a gross error detection and identification strategy for these tests. Most of their recommendations. such as making use of collective X' tests first. using accurate variance-covariance matrix of measuremcnt errors and the proper error distribution are valid for all stra~egies in\~olving univariate tests.

In the end, they recommend that the PC tests should be used in cornbination with other statistical tests, since there is no guarantee that such tests will deteci all grcss errors. They also \xlarn the user about the increased computational time irl ca!culatins the elgenvaIues arid eigenvectors. an3 also cor:tribution analysis of thc PC Cesi statistics f3r gross error ideritificatior,. The bocton! line is that the PC tests are effective in certain situations, but they are not generaily superior ro lhc basic szatisti- cal tesrs described in this chapter.

DETECTkBfLlTY AND IDENTiFIABILITY OF GROSS ERRORS

Mi's close this chapter with the discussion of two important questions in gruss error- detection. The first is whether it is possib!~ ro detect gross errors irt aii measurements and whether gross errors in two or- more measurements can he distinguished from each other. The concepts of detectability proposed by Madron [21], and idcntifiability discussed by different researchers [23,24, 251, are used to answer these questions.

Detectability of Gross Errors

Similar to data reconciliation. an essential prerequisite for gross error detection is redundancy in measurements. Theoretically, it is possible to detect gross errors only in redundant measurements. A gross error irz a ,Ionredundant measurenzent cannot be detected. This is due to the fact that a nonredundant measurement is eliminated along with unmeasured variables and does not participate in the reduced reconciliation problem. Hence, no test statistic can be derived for a nonredundant measurement and a gross error in such a measurement cannot be detected.

In Chapter 3, methods were presented for observability and redundancy classification of process variables. Only redundant measurements are adjusted by reconciliation and observable unmeasured variables can be estimated. By adding sensors to measure new variables or by including additional cocstraints (if available). it is possible to eliminate unobserv- ability and nonredundancy.

Both matrix and graphical approaches can be used for observability and redundancy classification. In practice, however, inany redundant variables behave as nonredundant ones. We refer to such measurements as pl-ncticnllj I I O I Z I . C ~ I ~ I Z ~ ~ ~ I Z ~ I ? I C C I . T U I ~ ~ ~ I I C ' I I ~ . S . Iordzche e: al. j221. Crou'e [ X ] , Madroll [22]. and Charpsntier et al. 1251 ha;,e reported difficulties in the reconciliation and gross error detectioil in such measuremznts. Similarly, even if some unmeasured varisbjes are observable, their estimates may have such high standard deviations that they may be considered as pmcttca!ly unobsennDle \la:-inSler.

If a nieasure~nent of a redundact vzriable co!l!air~s a gross en-cr. then data recor:ciliation should theoretically inake a large adjustment to this n~easurement in order to obtain an xiirnate as close as possible to the true va!ae of the vaiiable. In some cases, however, dye to the nature of the constraints and the standard deviations of variables, I-econciliation rnay make an insignificant adjustment to the emoneous redundant mea- ~.lre!?el~' 2nd instead make sdjustlnents to other fault-free measurz!nentu to satisfy the constraints. Such a ~neasurerncnt is not trtlly redundant even if it is classified as redundant theoretically, and it is difficult to identify the gross error in such measurements.

Madron 1221 defines a practically redundant rnezsut-ement as one whose adjusrabilir\: is greater than a selected thl-eshold value. This condition is expressed analytically as

where a, is the adjustability, u?] is the standard deviation of the reconciled value i , and oVl is the standard deviation of the ~neasurement error. The critical limit qr is a value from interval (0,l). For example, if qr is chosen as 0.1, all measurements i having a, <0.1 are considered practically nonredundant. For such measurements uR,/oy, >O.9, and. therefore, the adjustment made to the meacured value is insignificant. The adjustability a , is also a measure of the improvement in the accuracy of a measured value that can be achieved through data reconciliation.

Charpentier et al. 1251 suggested using the ratio

for identifying the measurernents with weak redundancy. This factor is a measure of detectability of an error. Since constraint imbalances indicate the existence of gross errors. the detectability of a gross error depends on its contribution to the imbalances of constraints. The contribution of a measuremefit in a constraint residua! depends 011 the prac-ess cofistraint and relative accuracy of the measurernents (relative standard deviations). The contiibucion of an error to tile cotlstraint imbalances is proportic?nal to the detectability factcr. The !argei- the deteciability facto~; rhe more likely is the gross error to be detected. This ~ 1 s o implies ihat if t!le delectability factor dl is !argc, !hen grcss e m r s of small magnitudes in ihe coxesponding measuremenis can br; detected relatively easiiy.

A complete practical redundancy analysis is usefu! !il identifying a!i measured variables with weak redundancy. For linear processes, ?he staw dard deviarion of reccnciled estimates can be coniputed analytically as described in Chaptsr 3 and adjustability or detectability measures can be computed. For nonlinear proble~ns, however, these measures can be computed only after solving the reconciliation probleril for a given set of measurements and by linearizing the constraints around the reconciled estimztes. Equation 2-13 can be used to calculate the standard deviation of a reconciled value by a summation rule, as explained in Chapel- 2.

Simulation studies have also been conducted and variables with the t j l - lowing characteristics have been identified as prac~ically nonredundant:

Variables with relatively small standard deviations, in comparison with the standard deviation of the other measurements belonging to the same balance. This is usually the case with the measurements whose order of lnagnitude is also small relative to the other variables in the same balance equation (for instance, flows of small streams that appear in balances with flows of large streams). The required ratio errodstandard deviation for gross error detection is much larger for the variables with small standard deviation than for those with large standard deviation [23]. Parallel streams (for instance, outlet flows from a splitter that are not constrained by any other balance 1231). Flows that appear in the enthalpy balance, but not in mass balance [25]. These are typically pumparound flows that are used in the main column enthalpy balance and associated heat exchanger balances, but they are not included in the tower mash Lalance. An overall mass balance using rneasured feed and product rates for the entire fractionation is usually chosen in order to avoid the large number of unmeasured flows around the column itself. Temperatures of small streams in the same balance with teimperatures of large streams (even if the order of magnitude and standard deviation of such temperatures is similar). Inlet ternperattire of the first heat exchanger in the preheat irain [25:. It usually appears in ~jnly one enthalpy balance, while the fo1lo.h- ing tzmperatures enjoy extra redil~dancy by being part of at least two heat balances.

= hleasut-ed variables that appear in only one equation with an unmeasured variabie which is not corlstrained by any other balance ecluatior! or bounded. A gross error ir, t l i ~ rn2a:;ured vaiiable i s t~sually transfer-red ro the iinmeasured variable which has lnorc freedom to adjust.

There is no simple solu~ion for the data reconciliation and gross error detection in such variables. Extra constraints and extra instrumentation would certainly help. but that is not always possible. Sometimes artificial "measured" variables can be created from cslculated values in order to enhance redundancy 125, 261. The information about weak redundancy points can bc provided to the users of a particular data reconciliation package in order to enable then] to recognize the limitations in accuracy of the gross error detection methods and trigger decisions for improvin: their instrumentation.

Knowledge of practical variable classification is important information that can be included in the gross error detection algorith~ns. For instance, the detectability factor of a gross error can be used as a tie breaker when more than one measurement share the same value of the statistical test [16].

Example 7-8

For the process considered in the preceding examples, the covariance matrix of llleasurement errors is taken as the identity matrix. The covariance matrix for the estimates of all these variables can be computed using Equation 3-10. The diagonal elements of this matrix are the variances for the estimates. For this process, the variances for all the estimates turn out to be equal to 0.3333. Using these values in Equations 7-73 and 7-74, we obtain adjustability to be equal to 0.4226 and detectability to be 0.8165 for all variables. This implies that gross errors in all measurements have equal chance of being detected. On the other hand, if we take the true values of flow variables to be 1100, 99, 1, 99, 1 , 1001 and assume the measurement error standard deviations to be 1% of the true values, then the adjustability and detectability values for different measurements are given in Table 7-3.

Table 7-3 Adjustability and Detec:abiii?y Values for Process in Figure 1-2

Measurement Error Variailce Adjustabili!y Defectability

From the results given in the above table, we can conclude that it IS

relatively more difficult to identify grvas elrors in the measurements of streams 3 and 5 compared to the others. Note that this can also be inferred frorn the first observation made in the discussion preceding this example about measurements with sinall standard deviations. In order :o

verify this observation, about 20 simulation trials were made in each of which a gross error of magnitude 5 to 15 times the standard deviation was simulated in I-low measurement 1 and the GLR test was applied for identifying the gross en-or. Similarly, 20 trials were ~ n a d e with gross error in measurement 2, and so on for each position of the gross error. The results showed that while gross errors in streams 1, 2, and 6 were identified correctly in all trials, only 60% of the gross errors in stream 3 and 30% of the gross errors in stream 5 were identified correctly. Although the number of trials made are small, the trend of the results corroborates the observations.

identifiability of Gross Errors

Even if a measurement has a high detectability, it is important to determine if a gross en-or in this measurement can be identified or distinguished froin a gross error in any other measurement. For linear processes, this quesdon can be answered in different ways, which we describe below.

Iordache et al. 1231 pointed out that the test statistics of two diHerent measurements are identical if the columns of matrix A correspondi~:g to :he two measured variables are proportional !o each other. One special case ~f t h i ~ occurs when two parallel strezrrls link the san:e twc nodzs oi' a process. This irnplies :hat it is not possible to distl~guish bztween gross errors that occur in these ~neastirements. In the comext of the GLR test. Xarasimhan 2nd Mah 1131 indicated that if the signature vectors of two grass errors are proportional. then these cannot be distiiiguisll~d fi-om each other. If we restrict corlsideration to measurement biases ~il!y. then this observation is :he same as ihc one made by lordriche et a!. [33]. By using s i~aature T.~cctorsl identifiability problems bet~veen different types of gross errors can be discovered.

Recectly, Eagajewicz and Jiang [24! proposed the cancept of equivc- lrrzt sets o f g~us.7 errors. A set of gross errors is equivalen: to another set of gross errors if the two sers cannot be distinguished from each ochcr. For the case of measurement biases, Bagajewicz and Jiang 1241 proved that if a set of measurements or k var~ables fornms a cycle of the process graph, then gross errors in any combination of k-l nieasorements from this set cannot be distinguished from gross en-ors in any other cornbination. This can be easily verified. if we note that in serial elimination a nleasurement suspected to contain a gross en-or is eliminated, making the corresponding variable unmeasured.

Choosing to eliminate any set of k-I measurements from a cycle of k measurements will automatically make the remaining measurement nonredundant and will eliminate it from the reconciliation problem. This implies that the solution of the reduced reconciliation problem will be the same regardless of which combination of k-l measurements are eliminated. Thus, it is not possible to identify which set of k-I measurements from this cycle contains gross errors and all such sets are equivalent. For the same reason, it is not possible to distinguish gross errors in the measurements of all k variables of a cycle froin any set of gross errors in the measurements of k-I variables of this cycle.

As a special case, if we consider a cycle formed by two streams (parallel streams), then is not possible to distinguish between a gross error in one stream from the other. It is also not possible to distinguish whether both the parallel stream measurements contain gross errors or if only one of them contains a gross error. Furthermore, if the number of independent constraints is equal to in, then all sets of nz linearly independent gross errors are also equivalent. This is due to the fact that the reduced reconciliation problem will have no redundancy left and there is no i n f or- rnation a\iailable to make ar,y distinction between them.

We refer to equivalent sets of gross errors as belonging to an equivnirrz- c-y class. Eqgivalency classes can also be obtained in te1111s of the signature vector-s of gross ennrs. which allow other types of g r o s errors such as !eaks to he also considered. The followi~ig principle can be derived:

l i the signature vectcrs for a set of k grcss errcrs form cr linearly dependent set of rank k- I , hen it :s no: possible io theoretica!ly d!'stinguish behueen one combination of k-1 g ras errors from any oiher cornb;notisn o f k - I grcss errors chosen from this set. It is also not possible tc &siin3uish whether k gross errors or k- l gross errors from fhis sei are presznt in the process. (It is possible, however, to distinguish a cornbirlation of less than k- T gross errors from ofher combinations.) As c special case, if the rnaximcrrn number of independent signature vectors is rn, then any set of rn gross errors with !inearly independent siqnature vectors is equivalent to any other such set.

Example 7-9

The process graph of the flow proces5 considered in the preceding examples is shown in Figure 3-1. The following three cycles can be identified in this graph:

216 Dirrn h'c~.o,fc~iiafio:l olid <;nt EI I or- Ucrecllnn

Cycle 1 consisting of streams 2, 3,4, and 5 Cycle 2 consisting of streams 1 , 2,4, and 6 Cycle 3 consisting of streams 1, 3, 5, and 6

Thus, the following equivalency classes of sets of biases are obtained:

Class 1. [2,3,4]; [2,3,5]; [2 ,4,5] ; [3,4,5];and 12, 3 ,4,5] Class2. [1,2,4];[1,2,6]; 11,4,6]; [2 ,4,6] ;and[1,2,4,6] Class3. [1,3,5]; [1,3,6]; [1,5,6]; [3,5,6]; and [1,3,5,6]

If we additionally consider, say, a process leak in node 1 (splitter), then we use the signature vectors to identify equivalent sets. 'I'he signature vectors for measurement biases are the columns of matrix A. The signature vector for a leak in node 1 is the first column of matrix .\, which is identical to that for a bias in measurement 1. Thus, a leak in node 1 cannot be distinguished from a bias in measurement 1. In addition, we also obtain the same equivalent sets as obtained using cycles of a graph because the signature vectors for measurements biases in streams 2, 3. 4 and 5 ar-e linearly dependent with rank 3 and so on.

it should also be kept in mind that if a set G of gross errors contains a rchse: of zross errors, say gc, belonging to art equi-~alency class C, then o:!:er equi\~alcnt sets of C can be obtained by replacing gci with other sets which belong to C. Thus, for example. we can derive the equivalent sets for :he combination 11, 2, 3. 41 by replacing the subset [2> 3. 41 by other sets of Class 1. Similarly, we can replace [ i , 2. $1 by other sets of Class 2. Thus, .,ve obtain another equivale~~cy class siven by

Ciass 1. [ I , 2. 3.4;; [ i ; 2. 3; 51: [ I . 2 , 4 ,5 ] ; [ i , 3 , 3 , 51; i!, 2, 3, 61: [1, 3 , 3 . 6;: [?,: 3.4. 6;; 11; 2.5 61: [2,3. 5 , 61; 11,4,5, 61; [2,4,5.4]; [3.4, 5, 61

'The last 5 sets are added to Class 4, because they are equivaleni to sets [I, 2. 3, 51: [ I , 2. 4. 31; arid 11, 3, 4, 51 which belong to Class 4. Equivd- 1encg Class 4 can also be generated using the fact that this process has 4 in~c~endent constraints and all sets of 4 gross errors with linearly ifide- pendent signature vectors are equivalent.

Although identifiability problems can occur in linear processes. ill gellrrol, 11zi.v i.s /lor u ~ I . o / ~ / E I ~ I in tzoi11ii1e~1r pi-OCESSCS. If nonlinear con-

straints are linearized around the reconciled estimates, it is highly unlikely that the columns of the linearized constraint matrix will become dependent. Even if this occurs, it has to be interpreted as a numerical problem rather than as an identifiability problem.

PROPOSED PROBLEMS

NOTE: The proposed problems that are irtclr~ded in this c/apter require more extertsive calculatiorts. A computer program or a mathematical fool such as MATLAB is required irz order to get solutw~ls to these problems.

Problem 7-1. A mass flow network from Rosenberg et al. [I91 is represented in Figure 7-1. The true mass flow rates (in Iblsec) in the stream order are given by the vector: [15 15 25 10 5 10 10 5 5 5 10 5 5 10 10 101. All mass flow rates are considekdim6asured. ~ h k standard delpiation' f& each measurement is 2% of the measured value. Following the procedure explained at the end of Chapter 3; sirnulate random measured values and fiild :he I-econciliation solution. Next; sirn~llate a single gross error (bias or leak) and

Figure 7-1. Mass flow network for Problem 7.:. Reprinted with permission from [19]. Copyright O 1 987 American Chemical Sociep.

apply appropriate statistical tests (at a-0- .05 level of significance) for gross error detection and identification. For a more extensive study, simulate gross errors of various magnitudes and in different locations. Also, calculate the detectability factors by Equation 7-74 and explain why some gross errors can be detected and correctly identified, while others cannot.

Problem 7-2. The steam-metering system for a methanol synthesis unit [ I 81 is represented in Figure 7-2. The correct values of the steam flow rates are listed in Table 7-4. The values in Table 7-4 are 8-hour averages of the plant data except they have been adjusted to balance the system. A11 flow rates are considered measured. The standard deviation for each measurement is 2.5% of the measured value. Repeat the operations indicated in Problem 7-1, and explain the behavior of various statistical tests for gross errors s i m u ~ a ~ ~ c i in streams with different detectability factors. Choose the level of significance a=0.05 for all statistical tests.

Figure 7-2. Steam metering system of a methanol synthesis plant [I 81. Re roduced with permission of h e Arnericm institute of Chem;col Engineers. cqoyrgL Di 986 AIChE. A// rights reserved.

Table 7-4 Correct Values of Flow Rates for the Steam System of Figure 7-2

Sfreurn No. Flow Rate 1,000 Kg/h Stream No. Flow Rok 1,000 Kg/h

1 0 86 15 60 2 1 I6 21 64

(6 0 3 111 82 17 32 73 4 109 95 18 16 21

4 5 53 27 19 7 95

1 6 1 12 27 20 10 5 7 2 32 2 1 87 27 8 164 05 22 5 45 9 0 86 21 2 59

10 52 41 23 46 64 1 1 14 86 2> 85 45 12 67 27 26 81 32 13 111 27 27 70 77 14 91 86 28 72 23

Problem 7-3. A sinlpliiied diagram for an ammonia synthesis process from C1oa.e et al. [ I I ] is represented in Figure 7-3. By using Crowe's projectioil mati-ix niethod. all xnrneasured variables have been eliminated and a reduced inode1 involvirig o11iy ~rleasured variables is obtained. The constraint 111atnx fo~- the reduced niodel is:

and the lneasured values for the measured conipo!len: flow rates are indicated in Table 7-5. A nondiagonal variance-covariance matrix for the measuremenr errors was uszd for- ct~is p~nblern as follows:

Figure 7-3. Flow diagram for a simplified ammonia plant [-I 11. Reproduced with permission of the American /nstitv!e of Chemical Engineers. Copyright a1983 A!ChE. A// rights reserved.

At least one gross error exisr:; in the given measured data. Apply various staristical tests at le\,el c:=O.C5 to find the most likr^ly ioca~ion of the cross error. d

Table 7-5 Measilred Values for the Ammonia System in Figure 7-3

- - --

Spuies Measured Flows (molls) _ _ - - _ _ _ . _ _ _ __ -

* h2 ' I 1 7

N,"' 84 A+" C' 4 x;-I 101 ~ ~ ' 2 1 20 7 ig-13) 69

h i i q i * l 62 H2(> 205

- -

Figure 7-4. Flow dia ram fo: o chemical extraction plant 1271. Reproduced with f Chernica' Engineering. pe rn is i i~n of h e ConsJan Sociei .or

I'roblem 7-1. A tlov shec! for a chemical exrractioil p i m i From Hull>- e l 21. [27j is repl~s~tlteci in Figure 7-4. Tile measured ilow rates i i :~ lbii~r-!. aver2:g::d over z 13-haul- period. are given ir, Tabl:: 7--6. After e1i:lliilaiin~ the annleasureG tlows by a pro-;ection mat!-ix. :tie foilou ins ~ .cd i i i~ i i

11tiroduoiorz lo Gross Err-or Derct-rion 223

The variances for the measurement errors are also given in Table 7-6. Apply the global test and the measurement test (both at a=0.05), to identify the measurements suspected in gross error. Eliminate the suspected gross errors (one at a time) and recalculate the statistical tests. Which measurement is more likely to contain a gross error? Repeat the probleln with the nondiagonal covariance matrix given by Holly et al. [27]. Explain any difference from the results with the diagonal variance-covariance matrix.

Q' Q)

Table 7-6 Measured Values and Variances for the

Chemical Extraction Process in Figure 7-4

Strwm Variance

8.35E+05 1.07E+07 1.66E+07 4.95E+O7 1.09E+07 6.3855

6.77E+C7 1.05E+9S 4.70E+06

0.4301 4 1096

8.27E+06 9.49E+OS

23680 771.5 6.547

X.50E+07 7438 837 1 400 1

8.27E+05

Measured value (Ib/hr)

SUMMARY

*Any gross error strategy needs to detect and also identify the location of gross errors. There are two types of errors associated with any statistical test: Type I error (when the test detects a nonexistent error) and Type I1 error (when the test fails to detect an existent error). Only the measurement test and the GLR test can directly identify the location of gross errors (by a simple identification rule). The GLR test is the only test which can identify both measurement biases and leaks by the same type of test. The gross error detection strategy by GLR test involves estimation of magnitudes of gross errors. Maximum power tests can be derived for the measurement test and for rhe nodal (constraint) tests. But the GLR test is more powerful than both of them for the single gross error case. The power of the GLR test is same as for the measurement test for a single measurement bias. Alternatively, the GLR test statistic is equivalent to the measurercent test statistic for a single measurement bias. Principal cornponent test is a iinear cornbinatior, of the eigenvec- tars of the variance-covariance matrix of cortstraining residuals or measurement adjiistments. - Principal component tesi cannot directiy identify the l~cai ion of cross error. It r cq~ i res additional analysis ia order to find the major construints or nieasurenie1l:s contributing to the principal co~nponents that failed the test. Serial e!irnination can be used to identify gross errors detected by the global :est. The reductio~: in global test statistic after elimination of a measurement is equa! to the GI-? test stz!i:t;p 'The detectability of a gros.: error depends mainly on its magnitude and its location. Some sross errors can be detected, but not always properly identified.

REFERENCES

1. Ripps. D. L. "Adjustment of Experimental Data." Chetlz. Etzg. Progress Synp. Series 6 1 ( 1965): 8-1 3 .

2. Almasy, G. A.. and T. Sztano. "Checking and Correction of Measurements on the Basis of Linear System Model." Prol?lei?ls of Control and Injiot-tnrrtion Tl7eor;v 4 ( 1975): 57-69.

3. Madron. F. "A New Approach to the identification of Gross Errors in Chem- ical Engineering Measurements." Clzenz. Eng. Sci. 40 (1985): 1855-1860.

5. Reilly, P. M., and R. E. Carpani. "Application of Staristical Theory of Pidjustrncnt to Material Balances," presented at the 13th Canadian Chem. Ens. ConScrence. Ottawa. Canada, 1963.

5. Mah. R.S.H., G. M. Stanley. and D. W. Downing. "Reconciliation and Recti- fication of Process Flow and Inventory Data." 11111. & Eng. Cl?etn. PI-oc. Dc7.s. Dcix. 15 (1976): 175-183.

6. Mah. R.S.H., and A. C. Tarnhane. "Detection of Gross Errors in Process Dara." AICIIE JOI~I-IINI 28 ( 1082): 828-830.

7. Sidak. Z. "Rectang.11ar Confidence Regions for the Means of Multivariate Normc~l Dist~-ibtitio!~." Jolc~,~cii ofAt71cr- Stutis. Assoc.. 62 (1967): 62h433.

S. Ro!il~ih. 1). K. anti J . F. L),i\;is. "Uii'niascd Estimation of Gross Error\ in PI-ocess Me3ji1remen1s." .-!/Ci1i'.lor0~11~11 38 (i992): 563-572.

4. Crow?, C. ii4. "Test of Masirnux Power for Detection of Gio.;s Erro:-s in I-'r~>cesb Constrainrs." rlIC!~f; Jo~/rnal 35 (1989): 869-872.

10. Tsmhane. A. C. ''PI Nore on the Use of Residuais for Detecting at: Otitlier in Linear- Regression." Bion~cil-if3 69 ( 1 382): 4XX199.

i 1 . Cro~r~e. C. hl.. Y.A.Ci. cam pi:^. alld A. HI-yniak. "Reconciliation of Fr3czss F!o:41 Rates b>- Matrix Projec:ion. 1. 1,inear Case.'. AIClrE . I ~ I A ~ I I ( I / 29 ( i 983 j: 881-888

12. Ta:nbar~e, A. C.. and R.S.H. M211. "Dala Reconciliutioj~ and GIYJS~ Error L3etec- tion in Cheniical Process Networks." Tc.c.lri1on7erric.s 27 (1985): 409-422.

13. Narasinthari. S.. tind R.S.H. Mah. "Genersli~eil 1-ikelihooti Ratio Method for Gross EITOI- identii?cl~tion.'- AICIIE JJ~I- t lnl 33 (1 987): I5 14-1 52 1 .

i i . Nal-'isimhan. S. "Maximum Pouer tests for Gross EI-ror Detection Usii~g L.ikelihood Rations." AICIIE Joir1.t7irl 36 ( 1990'1: 1589-1591.

!i Tong. H., and C. 31. Cro\\~e. "Detection of Cross Errors in Data Reconcilia- tion by P~.irlcipal Componciit A~lalysis. ' ' ;ZIChE Jour-1101 41 (1 995): 1712-1723.

16. Jordache, C., and B. Tilton. "Gross Error Detection by Serial Elimination: Principal Component Measurement Test versus Unjvariate Measurement Test," presented at the AIChE Spring National Meeting, Houston, Tex., March 1999.

17. Narasimhan, S., and R.S.H. Mah. "Treat~nent of General Steady State hlodels in Gross Error Detection." Compurels & Clzenl. Engng. 13 (1989): 851-853.

18. Serth, R. W., and W. A. Heenan. "Gross Error Detecting and Data Reconcil- iation in Stearn-Metering Systems."AlChE Journal 30 (1986): 243-247.

19. Rosenberg, J., R.S.H. Mah, and C. Iordache. "Evaluation of Schemes for Detecting and Identification of Gross Errors in Process Data." h d . & Ell.;. Chetn. Proc. Des. Dev. 26 (1987): 555-564.

20. Crowe. C. M. "Recursive Identification of Gross Errors in Linear Data Rec- onciliation." AlChE Joul-1,0134 (3988): 541-550.

21. Mah. R.S.H., G. M. Stanley, and D. W. Doarning. "Reconciliatioil and Recti- fication of Process Flow and Inventory Data." 111d 3 Eng. Chrrt7. Prnc.. Dts. Del,. 15 (1976): 175-1 83.

22. Madron, F. Process Plntlt Perfortnur~ce: Meusztr-emerzt utld Uuru PI-ocessilzg ful- C)i>tililization and Rett-ofits. Chichester, West Sussex. Eng!and: Ellis Hor- ucoc! Limited Co., 1992.

23. Iordache, C., R.S.H. Mall, and A. C. Tamhar~e. "Performance Studies of the X!easurzment Test for Dc!ecting Gros Errors irt Pn:cess Da~a." AICI7EJozri- no/ 3 1 (1085): 1 !87- 1201.

24. Ragajev:icz, M. :.. and Q. Jia12g. "Gross Error Modeling and Detection iit

Plant Linear Dynamic Keconciliation." C ~ ~ ~ I I ~ I I ~ , D I - . S & Clzer~i. Er~gizg. 22 (ilu.

11. 1998): 1789-1809. - - i. Chamentter. \I., I. J. Chang, G. M. Schv.,entel-. and K. C. Bardin. "An 011-

iinz Cat2 Reconciiiario~ System for Crude and Vacuum rjnits.'. presentzd ar the NPKiZ Computer Ccnf., FIouston, Tex., 19";.

26. Kneile, R. "Wrins More Info!-11ia:ion out of Pisnt Data." C!lern. Eng!lg (Mar. 1995): i 10-1 16.

27. Hclly, W., R. Cook, and C. M. Crowe. "Reconciliation of Mass Flow P,-?e Measurernents in a Chemical Ex~I-action Plant." Ca17. .I;. O f Chem. E:7g. 67 (Aug. 1989): 595--601.

Mir/ri,dr Cross Errol Ide~i f iJ ic~f jo~~ S t r ~ l t e ~ i c ~ for Stef~dv-Sia?e Proces~es 227

Multiple [dentif iiation Strategies

for Steady-State Processes

f * In the precedinz chapter, the different statistical tests for detecting the

presence of gross error-s and the methods for identifying a single gross error in the data were described. For a well-rnairitained plant, we sholtld genc!rally not expect rncre than one gross error io be present in the da;:~. Therefore; a fundamentai prsrequisite of any gross error detection strategy is that it should hzve good ability tc detect and identify cor~ectly a single grow errgr.

If dats recccciliarion is app:ied to a largc s~bsys tem consisting of rcariy measurements, however. or if tke sensors are operating in a hortile environment. and/or plant m2in:enance prscedures are inadequate, then it is possible for severa! gross errors to be .;imultaneous!y present in the data. Thus, there is a need to add a third component to :he gross elror detection strztegy which provides the capabi!ity to detect 2nd ide~ t i fy multiple gross errors.

Generdly i r~ the research iiterature, a gross error detection strategy is presented as a single entity withou: clearly distinguishing t:,, diff~ieii; components of detection. identification of a single gross errcr. 2nd ilenti- fication of multiple gross errors. As mentioned in the preceding chapter, we have chosen to separately ar~alyze the three different components used in a gross detection strategy to gain a better insight in!o the sirnilari- ties and differences between the various methods proposed. Our main focus in this chapter is the component of the gross error detection strategy that deals with n~ultiple error identification.

Multiple gross error identification strategies have been proposed by different researchers over the past four decades and it is not our aim to describe in detail every strategy in this chapter. In order to gain some perspective, we have attempted to classify these strategies into different classes depending on the core principle on which they are based. Within each of these categories we have chosen to describe one of the strategies in detail, depending on ease of description, and then indicate the different variants of this strategy proposed by other researchers.

All the techniques developed for multiple gross error identification can be broadly classified as either si~~zultaneoz~s strategies or serial strategies. They may also differ in the type of information exploited for identification. For example, some of the strategies also make use of information about lower and upper bounds on the variables to enhance the identification process. Lastly, as we have noted in the methods used for single gross error identification, not all strategies are designed to distinguish between different types of faults. For ease of comparison and description, we will restrict our considerations to the identification of gross errors caused by sensor biases, and indicate wherever pertinent the extension that car? be made to icclude other types of gross errors. Linear system5 are initially considered followed by the treatment of ~onlirlcar processes.

STRATEGEES FOR MULTLPLE GROSS ERROR IDENTIFICATIIBN IN LINEAR PROCESSES

Simuitaneous Strategies

idcrrfi,ficafi'on L'sing Sirzgle Gr3.s.s Error Wsi Statistics

Simultaneobs strategies for niultiple gross ec'or idet1:ification attempt to ideztify all gross errors present in the da:a siinultaneously cr in a sill- gle iteration. :n the case of the measurement test or GLR test, this strate- m v i s a simple extension of the identification ru:e used for identifving a -, , single gross error and can be stated as follows:

Iderztify gross errors 111 all nzeasrtrenze~zts whose corresporzdilzg test statistics exceed the test criter-iorl.

In the case of the GLR test, the ahove rule can be easily extended to identify other types of gross errors by making use of the corresponding

bll,iiipic Gross Err-or i ~ f e f i r r / i < ~ ~ ~ t i i ~ n Sirurrgrrs fur Sirad\-Slo:r, Prorr.rsfs 229

test statistics. The effectiveness of this rule was investigated by Serth and Heenan [ I ] and was generally found to result in too many mispredictions. The main reason why the above rule does not work well is due to what has been referred to as the sr7zruriizg eJfect.

Since the variables are all related throt~gh the constraints and the constraint residuals are used in deriving the test statistics, a gross error in one measurement may cause the test statistics of good measurements to exceed the test criterion but not the one which contains the gross error. In other cases, the test statistic of the measurement containing the gross error and those of other measurements can siinultaneously exceed the test criterion. The degree of smearing depends on many factors, such as the level of redundancy, differences in standard deviations of ~neasurenlent errors, and magnitude of the gross error 121.

Identification Using Comhiriatoricrl Hypotheses

Another sirnultaneo~ls gross error idcntificatio!l sti-ategb- is based on explicitly postulating all the alternati1.c hypotheses not only for a single gross enror but also for the differen: colnbinations of two 01- more gross errors in the daiz. A test statistic call be derived for each of these alternatives and the most probable one cho\en. This srl-ategy IS especialiy s~~ i t e i l 6, IJ, , application aionp with the GLR test. It can be recalled finm the prr-

ceciing chzpter rhac the GLR rest considers all possible sinple Sross err(>!- hypotheses as part of the alcernativc hypothesis. We can extend this to i!lciuCe hypothesis of ail possib!e combinations of t\izo or more gross errors. In o~f-rer words. the hypoti2eses can he f~ji-mulatcd its fc~lio.~vs.

Ho ino gross errcr in the d m ) :

Err] = 0

F I , (composed of a!ternaii\es I f : . H;'.", . . . ff,!'.". )

where

Hi (singie gross errcjr alternatives):

Elr] = bf,

N;'," (two gross emos combination altcrriative:,):

E[r] = b,f , , + b,f,,

H;1,i2.. . j k (k gross error combination alter-native\!:

E[rj = 1:!f1, +b, f , , + . . . +t>,f;,

where the indices i l , i2 and so on are chosen to exhaustively consider all possible combinations of gross errors. Thus, if t,,,, is the maximum nunlber of gross errors considered to be simultaneously present in the data. then 2'rnax alternatives are present in the composite alternative hypotheses. Corresponding lo each of these alternatives, the GLR test statistic can be derived. In general, let us consider the alternative hypothesis of k gross errors in the data corresponding to the gross error vectors f,,, fi2, . . . fik. The expected values of the constraint residuals under this alternative hypothesis can be written as

where b is a column vector of unknown magnitudes of the gross errors and FL is a matrix whose columns are the gross error vectors fi , , fi2, . . . f,L corresponding to the k grcss errors hypothesized (see Chapter 7). We can obtain the like!ihood ratio for this hypothesis and following the same procedure as in deriving the test statistic Tk in Equation 7-30 \hie can obtain the rest statistic for- this hypothesis as

The maximu~n likelihood esti:nates of the corresponding gross errol magnitudes are given by

i n arder to tietermine the number and type of gmss errors, i~owever, nle csfinot apply the simpk rule of choosing the nla.;iril~rli test ststistic ainong zli alternative !iyp:)thcses. ]'his is due to tile fact t h ~ i all the test statistics do nut follow the same dis!ribution because of the differences in the number of degrees of freedom. The test statistics for k gross errors given by Equation 8-3 can be shown to follow a central chi-sqt~are distribution with k degrees cf freedon. under the null hypothesis. In order to choose the most probable l~ypotnesis, we can con~pute the Type ! enar probabilities lvi cacil of the test sta~;siics given by

where X: is the random variable which follows a central chi-square distribution with k degrees of freedom. We can now choose the minimum

among the Type I error probabilities given by Equation 8-5. If this exceeds the modified level of significance fi for a chosen allow~ble probability of Type I error a, then we can conclude that gross errors are present and the hypothesis corresponding to the minimum Type 1 error probability gives the number and type of gross errors present. Furthermore, the estimates of the g r o s errors are given by Eq~~at ion 8-4. It can be easily verified that this method is equivalent to the choice of the maximum likelihood ratio test statistic for the case of identifying a single gross error described in the preceding chapter.

It must be noted that in any case it is not possible to detect and identify more :I-oss en-ors than the number of constraint equations, that is, t,,,, can at inost be equal to 111. This is because for e\,e13~ gross error hypcthe- \ i ~ e d either the correspondin: measurement is eliminated or an extra unknown p:irameter cor~esponaing to the ina,nnitgde of the gross error has to be estr~ua!cd. v:hich reduces the d e ~ r e e of redundancy by one.

Furihermo~-e, due to gross error identifiahllity pri3blems as discussed in the preceding chapter, only Gne cornbination of gross en-ois belonging to each equivalency ciass needs to be considered in the alternative hypothesis. For example, in the case of tlvo para!]<-1 streanls. o ~ l y one of the h).pctheses fc,r a gross er-ror in cnc of these strearn flow measuremer?:s has to he consicicred. Similarly, i t is not p~s:jible to sin~~rltaneclusly identify gross errors in fiow r~leasuremerits of streams forming a cycle c11̂ rhc process. In other- wor-ds. only those corubiriations of gross errors can be considered whose gross error cignature vectors are linearly independent. so that the matrix Fk will be of full column rank and the inverse ~f the matrix F':V in Equation 8-4 is guaranteed to exist.

It was proved in the preceding chapter that for a single gross error the GLR tzst is equivalent to the use of global test cornbined with measure- Itlent elimination sti-ategy. This result is valid even when multiple rneasurements are simultaneously eliminated. In other words, the GLR test statistic given by Equation 5-3 for the hypothesis of gross errors being si~nultaneously present ir t X measurements is identical to the reduction in

the global test statistic due to elimination of the k measurements suspected of containing gross errors. We describe below a strategy proposed by Rosenberg et al. [ 3 ] based on the global test which is essentially based on this principle.

The strategy proposed by Rosenberg et al. [3] makes use of the global test along with elimination of measurements. Corresponding to each alternative hypothesis of 8-1, the measurements which are suspected to contain gross errors are eliminated and the global test statistic is computed. Using these statistics, gross el-rors in measurements corresponding to the most probable hypothesis (the lowest Type I error probability) are identified. Instead of comparing all the alternative hypotheses simultaneously, however, Rosenberg et al. 131 considered in sequence single gross error alternatives, followed by two gross error a l t e~~a t ives and so on in increasing order of the number of gross errors hypotliesized.

If at any stage in this sequence the global test statistic computed after eliminating the suspect rneasurements is found to be less than the critical values at the a level of significance drawn from the appropriate chi- square distribution, then the procedure is terminated. This approach therefore attertlpts to ider.tify as few gross en-ors as necessary to accept the null hypothesis that no lncre gross errors are present in the remaining measurements. Rosenberg et al. [3] perf^;-med simulation studies ta co111- pare the performance of this scratcgy with other strategies. In particutar-. they found Chat this simulianeous strategy performs better than the sirnp!e simultaneous strategy based on measurement tests described in the p r e ceding section. However, theii- cornparisoil was limited to cases n;heri only cne gross error was present in tlie measurements.

'The sirn~~ltaneous strategy described ah:)ve has not been used rrruch due to the conibinatorial increase in the numb-r of alternative hypotheses to be :ested far eacl: of which a statistic has to be computed, leading tcl excessi\;e computational burden. However, tlie speed and power of computers has b ~ e n increasing rapidly, and it is wo!-thwhile to s!udji whether the si~illtiiii~cjiis strategy described above gives better performance than serial strategies and has acceptable coinputational requiren:ents.

Iderttification Using Sin~ultaneous Esti~natiorz of Gross Error Mug~zifudes

The method due to Kosenberg ct al. [3] inlplicitly assumes that a t ~ z i t ~ i -

rnul r ~ u j t z b e r of gross errors are present in the system, because it considers hypotheses of 2 or more gross errors only if the hypothesis of one gross er-ror is not statistically acceptable. This is also true of all serial

strategies that we will discuss later in this chapter. Such methods generally do not perform well when many gross errors are present. An akerna- f ive strategy is to initially presume as many gross errors to be present in the data as can be identified and then to discard some of these possibilities based on additional criteria. Rollins and Davis (41 developed a simultaneous strategy called the unbiased estimated tech17ique (UBET) which is based on this principle.

It has been pointed out earlier that the maximu~n number of gross errors that can be identified is equal to the number of process constraints, m. Moreover, the signature vectors for these gross errors must be linearly independent in order that their magnitudes can be uniquely estimated. In the UBET method, it is initially assumed that rrz gross errors (whose signature vectors are linearly independent) are present. Furthermore, it is assumed that the types and locations of these candidate gross errors are specified. (We will describe later methods that can be used to choose this initial candidate list of gross en-ors using the basic statistical tests).

Let F be an 11 x 17: n:atrix whose columns are !he signature vectors of the assumed gross errors. The magnitudes of these gross errors can be simtl1:aneous:y estimated using F instead of Fk in Equation 8-4. The estimated magrlitudes of gross errors will be ~ ( I Z ! > ~ C I . T P ~ if all these gross er;-ors are actcally present; hence, :he narne UBET for [his method.

In order to decide ivhzther an assumed gross en-or is actiial:y present, we can lest the hypotheses that the magnitude of the assumed gross error is ec(ua1 to zero or not. The estimated n?agnitudes of gross errors can be ssed for this purpose. If no gross emors arc present, then i: can be proved that b , is normr;lly distiibuteci wit!l nizar? zero and variance a,,. whicl? are the diligunat e1::;nzn:s cf rcatrix D, whel-e 1) = Fi 'v-I~. We can ioncl~lde th3t the rnagnit~lde of y o s s error i is non-zero if I b,/diil exceeds Z I .,-,, where a is L: chosen level of signiiicance; o~herwise, we can cor~cludc that gross error i is ilo? present.

In order to select an initial candidate set of nz gross errors, we can iililke use of basic ' ' 1 . 1 :*. Pl., .~.'-- LIL'II LC>,> . ,LJrirc ,r( l i l l L L ' U : J L b L 7. ::;); LX2l l l -

ple, one simple metho:! i: tc -ompute the Cr2K text statistics f ix each gross en-or using Equation 7-70 and to pick the first 171 gross errors with the largest statistics. after taking into account equivalency considerations to ensure that the sigtiature vectors of selected gross errors are linearly independent. Rollins and Davis [4] made use of the nodal test in order to select an initial candidare set of gross errors, but this requires nodal tests to be perfor~ned on colnbinations of nodcs as well.

Jiang et al. 151 were the first to point out the need for taking into account equivalency of gross errors when choosing the candidate set and they appropriately modified the UBET strategy. They also made use of principal component tests instead of nodal tests to choose the candidate set of gross errors. Their simulation results showed that the overall performance of the gross error detection strategy did not significantly depend on whether the nodal or principal component tests were used to choose the initial candidate set of gross errors.

Since the UBET assumes that a maximum identifiable number of gross errors may be present, it can be expected to perform well when many gross errors are present and poorly when only a few gross errors are actually present. This was also confil-med through simulation studies [4J.

Other si~nultaneous strategies were proposed by Jiang et al. 151, which have similarities with the method of Rosenberg et al. 131. and also b ~ , Sanchez et al. 161, who designed various strategies for simultaneous identification and estimation of nleasurement biases and leaks.

The above three simultaneous strategies are illustrated usiiig the sir-12- ple flow process example considered in Chapter 1 .

Example 8-1

We consider the sintple h a t eschanzer with bypass process sho\vrl i;; Figure 1-2 for the case when all the Iloxl variables are n?eascred. M t assume that thrce gross ei-rors of +5. i 4 and -3 unics i~ measurelnerlts c;f streams 1, 2, and 5. I-espectively, are p:-esent. The true. measured, 2nd reconciled valces of ail fl:)ws for this caye are shov,;~~ in Table 8- I

C Table 8-1

Data for H e a r Exchanger witll Bypass Process Containing Three Gross Errors

Stream Number True Flow Values Measured Flow Values Rwoncrled Flow Values -

1 100 10691 102 0573 2 64 68 45 67 1667 3 35 74 65 718867 4 64 64 70 67 I667 5 36 37 J1 34 3867 6 100 OX qS 102 0533

-- -

Let us cornpute the GJ,R test statistics for all allowable gross error i + combination hypotheses using Equ:ition 8-3. The maximum riurnber (7:'

gross errors that can be identified for this process is equal to 4, since there are only 4 Table 8-2 shows the GLR test statistics and the corresponding a level? for different combinations up to a maximum of 4 gross errors. Since streams 2, 3 ,4 , and 5 form a cycle, it is not possible to distinguish gross errors in all four of these measurements from a combination of gross errors in any three of these four stream measure- r * ments (see Example 7-9). For the same reason, it is also not possible to distinguish gross errors in any 3 of these streams from any other combination of 3 streams chosen &t of these four streams. Thus, in Table 8-2 the test statistics for all these equivalent combinations are listed in the same row. Similarly, other equivalent combinations are listed together in the same row of Table 8-2.

Consider the GLR test statistics for a single gross error given in Table 8-2. The GLR test criterion at 5% level of significance is 3.84. If we apply the simple identification strategy based on the GLR test, then gross errors in rneasuremcnts of streams 1, 4, and 6 are identified. Thus. Type I errors for measurements 4 and 6 are committed. Furthennore, the gross error in stream 2 is not identified. These identification errors are caused by the smearing effect of the gross errors.

We will now apply the sirrluitafieous strategy of testing all possible hypotheses of one or more gros? errors. Fror-n the result5 of Tahle 8-3, it is observed that a gross ~ r r o r in strzani 1 has the highest test statistic among ali single gross error hypoiheses. Sii~~ila~-Jy, ihe combination [ I , 2) has the iargest test statistic among ail 2 gross en-ors hypotheses. combination [ I , 2, 51 among 3 gross error hypotheses and all four gross e 1 ~ 3 r hypotheses have the same test statistic (since the reduced problern dces not have any redurliailcy for any of these hypotheses). 1 a

Among these different combinations. the hypothesi5 of gross errcr:; lil

measui-ements I and 2 gives the !east Type I error probability cf 1.41 26E- 10. Hcnce, the simultaneous strztzgy based on testing all possible hypotheses identifies gross errclrs in measurements 1 and 2. Tnus, this stl.aii-gj ;2al;ifiL:, i;ie gross errors in rneasurenlc~lts 1 and 2 correctly, .-.!Chc?ugh it dces not identify the gross error in stream 5 (in this example, it is tacitly assumed that the Type 1 error probabilities ror different hypotheses are computed accurately, even though they are very small). The gross error in measurement 5 is not identified; therefore, a Type I1 error with respect to that measurement is committed.

Table 8-2 GLR Test Statistics for a11 Hypotheses of One or M o r e Gross Errors

Measurement Combination GUI Test Statistic Type I Error Probability

[I1 35.381 2.7 1 14E-09 121 2.470 0.1160 [31 0.084 0.77 19 l41 13.202 2.7970E-04 P I 3.139 0.0764 161 15.105 1.0 169E-04 [1,21 45.361 1.4326E-10 [I, 31 36.910 9.6644E-09 11,41 40.295 1.77868-09 11,51 35.467 1.9878E-08 [1, 61 36.491 1.1915B-08 [2,31 2.968 0.0227 1241 13.282 0.00 13 [2,51 7.469 0.0239 1261 15.489 4.3307E-04 L3,41 13.61 0.001 I 13,51 4.982 0.0828 [3,61 16.803 2.2459E-04 14,SI 13.997 9.1329E-01 [4,6] 37.725 6.4279E-09 [ j , 6; 23.i33 9.4733E-06 [ I , 2,3l 45.742 6.4362E- 10 [ l . 2 ,4J; 1 1 . 2,6]; ! i . 4, 61; 12, 4. 51;

[1,2,4,61 45.522 7.1653E-i0 [I, 2, 51 46.254 5.OOSSE- I !I 11, 3,4] 43.234 2.1918E-09 [I, 3, 53; {i. 3. 61; [I, 5-61; i?. 5,6]:

11, 3, 5. 61 37.223 4.1275E-08 [1,4,5] 40.3: 8 9.1230E-09 [2,3, 41: 12, 3, 31; [2,4,5]; p , 4 . 51; p, 3.4.5j 14.014 0.00289

C2,3,6! 17.610 5.29338-04 [2, 5, 61 24.5::: 1 0-17- nr

1 .o , -LL.-,,., 13,4,61 37.854 3.0349E-08 14.5,61 41.415 5.3382B-09 All combinations of 4 mzasurcments

out of 6 except [I, 2,4,6]; [1,3,5, 63 and 12. 3. 4. 51 46.254 2.1804E-09

If we apply the technique of Rosenberg et al. [3], we would first compute the G T and check whether a gross eTor exists. For these process data, the GT statistic is 46.254 while the test criterion at 5% level of significance is 9.488. Since the GT is rejected, we compare all single gross error hypotheses and find that a gross error in measurement 1 has the least Type I error probability. If we eliminate this measurement and recompute the GT we find it is equal to 10.873 while the test criterion at 5% level of significance is 7.815.

We consider all two gross error hypotheses and find that the combination [ I , 21 has the least Type 1 error probability. We therefore eliminate these two measurements and recompute the GT statistic to be 0.893. Since the test criterion is now 5.991, the GT is not rejected and we temi- nate the procedure. In this case also. two of the gross errors are identified correctly without committing a Type 1 error. As with the GLR test, however, a Type 11 error with respect to measurement 5 was committed.

In order to apply the UBET method, we have to first choose a candidate set of 4 gross errors. Using the GLR test statistics for single gross errors listed in Table 8-2. we choose biases in measurements 1, 6 , 4. and 5 as candidates. Note t!iat xiynature vectors for these gross errors are linearly independent. The estimates of these gross errors are obtained as [3.8 1. -4.25, -1 2 5 , -1.221 and the variances of these estimates are c o n - puted as 13, 2. 2, 31. Hence. the test statistics for the magnitudes of these gross errors are computed as i2.2, 3.0, 0.86, 2.441. For a = 0.05. the criri- cal value is 1.96. Based on rhese tes: statistics we. therefore. conclude thar the magnitgdes of gross errors in nieasurements 1, 6 and 5 are non- zerc. Thus, UEET commits a Type 1 errGr in 6 and a'rype 11 elTor in 2.

It shoilid also be nclted that. although we have restricted cur considerations to identifica~ioil of measurement biases, we can use the above simultaneous strategies for identifying leaks if we combine these component strategies with statistical tests such as the GLR or nodal tesis that have the capability to detect and identify leaks.

Serial Strategies

As opposed to siri~ultarteous strategies, serial techniques identify gross en-ors serially. one by one. Many serial strategies in combination with different statistical tests have been proposed and their performances studied (1, 3, 71. They differ by the type of statistical tests that are used, the manner in which the gross errors are identified, the tie-breaking rules used when the criteria fol- gross error identification are identical for dif-

ferent gross errors, and the criteria used in f ie algorithm for terminating the serial strategy.

Furthermore, some serial strategies also make ase of information regarding upper and lowel- bounds on variables. A better understanding of these techniques can be obtained by focusing on the core principle used in the strategies ignoring the processing details and the statistical test used for gross error detection. We first describe tlie two main principles exploited in serial strategies before describing the different algorithms developed using these principles.

Principle of Serial Elimi~zation

The principle of serial elimination first suggested by Kipps [8] has been described in Chapter 7 for identifying a s111gle gross error. This principle is useful in identifying gross errors caused by measurement biases ottly. because it relies on eliminating measurements suspected of containing a bias. This basic principle can as well be utilized for inultiple gross error identification by identifying gross errors serially. At each stage of the serial procedure, a gross error is identified in one measurement (based on some criteria) snd the corrcspondiil~ measurement is elirninated before proceeding to the next stage. The major advantage of serial e!imi!iation is that it does ~:ot require ally pi-ior knowledge about t h ~ exisrence or loca~iort of gross eliors.

The principie of seriai coi1:pensation was f rst suggested by Nal-asirnhan acd Mali i3j in co!ijur?ciion with the use of thc GL,R test. rjnlike tlie serial climjnatior~ trcl~!liq~ie. tl:ls principle call t ~ e used to idc-ctify other iypes of gross error>. besides measuren:eni biases. At each stage of [hi: serial procedure, a grohi erroi- is identified ibascd on some criteria) and the measurements or mudel are compensated using the identiijcd type and location of the gr(>ss error and the estimated maynitude of the Sross error, before proceed in~ to the next stage.

The criteria used for identifying a single gross error in each stage of the serial procedure can be based on one of the statistical tests. Histori- cally, the serial elimination principle has been used in conjunction with

238 nala ~ ~ ~ ~ , ~ l < . i / r a r i o ~ ~ ar~d Gross Oror L)errcriotz

the global test or measurement test, while the serial compensation principle has been used with the GLR test. However, it was established in the preceding chapter that the measurement test and GLR test are identical tests for identifying measurement biases. Thus the use of the serial eljmi- nation principle in conjunction with the GLR test is the same as using it in conjunction with the measurement test.

Furthermore, it was also proved in Chapter 7 that if the global test is used in conjunction with serial elimination strategy, then this is precisely the same as using the GLR test. The former identifies a gross error in the measurement whose elimination gives the maximum reduction in the global test statistic, while the latter identifies a gross error in the measurement corresponding to the maximum test statistic. Thus, identical results are obtained by using the serial elimination strategy in conjunction with the global test, measurement test or GLR test provided they use the same principle for single gsoss error identification at each stage of the serial procedure.

The serial elimination and serial compensation principles were histori- cally derived from different viewpoints. However, it is proved subsequently that if a modified sisrial compensation strategy as developed by Keller et al, [lo] is tlsed, then this is exactly identical to the serial e!imi- natjon strategy. This result unifies all the different approaches. The reader is urged to keep these resu1:s in mind when going through the description of ihe scn?e of the most efficicn: strategies described below.

Serial Eli:ni~zution Strategies That Do Itrot Use Bounds

Amorlg the serial eliiilication strategies which do no: use bound i~lfor- ination. iile Y J ~ I S ~ G ~ . proposed by Serth and Eeenar; [ I ] kr!own as the itel-- <-rivr ~tle(~si:re~izeilt iest (!M'T) provides the basic structure. This strategy makes use of this measurernent test for gross error detection, and the ruJe for identifying a single gross errcr ir? :he measurement corresponding to the lnaxirnum test statistic at each stage of tlie iterative procedure. For multiple gross error detection. i t uses the srnal elimination strateg. Th? algorithm terminates when the maximum of the test statistics for all remaining nieasurelllents does not exceed the test criterion. The details of the algorithm are as foilows:

Algorithm 1. Iterative hleasurement Test (IMT) Method [I]

For ease of understanding the algorithm, the following sets are defined: Let S be the set of all original ~lieasured va-iables. Let C be the set of mca-

surements which are identified as containing gross error5 (initially set C is empty). At each stage of the iterative procedure, the measurements in set C are eliminated and the variables corresponding to these measurements are treated as unmeasured and projected out of the reconciliation problem as described in Chapter 3. Let T be the set of measured variables in the reduced reconciliation probleril after projection of all unmeasured variables. Initially, T is the set of measurements which occur in the reduced reconciliation problem after projecting out all unmeasured variables which are present.

Step 1. Solve the l n ~ t ~ a l reconcil~at~on problem Compute the vectors x ,. (Equation 3-8). a (Equation 7-1 1 ). and d (Equation 7- 15,

Step 2. Compute the measurement test statistics zdJ (Equation 7-17) for all measurements in set T.

Step 3. Find z,,, the maximum absolute \ d u e ainong all zd, fro111 Step 2 and compare it with the test criterion Zc = Z1-Bll. If z,,~,; I Z c proceed to Step 5. Otherwise. sel-ct the measurernznt corresponding to z,,,, and add j: to set C. If two or more measurement test statistics arc equal lo z,:,,,, sc!ect tlie rrlc'ssurerneni with the lowest ir,dex/ !n add to set C.

Step 4. Remove the mzasure~nenis contained in C from sei S. Sclve the data reconciliation probictm tresting the \.ariabics co r respond~n~ ;o set C also as unrnzasused. Obtain T, the set of rneasurelt1en:s ir? tht. reduced data recoficiiiation probleit:. arid the \-ei:oI-s a and d en:-iespocding :o !hex rrleasurernents. (Nore that Serth and Hccnarl j l J dehipneti chis clim- ination sch?~ue for a mzss tlow balance pi-ohiern: rhzrek~rt.. in their alzo- rithm. renlc! ir:g measurernents is equivalent to reinot !:I: sireanis fro111 the network by fiodai aggregation. In genel-al. this car: br achieved by a projection matrix. as described in Chapter 3.) Retunl to Siep 3.

Step 5. The measurements yj, j t C sre suspected of containing gross errors. The reconciled esti~nates after removal of these rneasurenlents a_re those cbtairi2d in Srep 4 of the last iteration.

Note that the test criterion Z1-11,2 should strictly be recrllculated at each Step 3, since depends on the number of measured \-ar-iables in the model (Equation 7-7. where nz is replaced by the numbel- of measurements in thi. model). The more measured variables are eliminated. the

fewer the nulnber of simultaneous multiple tests, therefore the lower the probability of Type I error overall. It is possible, however, to sinlply use a global test at each stage to first detect whether any additional gross errors are present and use the measurement test statistics for identifying the gross error location. The level of significance at each stage k can be maintained at a and the critical value can be chosen from the chi-square distribution with degrees of freedom nz-k+l where k-1 is the number of qross errors identified so far. 'This ensures the Type I elTor probability to be maintained at a.

The measurement test used in Steps 2 and 3 can be replaced by a different statistical test. Principal component measurement test [ I I ] can be used instead. but that requires Inore computational time. The principal component analysis should be used only if that provides superior power and lower Type I error. But that is not easily achieved without additional identification 15, 12. 131.

Example 8-2

The ser-ial elimination strategy is illustrated using the same process as in Example 8-1. M'e will use the global test at each stage for detecting whethel- gr-o\s errors are present. and i!' so ilse the GI-R test statistics (n-hich is equival~nt to the ~neasuremect test; for idciltifying ihc' gross error-. The 1c\.el of signific;~nce for thc global test is chosen as 0.95 for all staFes of the serial straiegy.

For the same mca:;ured data as in Example 8-1, the glcbal test stztis- tics for each stage of the serial eliinintition procedure arid the cc~~esponding tes: cri:ena are listed in Table 8-3. It is ollseri.ed that the yloba! test ~ - ~ . ~ c c I s the nul! hypo~hcsi:; in stages 1 and 2, but 1101 in stage 3. Thus. t\vo gross errors are ~dentified by this algorithm. In thc ill-st stage. since t!le maximum ELK test .,ratistic is attained for measurement 1 (refer to Table 8-2). 2 gross error in this mcasurrme:lt is identified. In the second stage after elii-r~inating measurement !. the C;I K test statistic is mnxirnum for strearn 2 among all re~liainirtg ~ncasurerneilts. This can be vcriiied by comparinz [he test statistics for all combinations of two gross errors which contain strearn 1 . Thus. t\vo of the three gros:; errors are con-ecLly identified lvithout a Typs I el-1-0s being comn;ittcd.

Table 8-3 Gross Error Identification Using Serial Elimination Strategy

Stoge k Global Test Degrees of Freedom Test Criterion Gross Error stutistic v=m-k+l

~ t , o o s Measurement Identified in

Serial Compensation Strategies That Do A70t Use Bounds

The serial compensation strategy was developed for use with the GLR L L S ~ [9] for multiple gross error identification. It exploits the capability of the GLR test to detect different types of gross errors and also makes use of the estimates of the gross errors obtained as part of this method.

@ a Again, in this method gross errors are detected serially one at each stage. The component method used in this strategy to identify a gross error at each stage is the simple rule of identifying the gross error correspondin: ;o the maximum GLR test statistic.

Since the gross error could either be associated with a rneasurertient or the model (in case af leaks), elimination of measurcmellts is nor appropriate. instead, the estimaled magnitude is used to cornper~saie the corresponding nleasurenler?t or model constraints. The GLR test is applied to thz co!npensated constraint residurtls to detect and identify^ any other gmss error presefit. T i e procedure stops when the maximurn of the test

z e statistics among ali remaining gros:: eimr possibilities does nct exceed the tes: critelion. We refer to this algorithm as the sinzple set-id coi77ycJ~i-

saf ion .y:rg:egy (SSCS) since a modified version is described later. The details of the algorithm are as follows:

Algorithm 2. Simple Serial Compens2tio.1 Strat-., ( C Y S ) [5)j

Step 1. Co~npute the constraint residuals r and covariance matrix of constraint residuals V using Equations 7-2 and 7-3 if no unmeasured va~iables are present. (Otherwise, the method for treating unmeasured variables described in Chapter 7 can be used to obtain the projected constraint resid:i- als and its covariance matrix.)

h4rrlriplr Gmss Errol- Ide~tr!iicilriorr Srr-ui~yies.for- Sreodv-Str~rr. Procrssp.~ 243

Step 2. Compute the GLK test statistics Tk (Equation 7-30) for all gross errors.

Step 3. Find T, the maximum yalue among all Tk from Step 2, and compare it with the test criterion x ; - ~ , ~ . If T 5 X:-p, proceed to Step 5. Oth- erwise, identify the gross error corresponding to T. If two or more gross error test statistics are equal to T, arbitrarily select one. Let f* be the gross error vector corresponding to T and b* be the estimated magnitude (Equation 7-29).

Step 4. If the gross error identified in Step 3 is a bias in measurement j, then compensate the measurements using the estimated magnitude of the bias. The cornpensated nieasure~nents are given by

On the other hand, if the gross error identified is associated with the model constraints, for example leak i corresponding to leak vector f, then the constrain? model is co~npensated using the estimated magnitude. 'rk2 co~npensated model is given by

in eitilcr case. the compensated constraint residuals a ~ - e give^; by

Return lo Step 2 replacing the constraint residctals with the comper~sat- ed conqtraint residuals.

Step 5. Conlpute the reconci!eci estimate:; for x using the compensated :mates can measurements and compensated model. Equivalently, the s t .

be obtained using the compensated co~st ra in t residuals instead of the ciginal constraint residuais in Equation 3-8.

'The pr-inciple of compensating the rr~easurements or model at each stage of the above strategy iinplicitly assulnes that the gross errors identified in the preceding stages and their estimated magnitudes are correct. In order to understand this clearly, it is instructive to consider the hypotheses that is tested by this algorithm at each stage of the serial procedure.

Let us assume that at the beginning of stage k + l , we have already identified k gross errors corresponding to gross error vectors f;, f';, . . . , f;

and their estimated magnitudes are given by by, b:, . . . , b> The hypothesis for stage k+1 undc,r the assumption that all Goss errors identified in the previous stages as well as their estimated magnitudes are correct can be stated as

H:' (only gross errors identified in previous stages are present):

H k+l (one additional gross error is present):

where f, in the alternative hypothesis can be any one of the gross error vectors corresponding to remaining gross error possibilities not identified in the preceding k stages. It ca:l be noted that in teims of compensated constraints r.esiduals, the hypotheses for~i~uia!ion is si~rlilar to hypotheses f ~ r detectir?g and identifyin2 a single Fross exor and thus the GLII test statistics for !he remaining gmss error possibilities can be co~nputed by using the compensated residl~als in Equations 7-30 to 7-32.

If multiple gross errors are pl-ese~t ir? the data, the serial procedul-e oE frying co identify them results in rnisp~edicting the type c r location of the gross error. Eve11 if the gross 2 1 ~ 0 i is COI-rectly identified. the estimate of its mag~i tude rnay be grossly incorrect. Thus, the cr~rnpensated residuals can contain spurious large er-rors noi presen! in the original residua!^. which can i~llpliir the accuracL- of identification of remaining gross errors. The serial con:pensation strategy may thus give rise to a largc number of ~nispredictions, especially n1h-n many grnss errors are present and when the magnitudes of gross errors are large in comparison with the standard deviations of the randotn errors in measurements. This was derno~strated through simulation studies by Rollins and Davis 141.

On the other h a ~ d , if only a few gross errors are present in the data. then the SSCS strategy performs as well as the IMT as shown through simulation studies [9] and typically requires about one-fifth to one-half the computing time required by the serial elimination methods. This is due to the fact that that serial elimination requires recomputation of the

test statistics at each elimination step, which in turn involves recalcula- tion of all matrices involved in order to get a new solution. Co~tstruction of a projection matrix is reqilired after each deletion of a measurement from the network. Only for diagonal C and mass flow balance equations certain efficient elimination procedures can be developed [14].

Algorithm 3. Modified Serial Compensation Strategy (MSCS) [lo]

In order to obviate the problem of too many mispredictions by the SSCS strategy due to incorrect compensation, a modified procedure was proposed by Keller et a]. [lo]. In their modified strategy, only the types and locations of gross errors identified in the previous iterations are assumed to be correct and the estimates of the gross errors are not used in compensation. The modified strategy still uses the GLR test for detecting gross errors and the rule for identifying a gross error correspor~ding to the maximum test statistic. Elowever, the hypotheses that are tested at each stage of thp_ serial procedure are different from 8-9a and 8-9b and can be stated as follows using the same notation as in 8-9a and 8-9b:

H ; ~ ' (only gross ersms ideiltified in previoils stages arc present):

H pi (one additionrtl g ro~s errcr is presect):

It shoulu be noled cila~ 111 ~115 nuii anu ;!it.-malive hypotheses, the magnitudes "f ,tie gioss errors arL dssuiiied to be unknown and their maximum likelihood estimates are computed as part of the, computation of the GLR test statistics. However. it can be observed that in the hypotheses fonnulation the locations of the gross errors identified in the preceding k stages are assumed to be correct. The GI,R test statistic at stage k + l for a gross error corresponding to vector fi (not identified i n the preceding stages) is given by

Keller et al. [ lo] demonstratcd through simulation that the MSCS strategy co~nmits less rnispredictions as cornpared to SSCS, especially when a large number of gross el-sors are present.

Although the MSCS strategy was devised as a modification of the SSCS strategy, a look at hypotheses 8-10a and 8-10b shows that in reali- ty no compensation is being applied, and the original constraint residuals are utilized for gross error detection in all stages. If we restrict our consideration to gross errors caused by sensor biases, then this strateFy is in fact exactly equivalent to the serial elimination strategy. The proof of this follows froin the interpretation of the GLR test statistic as the difference between the optimal objective function values of two data reconciliation problems (see Chapter 7).

The GLR test statistics for stage k + l of the hlSCS strategy. giver, b!- Equatiori 8-! 1, can also be in!eiyre:ed in a sirnilar manner. TfLe first tc.:-~i: in the RHS of 8-1 1 is the optima! objective function of the data reconciliation prab!em in which the magnitudes of the gross errors itier1:ified i n :he preceding k siages are simultaneously estimated as p:zrt of the datz reconciliation problem. The foiinu!ation of this problem is an extz:lsioii of Problem Pi of Chapter 7.

Similzly, tile seco!ici tenn in the RHS of 8-1 I is equai to thc optiinai ohiective funcrion of the data rtxo~ici!iatior? probIem i n which the ~nagni- tudes of gross emrs identified in preceding k stages a23 the mag!lirudz of the gross error hypetilesized ir? stage k + i are simultaneously estic~aied. !ii

addition, i: wax also pointed our in Chapter 7 chat the optimal objectix-e function value is the same whether- we chocjse to retain the measuiemenr containin? a grow prrnr ,>r,d estir~rrtc thp *?:~,-~i!zCi_e of the Fmss en-or. or we choose to eliminate the rneasurernent contairling a gross el-ror and treat it as an unmeasured variable in the data reconciliation problem.

This result is also true when these are several measurements containing gross errors. In other words, we can also interpret the RHS of 8- 1 l as the difference in the optimal objective function values of two data reconciliation problems, one in which the measurements identitied as contail?- ing gross errors in the preceding k stages arc eliminated, and the other i n which an adctitiona: measurement in wliich a gross error is hypothesized

246 nnrn Rccotr< ilrtzrrt~tl rrrrd Cr-0.5s Err or- Orrccrroti

at stage k + ] is also eliminated. But this is precisely the strategy used in IMT. Thus, the MSCS and IMT are identical strategies for identifying gross errors in measurements. I-lowever, the MSCS has the additional capability of identifying other types of gross errors. We can therefore regard MSCS as a serial elimination strategy that can handle different types of gross errors.

Q. Q

Example 8-3

The serial compensation strategy is applied for gross error detection on the same measured data for the flow process considered in the preceding examples. Again, the global test is used for detecting whether gross errors are present and the GLR test statistics are used for identifying the gross error. The main difference between this and the preceding example is that the residuals at each stage are compensated using the magnitude of the estimated bias and Fquation 8-8. The results are presented in Table 8-4. which show that two gross errors are detected in measurements 1 and 2. C .

Table 8-4 Gross Error ldentifisation Using

Modified Serial Compensation Strategy

Stage k Global Test Pegrees oi Test Criterion Gross Error Es5rnoted Stahtic ireedow -2 ,.,,o os Identified in Mag~itude

v =m-k+ i Measurerncnt of Gross Erro:

Serial Strategies Tlzat 157,s.. B01~zd.s

The algorithrns described in the preceding section do not ensurc the feasibility of the reconciled solution. Sorne of tile ~ i o w rates in the r'inai solution may be negative or may have large unrL~sonable values. Fur- thermore, if reliable upper and lower bounds of both measured and unmeasured variables are known but are not imposed. the recorlciliatiori can provide a solution that violates certain bounds (infeasible solution). This very fact may be indicative of other undetected gross errors in the data or that the identified sross errors may be incorrect.

The information about upper and lower bounds on variables can be exploited in the gross error detection strategy. Serth and Heenan [ I ] proposed a heuristic method of utilizing bound information in gross error detection by nlodifying the IMT method. The t~zodified IMT method (MIMT) still uses the lneasurement test for detection and the serial elirni- nation strategy for multiple gross error detection. However, the colnponent strategy used for identifying a gross error at each stage is not a simple rule of choosing the measurement with the maximum test statistic. Instead, the MIM'T method identifies gross errors in measurements only if their deletion from the original set S gives a data reconciliation solution that satisfies the bounds for all measured variables.

Algorithn~ 4. Modified Iterative Measurement Test (MIMT) Method [I]

Step 1. Solve the initial reconciliation problem. Compute the vectors x, a, and d.

Step 2. Compute the measur-ement test statistics zd., for all nteasurenlents in set T.

Step 3. Fi~id z,,,,, the maximum absolute value zri~ong d l zdj f r ~ i l i Step 2 nnd cdmpare it wit!l tbe test criterion %, = %1-P/2. J f z,,, < Z,, pr<)~ei(! ic)

Step 6. Citherwise, seiecr the measurement coixesponding to zma., and tern- pot-ui-ily add it to set C. if two or nlcre rneasursrcent test statistics are qua1 to I,:,,, select the measurement wl:h the lowest indexj to add to set C.

Step 4. Rernove tiic ~neasuren!ents con~airled in (3 from set S. Solve t.he d213 reconciiiatioil problern creating the variables correspondin2 to set C also as unmeasur-ed. Obtain T, the set of measurernefits in the reduced data reconciliation problern, and the vectors a and d corresponding tc these measurcrnznts.

Step 3. Determine if the reconciled values for all variables in set T and set C are withrrt their prescribed lower and upper bounds. If all reconciled values are within the bounds. stor-e the cun-ent solution and return to Step 2. Other- wise, delete the last entry in C, replace it hy the measured variable corresponding to the next largest value of I z,, I > Z,, and return to Step 4. If I z,,,~ I 2 Z, for a11 remaining variables, delete the last entry in set C and go to Step 6.

Step 6. The measurements y,, j E C are suspected of containing gross esrors. The reconciled estimates after removal of these measurements are those obtained in Step 4 of the last iteration.

Exercise 8-2. Re~lrite Algorithm 4 for a pmblen~ with both ~neasurcd and urzriieastr~ed variables solved by QR deco1nj3o~itiorz

I as described in Chapter 3.

There are several limitations in the manner in which bounds are treated in the MIMT algorithm as listed below:

(1) The algorithm only checks for bound violations in the nieasured val-i- ables (more specifically only in the measurements of sets T and C ) . Bounds on unilleasured vaiiables are ignored in the algorithm.

( 3 ) The algonth111 may tenuinate even if the test statistics for sorne of the measurements exceed the test criterion, due to the modifications in Step 5 of the al~orithnl. In essence, the tnethod relies on bound infomlation at [he expense of the information provided by the statistical test.

13) As an extreme case, the algorithm can also terminate with all the test statistics below the test criterion, but the reconciled solution violating the bounds for sonie of the measured val-iables. This can happen if the test statistics of tlie iriitla! reconciied solu!ion cio not exceed the test critericm ir, Step 2 cf tlie algorithm.

133 Since ;he n~ethod is based on sei-ial e1in:ination of measilrernenrs. it has :he :3an1e liinitatiei? as TMT of being appiicable only for- ideritify- ing biases in measurements and nclt other types uf gross errors.

Rosenberz e! al. 131 prcposed thc errei~ded niecislri-cme~z! resr (EMT) and dy~zanzic rizeaszrrc-r,lerzf rest (DMT) strategies that also exn!oit bound infarmation in gross error detection. Strictly, EhlT cannot be classified as a serial strategy because it involves the sin~ultaneous elimination of two or more measurements similar to the simultaneous strategy using the zlobal test described earlier. The EMT strategy initially creates a cnndi- date set of measurements suspected of containing gross errors by using the measurement test and selecting those for which the test is rejected.

From this candidate set, combinations of one o: more measurements are eliminated in order. Gross errors in the eliminated set of measurements are identified provided the following two conditions are met:

(a) The measurement test for all I-emaining measurements are not rejected. (b) Reconciled estimates of all va~iables (including those that are elimi-

nated) satisfy the bounds.

In extreme cases, if it is not possible io identify a set of measureme~lts which when deleted satisfies the above two conditions, rhen all measurements are suspected of containing gross errors.

The DMT algorithm is very similar to MIMT except that it checks for bound violations in :he estimates of both measured and unmeasured variables (including variables whose measurements are eliminated). More- over, DMT initializes the set C with the measurement having the largest MT statistic and enlarges the set C at each iteration by the measurement with the largest rejected MT statistic. For details of EMT arid DRTI' the reader is referred to the paper by Rosenberg et al. [3]. These algorithms also have the limitation that they can be used for identifyin, 0 meastlre- men: biases only.

Algorithm 5. Bounded GLk (BGI,R) Method [15]

The above serial strategies suffer froin some limitstions described in the preceding section due tc the heuristic manner in which boui?d information is utilized in gross error detcction. If reiiabls bounds on varizb!cs are available and the reconciled estimates of variables are expected to satisfy these bounds, then it seeins more appropriate to inclode thzn: as inequality constraints in the date recc~nciliation problem. Due to tl~ese ineqgality constraints, the solution to :he data reconciliation probleili cannot be obta i~ed analytically, cven if the model constraints are linear.

A quadratic programming optimization technique has to be used as described in Chapter 5 for solving the resulting data reconciliation problem. (In general, if the model constraints are nonlinear, then a successive quadratic progra~nming technique as described in Chapter 5 has to be employed [i5].) Despite the complexiiy, this method offers an elegant and theoretically more rigorous approach for including bounds in data reconciliation and gross error detection. Such an approach is used in the BGLR method as described in the following steps:

250 L ) ~ M ~<:'co~~i. i / ir~~iv, , n11d C;roz Error I)creciiori Mirlfipl<, G~nsr Ert-or- Ide17rijic-urion Sif-ufe~i<,s for Jlr<tdy-S~arr Pr-oc<,sses 25 1

Step 1. Solve the bounded reconciliation problem including the bounds on both measured and unmeasured variables in the reconciiiation problem. (Use a nonlinear optimization technique described in Chapter 5.)

Step 2. Identify the active constraints at the solution of the reconciliation problem. (The active constraints include all the conservation constraints as well as any bound constraints that are satisfied as equalities at the optimal solution.) Denote the matrix formed by the active constraints by A, and the RHS of the constraints by c.

Step 3. Conlpute the constraint residuals, F, and covariance matrix of these residuals, V, using all the active constraints identified in Step 2 (use Equations 7-2 and 7-3 with A replaced by Aand c replaced by c).

Step 4. Compute the GLR test statistics using Equation 7-30 to 7-32 with r and V replaced by F and v. respectively.

Step 5. If rhe maximum anlong the GLR test statistics T 5 X:-8 go to Step 6. Otherwise apply the simple senal conipensatio~i strategy using -4. - r, and V, instead of A. r. and V.

Step (1. Solve the bounded data reconciliation prob!ern using :he compensated r~leasurement:; and he co~npensatcd rnodel to obtain tllz rezon- cilcci estimates of dl variables.

The s:rategj described above is a simpler and generalized versicn of the original ~nethcd de.~elopcd by Nar-asimhan and Hdrikumar [1-5] ir? xhich sepsl-ate tests are applied to the var-ables which are at rhcir bounds (aim referred :o as the re.:-tric~eri ~~nriaOles) a11d :he other unrestricted variah1.-s.

.4 few points are worih rneniioni~g with respect to the above strategy. In Step 5 of the above strategy, it is implicitly assumed that the same set of constraints will be active in all stages of the scriai compensation strateg jl. I rle rneoretlcaliy correct procedure is to solve the boucded data rec- onciliatiol~ probiem using compensated measurements and compensatzd model constraints at each stage after a gross crror is identified, in or-der to deternline the new set of active constraints. Since this will increase the computational burden significantly. it has not been used in the above strategy. If there is no limitation of computing power, however, then it is advisable to solve the bounded data reconciliation problem at each stage of the sel-ial procedure.

Secondly, in the BGLR method described above the simple serial corn- pensation strategy is used for multiple gross error detection. Instead, the serial elimination strategy or modified serial conpensation strategy can be used as well. Lastly, the above strategy does not have the limitations of the MIMT and cther methods. The estimates that are obtained at the end of the procedure will satisfy the bounds, and the test statistics of all measurements will be below the threshold value (provided the bounded data reconc~liation problem is solved at each stage of the serial procedure).

In the extreme case, if an infeasible solution is obtained in Step 1, this indicates that the bounds imposed on variables are too restrictive and have to be relaxed. One important advantage that this method offers is that it can be used even when more complex inequality constraints (such as thermodynamic feasibility restrictions imposed on the temperatures of heat exchangers) are used. If the inequality consrrarnt is active at the optimal data reconciliation solution, then this is simply included as an equality constraint in the constraint set when gross error identification is applied.

Exercise 8-3. Deve l~p tlze BGLR algoritl~r7z for- the case ~ ~ h c r z (a ) rlze boiinded dutc! reco:zciliatior? prz~blei?z is solved ufter each stage qf the serial coniper?.sariorz pr-~cedzrrz, (0) niodijied seriul cor?z~~eilsatiorl is used as the .srruteg~~,fi?r 111~1ri,ole g r o . ~ ~ error de!ectio~r irz~remcl o f ~ e r i ~ z ~ 0 1 ? 1 ( X ? l z ~ ~ i i ~ l ~ , a;?d ( c ) whcrz ho;h chunges (c) rind (b) are made.

Siinulation sc~ldies conducted by different researchers j 1, 3, 151 show that bound infomiation enhances the perfomianie of g r ~ s s error detection strategies only if thc measured values are close to the bounds. Both simultaneous and serial algorithms described above are not limited to linear flow processes. If nonlinear constraints are involved, they can be !inearized befare applying any of these detection schemes. Gross error detection in nonlinear processes are discussed in greater detail later in this chapter.

Example 8-4

We apply the BGLR method to the process considered in the preceding examples. For this purpose we will assume that lower bounds and upper bounds on variables are specified as in Table 8-5.

Tght bounds on variables I and 2 are specified to illustrate the effect of bounds on gross error detection. The data reconciliation solution obtained without using these bounds are shown in Table 8-1

Table 8-5 Lower and Upper Bounds on Flows of Heat Exchanger with Bypass Process

Measurement Lower Bound Upper Bound

1 99 102 2 60 67 3 30 40 4 55 75 5 30 40 6 90 : : 3

The reconciled estimates violate the upper bognds o : ~ streams I and 2. Therefhre, the data reconciliation problem is solved using the bounds and the reconciled estimates are obtained as [101.97, 57.00, 34.97, 67.00, 34.97, 101.971. It is observed :hat the upper bound or! strsam 2 flow is active at the optinial soiution. By includins this conztraint aloi~g w~th the flosv balsnces we zrt the expri!ided consti-aiilt nitltris arid RHS of coasti-aiints

The last of the above constrai~its is the active upper boiliid constrain1 or, the flow of stream 2. C'sing the enlarged constraint set, we ccrrnpute the global test statistic which is equal to 46.338. This is greater than t!le test criterion of 1 I .07 (Xi,o,os) and the (iT is rejected. The maxirnunl GLR test statistic is obtained for lneasurernent 1 and so a gross error is identified in this measurement in stage 1. MTe contintie this procedure by solving the bounded data reconciliation proble~n at e x h stage to determine the active constraints and the results are given in Table 8-6. Again only two of the 3 gross errors are correctly identified. The final reconciled values are (99.0, 64.7033, 34.2967, 64.7033. 34.2967, 99.01, which satisfy the booncis.

Table 8-6 Gross Error identification Using Bounded GLR Method

--. - - Stage k Active Global Test DOF \, = Test Criterion Gross Emor

Constraints s Statistic (m-k+ 1 +s) % t.0.05 Identified - -

I 1 (UB on 2) 46.338 5 1 ! .075 1 2 None 10.873 3 7.8 15 2 3 l ( L B o n 1 ) 2.61 3 7.8 15

Combinatorial Strategies

There are several gross error identification strategies ~ h i c h rnake use of the nodal test. As pointed out in the previous chapter, the use of the nodal test requires a strategy even for identifying a single gross en-or. Since these strategies cannot be easily classified as either serial or simultaneous, we have chosen to categori~e thein separately. Most of these strategies are specifically tailored to Siow processes and cannot be applied to nonlinear processes.

The basic principle used in the gross en-or identification strategy based on the nodal test was first proposed by blah et a!. [10]. If a g o s s error is present in any f l o ~ r rneasllrement. then !bib afficts rhc constrsint resida- als in which the measurerneat occurs. 'Thus. wc can expect the nodal [?st for :he tivo nodes on which the correspo!lding stream is incident to be rejected. !I these two nodes are merged, however, then :hc correspondiii~ intsrcannecting stream is eliminated and the nodai Lest for this coinbina- tion node (also called a pseudo-node) wiil ntost pi-(>hahly not be reject~d. 11; order to exploit this principle, nodal tesL.i are co~iductcd or1 rht rcsiciu- a!s around singie nodes as well as coriihinr?tions of t ~ v o or inore r,oi!es which are corinected by streams. If a noda! test for any cunibinatiori is not re-jected. rhcn the fiov: measurenlents of ail streams which are incl- dent on the pseudo-node may he assumed tc) be free of gross errors.

It should be noted that no de,cisions can be !i?sde c ~ ~ ~ ~ , - r r ! i i ! ! ~ the !Inw measurements of strearns interconnectiiig any two nc3ries for-nliiig the pseudo-node. On the other hand. if the nodal test is rejected, then one or I I I U ~ C ~iit'astirelIierits incident on the p.;eudo-node can contain grosh errors. No direct statement. howeve;; can bc made regardir~g any of these measurements. By selecting suitable coi:ibinations of nodes on \t.hic!i nodal tests are performed. it is possible to identify a set of measurements which are likely to free of gross errors. The remaining ineasurem'nts at-c suspected o t containing yross 21-roi-s.

hflrl:il~lc GI-OSS Err-or- Ide~~riJicariorz Stmrch.ies fur Sleud>--S:arc Processes 255

Further screening of this candidate set can be performed using other serial or simultaneous strategies to identify the measurements containing gross errors. These strategies can also be used to identify leaks in nodcs along with measurement biases [ I , 71. We describe below the linear corn- bii~ation technique (LCT) algorithm by Rollins et al. [ I 81 which uses the above strategy.

Algorithm 6. The Linear Combination Technique [18]

In the LCT method proposed by Rollins et al. [18], nodal tests are conducted on constraint residuals around single nodes as well as certain combi~iations of nodes. In general, for any of these nodal tests, the hypotheses can be expressed as follows:

i-i,;: lTpr = 0 versus H,~: 1TPr # o (8-12)

where p, is the expected value of the constraint residuals and I, is a vector of zeros and ones representing the linear colribination in the ith test. At a level of significance, H,; is rejected if:

If Kc,, i h not rejected, ali measuremer?ts of streams incident or, the psecdo-node are considered to be free of gl-oss ei-rors. After all 1i:iear combination tests are performed, two sets of n?easared variables are ct;tained. One set. SET]. contains variables whosc measuren?eiits are not suspected to contain gross el-rors. The complenientary set, SET2, consists of variabies whose 1neasureinent.i are suspected cf containing gross errors. Of course the algorithm may result in incorrectly classifying the rneasurements, giving rise to Type I errors (when good measurements are placed in SET2), or Type I1 errors (when faulty rn~ssr~rementu 2~ nllred in SETI). The chosen level of significance a for the tests plays an important role in balancing the two types of errors.

In order to reduce the number of linear col~lbiiiations for hypothesis testing, Rollins et al. I dl adopted the following rules:

a. Conduct nz nodal (constraint) tests on individual nodes. If HOn for node (balance) k (k=l, . . ., m) is not rejected, no nodal test on combination nodes contatning node k is conducted.

b. A gross error in a small flow is generally difficult to detect and if it is implicitly assumed that such stream measurements d o not contain gross errors, then no nodal test on combinations of nodes connected by a low flow I-ate stream is conducted.

c. No nodal test is perfomled on node combinations which are not connected. d. No nodal test is performed on a nodal combination containing two

nodes which are connected by a stream whose measurement has been classified to be free of a gross error.

The above rules are used to essentially avoid conducting nodal tests on pseudo-node combinations which do not provide any additional infonna- tion for identifying the good measurements. Mah et al. 1161 used a similar strateSy and made use of the above rules except rule (b). In addition, their procedure L L L . , ~ attempted to identify leaks it1 nodes as a last resort, when the nodal test for a node is rejected, but all the streams incident on this node are classified as good (placed in SETI). Serth and Heenan [ l ] proposed three different variants of the above strategy in one of which information about bounds on variables was also exploited. Yang et al. 271 used a combination of measurement and nodal tes:s in which the measurement test was used to identify an initial candidafe list of suspect measurements ahilc the nodal tests were used to cor!nter-check whether the suspcct ~neasurernent does contain a gross errDr.

Although strategies based or, nodal tests reduce the nunber uf miupre- dictions (Typz 1 errors) as compared to serizi strategies [I] , they suffer from the follcwing drawbacks:

(1) :f nnitiple gross errors are prescn; in the dzta, dge to partial or com- p!ete cancellation of these errors, nodai tesrs for node coinbinations on which these streams are incident ~ n s y not he rejected resulting in i~lcorrect classif~cation (Type !I errors).

(2) They are designed for linear flow processesand it is difiicult to extend than to nonlinear processes.

Example 8-5

The LCT algorithm is applied on the data for the heat exchanger with bypass process considered in the preceding examples. Nodzl tests (using u = 0.05) are performed on single and co~nbination of appropriate nodes and the streams are classified as shown in Table 5-7.

Table 8-7 Gross Error Detection Using LCT Algorithm

--

Node NT Status Measurements Combination Statistic Clossified Free

of Gross Errors

Rejected -

Rejected -

Not Rejected 3 and 5 Not Rejected 4 and 6 Rejected -

Not perfornled - Stream 3 is good Not perfonned - disconnected Not perforrncd - disconnected Not performed - Stream 4 is good Not performed - Stream 5 is good

Other than measuremenrs I and 2, the rest are classifiec! as good and thus two of the three gross errors are correctly identified by IXT.

Other collective methods used to simultaneously identif;, leaks and measurement biases and estimate their eri-or magnitudes have been recently proposed by Jians arid Qiigajewicz [l9].

PERFORMANCE MEASURES FQR EVALUATING GROSS ERROR IDENTCFICATION STRATEGIES

The pe i fo r~ance of rhe serial e l in~ lna t i~n and compensation 1?1ethods described above can be compared by nteans of computzr siinu1:ltior: expi-rircen:~ in which kno:xn errors are introduced into the dara and !he ability of the schetnes to icizntify and correct the erroys is evaluated. SLI~II colnparisons have becn trequen:ly reported !3 , 5. 6, 9. !O, i5, 18, 191. Since the error detection procedure is stochastic in nature. the per-fos- mance is averaged ovel- a suitably large number of ;rials. A mini~nunt of 1,000 sirnulation trials is recommended.

Given the set of true xaliles fo~- all PI-ocess variables. for each simula- !ion trial, a measurement \.ector is generated as

where 6 is the vector of gross errol-s. Using random numbers from the standard normal disfribution. a vector E of random errors is first added to the true val~ies. For this. the standard deviations are ust~ally taken as home

fixed percentage of the true values. .4 specified number of gross emrs are added to obtain the measurement vector. The location of the gross errors are unifonnly and randomly selected over the set of measurc(l variables, while the magnitudes of the gross errors are uniformly ancl randomly chosen between specified upper and lower bounds. The sign of' the gross error is also chosen randomly. The magnitudes of the gross errors are constrained by

where 1 is a lower fraction and u is an upper fraction of the random value (not including the gross error). For instance, 1 = 0.05 and u = 0.59.

For the purpose of gross error detection, the level of significance a tor the statistical test is also required. This value is frequently chosen so that the average Type I error when no gross errors are presen: is 0.1.

Different performance measures have been used to evaluate gross error detection performance 13, 91 as follows:

1 . The overatl power (OP) of the method to identify zross errors cornea- ly is $yen by

O-,rerall - Number of gross erTors correctly identitied -

pca'er (8-16) NumSsr of gross errors siinulated

The overail pcuyer is co~npu:eri only for si;:~~rla~ion trials ill wlrich gross erri;rs 21-e sin~ulated.

Roi!ins and Cavis 141 defi!led an o\~ri-a!l poiter-.fi~nctir/n (OPF) as folioa s. which is a more conservative ines: L\UI-c.

Wu!nt)i'r of ~iln11la.tio11 trials with perfect idetttification = - - - . (S - 16a)

N~ilnber of sirnulation trials

By perfect identificiition, it is implied that all gross errors and their locations arc. correctly identified anti no mispredictions are ritade. Obviously, an ideal strategy is one which results in an OPF of unity. Note chat it is always possible to get an OP value of i by predictir~g all ~neasurernents to contain gross errors. This is not satisfactory hcc.ause too n~arty inispr-edictions a!-e also made in the b a r ~ a i n . 111 tlii. case,

Multiple Gross Error Idenlifi~.ufron St~-alcgies for Steady-State ProCesSeS 259

however, the OPF value will be zero if some of the ~neasurerne~lts do not contain gross errors.

Sanchez et al. [6] have modified the definition of OPE taking into account equivale~zcy of gross errors, and have denoted their performance measure as OPFE. This measure is also conzputed like OPF except that "perfect identification" is interpreted to account for equivalency of gross errors. Perfect identification of gross errors is achieved, if the set of gross errors identified in a simulation trial belongs to the same equivalency class as the set of gross errors actually present in the measured data. This also implies that no mispredictions are made in the simulation trial.

2. The average number of Type I e r r o r s (AVTI), which defines the number of incorrect identifications made by a method, is given by

Number of gross errors wrongly identified AVTI = (8-17)

Nurnber of simulation trials made

The measure AVTI is cornp~lted sepal-ately for each simulation run. whether or not gross en-ors are simulated. The interpretation of "wrot t~ iclentificatio~iz" car1 bz suitabiy made after taking eqvivaienc): of Fross errors into consideraric:;.

3. A~icther mctasi1r.e of pe~fo!-marrc~ is the selectivity, defined as

Nui:lbtr of pros.; errors corr-ectly identified Sc!cciivi!y = - (8-18)

Tom1 rlun~bzr- ol' gross errors beicc;ed

IL may he noted that the denominator includes only !h~sc sirnukition trials whel-e a gross error is simulated.

1. Average er ror of estimation (AEE) is the fou~ th type of performance measure. It gives the accuracy of estimation of the bias magnitude on the average. It is used to compare serial co~npe~lsation methods, where estimates of the gross emor magnitudes are also PI-ovided.

1 " estimated value - actual valr~e AER = ~1 (8 - 19)

actual value

COMPARISON O F MULTIPLE GROSS ERROR IDENTIFICATION STRATEGIES

Different simulation studies have been conducted comparing the performance of gross error detection strategies. Among the different strategies proposed for multiple gross error identification we would like to determine the best, in terms of the measures described above. Unfortu- nately, the performance studies conducted so far do fiot provide a definite answer. Nevertheless, some conclusions can be drawn from these studies.

Before making a comparison it is important to ensure that different strategies are compared on the same basis. Since it is always possible to improve the power of a strategy at the expense of a greater Type I error probability, it is important to ensure that the different strategies have the same Type I error probability, so that a judgment can be made based on their power and se.lectivity measures, etc.

Secondly, it does not matter whether the GT, MT or GLR test is used for detection because the same performance can be obtained with all these tests by using the same component strategies for single and multiple gross error identification. This follows from the equivalency results between these three tests proved in the previous chapter How~ver , strategies that make use cf the nodal tests or principal component tests are distinct and cannot be combined with other tests. Our comparison is focosed on the component strategy used f ~ r multiple gross error detection.

Since audified serial compensation is identical to serial elirnirratior? and aiso has the ability to handle gross errors other than biases, amar:: t l~cse two only MSCS needs to be considered. I<ei!er et ai. [lo] sl~oived that MSCS performs better than SSCS because it corn:ni:s fcves rnispredictions whefi i~ult iple gross errors are present in the data. 'Thc reasoc for this is that SSCS uses the estimated magnitudes of gross errors identified in preceding scages to compensate the measurements. which can itself introduce errors due to incorrect estimates. MSCS avoids this by collec- + ; ~ . - 1 ~ r ~stimarin- -. thy m:rg!!itudes of gross errors hypothesized. Rollins and Davis 141 also showed that SSCS commits an unacceptably high number of Type I errors when the standard deviations are very small or when many gross errors are present in the data. Based or; these studies, it appears that MSCS is the best among the serial strategies.

Studies to determine the best simultaneous strategy for multiple gross error identification have not yet been performed. The simultaneous strat-

egy proposed by Rosenberg et al. [3] was compared with szrial elimination and was found to perform equally well. However, in this comparison the equivalency between different gross errors has not been taken into consideration properly, since the effect of equivalency of gross errors was not completely known at the time of their study. The simultaneous strategy based on testing all possible gross error combination hypotheses has not been evaluated in any study so far, since it was considered to be computationally intensive.

A cornparison between LCT and SSCS was made by Rollins and Davis 141 and they showed that LCT performs much better especially when standard deviations of measurement errors are small and when a large number of gross errors are present in the data. However, since MSCS is better than SSCS a comparison between LCT and MSCS is more appropriate. A major problem with LCT is that at present it is applicable only to linear flow processes.

The strategy based on principal component tests was recently cornpared with MSCS and other strategies by Jiang et al. 15). Their study is the only one where equivalency between different gross error sets is properly taken into account before computing the performance measures. Tneir study did not demonstrate sup~r io r performance of PC test strategy. This is alsc confirmed by recent studies by Jordache and Tiltcn [I21 and Bagajewicz ct al. j13j.

GROSS ERROR DETECT96W IN NONLINEAR PROCESSES

All statisrical tests and serial eliminakion cr scrial compensation tech- 11iques presenred in this chapter call be used for detection afid identification 3f gross errors in processes described by no~ l inea r models. T i e iisua! procedure is first to perfom1 a linearization of the process model followed by an identification method designed for linear equations. Typi- caily, thz measured data are reccrziIec! m?er r f l c ::scmp:icr, :!:z: r.;

gross errors are present and the cwctraint equations are linearized aronnd thc reconciled estimates.

This strategy. although very popular, may not be suitable for highly nonlinear pmcesses with data corrupted by significant gross errors. The reconciled estimates which are obtained based on the assumption that no gross errors exist in the data may be far from the true values and the resulting linearized model may not be a good approximation. Conse- quently, chis approach may fail to identify the true gross errors. There is

no guarantee of a successful gross error detection and identification even for pure linear models. Nonlinear processes are much more complex, and they require special methods of gross error detection.

One step forward was provided by Kim et al. 1201. They tailored the MIMT serial elinlination algorithm to fit the nonlinear data reconciliation problem. Their enhanced algorithm differs from the MIMT algorithm in two ways. First, in Step 1, the data reconci!iation problem is solved using nonlinear programming (NLP) techniques. Second, in Step 5 , the reconciled values x and the measurement adjustments are also calculated based on the nonlinear solution.

The variance-covariance matrix of the adjustment vector, however, is calculated from a linearized model as proposed by Serth and Heenan 1711. Their method, tested on a CSTR reactor model, showed superior perfomatice in comparison with the MIMT algoritllrn used with successive linearization, especially when the number of gross errors increases. The NLP solver is more robust and provides more reliable estimates Ihr reconciled values and gross e ~ ~ o l - s which enhances the perforinance cf oross error detection and identification. If large gross errors exist i!i the data. however, they need to be screened out prior to application of this technique. Moreover, the compu:aticxlal tin;= can be 7ery high for iarge- scalc industrial problems.

A new swategy for detection of gross eri-0:-s in noniit~ear processzs \ V a q

recently proposed by Renganathan and Narasiinha~? 122). In thei~ apprcach. a test strategy analogous to the GLR method was proposed that does not require li~earization of the constraints. .4 brief ciescl-iptior, of this nzetl~ud f0:lows.

Gross Error ldentificatisn in Nonlinear Processes Using Nonlinear GLR Method

It was proved in the preceding chapter that the GLR test statistic for a measul.eme11t i is identical to the difference in the 011jeciivefi:17r!im (OF) values of two data reconciliation problems. one of whlch assumes mar n ~ ) gross errors are present, whereas the other assunlctb ~iiac ii gross en-or is

present in ~neasurement i. The formulation of the data reconciiiatiorl problem when a gross crror is assumed in measurement i, was also described in the preceding chapter (Problem PI iri Chapter 7j.

For identifying a single gross error in the GLR neth hod, the maximum difference between the OF values over all the gross errors hypothesized is obtained. If this difference exceeds a critical value ther, a gross crror is

detected and is identified in the variable which gives the maximum O F difference. In other words, the gross error model that gives the minimum least squares OF value is selected, which means that the gross en-or model that best fits the observed data is selected as the most likely possibility.

For nonlinear processes, a gross error detection test can be obtained by applying the above principle of GLR test, that is, the test statistic is obtained as the maximum difference in O F values between the no gross error snodel and the gross error model for variable i. The test statistic T is given by

where

T, = OF for no gross error model

- O F for ith gross error model

The OF for no gross error model is obtained by solving the standard nonlinear- DR problem as formulated in Chapter 5. The OF for ith gross error model is obtained by [lie solving the nonli~?ear DR problesn analogous to Pioblern PI descr-ibcd in the preceding chapter, obtained by s in l~ ply rep1acir.g :he linear constr-aints (Equation 7-1) with the nonlinear constraints of the process (Equations 5-8 and 5-9). These noclinear DK problems are soiveci us i r~g nonlinear programming techniques a s described i i ~ Chapter 5. Thc test statistic is compared with a prespecificd threshold (critical vaiuej T,, a i d a gross ever is detected if 7 exceeds T;.. ?'his means that the corresponding gross srrcr mode; Sesi fits :he data 2nd so the variable corresponding to that gross elror model is identified to bc biased.

'The magnitude of bias b is also obtained as part of the soluticn of Problem PI. Although no statistical reasoning can be given for the choice of rhe critics1 valkc, iiic 52111~ L G A ( Cl iterion cia ;I! ~ i ~ c case of linear processes can be uxc3. This may be adjc,:ed by :rial and error using simulaticn if it is desired to obtain a specified Type I error probability. It sli0~11d be noted that, iri this approach, the nonlinear constraints are treated as such and not approximated by a linear form. Furthennore, bounds and othel- inequality constraints can be included as part of the constraints. and the gross error detection test can still be applied as described above

since it uses only the optimal objective function values. For ease of refel-- ence, we denote tinis test as a 17onlinear GLR test (NCLR).

The NCLR test described above can be applied to detect at most one gross error. However, this test can also be combined with any simultaneous or serial strategy for multiple gross error detection described in this text. As a particular case, we describe the use of NGLR test along with MSCS strat- * a egy for multiple gross error identification caused by measurement biases.

NGLK Test wit17 Modified Sei-ial Coinperzsatiorz Strrrregy (MSCS)

In MSCS, at each stage of application of the test, only the locations of previously detected gross errors are assumed to be col-rect. but the estimates of all the gross error magnitudes are assumed to be unknown and are therefore estimated sin~ultaneously. This process is repeated until no further grcss errors are detected. Applying this sirategy along with the NGLR test, we obtain the test statistic at stage k + l as in Equations 8-21 and 8-22, but the OF values at stage k+ I fcr the no gross error snodel and gross error model for variable i are, respectively, obtained by minimizing the followirlg objective functions, subject to the nonlinear constraints given by Equations 5-8 and 5-9.

w h e ~ e b, is a X vector of unknown biases and EL is 3 matrix whose columns are the unit vectors. The jth colurnn trector of E, has a unit value in the position coiresponding to the measureliient in which a gross error was identified in stage j for j = 1 . . . . k. Tile minimization with respeci to the unknown masnitudes o f gross error magnitudes, bh, for computing the objective function values implies that only the locations of gross errors identified in the previous stages are assumed to be correct. Their magnitudes, however, have to be estimated simultaneously along

with the gross error hypothesized in the present stage, which are actually the premises for the hypotheses in MSCS. For highly nonlinear processes such as reactors, Renganathan and Narasimhan (221 demonstrated that the NGLR method gives better performance than methods which rely on linearization of the constraints.

BAYESIAN APPROACH TO MULTIPLE GROSS ERROR DETECTION

The major PI-oblem in gross error detection is how to enhance the power and selectivity of the test without increasing the frequency of false detections (Type I errors). One way to enhance the power and selecti\iity of gross error rests is to use the information from the past data end, particularly, the frequency of past failures on measuring instruments.

To incorporate historical information 011 measul-ing instrumentation, we can make use of the Bayes theorem in statistics [23]. A gmss enor detection and identification procedure for measurement biases based on this approach has been developed by Tamhane et al. [24. 251 for teady- slate processes.

The usual detection techniques for steady state processes are applied to the snapsiiots of da:a or- averages ol data collected within a time period. But if a significant instrumenr failurr occurs within the dzta collecticn period, the s:aiisiical test:; using averages of data may not be able to capture that failure. Eve11 if eventually they will capture that failure, it might take a long time ur:til an i11strun:ent problenl is detected by a st~tisticzl test.

The new Bayesian apvrcach is not a one-time application, as with t!ie preuio~s :;teady-state tests or cornhinatlon (if tests, but rather a seq~~e~li icd a17yliccrfic~rz thnt iricor-pora:es histolical data collected and updated over time. The sequential approach raise: more questions than the hlatis!icai tests using data averages. For instance. how often are the instiun~ents inspected to confinn the occurrence of gross errors? If a gross crror is c u ~ l L ~ ~i~cci, how SOOIL is ihe instrument repaired? 1s it reasonable to assrme that the repair is pel-fect, that is. the instr-amznt will be as good as new'? Should the model include a frtctor for the aging of instruments? Incorporating all these issues in the sequential application framework makes the Bayesian al2o1-ithni a 1;1uch Inore challenging task than the previous gross error detection and identification algorithms.

To simplify the description of the Bayes test for gross error detection we will first assume a one-tirnc application of this test. We consider a relatively shol-t msssusement pcriod o f R' consecutive observations. T t ~ c

average vector y of the N observations satisfies the following measurement model:

where: x is the i z x 1 vector of tnie values, E is the n x 1 vector of random

4, .b errors and 6 O e represent a vector formed by products 6,ei (i = 1 . . . n); 6, is the magnitude of a gross error in measurenlent i, and ei is unity if a gnxs error is present in measurernent i or zero otherwise. We assume that a gross error can occur in any measurement i only at the beginning of the measurement period. The vector x is assunled to satisfy the following linear constraints:

Note that if a nonlinear model is used, a linearization of the-original model should be first performed. The following assumptions \+ill also be

(I * made for the one-time application of the Bayes test:

a. Vector & follows a multivar-iate normal distribution N(0,Q) with known covariance matrix Q = (l/N)C, where C is the covariance matrix cf the individutil data vectors.

b. A gross error in any measurement i is a kfiown constant rrta~nitude. say a;.

c. The values of prior probabilities of instrument fsilures zre kno-,vn. UJe assume that each element 6, of 5 is modeled as a Bernoulli rat:dorn variable taking on values 8: and O with probabilities p, and ! - p!, respectively (i = 1 . . . n). If &,=by, i e I and 6,=0. ip I. then the come- sponding gross error vector is denoted by 5,. There are 2" possible states of natcre where 6, rangcs from (0, 0, . . ., 0 ) for the staie with no gross error, to (Si. Ej2. . . ., a,,! for the state with zross errcrs occus- ring in every measurement.

ci. Gross errors (cr instrument failures) in different instru~nenrs occ~lr- independently of each other. Then the prior probability that 6 = 6, is given by

We will refer to the xi's as the group pr-ior probabilities. The Bayes theorem is zpplied to compute the gt-oup posterio~- pr-obahilit~ %, of 6,,

266 Dora R~concjljnrior~ arrd Gt-oss Er-/-or L)erccriorl

given the group prior probability .nl of 61 and the measured data. A general Bayes formula for posteriors is given by

where f(datal6J denotes the conditional probability density function (p.d.f.) of the given data given that the true state of nature is and the summation in the denominator is over all 2" subsets J. The Baycs decision rule for identification of the most likely state of nature is the following:

The most likely state of nature 61 correspoizds to the inaximuin posterior iiz Equation 8-27.

Therefore, if ?r; = max E,, then the rneasurernents it- I* are declared in C * gross error. If Ix= 0, then all measure1nen:s are declared free of gross errors. From the identification rule above, we can see that the Bayesian ;~pproach f'alls into the category of sinzultc!neous strategies f ~ r g r o i er-r?)r dt.tection U I I ~ icietztificntioi?.

"TOSS eIAIOIS. Since it also assumes knowledge about the magnitude of = the Bayesiari strategy is close!y related to the serisi compensaticn stratc- gies based on GLR test. In fact, for equal prior probabilities of gross error occcrrence p, for all measurements i = ! . . . n, Fornlula 8-17 hecornes simiiar to Equation 7-27 used to derive the GI,R zest. There art two differences though: (i) instead 01' direct nleasured da:;i )-, a 1ir:s:li- + tra1:sfr1rm2tio11 gf vector JJ is used far the GLR tes: (namely the res idu~~l vecfor r = Ay) and (ii) rhs deno~ninator in Equation 8-27 is a surnnatian over all possib!e stares of nature 6, rather than the state cf nature So coi-- respotidi~g to the case of no gross error as in Equation 7-27.

Note that in Equation 8-27 we cannot use v directly for dnta because its p.d.f. involv-,~ the true vector x which is unknown. What is required is a transformed vector Cy such that

( i ) the p.d.f. of Cy is free of x and depends only on 6: ( i ~ ) the covariance matrix C Q C is nonsingular.

Equation 8-25 indicates that matrix C must satisify the condition. C = MA, for so~ne 1i7 x ti7 matrix M. But C is not unique. It can be shou n .p

[24, 261 that the choice which leads to a maximal d i ~ n c n s i ~ ~ ~ l mation gives rise to the following Bayesian formula:

X,exp - 0.Xy - 6 , ) ' ~ ( y - 6,)} - Tc, = {

x n , e x p - {o.s(~ - 6, ) T ~ ( y - 6,)) J

where U' is the covariance matrix of the vector d of modified meaurement adjustments (see Equation 7-15 in Chapter 7).

Next, we will present the sequential application of the Bayesian test, which is the desired implementation of the Bayesian strategy for gross enor detection. The sequential application of the Bayesian test enaoles continuous updating of the prior probabilities of instrument failures for better gross error detection and identification. The measurement model is now time-dependent even though the underlying process is steady state:

where t is the index for time periods: y(t) is the average of h'Z data vectors observed during time period t ; &it) is the vector of rantlon; erro1.s assutn?ci to foilow a nlultivariate normal distribution Ar(0.Q); 6 @ e (t-I) repi-esent tke vector of gross cn-ors present at time ( : - - I ) , i.e., at the beginning of measurement time period i.

Initially we assume tha: the occun-ence of gross errors are independent Hctnoulli ranciorn veriables with a constant (with respzc! to time) failure rare Oi for the ith insiiunen:. In other words. the probability that the iih instruxent fails (in the sense of a gross er-ror occurring in the measured value) a: the stalt of any given period t is the same, namely. Oi, and t i e faiiure oi' a given instrument ir, different tirlte periods are independent. For a given 6i, the conditional probability that instrument i is in a failed scale at time 1-1 is given by

where z,(t-i) is the time since the last check on instrument i. Note that 0, is quite different frorn pi(i-I), the probability that the ith instrument is in a failed state at time f - I . The instrument may be in a failed state at time t-I

if it failed in any of the previous time periods and has not been repaired. To co~npute p,(t-1) required in Equation 8-26 which is used to calculate the Bayesian test (Equation 8-28) we assume a prior distriburion on each 0, and compute p,(t-110,) with respect to this prior distribution. This approach has the capability of using the past instninient failure to update the prior distributions on the 0,'s. Independent beta distributions 1271,

'(1, +mr) o ; j -~( l -Or , -~) fI(0,) = -

'(I I )r(nll 1 where

"! n,tllf 1 = ,,;KJ; + Z(J) -

I n,. i = l ... n

current data than the prior infonmation and vice versa. By choosing 8': and si, 1';' and m 'li can be estimated from Equation 8-33 above.

The Bayesian formula can be used to compute the posterior p.d.f. of 8,. given its prior p.d.f. and the conditional p.d.f. of the failure data.

gi (failure data 1 Oi )f, (0, ) f, (0, I failure data) = (8 - 31) i l g i (failure data I Oi )f (0, )dB,

can be used conveniently for this purpose, because they are conjugate priors with respect to the geometric distributions which are followed by the instr-ument lifetimes (see Equation 8-35). In Equation 8-3 1 above, I-(.) denote the gamma function, and 1, and rn, are two parameters with the following interpretation: 1, is the number of previous failures for instrument i ancl nil is the sun1 of previous lifetimes for instrunlent i; the ratio l i / (l,+tni) is the mean of the beta prior distribution. denoted by 0;. The ptlra~neters 1, and m, are updated using data on past failures of insti-annents 135) 2s follows:

n-11e1-c ni is tliz number of past fxi!ures for iristrnmen~ i and t'j ' is tile lifetime of insirunizr;t i for the j r l i failure, that is, the ~iurnbcr cf time periods between its jtti failare and ij-!)ti1 f-:?iiures (or t ' y for j= l ) . A method of chocsinz initial values 1'JJ)and rnf;'was proposed by C<)lornbo and Colistantirli [?XI:

Parameter I < s, <: 3 enables the user to select which factor is rnore itnportant in the estimation of the prior 0';); a large value (close to 2) yields small \salues of 1'':' and i l r '? ~vhich means that mot-e weight is given to the

g, (failure data I Oi ) = $;(I - 0,)" (8 - 35)

In Equation 8-35, gi is the probability that instrulnent i lasts (was not in failed state) for exactly .ri time periods. The p,(t-I) (required in Equa- tion 8-26) is computed by

A!thougli not expliciily stated, the following assuniptions h2ve bzen also ma& so far in the Bayesian cladel:

t * ( i ) The mag~~itudes of gross er-I-ors at-e known z?nd conita:lr values 6-

(ii) The insirument fa~!ur-e probabilities are indrpendent :if i!lstrun~erlt azec (iii) Checking and corrective actions arc il~:~iisdiate arid per-l'cct

Thew assun~ptions, however, usually do not lloid i n real life. In rhz nexc section we will show how these assunlptioris can ?>e relaxcd in Chis ~.xactical irnpleinentation of the Bayesian strategy.

First. the magnitudes of rlzz glnxs el-10:s ( - ( I ~ I be .seqriei7ticzll~ ~ r / ~ t i t r ~ ( ~ d based on past data rather than considering them known constants. There are various ways of estimating the magnitudes of Zmss en-ors. 0r;c ~~izrhod was presented in Chapters 7 and 8 in connection with the GLR test. Arxth- er procedure was suggested by Romagnoli [29]. Iordache [26] pr-oposed a

C Q simplified c net hod whicli makes use of the modificd acijusrmc!lt \2i.ctor- d

defined by Equation 7-15. This method, though, is suitable only for linear constraints as given by Equation 8-25. The expected value of vector d can be obtained by combining Equation 7-15 with Equation 3-8:

If we approximate E(d) in Equation 8-37 by the observed vector d = Wy, then the vector 6 of gross errors can be estimated from the equation

W 6 = d (8 - 38)

Note that in Equations 8-37 and 8-38 the vector 6 is actually 6 Q e (t-1) described in Equation 8-29. Furthermore. matrix W is generally singular, therefore a least-squares solution should be nhfqined. One way is to use a Moore-Penrose pseudo inverse of W. This solution, which also involves a singular value deconiposition of matrix W, provides a minimum Euclidean nomi of vector 6. that is. it minimizes 6'6. The solution is unique [30. 311 and can be written as:

Kote that, i n clrder- ro obtain a meaningful so!ution. only the estimates for thc 6,'s associated with the measurements declared in gross error by the Uayes test are upda~ed. T;ic.refore, ali e:'s except those corresponding to Inzasurcrn2n:s suspected ir! grcss error ill the previous step (t-1) are zero.

Secondly. the hilir?-e ;ure.s 9,'s rvill l i o T be coizstn~lt, but will increase ~vith the azes of the instru~nents. Let Oi(Ti) be the failure proSability for ins!rl;~?ient i whcn its sct:~ai age is T,. The followirlg model can be uscc; S9r 8,(Tj) [2?];

w1le1-e 0 < 8,(1) < 1 and PI 2 0 are given constants. If Pi = 0, a constant f', '1 . '11 me rate model is obtained. For PI > 0, the failure rate increases wit5 age (as 'l-,+-. O,('I-,)i 1). Note that the model described by Equation 8-40 has not heen inlplemented in the Bayes test yet. because it is rather complicated.

It was only used to simulate gross errors based on the aging function for 0, [23,24]. In the Bayes test, Equation 8-28 constant Oi is still assumed.

Thirdly, delays in checking and impeqecf col-rcctive uciions can also be taken into account. Immediate instrument checking after gross error detection, followed by a corrective action is not usually feasible in practice. First, because of inherent Type I errors. we may want to verify over a longer period of time the consistency in the gross error detection. A simple rule such as 213 (2 out of 3) can be adopted; that means that a gross error should be detected by the Bayes test in some particular measurement i at least twice out of three consecutive time periods.

Second, even with sustained evidence of gross errors, the operators may want to postpone the instrument checking and correction at a more convenient time (for instance, at the end of the shift or at the scheduled maintenance time). Until then, a gross error is assumed detected, but not corrected in the instrument i. Therefore, the parameters I;, m,, and Qi are not updated until the instrument was checked and found to cause a gross error. But the magnitude of the gross error 6i will be updated continuously until the instrument has been repaired.

Based on the above assumptions, a Bayesian gross error detectien and identification algorithm can be imp!emcnted as follows:

Step CI. /!zirializrrfiot~. At the beginning input the following information:

(i) Constraint matrix A (or the model tc be linearized), covariance matrix C, and the number of data Tjectors per sampling period, N. If average data of N data vectors is ased, the ccvariance niatrix Q=( 1/N)X will be uscd instead of C.

(ii) For each n1easuren;ent i =i . . . i i , enter the following initiai estimates: 8';. 0':. l'y', and ,nini. Note that 1': and I,,' j') can be initialized from 0': and a pxarneter i < s, < 2. Equztion 8-33; 6'!' can be initia:izrd

;o, as constant number of standard deviation, i.e., cio. . ( i i i ) Set the ages ~ ~ ( 0 ) to a11 instruments i equal to I (fresh instruiiients).

Set the time period t equal to 1 also.

Step 1. Rc~nd i17 rl~c. hrdmtrr vcJctot-t for period r and corl~pute thew average vectol ? (t)

Step 2. Ccrlc~tlrrte the prior pt-obubilities p,(t-1). Equation 8-30 and the group priors 7 t I , Equation 8-26.

Step 3. Calculate tlze posterior probabilities x I , Equation 8-28. Note that, if all possible states of nature for the vector 6 O e are considered, the computational time for Steps 2 and 3 is exceedingly large. There are two ways to reduce the computational time:

(i) Since the denominator for all posteriors is the same, only the nurnera- tors should be computed; we can even calculate the natural logarithm of the numerators, thus avoiding the computation of the exponential functions.

(ii) The number of states of nature for the vector 6 63 e can be reduced to a much smaller subset, by the following strategy: Calculate first the posteriors associated with the states of nature involving only one e, equal to I at a time (single gross error case) and the 6$ case (no gross error). Select the measurements corresponding, say, to the top 25% of posterior values and calculate the posteriors of combinations of at most three of those measurements (we assume that at most three gross errors can sin~ultarleously exist). Other strategies for reducing the number of hypotheses to be tested 1181 can also be adopted.

Step 4. Cecisiorz. Ficd the rriaxi~nnin group posterior, %;= max ?, , among all selected combinations i. !f the c:)rresponding set Ix is empty. n o groxq error is detected. Othe-mlise, the measurements in set I* are suspected in gross error. However. the iinai decision is delayed. For iilstance, if rule 213 is used, a ineastuement is declared in gmsz error when detccted at least twice out of three consecuti-ie time per i~ds . The same is trite for the n o gross error case.

Step 5 . Actiotz. If the current time t corresponds io the scheduled inspec- tior, time for the instru~ne~ltation. the foliowisg actions are assuriled:

(i) Check the instruments in set I* detected a: Step 4. Note that insrri!- meat checkins 1nav hc delayed. Iil that case. I7 is the set of all instruments declared in gross error between two consecutive inspections.

(ii) Decide which ones are actually faulty, say subset 1'. (iii) Repair or replace the instrumsnts in set 1'. (iv) Update the age of instrutnents l', i.e., t , ( t )=i for a true failure fol-

lowed by a corrective action.

Step 6. Reestinzale the magrzitudes of the gtuss ei-t-or-s, Equation 8-39. Only the magnitudes of the gross errors for the measurements in the detected set I* are reestimated. Note that. due to inherent 'Type I en-ors, the magnitudes of s a n e falsely detected el~ol-s are also reestimated. But, if averaged data is used, their estimates should be much smaller than those for the true gross errors.

Step 7. Reestimate the parameters 1 i') and mjt) for beta prior distribution of 0; (i=l . . . n). The associated parameter 0 it/ is also reestirnated. by the ratio 1, I (li+m,).

Step 8. Set titne period r = t + I and return to Step I

A sequence of failures (gross error occurences) for a certain instrun~ei?t i, followed by detection and correction actions according to the Bayesian algorithm described above is shown in Figure 8-1.

More details about the sequential Bayesian a1go1-ithm can be found in Tamhane et al. 124. 251. Tordache [26] perfonned a comparative evaluation of the Baycsian algorithm with a similar strategy based on the measurement test. using siniulation runs with 10.OC)O time periods. The per- forrnacce criteiis med were the probability of Type I errors. the power of correct deiection and the average deldy time before a iorrczt detection f(jr the sarne averege number of Type I errors.

In gerterai, the Bayesian method outperforms the rneasurerccnt test i n the follo~ving situations: high Frequencies of gruss error occurrence (1ns1- riple gro5.r errors), large spread in the magnitudes and frequencies oi' srcss error-s. arid 10i1: delays in cor~i'irrntrtior? and re;~ai!-s. O n the cither hand, the 3ayesian nlethod conve~-ges very slowly. S!arting with ini~i:tl g u e ~ cf e, equal to 33% or 300% of the tlue va!ue. a large nuinher of observed faiiures (on the order of 100) is neecied befor-e 6 , converges. Therefore, accurate initia! ehtimates of 8,'s a -e needed before the Baye.iian method may be put to plec!ic:l! I!:?.

If there Is uncertainity about the prior estirriates of H,,'s, one strategy is to placc :nore weight on thc current data until more historical data i.: obtained. The perfoi-inance of the Bayesian method is nluch less dependent on the esri~nates of6,'s. More work is required to rilake the Bayesian appi-oach rsally competitive with other gross error identification strategies. If all implementation details are clarified, it will become an appropriate strategy for online applications.

12E(?ENL> F F a i l ~ ~ r e CD - conrct detection El) - falhe detect~on CK - checking and repair C - chcck~ng (no rcpair) D, - detectioii delay

Figure 8-1. A sequential Bayesian failure detection process

PROPOSED PROBLEMS

NOTE: The proposed pr~blen ls that cre irzclzcded irz this chapter requite more cxterzsitv calculrrfiorzs. A cotnprrrer program or a matlze- rnatical tool such cs hlAT1J.B iz required irr order to get sol:itin~zs to ?It~se prob!etrzs.

Problem S-2. Ilse the the stearii metering system for- a rnethanol synthesis unit represented in Figure 7-2 and the dat; indicated ili Prob!em 7-2 to siiilulate multiple gross errors. For faster solvir?g and a Iliore clear analysis, at most three simultaneous gross errors are recor~~rr~endeci. if possible, apply both serial elimination and serial compensa;i,,ii strategies (with appropriate statistical wsts at w.=0.05 level of significance) and perform a compar-ison of results. Gross errors of various magnitudes and in locations with different detectability factors are recommended.

Problem 8-2. A section of a heat exchanger train froni reference 1321 is showt~ in Figure 8-2. A 11ccrt t t -~ i r7 s f c ' r - flrrirl (HTF) is used to heat two

hydrocarbon streams (A and B). Five heat balance equations can be ernployed in order to describe the system. The first four are obtained by equating the shell and the tube side duties. The fifth is an HTF energy balance.

The process operating values and the measured variables are given in Table 8-8. Standard deviations of the measurement errors are also listed in Table 8-8. The units for flows are bbllhr and for the temperatures degrees F. Densities, in Iblbbl, are: HTF: 290, A: 300, B: 320. The enthalpies, in Btullb, are related to temperature by the following equations:

HTF: H = 0.3233 + 1.1 4E-4T2 Am: H = 0.424T + 2.80E-4T2

Solve first the nonlinear data reconciliation to obtained the reconciled values fcr the case with no gross errors (the data in Table 8-8 is free of

Figure 8-2. Heot exchanger network. Reproduced from reference 1301 with permis sion of Gulf Publishing Co

6

gross errors). To speed up the calculations, successive linearization is recommended, but any NLP solver can also be used, if available.

Next, change the measured value for flow F5 from 312.9 to 362.5 and for the temperature T8 from 662.0 to 669.1. Apply appropriate gross error detectior and identification strategies to find the location of gross errors and their corrected values. To compare the results, see reference [32].

Table 8-8 Data for the Heat Exchanger Network in Figure 8-2

F 1 F2 F3 F4 F5 TI T2 T3 T1 T5 Th 1'7 T8 T9 1 i ( j

1'1 1 TI1

Node Nc.

SUMMARY

For detection and identification of multiple gross errors, sirnulta- neous or serial strategies have been designed. Two major types of serial strategies exist: serial elimination and serial compensation. The sin~ultaneous strategy for mutiple gross errors based on measurement tests is simple, but it usually detects too many nonexistent errors. The simultaneous strategies based on a GLR test require testing of multiple hypotheses and take significant computational time. A method of selecting the most likely hypothesis is required for these strategies. The simultaneous strategy based on a GLR test is equivalent to serial eliminatior? strategy based on a global test (elimination of one, two, three, and so on measurements).

*The ~ t ~ a j o r advantage of serial elimination is that it does not require prior knowledge about the location and magnitude of gross errors. It mizht take significant computatioaal time, howevel-, and co~cld suffer a decrease in the solution accuracy due to the reduc- tic11 in redtlndaacy if too many measurements are eliminated. A certain rcodificd serial cornpensarion st!-ategy is equiiralent to 1 szrial elimination.

I I

The ~;iaxi!nzm nurnber of nieasurecnents that can be eliminated is j equal to the nur~iber cf constraints (or reduced constraints, when ur:measured variables cxist). 1 For all gmss e m r detection strategies. rhe global test shoz:d be i

I app!ied f:rst (to avoid ur?ni-ccssary calculation of statistical tesis, I when no gross errcr exists). i

*The SSCS algorithm may detect many nonexis:eitt gross errors, ~ C C ~ L I S ~ of possible wrong conlpensation. I The MSCS algorithm is equivalent to the serial elininatio:: strate- i zv usicg the iterative measuremen: test (iMT) for detection and L.

I identification of measurement biases. The MSCS, however, can detect leaks as well.

! 1

278 Dufo Kccorrciiiurin~~ urrii Gross F-n~r . Dcrectror~ Multiple Gross Error l&ntiPc(irir,,~ S~rate~ies for Sleodj-Starc f'rocesses 279

SUMMARY (continued)

Bounds on variables enhance the performance of gross error detection strategies only if the measured variables are close to either bound. Combinations of tests (e.g., nodal and measurement tests) can be successfully implemented but require a strategy for reducing the number of test hypotheses. Strategies based on combinations of tests cannot be easily extended to nonlinear models. The MIMT and MSCS strategies can be applied to nonlinear models (with some modifications). One way to enhance the power of detection and identification of a gross error is to include prior probability of instrument failure and a prior estimate of the magnitude of the gross error and apply a Bayesian t y p ~ test. The prior iafomation can be continuously updated by a sequential appl~cadon of the Bayesian algorithm.

REFERENCES

1. Ser:h K. W., and W. A tieenzn. "Gross Error Dc:cction and Reccncilia!io~~ ir, Stcam-met-ring Systems." AIChE Jourtzal32 (1986): 733-'717.

2. lordache, C., R.S.13. hilah, and A. C.Tamhane. "Ferfom~ance Studies of thz hJie,~surement Test for Deicc:ins Gross EITOIT in Process Data.'' AICIzE Iorrr- + S

1701 31 (no. 7, 1985): 1187-1201.

3. Rosenberg, J., R.S.H. Mah, and C. Iordachc. "Evalua~ion of Sche~:les fol Detecting and identifying Gross Errors in Process Data." Ijlcl. & Evg. Clier?:. PI-oc. Des. Dev. 26 (1987): 555-564.

4. Roilins, D. K., a;;,'. J. F. 3z, i.,. "TT1;:;:-scd Es:i;;;;:i~.; ;if C;GSS Errors in Process Measul-e~nwts " AICIzE Journal 38 (1 992): 563-572.

5. Jiang, Q., M. Sanchez. and M. J. Bagajewicz. "On the Performance of Prin- cipal Component Ai~alysis it1 Multiple Gross Error Identification." lrld. C; Eng. Chern. K P . ~ P U I - C ~ ~ 38 (no. 5. 1999): 2005-201 2.

6. Sanchcz, M.. J . Ronlagnoli, Q. Jiang, and M. Bagajewicz. "Si~nultaneous Estimation of Biases and Leaks in Process Plants." Carrtputers & Chern. Engrzg. 23 (no. 7, 1999): 841-858.

F .

7. Yang, Y., R.Ten, and L. Jao. "A Study of Gross Error Detection and data Reconciliation in Process Industries." Computers Ch;lcn?z. Engng. 19 (Suppl., 1995): S217-S222.

8. Ripps, D. L. "Adjustment of Experimental Data." Chenz. Etzg. Progr. S y ~ n l ~ . Series 61 (no. 55, 1965): 8-1 3.

9. Narasimhan, S., and R.S.H. Mah. "Generalized Likelihood Ratio Method for Gross Error Identification." AIChE Journal 33 (1987): 15 14-1521.

10. Keeler, J. Y., M. Darouach, and G. Krzakala "Fault detection of Multiplc Biases or Process Leaks in Linear Steady State Systems." Cotnpurers & Chem. Engng. 18 (1994): 1001 -1004.

11. Tong, H., and C. M. Crowe. "Detection of Gross Errors in DATA Reconcill- ation by Principal Component Analysis." AIChE Jor~rtzal 41 (no. 7 , 1995): 1712-1722.

12. Jordache. C., and B. Tilton. "Gross Error Detection by Serial Elimination: Principal Component Measurement Test versus Univariate Measurement Test," presented at the AIChE Spring National Meeting, Houston. Tex . March 1999.

13. Bagajewicz, M.. Q. Jiang, and M. Sanchez. "Performance Evaluation of PCA Tests for Mnltiple Gross Error Idcntification." Con7plrter.s & Chem. Etzgng. 3 (Suppl., 1999): S5S5-S59.l.

14. Romagnoli, J. A., and G. Stepha~iopolous. "Rectit~catio!~ of Pr~cess Mra- surement Data in the Prcsence of G r x s Errors." Chenl. E11,q. Sciet!ce 36 (1981): 1849-1863.

15. Harikumzr, P., and S. Narasirnhar~. "A Method to Incozporate Bounds in Data Reconciliation and Gross Error Detection-TI. Gross Error Detection Strategies." Con!puters & Chern. Engng. 17 (no. : 1, 1993): 1 ! 2 ! -1 ! 3s.

16. Mah, R.S.H., G. M. Siarlley, and D. W. Downiny. "Reconciliation 2nd Recti- fication of Process Flow and Inventory Data.'' !lid. & 013. Clzern Pruc. 17e.y. Cev. 15 (1976): 175163.

17. Rosenbel-g, J. "Evalustion of Schemes for Uctecting and Identifying Gross Errors in Process Data." M.S. Thesis, Northwestern University, Evanston, I11.,1985.

18. Rollins, D. K., Y. Cheng, and S. Devanathan. "Intelligent Selection of Hypothesis Tests to Enhance Gross Error Identification." Computers & Clzeril. E I Z R I I ~ . 20 (1 996): 5 17-530.

19. Jiang, Q., and M. Bagajewicz. "On a Strategy of Serial Identification urith Collective Compensation for Multiple Gross Error Estimation in Linear Data Reconciliation." bid. & Etzd. Chern. Research 38 (no. 5, 1999): 21 19-2128.

20. Kim, I. W., h$. S. Kang, S. Park, and 1'. F. Edgar. "Kobust Data Rcconcilia- tion and Gross Error Detection: The Modified MIM'I' using NLP." COI~ZPII I - er.s & Cl7c1n. E I I ~ I I ~ . 2 1 (no. 7, 1997): 775-782.

21. Serth R. W., C. M. Valero, and W. A. Heenan "Detection of Gross Errors in Nonlinearly Constrained Data: A Case Study." Ctzenl. Eng. Cornrn. 5 1 (1987): 89-104.

22. Renganathan, T., and S. Narasinihan. "A Strategy for Detection of Gross Errors in Nonlinear Processes." Ind. & Eng. Chem. KPS. 38 (1999): 239 1-2399.

23. Box. G.E.P, and G. C. Tiao. Bayesian I~zferetlce i~r Sruiisticui Arta1jsi.s. Reading, Mass.: Addison-Wesley, 1973.

24. Tamhane, A. C.. C. Iordache. and R.S.H. Mah, "A Bayesian Approach to Gross Error Detection in Chemical Process Data. Part I: Model Devclop- rnent." Clre~,:or~iefrics arid Irl/cl. Lah. Sys. 4 (1988): 33-45.

25. Tamhane. A. C.. C. lordache, and K.S.H. Mali. "A Bayesian Approach to Gross Error Detection in Chemical Proccss Data. Part 11: Sirnulation Results." C'ha~~!n~~ic~ric-s urld Ititel. 1 ~ 1 1 . SVS. 4 (1988): 13 1-1 16.

26. Iorciache. C. A Bn~c.rirr~~ A~~II -o i rch to GI-0.5-s El-I-nr Ilereclio~z ill Proc-e5.s D(zt(i. Ph.D. Dissertation. N o ~ ~ l i w e s t e r ~ ~ University, Evanston. Ill., 1987.

17. l,fa!~!!. N. fi.. R. E. Sc113fc!-, iind N. D Sirt_~p~l~-wall~i. ltietlrodc ji)r Siatisricui ,-117rrI\-.si r o/'iic,/i:rl/ilin. (r12d /.;/i~ Dcrtn. New York: Wiley. !!,?A.

1s. Ct)Ion~ho. .A. G.. and li. Co~ls!nntlni. "Ground-hypo:tii.ses for Bets Dihtribu- iiorl as Bayesian PI-io:-." IEEE trail.^. o ! ~ ~?t~lir:hiiit~ K-29 (co. 1, 19Si)j: 17-20.

29. Rci?~agnoli. J. A. "On Ih ta Reconci!iatioii: Con>ti-aint Processi:lg and Trest- nlcnL of Bia\.'- Clif,!~~. Gig. .Ccier?r-c, 58 ! 192.3): 1 1 07--! 1 ! 7.

3 9 Golub. ti. H.. aixi C Keinsch. "Sir1gu1a1- Vrllue 1)ecoinpoition and Lcasi- q u i r e s Solutions." :?'~i:~lo-. Mrrrh. I4 (1970): i lG3420 .

3 1 . Seber. G.A.F. Lirirc:~. Rc~g,z~.\..s!or~ Atzol~ris New YorL: U'ilcy. 1977.

32. Alhers. J. E.." Data Keconciiiation \uith Un~nea>ured Variables." H ~ d r ~ c . PI-oc. (March 1994). 65--60.

Gross Error Detection in Linear Dynamic Systems

A s it currently stands, industrial applications of dynamic data reconciliation have not been attempted. It is therefore not surprising that only a f e u attempts htive been rmde to address problcrns in the subject of gross error detection in dynamic sysiems even in the resexch literatwe. Several deve!- opnents , however, have occurred in tnc. closely related iopi: of mcdel- based fault dizgnosis which can be gainfully exp!oited for gross erro:- identification in chernicai processes also. The purpose of this chapter is tc expose ?he reader to the issues ~nvolveci in this p;cblern area and to also providc an introdiuzf on to the pmblem of mcdei-based f a d t diagrrosis.

It should 'Fe pointed out that, typicaily, gross error d t t ~ x t i c x ~ is more concerned with the PI-ob!ern of de iec t i~rg biases i n l-nedsured data oi. process leaks, whereas fault d i a g ~ o s i s treats a wider class of probl~,rns associated with scnsors. actuators, and t!le prcczss ~nc~cicl. The i n t ~ - o d ~ c - tion to fault diagnosis that we provide is only brief in a s 11iuch as it pcr- tains to the problem of detecting biases in measurements. The interested reader c2n refer to the book by P::::::: c! I!. [I: r::::! the :::crc rzcc::: k?ek by Gertler [2] for a more co~.\rrei~rr.sive treatment of this subject. For the purposes of this chapter, we use the t e r n s fault and gross err-or inter- changeably.

In Chapter 7, w e listed the basic requirenlents that any gross error detection strategy for steady-stat€ processes should fulfill. 'These are the abilities to (i) detect the presence of one o r more gross errors, (ii) identify the type and location of rhe gross error, ( i i i j identify multiple gross errors, and (iv) provide estimates of the gross errors. T'nesc requirelnenCs

252 I)r~ca K~~coi~ciirtiri~~,i nrrd Gr-oss Error- Ijcr~~criorr

carny over to gross error detection techniques for dynamic systems also. In addition, these techniques should consider the following issues:

(i) In steady-state processes, gross error detection strategies exploit only spatial redundancy in the data for the purpose of detecting and identifying gross errors. Similar to dynamic data reconciliation, however, which exploits temporal redundancy in the data for improving the accuracy of estimation, gross error detection and identification can also exploit temporal redundancy for improving diagnostic performance. This is typically achieved by applying gross error detection techniques to a window of measurements made within a chosen time period.

(ii) Since a gross en-or has an effect only on those measurements that are made after its occurrence, it is also important to estinlate as part of the overall gross error dGLL,.ion strategy the time instant at which a gross error has occurred. In the following section, we describe a procedure to meet these requirements.

It can be noted from the preceding chapters that for steady-state processes the techniques of data r~conciliation and gross emor detection go hand in hand. This is also valid for dynamic systems and the type of state es:ii;lator used also has an impact on the gross el-ror detection strate- rv. For the sake of simplicity we consider only gross errors caused by L,

biases in sensors, aithough, in principle, the method we describe can be applied for identifying other types of Faults. Furtherm~re, we restrict our consideratior; to linear dynamic systems for which a Kalrnan filter esti- Inator is used far state estimation.

PROBLEM FORMtilATiON FOR DETECTION OF MEASUREMENT BIASES

We consider a linear dynamic system for wh~ch Equation 6-1 describes the dynamic evolution of the state variables. If biases in measilrements are not present. then Equation 6-2 can be used to describe the relation between measurements and state variables. We assume that the statistical properties of the state and measurement noises given by F4uations 6-3 through 6-7 are obeyed for the process. We further assunie that a Kalrnan filter given by Equations 6-13 through 6-17 is used to estimate the state variables at each sampling time.

Consider the case when a bias of magnitude b in measurement i occurs at a time f = tor One theoretical model fbr the bias is to assume that it

occurs instantaneously at some time instant and once it occurs its ~nagnitude remains constant for all subsequent times (until the sensor is recalibrated). This is also referred to as a step jump in the measurements 131. In this case, the measurement model may be written as

where e, is the ~ t h unit vector, is the unit step function which is 1 for all times k L to and is 0 otherwise (as in Chapter 6, the subscripts to and k are used to represent time instants roTand kT).

Equations 6-1 and 9-1 together represent the gross error (or fault) model caused by a measurement bias in sensor i. Although a bias is modeled as a step change, it can also be modeled as a drift with con>+ant rate of change as follows.

where s is the rate at which the bias changes.

The objectives of a gross e:ror deteciion strategy are to ( i ) detect whcther a gross errol- has occurred, (ii) determine the time ro at which a grcss errar has occurred and the n:easui-ement i that contains the bias, u

and (iiij estimatz the magnitude b or the rate of cha~lge s of the bias as the case may be. For the sake of definiteness, in the subsecluent sections we model sensor biases as step changes.

STATISTtCAL PROPERTY OF INNOVATIONS AND THE GLOBAL TEST

Tlre principal qilantities which are used in gross error d~tection strategies are the i~izovations defined in Chapter 6, which are computed as part of tfi-. K-l!m3n f;.!ter estin~ztor at each time.

7 7 I he innovarions are anai~gous to nleasurenlent residuals which have been used by the n~easurement test for detecting gross errors in steady state processes. Under the null hypothesis that no biases are preseni in the measurements made from i~litial time to current time k, it can be

proved 141 that the innovations are normally distributed with expected values and covariance matrix given by

Moreover the innovations at time k and time j are not correlated, that is,

Exercise 9-1. Prove that the in1lovotio17.r fo/l012> a Ga~lssian distribution wit11 stati.stitn1 propet-tics given Dv Equurior~s 9-4 tl2mrrglr 9-6, when izo gross error.,- ure present ;IZ the nreu.surei71er7t,s.

Utilizing the properties of the innovations it is possible to construct a st.atistical test analogous to the global test defined in Chapter 7 for detecting the presence of a gross error. The test statistic is gi\.cn by

Under :he ncll hypothzsis t5at nc gross errors ale piesent ir! the meas111-e- mecis. it czn be proved that y follows a chi-square disiribution with I: degrees of frecdo~n where 11 is the nun:h~r of nteasurements. For a chsserl :sue1 of signif cance 0.. cac choose as :he test cri!eriorl ,, alid rejec-t tliz cul! hypothesis if y exceeds the criterion. This tes: can be applied 3t

every sanipling !inx instant to detect when a gross eITor ha: occurred. If tht. test rejects the nuil hypoihesis for the first til:ic at some t i~nc instant, say 1,).

then it clay be concluded that a grass c ~ ~ o r has occurred ar time to. ui ccurse, this coriclusion is subject to the Type I and Type I1 error

p~obabilities of the test. In order to protea against these errors, one possibility is to conclude that a gross error has occurred at tirne r,, if the test re.jects the null hypothesis not only at time to, but also for M out of the next N tin:e instants. Such a sirnplc voting system has been proposed by Rollins and Devanathan 151. A more elegant approach is to use the seqz~ential pi-obubili~ ratio te.st (SPRT) first proposed by Wald 161 and used in fault diagnosis by Montgotner-y and Williams 171. anlong others. This approach, howc~er. has so far not been used in gross error detection.

Instead of constructing the global test using only the innovations at every sampling instant, the innovations obtained over a time window of. say, N sampling instants may be jointly used because they have a joint Gaussian distribution by virtue of property, Equation 9-6. 'The global test statistic in such a case is given by

This test waq proposed by Mehra and Peschon [8] and it can be proved that under the null hypothesis, has a chi-square distribution with hTi1

degrees of freedom. For a given level of significance. the test criterion can be chosen from this distribution to detect whether a gross error is present anlong the N measurements within the time window. Using this %st, however, it is dificult to estimate the time of occurrence of a sross error.

Example 9-1

We consider- a level control process for which a linear discrete model alas derived in Esaniple 6-1. Rased o:i the data $\-en in that exa:npIe. :neasuremeiits correspo~diny tc, the closed loop beha\-io:- of the proczis were siniuiated wi:liutlt addin? any biases to the rni.z.;irrenien(s and tilc Kalman fiitz: is used to obtain estiniates of level and \31ve position. T!K steady state va!ue of th:: covari:~rtce matrix of estirn:itior: en-ors ~ G I - chi.; process can be computed as

and hence the steady state c~var i i~nce 1:iatrix of inno\ ations citn !)e con;- puted as

The zlobal test statistic 7 is co~nputed :I: each tirlie as \cell as the curnil- lative global test statistic 8. These are shown in Figures 9-1 and 9-7. respectively, along with the chi-square test criterion at 5% levei of'significance. It is obser-ved froin these plots that thc GT reject, the nu!l hypothe-

sis 2 out of 100 sa~npling instants. The cumiilative global test, however, does not reject the null hypothesis at any time within this set of 100 Sam- ples It should be noted that the test criterion for the simple GT is constant at 5.99 while the test criterion for the cunlulative GT increase with time, since the number of degrees of freedom increases.

Measurements were also simulated for a bias in valve position of 0.5 volts occurring at initial time and the corresponding GT and cumulative GT gatistics for 100 samples are shown in Figures 9-3 and 9-4, respectively. While the CiT rejects the null hypothesis for 26 out of 100 samples, the cumulative GT rejects the null hypothesis for all samples. The results indicate that the cumulative GT does not commit Type I errors and is able to detect the presence of the bias for all sanlpliog times. This is expected because the cumulative GT also exploits temporal redundancy. It should be cautioned, however, that, in this simulation, the time at which a gross error occurs is known exactly and, hence, the cumulative GT comrnits no Type I or Type I1 errors.

In a sequeritial applica~ioil of the cumulative G T the time of OCCUI--

Fence of the gl-oss error is not kooivn precisely. Therefore, all the measurenlents used in computing the cumu1;itiae GT statistic will not contaiil t!le effect of ths gross elrcr and the tesl may not ha i s perfect detectio~l capabil;ty. i\evcnh.-les<. tile s e \ ~ I t s indicate tha! sequential tests such a:; Spm 161. \v]iich i s nlsn 3 c~lmoin:i\~e ks:, sho~ild be used for the purposes cf gross en-or de~ectlor: in dynamic processes.

Gross Lrror I>etecrron In Ldnear n\romr< S\srer7zs 5111F

o Test Statistic

-- Test Criterion

0 2 0 40 60 SO 100 Time

Figure 9-2. Cumulative global test statistic for mecsurements without gross errors.

o Tzst Statistic

Test C2enon 16.0 1 14.0

12.0 , 3

Figure 9-3. Global test statistic for measurements with bias in valve position Figure 9-1. Global test statistic for measurements without gross errors

o Test Statistic

450.0 1 -- Test Crj:erion

Figure 9-4. Cumulative global test statistic f ~ r measurements with bias in valve position.

The globa! test can detec! *bether a gross e ~ ~ c r has ~ccurred and call also bc appropriately lised to estimate the time of occcrrence of a grcss error. It requires an idenrification strategy. h~wever , to deterinine the type 2fid Jocaiion of i11e gross en-or (as in the case of steady-state processes descr~bed in Chapter 7 ) .

Serid elimination strategies; such as those developed for steady-state sysiems, have not been adapted for dyealnic systems as yet-though, in prii~ciple. it is possible to devise such strategies in c~rnbination witli the global test. The generalized likelihood ratio (GLR) test, however, which was used in steady state processes was in fact hascd on the Gl .R tect proposed by Willsky and Jones [3] for fault d ia~nosis in dynamic processes. 'Thus, this test can be used for identifying the type and location of the gross enor-. In fact, this technique was applied by Narasimhan and Mah /9] for identifying different types o f faults including sensor biases for dynamic che~nical processes. We discuss the features of the GLR technique for detection, identification. and estimation of sensor biases.

Gro.7.5 !?I-ror f)~rr(.fi~ll ill f2i~zear I)~llnnIic Svsierlis

GENERALIZED LIKELIHOOD RATIO METHOD

The GLR test for steady state processes, which is described in Chapter 7, was shown to be capable of identifying different types of gross errors, provided a model for the effect of the gross error on the process (also known as the gross error model) is given. In the case of a dynamic process. the effect of a sensor bias of magnitude b in measurement i that occurs at time lo is given by Equation 9- 1. The evolution of the state variables is still described by Equation 6-1. Without the knowledge that this gross error has occllrred, the Kalman filter estimates will continue to be obtained using Equations 6-13 through 6-17. Therefore, until time to -1 when there is no bias in the measurements, the expected values of the innovations at each t i~ne will still be zero. At su5seauent times, however. the expected values of the innovations at any time k L to are given by

The matrix Gk.lo is referred to as the sigrznturv rnatri.~ and depends on the time k at which the innovations are conlputed and the time to at which a gross error hac occurred. It depends on the system matrices and the type of control law used. For a control 12w based on the estimates as o,iven biy Equation 6-3 we can recursi\~ely compute the signature matrix using ttic following equations.

, . Ik,ro = -4kTk-l,~o + BkCk-l.ik-l,~g (9-10;

with all the above matrices initialized tt) the 0 n1at1-ices when k < I,+ I? can also be proved that ever? if a sensor bias is Frcsent in the measure- !ne!ltsl the innovatiorts follow a Gaussian distribution with covariance matrix given by Equation 9-5. Moreover. the inno\raiions at different tirnes are not con-elated. This result is valid in geceral for other types of nci'dirii.c fauits, that is. faults whose effect on the process can be modeled as an additive term to the nornial process nod el (compare Equations 6-2 and 9-1 in the case of a sensot- bias).

Gross Error DPICC!IOII irr IJII<,U~ D ~ I I O I ~ I ~ C .5y.s1ern.y 29 1

Exercise 9-2. Prove that ~/lzerz m gross crror raused by n biar in measurement i occur5 ut tin7e to, the expected values ofthe iiznovations are giverz by Equations 9-9, ~vl~ere the sigrzature matrix is computed recursively usitzg Equntio~is 9-10 through 9-12. Also show that the iiznovatioils follow u Gciussiurl distrif2ution with statistical properties giverz by Equutior~s 9-4 tlzr~ugh 9-6.

Since the innovations at different times are uncorrelated, they jointly follow a Gaussian distribution. both under the null hypothesis and under the alternative hypothesis that a sensor bias is present. Based on the statistical properties of the innovations established in Exercises 9-1 and 9-2, the GLR test can be applied to a window of N innovations computed from time to to to+N. The GLR test statistic (which is equal to twice the natural logarithm of the lnasinlurn likelihood ratio as in Equation 7-28) can be obtained in this case using

T = Max TI (9-13) 1

where Ti is the maximum likelihood ccst statistic for 2 bias in measure- ifient i given by

A sensor bias is identified in the measu1-c~nent i- which has the maximum test statistic among all rneasureinents. Thc maximurn likeliiiood estinlate of the nlagniiude of the bias is givcn by

Although it is possible to apply the GLR test using only the innovations at time to to identify the gross error that has occurred at this time, we have exploited temporal redundancy in the data by exploiting all the innovations from time to to to+N in the GLR test.

Exercise 9-3. Using the joint distribution of irznovatioizs obtained in Exercises 9-1 and 9-2, derive the GLR tesi statistic for identibing the location of a sensor bias. Also derive the nulxinzurn likelihood estimate of the sensor bias magnitude given by Eq~latioiz 9-15. Follow a similar procedure as outlined in Chapter 7.

Example 9-2

The GLR test is applied to the level control process studied in Exam- ple 9-1. Using the system model developed in Example 6- I , the expected values of the innovations for biases of specified magnitudes in eithet- the level or valve position measurements can be computed using Equations 9-9 thrcugh 9-12. The expected values of innovations (~n.-asurentcnt residuais) in ievel at diffeicnt sampling instants, after a s-nsnr bias of 0.2 v o l ~ s (0.317 cm) in level measurercent or z sensor bias of 0.5 vclt.; (0.3185 cmj in valve position measurement has occun-ed at start time, are piotted in Figure 9-5.

Simiiarly, the expected values of valve position innovatio~s for the same senscr biases are plotted i n Figilre 9-6. These plot\ esserttlally dzscribe the expected evoluticn of the innovations aftel- the partica!ar sensor bias of specified magnitude has occurred, assuning that the !imr at which the bias occurs is known precisely. For a bias of a diffe;-cnt magnitude, ihese curves will be shifted up or down. From these f i~ures , it is c!ear that the expected trend ir! the two innovations are diffcrcnt for the C W ~ SCIISOI- v i i l~eb and ic silo~iici. in principle, be possible to distill-

. , g~;,,i tz:ween these b is~zs . For a given set of measurements, the GLR method essentially deter-

mines the best fit of the pattern of measurements to these expected trends (shifted appropriately to determine the best estimate of thc magnitude) in order to identify the bias that has occurred. The method also accounts fhr correlation among these trends and the relative magnitude of errors in thc iilnovations. As a particular case, measurements corresponding to a bias

4--- Vn!\.c. bijs

-+ I,e\sl bias

Figure 9-5. Expected evolution of level innovation for sensor biases.

0 10 20 30 40

l i m x (sec)

Figure 9-6. Expec:ed evolution of valve innovation for sensor biases.

of magnitude 0.5 volts in valve position were simulated at initial time and the GLR test statistics were con~p~i ted for window lengths of 10 and 20, respectively.

The GLK test statistics for bias in levei and bias in valve position were found to be 29.37 and 4.89, respectively, for- a window length of 10 and 40.58 and 7.2 1, respectively, for a window length of 20. Since the maximum test statistic occurs for bias in level hypothesis, and the test statistic also exceeds the test criterion (3.84 at 5% level of significance), a bias in level is identified. The bias ruagnitude was estimated as 0.589 for a window length of 10 and 0.516 when a window iength of 20 is used. The ability of the GLR method to identify the bias as well as to obtain a more accurate estimate of its magnitude increases wiih the window length, as expected.

In the above derivation, it is implicitly assumed that the time to at which a sensor bias is presumed to have occun-ed is known precisely. In practice, only an estimate of this time can be obtained. The procedure described in the preceding section which makes use of :he global test can be used for this purpose. Alternatively, Willsky and Jones [ 3 ] used the GLR test itself to estiniate the time of occurrence of the gross error by treaiing it as 2 parameier (similar to :lie unknown bias nlagnitude) acd obtaining the maximum likelihood rztio oLer all possib!e values of to within the time wir?dow being considered. This car? resuit in a ~ignificant computation burden, especial!y for larze systrms unlcss the system matrices are independent of time.

An on-iiiie aigorithm which uses the Kalrnan filtzr for estimating the stzce variables at each t i~ne , :he giobal test for detecting the time s f occurrence of a gross e!-i~~r and tlis G I 2 tnethoc! f ~ r idenrifying the loca- tior! and es~i~nating the nlagnitude oi' the gi-uss error is as folio\vs. l* assurne that we are cul~entlj! at time k=O and have initial estimates ~f the state variables and the co\!ariance matrix of the estimates.

Step 1. Ir~creme~ll time ccunter k, and use the Kalnian filter equstinrls for cunent time instant k, to compute the state variables and the covariance matrix of the state estimates using Equations 6-13 through 6-17. Aiso corxipute the innovations vk.

Step 2. Apply the global test using thc test statistic (Equation 9-71. if the GT rejects the null hypothesis, initialize 311 elements of the vector d, matrix C a~ id matrices T, G, and J (which are required for computing the GLR test staristic), and the quantity 7 (,required for computing the global

test statistic of Equation 9-8) to zero. Set the time index to = k and go to Step 3; or else return to Step 1 .

Step 3. Update tlie matrices T, G and J using Equations 9-10 through 9- 12. Update d, C, and 7 using the following equations:

Step 4. Increment time counter k and cornpute the state estimates and covariance rnatrix of state estimates using Kallnan filter eqoations. If the time index k = to+N go to Step 5; or else return to Step 3.

Step 5. App!y the global test using . If the GT rejects the null hypothesis, :hen it confirms a gross error did occur at time t,. Compute the GLR test statistics T, = (di)'/Cii, where di is the ith element of d and Ci, is the ith diagolial eiernent of C. Identify a bias of 1nagni:ude d,=iC,*i-; in the measul-enieilt i' which gives the ~riaxirnurn valuc of T, arnong all the ~neasurenlents. Recalibrate the sensor if required. Return to Step 'I.

Since the global test which is i!sed to determine the tinle of occurrence of a gross error i r ~ Step 2 rnay commit a Type I em:, it is again sppiied ai Step 5 using all the inriovations during the eiapsed time window of N mcasuremefiis to confirrn whether a gross error did occur N time stcps before. This causes a delaj of N time sarnpling instarts before a zross error i s detected. It should be noted [ha: the GT (confirmatory t.-st) applied in Step 5 can also cornniit a Type I error or a Type I1 ei-ror. If a gross error did occur at time i,, and the G T at Step 5 does fiot de~cct this. tl- . - ...- ' - -. - .-., W L I : ~ ~ ~ , I I G d ~ L ; j L<,11~iuCit: iilai I I U g 1 ~ ) h h errors are present in the mea- c t I-A, -.r:nts made so fzr 2nd resumc the on-line monitoring procedure

from Step 2. This causes a further delay of at least N time steps before the gross error is detected. Other variations of this on-line scheme are also possible.

111 the GLR method described above, we have only considered the detection and identification of gross errors due to measurement biases. This npproach can also be used to detect gross errors or faults due to

biasep in actuators, process leaks, or even co~nplete failure of ser~sors and actuators [9]. Once we expand the definition of gross errors, however. to include these types of faults it is only proper to also critically examine a whole host of methods developed in the general area of fault diagnosis to evaluate their suitability. In the following section, we give a brief introduction to some of the fault diagnosis techniques and recommend to the .'-

I interested reader the books by Patton et al. [ I ] , Basseville and Nikiforov ! [ lo] and Gertler [2] for a more detailed treatment. , I

FAULT DIAGNOSIS TECHNIQUES

The term faults cover a wide range of malfunctions associated with sensors, actuators, and process equipment. They include both soft faults, such as biases in sensors or actuators; degradation in equipment performance, such as fouling of heat exchangers or catalyst deactivation, or partial blockages of pipes; and also hard faults, such as failure of sensors and actuators or unacceptable leaks from pipes and other process units. Several different techniques have been developed to detect and identify these types of faults. These methods car1 be grouped into different classes which are briefly described here.

Faults which can be associated with different parameters can be detected and idcntified by directly estimating these parameters as part of a general state 2nd p:uameter estirnation method. These paramr-ters can be compared with their nominal design values ti, determine .ahether a fault exists. For example, fotiling of heat exchangers can be detected and identified by estircating the overall heat [ransfer coefficient of tieat exchangers. Similarly, deactivation of a catalyst can be detectzd and identified by estimating the rate constants of reactions.

A survey of such methods has been presented by !sem~arin [I I]. Iiim- nlelblau and coworkers [12, 13J have described the application of this technique for fouling of heat exchangers and fouling of catalyst usi!ig a simulated example of a reactor with heat exchange. Tnis leci~irique can also be applied for detecting and identifying sensor or actuato, biases by assuming these biases to be present and estinlating their magnitudes. Based on the estimated ~nagnitude. a decision can be made whether the bias is large enough to warrant corrective measures.

Bellingham and Lees 1141 used this approach for detecting and estimating sensor biases for a simulated example of a level control process. In general, these techniques are more useful when the model is derived

froln first principles and it is easy to associate the parameters with different faults. The technique, however. can also be applied for det5cting and identifying faults or changes in parameters of empirical models from their nominal values 121, but it is difficult to associate these with actual equipment-related faults. The book by Himmelblau 1151 describes several techniques for fault diagnosis and applications to chemical processes. A second class of rnethods is based on design of observers for fault diag-

nosis, where the observer can be regarded as a state estimator which has a similar fonli as a Kalrnan filter for linear systems, but with the Kalrnan gain mat~ix chosen based on other requil-ements rather than on minimurn variance estimation considerations. The innovations obtained from such an observer can be defined by a si~nilar equation as Equation 9-3.

For fault diagnosis using these innovations, the cbserver or elements of the gain matrix is designed so that the innovations become nlore sensitive to those faults which we wish to detect. A linear transformation of the innovations may also be used with the transformation matrix being designed to meet the requirements. This technique is equivalent to fault diagnosis techniques based on what are known as parity equations that make use of inpui-ourpat models rather than a state space representatio:: [?I. Gcrile: and Luo [I61 l?;tve described rhc design of parity equations for a disiillation col:!mn t;, make then; sel~sitive to sensor faulrs and insensi- tive to ~~nrneasured disrui-barxxs in the feed flow and fced con~positions.

Another clzss of faalt diagnosi: nrethods are based on designing or structuring rrsidirals such 2s ini1ovatic;ns as par-ity equat io~s so that each elenlent of the re1;iduals resj:or?d (differ from zer!i signitican~ly) to a pa-- ticular fault or set of fauits but no: th-. others. Thus. the effec! of each fault ori :he resid~~;ils ca:1 b: described by a binary vector knoa.1: as the f i i l r : ~ si,yricitrtl-c~. iL'i1t.11 2 f31~1t occurs, a test is applied to each element nf the residuals :o decide u.hether- it is significantly different from zero or not. Based on these decisions and comparison with the fault sig~iat~ires. a fault may be detected and identified. Although these methods have no: ha,., .,,,,, t r i d i::~t i n chemical processes, it should be pointed out that the

concept of a fault sisnature has been utilized even in the GLR method. Finally, a I-elati\ely neu) ~nechod for- sensor fault detection and identiii-

cation by Dutiia ct al. [ 171 uses a principal conlponent model. The use of nlgltivariate pi.irlc.ij)i~! <.otilpotlerz! iitltrl?sis (PCA) for sensor- fault identification via reconstruction provides a reliable technique for fault d i a g ~ o - sis \\)hen there is sufiicient correlation among the measurements of precess variable.;. The principal component model captures measurement

correlations and reconstructs each variable by a successive substitution and optimization. Sensor reconstruction is used to validate the sensor measurement via the PCA model. The procedure proposed by Dunia et al. [I71 assumes that one sensor has failed, and that the remaining ones are used for reconstruction. A sequential procedure is used to analyze and validate all sensors.

THE STATE OF THE ART

Extensive research in the area of gross error detection and identification in dynamic chemical processes has not been carried out. Even the few techniques that have been attempted are applicable to linear systems or linearized approximations of nonlinear processes. Bagajewicz and Jiang 1181 have proposed an integral dynamic measurement test for gross error detection in linear dynamic processes. This test is essentially the use of the measurement test developed for a steady-state process, which is applied to a dynamic flow process.

In their method, since the integral form of the dynamic flow balanceb are converted to algebraic equations by modeling the f l ~ w s and levels a i polynomial in time, linear data reconciliation and gross error detectiozl tests deve!oped for stcaciy state processes can be appiied. Xlbuqrierque aild Bieglzr 1191 have considered the problem of gross error tierectiorl i!; n o n i i ~ e a r dy1121ilic processes. Their ru~ thod is significantly different from the traditional methods of data reconciiicltion and sross enor deiection, and is based G n robust estimation of the state variables in the presence of gross errors.

In order to ob:ain good stace est~matzi elsen in the presence nt g r n s errors, suitable forms of the objective fuaction for reconciiiatio~i are chosen. Aithoush the authars denons:rate that their methcds \vo~-k well for se!ec?ed examples, extensive testing stiil need: to be don?. Moreover. these rnethods can only be applied for lreating measurement biases and not other Lypes of gross erluis oi- c~ii:;>.

In summary, it wil! tal:? :n3rc years of resea:-:!: an?, developinent ei'fort before industrial applications of gross error detection for dynarilic processes can be taken up. Since significant developments have occurred in the field of fault diagnosis, some of which have been applied in the nuclear and aerospace engirreering, it is also wo~thwhile examining ho\s these techniques can be adapted and used for chemical processes also.

SUMMARY

Gross error detection in dynarnic systems involves not only the detection and identification of gross errors but also the time at which a gross error has occurred. The innovations used for Kalman filtering for discrete linear dynamic models are usually normally distributed with zero mean and a certain covariance matrix and they can be used to construct statistical tests. Global test statistic for gross error detection can be developed for linear dynamic processes. This test can either be for each time instant or for a time window of measurements. The time of occurrence of a gross error can be detected either by a global test, a sequential probability ratio test, or the generalized likelihood ratio test. The GI,R test can be used to detect, identify, and estimate gross errors. An on-line gross error detection a!goiithm which uses Kalrnan fil- t e r i~g for estimating the state variables, the dy~iamic global test for detecting the time when a gross error occurs and the dynamic GLR test for gross error identif cation and estimation cf rnagt;i- tudes of the gross errors for :inear dynamic processes is available. Techn~ques developed f ~ r fault diag~ivsis in dynarnic systems car, be adapted for grass error detection and identification.

REFERENCES

1. Patton, R. et al. Fuiilr Diagnosis in D~nc~llzic S)1.srerns-Thecry and i-lp,nlica- tions. Englewood CIiffs. N. J.: Prentice-Hali. 1989.

2. Gertler, J. J. Fault D ~ i ~ c t i o n and Diagnosic in Et~.rineering SJJS~CIIIS. New York: Marce! Dekker, 1998.

3. Willsky, A. S., and H. L. Jones. "A Generalized Likelihood Ratio Approach to the Detection acd Estimation of Jumps in Linear Dynamic Systerns." IEEE Trans. Allron7atic Corztrol AC-21 (1976): 108-1 12.

4. Maybeck, P. S. Sfochastic Models, Estimation and Control-Vol. 2. New York: Academic Press. 1982.

5. Roll~ns D. F., and S. Devanathan. "Unbiased Estimation in Dynamic Data Reconciliation." AIChE Jour17al39 (1993): 1330-1334.

6. Wald, A. Sequential A11alj~si.s. New York: Wiley, 1947. Reprint. Dover, 1973.

7. Montgomery, R. C., and J. P. Williams. "Analytic Redundancy Management for Systems with Appreciable Structural Dynamics," in Fault Diugrzosis in I)?tunnic Systemn.7-Tlzeoty and Applicarion.~. (edited by R. Patton, P. Frank, and R. Clark), pp. 361-386. Englewood Cliffs, N. J.: Prentice-I-fall, 1989.

8. Mehl-a, R. K., and J. Peschon. "An Innovations Approach to Fault Detection and Diag~iosis in Dynamic Systems." Autornaticu 7 (1971): 637-640.

9. Narasimhan, S., and R.S.H. Mah. "Generalized Likelihood Ratios for Gross Error Identif-ication in Dynamic Processes." AlChE Jonrnctl 34 (1 988): 1321-1331.

10. Rasseville, M.. and I. V. Nikiforov. Detection of AOrupt Changes--Tlzeoty and Applicario~~s. Englewood Cliffs, N. J.: Prencice-Hall. 1993.

11. Isermatm. R. "Process Fault Diagnosis Based on Modeling and Estinlation Methods: A Survey." Aiitornurica 20 (1 984): 387404.

12. Watanabe, K., and D. M. Himmelblau. "Fault Diagnosis in Nonlinear Chenl- ical Processes---Parts 1 and 11." AlCI7E ./ourt7ril29 ( i 983): 243-260.

13. Park, S.. and D. I??. H~rnmclblau. "Fault Detection arld Giaynosis via E'ara- nieter Estin~acion in Lumped Cynarnic Sysrerns." Ind. 07g. C'lle111. Des. Dew. 22 (no. 3. 1983): 482487.

14. Belli:~ghan~. 5., and F. P. Lees. "The. !letection ol' blalfunction Using 3

Process Control Ccmputer: A Kalrna~l Filterifig 'Technqiue for Genera! Con- tra1 Laops." T~UIIS. IChertzE 55 (1977): 2.53-265.

15. I-iirnrnzIblau, D. M. F~u!t Detecricn civ/J I:ing17osis ill Clzcrliicn! nr;d Pi.tro- ciict~zicnl P:-cjcrssex. .4nistcrdarn: El:icvier. 1978.

16. Gertlcr. J. J.. aad Q. Luo. "Robus: 1sc;lvSle Models tor Failure Diaposis." AlC'IE Josrnnl35 (1 989:: 1856-! 868.

17. Dunia. R.. S. !. Qiu. T. 1:. Edgar. and T. J. Mc/\voy. "Idenrificatior? of Fault), Sensors lising Principal Component Analysis." AICIIE Joit1-tzi11 32 (no. 10, 1996): 2797-28 12.

18. Bagajewicz, M. J., and Q. Jiang. "Gross E I T O ~ Modeling and Detection in Plant Idinear Dyna~nic Reconci!iation." Corlil~uict-.r 8 Clletn. D I ~ I I ~ . 71 (110.

12, 1998): 1789-1 809.

19. Albuquerque, J. S., and L. T. Uicglcr. "Data Reconciliation and Cross Error Detection for Dynamic Systems." AIChI: JoL(~I ( I / 32 (1 996): 2841-2856.

Design of Sensor Neiworks

The principal objective of data reconciiiation and gross error detection is tu improve the accuracy and consistency of estimates of process variables. These techniques certainly reduce rhe errol- content in measuremsnt.; if redundzncy exists in the measurements. The extent of improverne~~t that zrtn be achieved depends c;uci:tIly on ( 1 ) the r!ccuracy of the sensors whict: is s~~cxitied by the variance in the nteasilre~nent errors and ( 2 ) rh: numhsr of' variables and thsir type which are measured. Different sensors mr:y be ava~lable for neastrricg a variabie wit11 wideiy varyifig capshilities such as the range over which i: can ineasiire, reliabiiitj~, 2nd accuracy. The cos: of tne sensor .vil! be a function of its capabilities. This i11fc1-mation musf typically tie obtained from instru~nt.n;ation man~ifzcture~-s or suppliers [ I:.

If v;e consider all tlte different variables sucli as flow rates. tempera- tules. pressures and compositions of the s:realns in 2 process, these coulc! be of the order of sevzral thousands in number. Clearly, fi-om the vie\<- p o i ~ ~ t of cost, complcxit~ or technical fcasibiliiy it is not possible to mcd- sul-e each and every variable. Only a subset of these variables are usually measured. It is es~intated that the cost of i11st1-urnentation (including co11- tro! elements) is about 2-8% of the total fixeri capital cost of a plant 121. Iluring the design phase of a plant, the decisions regarding which variables should be ~ileasured are generally taken when the piping and insti-u- mentation diagrams for the process are prepared [3) .

Current practices. however. indicate that these decisions are made based on pre\,ious experieitci: with similar processes and thumb ru!cs. \Ve I-ef;t~- t o rhe problem of' sclecti~lp the variables to be measiir-ed as thi:

design o f a sc3nsol- network. Although this problem i s an irnpor-tant one i n design of new plants, it can also be used to retrofit the nieasur-ement structure of existing plants by identifying new variables that need to by measured for improved monitoring and control of the process.

The design of a sensor network is influenced by different considerations, such as controllability of the plant, safety, reliability, environmental regulations, and accurate estimation of all important variables. If the estimates of variables are used in control, then the accuracy of estimation also has an effect on control performance. Keeping the scope of this text in mind, in this chapter we only consider the design of sensor netwol-ks for maximizing accuracy of estirnatior: through data reconciliation, while giving due considerations to the cost of the design.

Moreover, the treat~nent in this chapter is limited to linear (,flow) pmcesses only. It should be noted that the objective of maximizing estimation accuracy is only one of the impol-ttnt considerations and a compreher~sive desi~1-i should also take into account other requirements mentioned ahove. T11ix problern is receiving increasing attenticn from different reseali-her-s in recrrlt years and new solution strategies are being developed. It may req~~ire se.r era1 vears of additional effort before these solutioris are irnplernzntrd in practice.

ESTIMATION ACCURACY OF DATA RECONCILlATfON

Eefore .rise discuss rhe rna:bematical iormu1a;ion of t!lt; sensol- nctc,o:h design problern. we first ex~?mine the estiri:ztioi~ accuracq obtained through data reconciiiation and ihe effect that the cilcice of 111casure(! variables have cn it. The flow reco:~ci!iaticn exampie discussed in Exain- 111e 1-1 i l l Chapter 1 doe:; hi~hiight some of these issuzs. \17e ~-eexai~li i~c~ this problem in greater depth.

Example 10-1

T I . . A ..- .-;concilcd c.,:i;;,n:;., :;f :hc stream flows for the process shown iii

Figure 1-? were presented in Tables 1 - 1 and 1-2 for different chvice~ 0;'

measured variables. Let us consider the results of Table 1-1 fcr which ail flows ar-e measured. and Case 2 of Table ! -2 for which the only the i l o . ~ \

of streams 1 and 2 are measured. Thc difference bettileen the cstimatcd and true values of all streams (estimation errors) can be computed t r - o i ! ~

these results and are shown in Table I@!, along with the surn of squar-rs of the cstirnation errors.

From these results, we can observe that the error in the flow estimates of streams 1, 3, 5 , and 6 are much less for the case when all flows are measured as compared to the case when only the flows of streams 1 and 2 are measured. The estimation errors, however, for streams 2 and 4 are marginally more when all flows are measured. Although, from a purely intuitive viewpoint, we expect the estimation errors to be reduced if more measurements are available, it is clear from this example that the estirna- lion errors for all variables are not reduced &hen more nzeusurenzents are nzade. This is more forcefully brought out by the example presented by Mah [4] where it was shown through simulation that, as more measurements are made, a larger fraction of the reconciled estimates have smaller errors. It is thus clear that it is not appropriate to focus on any perticular variable for the purpose of designing sensor networks to increase accuracy of estimation.

Table 10- 1 Estimation Errors in Reconciled Flows for Process in Figure 1-2

Estimation Errors All Flows Measured Flows 1, 2 Measured

! 2 3 4 5 6

Sun) Squares of E5tinlarion Esrors

As an overall measure of estimation accuracy, we can use thz s u n cf squares of the estimation errors of all varizbles (which represents the overall inaccuracy). Table 10-1 shows that the sum of square of estimation errors is less when more meacurements are available. We can therefore use this .measure in order to design sensor networks. This measure, however, also depends on the measured values which can be different each time due to their random characteristics. The appropriate measure that we can use for design purposes is the expected value of the sum of squares of the estimation errors which will be independent of the actual outconx of the measurements and will depend only on the sensor network design as well as the inherent process structure.

The overall measure for estimation accuracy was first proposed by Kretsovalis and Mah [5] and is defined by

It should be noted that as J decreases the estimation accuracy irzcreas- es and therefore we refer to it as the measure of the overall estimation accuracy. It is implicitly assumed in the above definition that all variables are observable. If there are unobservable variables, then the measure of overall estimation accuracy can be written as the expected sum of squares of estimation errors for observable variables only. We will ignore this modification and restrict our considerations throughout this chapter to the design of sensor networks which ensure L I I ~ observability of all variables. It can be proved [ 5 ] that the overall estimation accuracy given by Equation 10-1 for a data reconciliation solution is given by

where S is the covariance matrix of esiinlaiion errors and the operator Ti- is the trucr of the matrix. It should be noted t!lat the diagonal elements of S are rhe varia~lces of the estimation eIrors and J is ther-efore the sum of the variances of estimation errors, which is eqtiivalenr to the expecced sum of squarcs of the estimation errors. Alti30;lgh it is possibk to cierive the estimatior? error covariance matrix from the data reconciliation solu- ticns for the measured and unmeasured variables ziveri in Chapter 3, we descl-ibe later an alternative sequential update procedure which is more usefa! in the context of the sensor ~ e t w o r k desigri problem. Different approaches have been developed to soive the sensor network design problem. In the fclioviing section, we discuss these methods which consider objectives of estimation accuracy, observability and cost.

SENSOR NETWORK DESIGN

Methods Based on Matrix Algebra

Let us consider a linear flow process for which the material balances are given by Equation 3-2:

where the variables x represent the strean? flows. In general, only some of thew flow variables are measured and the relationships between the measurements and stream flows can in general be represented uslng

where each row of matrix H is a unit vector with unit in the column corresponding to the flow variable which is measured, the number of rows being equal to the number of rneasurernents. We have chosen Equations 10-3 and 10-4 to represent a partially measured linear process rather thari the equivalent alternative model Equations 3-1 and 3-11, because it is Inore convenient for the purposes of designing sensor networks.

Mininzum Observable Serzsor Networks

We are interested in sensor networks which ensure the observability of all var-iables. We, therefore. first address the question of the minimum nilrnhe; of measurer?lents to he made in order to ensure that every flow variable is observable. LVe will for con\.e~iience refer tci silch a design .ts a !nitllmrlin c!O.setvni)!e seizsot. rzerlcoi-k. If there are 11 stream flows to be ebtinlated and :ye have nz flow ccnstrair~ts, then it i:, evident rh:tr at leaxt I:-m flows must be specified. In other words, :tic n~inirnum r,umSer cf ~neasuremerlts is 11-171.

For the flow process cor~sidei-ed in Example 1-1. the ~niniaur-n of mea- s~~rernenis io be made in order to ecsure that all stream fiou,s are observable ic 2, since there are 6 streams and 4 f low balanczs. Czse 3 of Ex:imple 1-2 is a specific instance of a minimum cbserl~able sensor uet\vork for this process. AII additional poinl to be noted is that in a niinirnv111 obsergable sensor network none of the measured variables is reduldant and the rtconci!ed values of these variables are exactly eqttal to their respective lneasured values.

No: every combination of n-~n measureinents will give rise to an observable systern. however. For cxaruple. in Case 3 of Example 1-2. although two rneasurernents are made, the flowis of streams 2 to 5 are unobservable. The condition that a sensor network must satisfy in orde1- to ensure observability of all variables in a linear process is discussed as fol-

Let us separate the flow variables into a set of n-m irzdeperzdenr variables XI and a set of rn dependent variables x,, and rzcast Equation 10-3 as

The dependent variables are chosen in such a way that the matrix A, is non-singular. If we ~neasure only the independent variables, then we can use Equation 10-5 to compute unique estimates of the dependent variables as

where

It is thus clear that this sensor network is a minimum observable sensor network. Therefore, the condition to be satisfied by a minimum sensor network in order to give rise to an observable system is that the sub-matrix corresponding to the unmeast-lred variables should be nonsingular. Note that this irnplies that the columns of the constraint matrix corresponding to unmeasured variables are lineariy indspendent (which is the oilservability condition in Exercise 3-3).

If :nore measurenients zre made than the niiinirnurn 1-equired to ensu-e cbservabiiity of all variables, then we obtain a rt.du:zcIatarl: seizscr r~erwork desigiz. Ever: if a rzdundant sensor netwoik is designed. it does not automatically imp!y !hat all flows are observable. There coltld be stlosets of variables v,.hich are unobservabIc while the rest are redundant. A redundant se:?sor network gives rise to an observable process if a ~ d only it' we can choose rz-in independent t~ariables from among the set of rneaa:wci vzriables such that the coi~straint subrrlatrix correspcnding to the remaining variables is nonsingular. In this case, the dependent set contains oce or more measured variabies. We refer to such a design as a 1-eclr~nd~znr obsen>ablr scrzsor network. We can always obtain a redundant observable sensor network starting from a miniinurn observable sensor network by choosing to additionally ~xeasure one or niore of the unmeasured variables in the minimum scnsoi network.

lows:

Estir~mtion Accuracy of Minimurn Observable Sensor Networks

w e now consider a minimum observable sensor network design and obtain the overall estimation accuracy for reconciled estimates. Based on the discussion above, we can choose th2 measured variables as the independent variables. For any observable sensor network (nonredundant or otherwise), the estimates obtained using data reconciliation must satisfy constraint Equation 10-3. Thus, Equation 10-6 can also be used to relate the reconciled estimates for an appropriate choice of the independent and dependent variables.

Since there is no redundancy in a minimum observable sensor network, the estimates of the independent (measured) variables are equal to their respective measurernents. This implies that the covariacce matriy of estimation errors in the independent variables is equal to QI, which is the covariance matrix of meastlrenlent errors of the independent variables. Let us denote the covariance matrix of estimation en-ors corresponding to the independent and dependent variables by SI and S,,. respectively. Then \ve obtain from the preceding arguments that

Lising Equatians 18-8 and 10-6 :rJe can show that

Combining 5quatic)ns 10-2. 10-3. a i ~ d 10-9. ihc measure tiir overall esti:natio:: accuracy for a aiinimurn obszrvahle ser?s(>r network car1 be expressed as

A niininit~rn observable sensor rietwork design that minimizes J defined by Equation 10-10 is desired. 111 order to solve this problen~. a nlixed integer optilnization pr-oblem can be used, which is desci-ibed later in this chapter. Hcre we \vill use a nai've approach and examine every ft.trsil2le combination to de~cr~n ine the optimal solution. We can select every co111bin;ttion of n-111 independent variables (such that the sub-

matrix corresponding to the dependent variables is nonsingular). For each combination, the independent variables can be chosen as the measured variables and the measure J for each sensor network design can be computed using Equation 10-10. The combination that gives the least J is the optimal sensor network design that we seek.

Example 10-2

We will illustrate the minimum observable sensor network design that maximizes estimation accuracy for the ammonia process shown in Figure 10-1. We will limit our consideration only to the overall mass flows of this process. For simplicity, let us consider the case when the flow sensors used for measuring any stream have an error with variance equal to I. Since there are 8 streams and 5 process units, we require a minimull1 of 3 sensors to observe all variables.

The different feasible combinations of sensor locations along with the corresponding measures of estimation accuracy are shown in Table 10-9. We can observe that there are 6 optimal sensor network designs corresponding to sensor locations (1, 2. 6). (2, 5, 7), (1, 3, 6). (3, 5. 7). (1, 4, 6 ) . and (4, 5, 7) with a minimum expected sum square of e s t i ~ ~ a t i o n errors equal to ! 1 units. It can also be observed tllar, although there ale

Figure 10-3. Simplified ammonia process

308 [)(,r<r Kri.o,ici;iutiutz and Grus.\ Error Derecfiorl

56 combinations of choosing 3 sensors locations out of 8 sensors locations, only 32 of these combinations give rise to observable sensor network designs.

Table 10-2 Minimum Observable Sensor Network Designs for Ammonia Process

No. Measured Variables Overall Expected Estimation Error

Redundant observable sensor networks. The measure of estimation accuracy for redundant observable sensor networks, can be computed using simple update formulae developed by Kretsovalis and Mah [5] . Let us begin with a minimum observable sensor network corresponding to a set of n-m measured independent variables, xl and the remaining unmeasured variables xD. The measure of estimation accuracy for this sensor network is given by Equation 19-10, Let us consider the addition of a new sensor to measure one of the variables in the set x ~ . Let q be the variance in the error of this new measurement. As in Eqazuation 10-4, the new measurement y can be related to the variables x by

where

and hT is a unit row vector with unity in the colun~n position con-esponding to the new @variable being measured. The expected es~imate error covariance mr:trices of the independe~lf and dependent -v~ariaSles after the

-., - addition of this new measurenient, S,, and S1, respectively are $\.en by

I j i

where

The change in the nteasul-e of i -ctirnntinn :Icc:vracy due t~ !hi_. additioit of this new measurement can be show11 to he

Equations 10-12, 10-15, and 10-16 can be directly used to compute the change in the measure of estimation accuracy due to the addition of a new measurement using the covariance matrix of estimate errors in the preceding sensor network design solution. Thus, starting from a minimum sensor network design solution the measure of estirnation accuracy for a redundant sensor network design can be obtained by successively adding the required measurements, and using the update equations after each addition. Similar equations can also be derived for deletion of a measurement from a redundant sensor network design solution. In this case, the change in the measure of estimation accuracy is given by

where

and the updated covariance matrix of estimate errors i n the independent va1-iables is given by

S, = S, + ~ , s , K G ~ s , (10-20)

It should be noted that rhe sct of independent variables and d~.perldzni lsariablcs do not change as new ~nea:;iirzinei;ts arc added or dclztzd. so

that after a series of additions 01 ciele:ic?ns each of these sets can contain a mixture of measured and unl~ieascred variables. Care must be taken. hou~ever. when a measuremelit is deleted to mrure that an ~~nobszr\,aS!e dzsign is not ob;ai!lcd. In f x t , if a mea\urernerlt is deleted which can icad to :in l.:nobservable process. then the der,c;min'ltor :II Equatior? 10 19 will become zero and this can be used as an indicator to iivoiti 5uch choices.

Example 10-3

We will consider the arurnonia process example and cotiipute the decrease in the expected en-oi- in estirnatcs for the addition of a single measurement to a minimurn obsel-vable serlsor- network design. The 1:;11-i- ances of all sensor errors are taken as unity as before. For this purpose. we will start with the optimal minimum observable sensor design obtained in Example 10-1 in which the variables 2, 5. 7 are measured.

Choosing these measured variables as the independent variables, the covariance matrix of estimation errors in the independent variables is the identity matrix (of dimension 3). The matrix F is given by

If we choose, in addition, to measure the flow of stream 1 , then the vector which relates this measurement to the independent variables is given by

The value of kl from Equation 10-15 is equal to 1!(1+2) = 113 and hence the decrease in estirnation error can be calculated from Equation 10-16 a5 -3.3333. The updated covariance matrix of estimate errors of independent variables is given by

ir, order to design an optimal redunda~t observable sensor netm.orh ffir a specified number of sensors, say i - ( I - > I Z - ~ ? : ; ive can start with any ~:lin- imu~n gbsen~able sensor network design and add i--iz+rt: additiocal sensors, Gne a: a tirnz and update the 1neasu1-e of estination r;cclil-acy usi:iz FAuations 10- 12. iii- 14, ail3 i 0- 15. \Ve can then i-riocc2ir the sc11sa:-s by. in tGm. add:r?g 3 new nieasurenlent and deleting an exis:ing rnzasal-=meill m get a new redu~ldant observable sensor ~ e t w o r k desigil conristing of i-

sensors. Equations 10-17 through 10-19 can be used ior upaatir~g the measure of estimation accuracy when a measurement is deleted.

In [his mar - >- - 1 ' '-- " ' L L I Y ~ , all I S ~ ~ Y L L L ~ ~ l l b i ~ l a i i o i i ~ c~lr r sensor iocations can be

examined i r c\-r'-~,r tr, find the design which gives the maximum expected estimation accuracy. 'This will result, however, in an exponential number of solutions to be examined for a general problem. Kretsovalis and Mah 151 outlined two sub-optimal design procedures for a redundant censor network desigfi for a specified number of sensors, as described o n the following page.

I_fe\ry~r 4 .SL'I~ ior NL'III.OI-k \ 313

Algorithm 1

Step 1. Determine the optimal minimum observable sensor network.

Step 2. Add a new sensor in turn to each of the remaining unmeasured variables and co~npute the reduction in estimation error using Equation 10- 16.

Step 3. Based on the results of Step 2, select the best r-17+177 sensor locations (the locations that give maximum reduction in estimation error) to obtain the redundant sensor network design.

Algorithm 2

Step 1. Same as in Algorithm 1

Step 2. Same as in Algorithm 1.

Stcp 3. Determine the sensor placement that gives the rnaximur~t reduction in expected estimation error from the results of Stcp 2 and add it to measured set of I anables. Stop if the number of measurements inade so fa- is. I - : or e!se retcrn to Step 2.

Both the abave algorithn:~ 3c not necessariij~ give the s e x o r nztwork design that Inaxirnizes cstirnation accuracy. 5l;t reduces :he computation- a1 burdsn significantly.

T.\le apply the above two algorithms for designing redundan: observabie senso!- networks using six measurements for the ammonia process. Frorn Example 10-2. the optimal minifzlum observable sensor network design corresponding to measured variables 2, 5, 7 is chcsen. We have tc. select three addii io~~ai villidbies TG be measureci with the objective of reducing the expected es:i,.;ation error as milch as possible. Table 10-3 shows the expected decrease in estimation error achieved by addin, 0 one. t\vo, and thtee additional sensors for different coinbinations of the variables selected to be measured. The ~naximum esti~ilate error reduction is achieved by choosing to measure additionally the variables 1 , 6, 8.

If we apply Algorithm 1 above we would select the variables 1, 6, 8 to be measured since these give the maxi~num estimate error reduction for

addition of a single sensor as observed from colu~nn 1 of Table 10-3. On the other hand, if we apply Algorithm 2, then we would first select variable 1 (or 6) to be measured since this gives the maximum estimate error reduction (column 1). In the next iteration, we select variable 6 (or 1) to be measured since this gives maximum estimate error reduction (column 2) among all remaining variables, and, finally, variable 8 is chosen to be measured due to the same reason. In this example, both algorithms give the opti~nunl sensor network design. although in general this may not be the case.

Table 10-3 Expected Estimation Error of Redundant

Sensor Network Design for Ammonia Process --

Change in Expected Estimation Error (Measurements Added)

hfirzilnrin~ Cost Sensor iVefu~ork L)esig:zs

Insea2 of maxifilizing esti~nation accut-acy. a n:inimun~ cost sefisor network may be designed that ensures observability of all variables. This ob.iectivc fur~ction was considered by Madron and Vevcrka [6j for sensor network design. Althougl~ several other issues were considered in their work, we limit our consideration to the design of minimum observable sensor networks at minirnun? total cost. The design algorithm proposed by Madron and Ververka [6] essetitially attempts to obtain a set of dependent variables such that the measured independent variables wi!l have the least total cost.

The colun~ns of the constraint matrix are first arranged in decreasing order of the cost of the sensor for rneasuri~lg the corresponding variables. A Gaussian elimination procedure is applied, with the pivot element

being chosen from the next available column if possible, and reordering of the rows and columns is done if required. This procedure stops once the first m columns form an identity matrix. The least cost minirnum observable sensor network design is obtained by measuring the variables corresponding to the remaining rz-m columns of the constraint matrix. We will illustrate this procedure by means of the following example.

Example 10-5

We consider the ammonia process with the sensor cost data for niea- suring different variables given by Table 10-4.

Table 10-4 Flow Sensor Costs for Ammonia Process

Stream Sensor Cost -

1 2.5 2 4.0 3 3.5 4 3.0

1.0 5 2.0 7 2.0 8 1.5

.- .- -- ---- --

The constraint matrix for this process is giver, by

where the colurnns are arranged in order o! decreasing sensor costs for measuring the correspo~lding strearn flows and the rows are the flow balances for nodes 1 to 5.

After- applying Gaussian elimination to obtain an identity matrix in the first t n columns. we get the following modified matrix

The order in which the pivots were selected for Gaussian eliinination are (1, I), (2,2), (3,3), (4,4), and (5,6), where the elements within brackets indicate the row and column index of the pivot element. It should be noted that for selecting the pivot elemelits the columns had to be rearranged since a nonzero pivot element was not available in the next column. The least cost mininiuni observable sensor network design is obtained by measuring the variables 6, 8, and 5 corresponding to the last

I three colun~ns of modified matrix A. Madron and Veverka [6] also considered constraints on the sensor

4 3 location problem such as specifications of which variables are unmea- sureable and which variables were required to be estinlated. They also considered the problem of locating additional sensors in a given partially n~easured process in crder to obtain an observable sensor network dcsign at mini~nlurn additional cost. In order :o solve these problems. tire columns of the constraint matrix A have to be ordered appropr~atcly befcre applying Gaussian elimination. 'The details of the pracedure tnay be obtsined from the pubiication by Madron and Veverka i61.

Methods Based on Graph Theory

Sensor networks far linear tlow processes can be desi%nzd elegantly dsifig graph-theorecic techniques. Unlike other methods. powerful insights are obtained concernkg the :structure of the seasor neiwork which make it possible to develop efficient algorithms for solving :he design problem. We wili again consider the design of sensor networks for maximizing esti~nation accuracy or for minimizing the total cost. The

i additional graph-theoretic concepts required for understanding r h r method discussed in this section can be found in Appendix B.

Maxit~zuaz Estinzatiotz Accuracy Serzsor Network Design

Minimum observable sensor networks. In the preceding section we

B * stated that a miniinunl observable sensor network can he designed by

choosing n-m independent variables to be measured such that the constraint submatrix corresponding to dependent variables is nonsingular. In other words, our choice of independent variables should make it possible to express each of the dependent variables as a linear combination of independent variables only.

In Chapter 3, we showed that all unmeasured variables are observable, if no cycle containing only unmeasured variables exists in the process graph. We also showed that in order to ensure observability of all unmeasured variables using a minimum number of measurements, the unmeasured variables should fonn a spanning tree of the process graph. In other words, a minimum observable sensor network can be designed by simply constructing any spanning tree of the process graph and choosing the flows of chords of the spanning tree as the measured variables. In this case. the chord stream flows are the independent variables and the branch stream flows are the dependent variables. Note that this is similar to the choice of independent and dependent val-iable choice made in Simpson's method for solving bilinear data reconciliation problems efficiently that was discussed in Chapter 4.

The relationship between dependent and independent variables can also be obtained easily using the fundamental cuisets of the spanning tree. As descri!>ed in Appendix R , a fundarnenta! cutset, with respect to a branch of the spanning 1:-ze, contains ope or more chord:; acd the stream !low coi-responding to the branch can be written in ternis (jf these chord streams floivs as

where K: is the fundamental cutset with respect io branch i. The elr,rnent.; pij are O if chord j is not an element of K,': otherwise they are +1 or -1 +:2rzc!i7g cr. whether chord .j has the same or 3pposite orientatlnr. 15

branch i. If the variance in the measurement error of chord flow j is o'. J

then fro111 Equation 10-2 1, the expected variance of the estimate crror of branch flow i can be obtained as

For a minimum observable sensor network, the estimate of the measured stream is given by the measured values themselves. Thus, the expected variance in the estimate of a measured variable is equal to its measurement error variance. The overall expecled estimation error (rnea- sure of estimation accuracy) is the sum of all the expected variances in the estimate of all variables. Using Equation 10-22, we get the overall measure of estimation accuracy as

where k, is the number of fundamental cutsets of the spannin, o tree in which chord j occurs. Equation 10-23 is exactly equivalent to Equation 10-10 except that it uses spanning tree and f u n ~ ~ m e n t a l cueset concepts instead of their matrix equivalents.

Example 10-6

The process graph of the ammonia process considered in Example 10-2 is depicted in Figure 10-2. A spcnning iree of ths process graph is s h o w in Figure 10-3 which consists of branches 2, 3, -2. 5, and 8 and chords 1. 5. arid 7. ~Zorrsspcrndirig io this spannirig tree, in the minimurn obse:-vabls sensor cetwork design ihc flclws c!f strear::s 1, 6. and '7 a-c mcasu~-26. Thi- filndamental cutsets of this spanning tree are [2. 1, 71, 13. 1, 71, [&, 1 . 71. is, 1, 6, 7j, and [8, 1 , GI whsrc the branch in each fundamenral cutsi-t =irt:

denoted by 2n underscore. Chord 1 occurs ir. five fcndamental cutssts. chord 6 in two. and chord 7 in four. If we :issunie all measurerrterlt enor variances as unity, then from F4uation 10-13 we get the overall expected estimaiicn error for this sensor design as 14. This va!ue may be compared with the solution given in Table 10-1.

A process graph can contain several spanning trees. The numbel- of spanning trees which is equal to the number of fcasible minimum ohsen-- able sensor network designs can be as large as nu-', where n is the number of nodes in the process graph [ 7 ] . There are several algorithms for constructing a spanning tree of a process graph and for finding the funda- menla1 cutsets of the spanning tree. Some of these algorithms alons with computer programs i11 FORTRAN language have been described in Deo (71. It should be kept in mind that, for the purposes of constructing a spanning tree the direction of ihe streams are ignored, that is, the process

Figure 10-2. Process graph of simplified ammonia plant

Figure 10-3. Miqiniurn observabie sensor neiwork.

graph is treated as an undirected graph. Tile directions of thc sirzans arc csed only to obtain the c~cfficien!~, piJ in Equation 10-21 if reqnircd.

Tile problem cf designing a minimum observable sensor actwork thai rnaxirnizes .-,stirnation accuracy (cr e.qilivalently miilimizes J ) ca:] be restated as ille prob!cm of constructi~lg a span!~ing tree of the process graph which gives the least value of J as defined by Equatior? 19-23. Starting with a spanning tree. we call generate a new spanning tree by means of a chard-branch interchange described in Appendix B as an elementary tree transformation.

In this technique, if we add 3 chord to the spanning tree, then we aflould delete a branch fro111 the fundaruental cycle forrned by tlie chord. in order- to obtain a new spanning tree. This elementary tree transforma- [ion implies that the sensor ~neasuring the chord stream flow is removed and instead a sensor is used to measure the branch flow deleted from the initial spanning tsee solution.

We can start with an arbitrary spanning tree and use the elementary tree transformation technique to successively obtain new spanning trees which gives improved overall estimation accuracy. The o~lly issue to be resolved is to select the chord and branch to be interchanged to improve estilnatiorl accuracy. An algorithm for this purpose is outlined below.

Algorithm 3

Step 1. Construct an arbitrary spanning tree of the process graph.

0 tree Step 2. Determine all fundamental cutsets of the current spannin, solution and compute its overall estimation error using Equation 10-23.

Step 3. Find the number of occurrences of each chord in the fundamental cutsets and compute the contribution ( k i + l ) o t of each chord i to the overall estirnation inaccuracy, and I-ank the chords in decreasing order of their contribution.

Step 4. Select tlie next ranked chord. say c,, from the crdered set of chords arid find the fundamental cycle formed h \ chord c,. Stop if !liere are no more chords to be examir?ed: else ma!; :he branches in the funda- niental cyclc in increasing order of their ineasure~nent error \ arinnces.

Step 3. Select the next ranked branch, say bJ, from the fu!u!?darne~:tal cycle and interchange cliord ci and t;ranch b, to <)btain a ilew spanning t~-ec. i f there are no rr,ore brafiches to be examir~ed, return ro Step 4.

Step 6. Obtain thc fundamental zutszts v.4t.b respect to T ~ E new spannjng tres and cornputz the overali estimation error '~si;tg Equation 10-23 col-- resp~nding io this new soluticn. If the overail estimatic;n error of the nex. solution is less than that of the old spanning tree. replace old solutio:~ v:ith new spanning tree and return to Step 2: or eise i-cstore old spanning tree solution and return to Step 5.

In the above algorithm, at each stage an atle~i~pt is rnadc to obtain a new spanning tree solution having better estimation accuracy, by a chord-brancli i~lterchange in the c u ~ ~ e n t spannin~ tree. If this artetnpt is successful, then the current solution is replaced by the new one arid the procedul-e repeated.

-L> an If after systematically exarilining all possible chord-branch intercha11;- improved solution is not obtained. then the algorithm stops.

The procedure adopted in the above algorithm is known as a loc.al n e i g h b ~ r l z ~ ~ d search technique since only the neighboring spanning tree solutions which differ from the current one in respect of only one branch are examined for obtaining a better solution at every iteration. 'The algorithm, therefore, gives only a local optimum solution and not the global optimum (sensor network design with least estimation error).

Example 10-7

The above algorithm is applied to the ammonia process using the initial spanning tree with branch set [2, 3, 4, 5, 81 considered in Example 10-6. From the fundamental cutsets of this spanning tree obtained in Example 10-6, the overall estimation error is computed as 14. Moreover; the chord set ranked in order of their individual contributions is [ I , 7, 61. We select chord I and find the branch set which forms a furldamental cycle with this chord as [2. 3, 4, 5, 81. Since all nleasurernent error variances are equal, we can choose any branch for the interchange.

If we arbitrarily select branch 2 and interchange with chord 1, we get a new spanning trse with branch set [ I , 3, 4. 5 , 81. The overall estimation en-or of this solutio11 is 12 which is less than tlle current solution. There- fore, we accept this new solution. If we repeat this procedure with i h ~ new solution we find that none of ihe chord-branch interchanges results in a bettzr solution. Thus, rhe satsor n e t w ~ r k design ob:ained by this algorithm corresponds i G the measurelnents of streams 2, 6, and 7 with estimatioi? error of I?. This soIution is worse than the global optimilm desigr, which has an estimation error of 1 1 a!: observed from Table 10- 1 .

P.lgorithms for redunc!ant sensor network desiyts for n~sxintizing esti- :na:iolr Liccilrilcy using graph theoretic techniques are >let to he de\,eloped

Minimum Cost Sensor Aretwork Design

The desipn of a n~inimum observable sensor netviork which has the least cost amang all minimum observable sensor networks can be easily accomplished using graph-theoretic techniques. From the discrtssion in the preceding section. we note that every spanning tree corresponds to a minimum observable sensor network. If we assign a weight to each stream which is equal to the cost of the sensor required to measure the flow of the stream- then the problem considered here is to determice tiis maximum weight spanning tree, where the weight of a spanning tree is the sun1 of the weights of its branch streams. The problem of deterrninirip

the maximum (or minimum) weight spanning tree is one of the classical problems of graph theory that has been well studied. Several algorithms are available for determining the maximum weight spanning tree in a straight-forward manner [7], and we choose to describe Kruskal's algorithm below 18):

Step 1. Sort the streams (edges of the process graph) in decreasing order of their weights. Initialize the set of edges in the tree, T to be a null set.

Step 2. Pick the next edge from the sorted list.

Step 3. Check if the edge picked forms a cycle with edges of the partial tree constructed so far. If so, discard this edge or else add edge to set T. If the number of edges in T is less than n (number of process units), return to step 2, or else Stop.

The method of checking if an edge forms a cycle with other edges in a set, and the other operations to be carried out when an edge is added to T, are explained in Deo 171. We illustrate this algorithm in the following exalnple.

Example 10-8

VJe repcat Exan~pie 10-5 tc find the least cost minimum observable sensor network of the ammonia prccess, bur this time using Kruskal's algorjthm. The c~rder of the streams in decreasing order of weights (sensor costs) is 12, 3, 4, 1. 6 , 7 , 8. 51. We pick edge 2 and add it to the ;see being constr~ited. h'ext, we pick edge 3 from the sorted !is[ arrd add it co the tree (since it does not form zi cycle with edge 2). \Ve continue io pick and add edges 4, and 1 because they do nct fom cycles with ~ t h e r edges added so far to the tree.

When we next pick edge 6, we find that it cannot be added to tree since it forms a cycle with edges 2, 3, 4, and 1 added to tree so far (can be visually verified from the ammonia process graph of Figure 10-2). So we discard it and pick the next edge in the sorted list and add it to tree since it does not form a cycle with other edges. \Ve stop because we now have added 5 edges to the tree which is equal to the number of process units. The resulting spanning tree is shown in Figure 10-4. Correspond- ing to the spanning tree the sensor network design measures chord streams 5, 6, 2nd 8. Comparing with the solution obtained in Example

10-5, we can observe that this is the minimum cost sensor network design among all rnini~nunl observable networks. The order of choosing the edges of the spanning tree is identical to the order of picking the pivots from the columns of the constraint matrix in the linear algebraic method used in Example 10-5.

Figure 10-4. Minimum cost observable sensor network

Methods Based on Optimization Techniques

The p~-oblem of ser:sor !letwork design can also be forrnulateti ss s 11iathema:ical cpti~ni/.aiion prob!eni ant1 solved usi:lg appropriate o~ j t i - rnization tecll~iiques. In the preceding sections, the design cf sensor networks with the objective of either it3inimizing cots or f i laxiriz~ng estimation accuracy was considered. Optimization tcchciques offer the 1:ossihility of siinllltaneo:~s!y considering differecr objectives arid a ls i inlposin~ otlier constraints.

Fur!hermore, i: can be estended to consider more gel~eral processes illvolving flows. temperatuixs, pressures and concentration measurements. H3giijeivicz [9j u.as the first to formulate sensor nerwork design opti~nization 131-oblern. We describe here only the Sonnulation of the problem and refer the reader to standar-d texts on op~imization for the details of the optimization rechnique used to solve the probiern.

In sensor- network design. the important decision to be rnade with regard to each stream flow variable is whether to measure it or not. In or-der t o ~nathematically formulate these decisions, we can convenicntly make LISZ nf binary (0-1) integer decision variables qi, one for each str-eam i which have the following interpretation.

0 if x, is not measured qi = { 1 if x i is measured

Let c, be the cost of sensor for measuring flow of stream i, and let ci; be the maximum allowable standard deviation of the error in the estimate of variable i. A minimum cost sensor network design which satisfies the constraints on estimation error can be formulated as

Min z c i q i qi

subject to

Gi I o, i = I...n

where oi is the standard deviation of the esror in the estimate of flow of stream i and is thc square root of the dlagonal e!ement of cokariancz matrix 3f estimation errors, S, which can be computed for- ar?). ciloice of sensor locations (defined by the valuzc chosan Sol- the binary var-iables o,,) using Equation 3-10, The above problsm is a mixed integer optimizatior, proble~n and can be sclved using techiliques such a5 branch and bound. Alternatively, commercial optimization packages such as GAMS, GINO or MINCS ~ h i c h havc a suite of opti:nization iechniq~eb can be efl'cc- tivelq used for solviny the atxve problern.

In the aboie optimization problern for:~:ula:icn, miui;nizution of the sensor network cos: was used as the objective (Equarion 10-25>, subject to a minirnum accuracy specification for the estirnaies. Alternativelyl we can choose to maximize the overall estilnation accuracy subject to a maximum limit on the cost of the sensor network. Ocher constraiilts, such as minimurn deslred reliability, can also be included in the problem S(.>i.-

mulation.

DEVELOPMENTS IN SENSOR NETWORK DESIGN

The earliest statement of the sensor network design problem for maximizing estimation accuracy through data reconciliation was given by Vaclavek [I01 for linear (flow) pi-ocesses. Later, Vaclavek and Loucka

324 Dura Recol~cili~~rio~l and G'rus~ Errol- Derecriorl

[I 11 proposed algorithms for sensor network design for ensuring observability of important variables in linear as well as multicomponent (bilinear) processes. Almost two decades later, Kretsovalis and Mah [ S ] proposed a systematic strategy for solving this problem, where a measure of estimation accuracy was formally defined and algorithms proposed for design of redundant sensor networks for maximizing estimation accuracy.

Ali and Narasimhan [12, 13, 141 addressed the problem of sensor network design for maximizing reliability and developed graph-theoretic algorithms for this purpose. Matrix methods for sensor network design for maximizing reliability were independently developed by Turbatte et al. [15]. Observable sensor network designs for linear and bilinear processes using matrix methods were also addressed by their group [16]. More recently, the issue of sensor network design for improving diagnos- ability and isolability of faults have beck1 tackled by Maquin et al. [17] and Rao et al. [IS].

Independently, Madron and Veverka [6] tackled the problem of ~nini- mum cost observable sensor network design using matrix methods. Bagajewicz [9] formulated the sensor network design as an optimization problem. The use of generic optimization algorithms fr3r sensor network design considering different objectives such as cost, estimation accuracy, and reliability was reported bj. Sen et al. [19]. Recent!y, Bagajewicz and Sanchez [20] presented a n~ethodolozy for designing or upgrading a sen- 53r netork in a process plant with the goal of achieving a certair, degree of observability and redilndancy for a specific set of variables. Although significafit progress has Seen made in t l i ~ design of sensor networks, a compreliensive strategy sirnul:aneously considering different objectives still has to be d=.ve;oped.

Design of Ser~sor Nerworks 325

SUMMARY

The location and accuracy of sensors determine the estimation accuracy of data reconciliation and performance of gross error detection methods. The unmeasured flows in a minimum observable sensor network design for a flow process forms a spanning tree structure. A minimum cost minimum observable sensor network design is equivalent to the minimum weight spanning tree. The general sensor network design can be formulated and solved as a mixed integer nonlinear optimization problem.

REFERENCES

1. Liptak, B. G. hzstrument Engineers' Handbock-Process Measur-eilzer!r ciizd Aizalysis, 3rd ed. Oxford: Euttenvorth-Heinemann. 1995.

2. Peters, M. S.. and K. D. Tinmerhaus. Plui~t Desigiz u~zd Ecoiloiliic~ ,;OI-

C1ze~i:icai Ei7gitzet.r~. New York: McGraw-Hill. : 980. 3. Coulson, J. M.: J. F. Richardson, and R. K. Sinnott. CI:er~zical Grgiizeer-ill?-

P'ol. 6. Desigiz. Oxfard: Pergamcn. 1983.

4. Mah, R.S.H. C h ~ i ~ i c a l Process S?i-tictures a;ld fi:Jonmtioil Flotcs. Boston: Butte~xorths. 1390.

5. Kretsovalis. A., and X.S.H. Atah. "Eifect s f Redu~dattcy oil Esiimation Accuracy in Process Data Reconciliation." C11en1. E;iig. Sri. 42 (19S7 f 2! 15-2321.

6. Madfor?, F., an* V. Veverka. "Optimzl Selection of Measuri~lg Poinr in Complex Plants by Linear Models." AIChE Joumal38 (1992): 227-236.

7. Ceo. N. Grap11 Thecln r+<itl? Al,nli~ntinrzs to Enpir~perinp lrnd Computer Sci- ence. Englewood Cliffs, N.J.: Prentice-Hall, 1974.

8. Kruskzl, J. B.. Jr. "On the Shortest Spanning Subtree of a Graph and rhc Travelling Salesman Problem." Pi-oc. .4m. Marlz. Soc. 7 (1956): 48-50.

326 Data Rrco~lciliatio~~ olld GI-oss Ei-I-or Detecr~orr

9. Bagajewicz, M. "Design and Retrofit of Sensor Networks for Linear Processes." AIChE Journal 41 (1 997): 2368-2306.

lo. Vaclavek, V. "St~~dies on System Engineering-111. Optimal Choice of the Balance Measurements in Complicated Chemical Engineering Systems." Chem. Eng. Sci. 24 (1969): 947-955.

11. Vadavek, V., and M. Loucka. "Selection of Measurements Necessary to Achieve Multicomponent Mass Balances in Chemical Plants." Chein. Eng. Sci. 31 (1976): 1199-1205.

12. Ali, Y., and S. Narasimhan. "Sensor Network Design for Maximizing Relia- bility of Linear Processes." AlChE Journal 39 (1993): 820-828.

13. Ali, Y., and S. Narasimhan. "Redundant Sensor Network Design for Linear Processes." AIChE Joccrnal4 1 ( 1 995): 2237-2306.

14. Ali, Y., and S. Narasimhan. "Sensor Network Design for Maximizing Relia- bility of Bilinear Processes." AICIiE Journal 42 (1 996): 2563-2575.

15. Turbatte, H. C., D. Maquin, B. Cordier. and C. T. Huynh. "Analytical Redundancy and Reliability of Measurement Systern," presented at IFAC/IMACS Symposium Safeprocess '9 1, Baden-Baden. Germany, 199 1. 49-54.

16. Ragot, J., D. Itlaquin, and G. Bloch. "Sensor Positioning for Processes Cescribcd by Bi!inear Equations." Diczyrlo.c.tic rr srtr?te c/e fntzc:ior~inrrlr 2 (1992): 115-132.

17. Maquin. 3., M. Luong. and J . Ragot. "Fault Detection and Isolation and Sensor Network Design." R4IRG-APII-JESA 3 1 (1 997): 393-406.

13. Rao, R., M. Ghushan, 2nd K. Rengaswa~ny. "Locating Sensors in Complex Chemical P1a.n:~ Eased on Fault Diagnostic Ohstr\;a'oility Criteria." AI'C!IE J~!r7-tzul15 (1999): 3 !O-322.

19. Sen, S., S . Narasi~n!ian. and K. Deb. "Sensor Network Design of Linear Processes Using Genetic Algorithms." Conzpuret-r Clzen~. Etzg~lg. 2'2 (1993): 3 85-390.

20. Bagajcwicz, M. J., and M. C. Sanchez. "Design and Upgrade of Nonredun- Aavt r!-.l, Rcdn~dant Linear Sensor Networks." AICIzE J o ~ ~ r t ~ a l 45 (!?93!: 1927-1938.

e? c) Industrial Applications of Data Reconciliation and Gross Error Detection

Technologies

Data reconciliation technology is widely applied nowadays in various chemical, petrochemical. and other material processing industries. It is applied offline o r in connection with on-line applications, such a s process optimization and advanced process control.

This chapter presents a review of major industria! applications of data reconciliatiori and gross elsc;r detecticr, reported in the 1itera:ure. Based or. this published infamation, w e only describe the brozd features of the applications without going inta the details about the particular solution technique or the software ~ l sed . However, we describe in greater detail two applications (with which the authors were personally associated) in order to highlight some practical prob:en~s and their resclution. Although there are many other industrial impleinentations and software for daia reconciliation applications, they are either proprietary (and n o detailed information is publicly disclosed) o r the p~lblished source of information is not easily accessible.

Th,. . .

, ,,, znaiysis In :his chapter is organized according ro the rnajor industrial types of app!ications for data reconciliation technology. From the multitude of industrial data recorlciliation app!ications, w e can distinguish three major types of applications:

1. Process unit balance reconciliatioil and gross error detection 2. Parameter estimation and data reconciliation 3. Plant-wide material and utilities reconciliation

330 D:,to Hecr~~iriliurion arid Gross Error- Detecrior~

Instrument standard deviations are very important in data reconciliation, and a syste~natic estimation procedure for the standard deviations should be employed. Redundancy is another important factor in data reconciliation solution and capability of detecting gross errors. Additional instrumentation may often be necessary to achieve a satisfactory level of redundancy. An optimal sensor location design software is ideal for data reconciliation applications. Gross error detection should be always followed by instrument checking and correction. Uncorrected instrument problems deteriorate the quality of the data reconciliation solution. Data reconciliation/validation is a complex problem that might require more than one solutlon technique. Since important flows and temperatures in tne pant may be nonredundant, data filtering and validation can be used to provide a quality solution overall. A satisfactory data reconciliation system should have enough flexibility to handle process configuration changes, variable standard deviations, missing nleasure~nentr and to accept various kinds of equations, including inequality process co~straints.

More guidelints and challenges stiil facing data reconciliation technc?l- ogy have bee^ poinied out by Ragajewicz and Mullick 131:

For rerinery applicatioris wid, a large vzrie~y of stream cnrnposiriocs, proper assay cha;actenzation is the key to a successful data reconciliation. With inaccura~e compositions, results may not saiisfy material balances and good measurements may 'Je ide~ltified as gross errors. For more acccrate darz reconcilia:ion, marerial and energy balance rec- oilciliatior: is necessary. Heat balances enhance the redundancy in the flow measuremenis and aIi in~proveci accuracy in the reco~lciled flow rates is obtained.

*The steady-state assumptior, and the data averaging m i ~ h t not be satis-

factory for some processes and material balances cannot be accurately closed. Dynanuc data reconciliation software needs to be developed for such processes and especially for advanced process control applications. Rigorous models do not necessarily increase the accuracy in the data reconciliation solution, but they enable merging data reconciliation and the associated parameter estimation problem in one run. Increased accuracy in gross error detection is still a current need, since none of the existent methods and strategies provide effective gross error detection for all types of errors and error locations simultaneously.

Typical software for process units material and energy data reconciliation and gross error detection are: DATACONT" (Simulation Sciences Inc.) [3, 5 , 61, DATREC (Elf Central Research) [4], RECON (part of RECONSET of ChemPlant Technology s.r.o., Czech Republic) 1211, VAL1 (Belsim s.a.) 1141, and RAGE (Engineers India Limited) [I, 161.

e Many NLP-based optimization packages designed for on-line applications have data reconciliation capabilities. They mostly use rigorous models, making the gross error detection more challenging. For that reason. only few have some sort of gross error detection. Ro~eo ' " , a new product of Simulation Sciences Inc. designed for closed loop on-line optimization has data reconciliation and gross error detection capability [22].

PARAMETER ESTIMATION A N D DATA RECONCILIATION

A problem associated with data reconciliation is the estimation of various model parameters. Data reconciliation and gross error detection algorithms make use of plant models, which might have totally unknown parameters, or parameters that are changing during the plant operation. Most of these paranezers-such as heat transfcr coefficients, fouling factors, dis:illation column tray efficicncies, compressor efficiencies, etc.- are fixed va!ues for the process optiinizziion; therefore, a high accurzcy in thcir es~imated values is required.

Cine zpproach to parameter es~imation problen~ is to solve it sirnullatze- ously with the reconciliation problem. Thc model parameters can be treated as regular ucmeasured variables, or as tuning psrarneters that are adjusted in NLP-type algorithms to match the pla:?; measuremects. The major proble~n with this irpproac'n is that in the presence of gross err015, the pa-aneters may be adjusted to wrong values or some measurements can wrongiy be declared in gross error because of errors i r~ model parameters. To obtain an accurate solution for both measured variables and model parameters, ail iterative process is usually I-equ~red, which is time consuming. especially with rigorous 111ddeib.

An alternative approach is to separute and secjuc~r~tinll)~ srrlvc. the two problems. First, data reconciliation and gross error detection is performed using only overall material and energy balances. The model parameters are then estimated using the reconciled values. This is similar to projecting out the unmeasured model parameters from the reconciliation problem along with their associated model equations. The parameter estimates obtained using the sequential approach are identical to those

332 Dora Kfcotrriliizlior~ and G r o s Error 1)errction

obtained using the simultaneous approach if there are no a priori estimates of the paranzeters available. Moreover, the parameter estimates obtained using the sequential approach may not a1:vays satisfy bounds on parameter values. An iterative procedure may be used to eliminate such problems. This approach was applied to parameter estimation problems in connection with advanced process control applications. The computational time is a serious constraint for such applications, and usually only one iteration is applied 121.

PLANT-WIDE MATERIAL A N D UTILITIES RECONCILIATION

Plant-wide reconciliation is a very important tool for material and utilities accounting, yield accounting, or monitoring of energy consumed by the process. Many refineries are already saving a significant amoxnt of money by using a production accounting and reconciliation system. The usual term for a plant-wide reconciliation system is production nccount- ins; therefore, we will adopt this term for the description in this section. Keld arcol~nring is another frequently used tenn 123, 241.

A production accounting system interacts with various groups iil the plant. The operations department provides the input infarmation and col- lects the reconciled results. The ins?nmenta:ion group obtains ins:^-umenc status ar,d perfmlls instrumenr recalibration and correction if mcessary. Process engineers, accou~iing anc! financial personnel, and planning and scheduling management retrieve periodic reports for their v a r i o ~ s needs. Daily, week!y, or msnthly reports are standard requirements for a production accounting system.

?arious types of data are required for piant reconci!ia:i~n as indicated in Table l i -1. For beacr data quality and timely infomation, a producriori accounting system is usually integrated with the plant historian and the entire process information/management system. Some data is retrieved automaticaily from a historian, but other data is entered manually. Human el-rol- 1s a factor affecting the data accuracy (and the reconciled results). For that season, some sort of data and model validation is very important.

Veverka and Madron 1251 describe an empirical procedure for detecting topology errors, such as a missing stream or a wrong stream onenla- tion. They used the balance residual (or deficit), defined as:

r = inputs - outputs - accumulation (1 1 - 1 )

Table 11-1 Data Types for a Production Accounting System

Data Type Description

Plant Topology Process units, tanks, and their conn~cting streams

Process Data PI-ocess data (e.g.. comperisated flows) from the process and utility units

Tanks Inventories Inventory from each tank

Movements (from unit Movement data-stdstop tirne, source node, to unitltank) and destination node, quantitjr transferred Transfers (from tank to tank)

Blends

Receipts

Blend data-stardstop time, source node, destination node, tank volumes and S!ending quantities

Receipt data-startlstop time. source node, destination node, quantity receivei!

Shipments Receipt data-startlstop time. source node, destination node, quantities measured

Meters/Sensors Accuracy Instrument acct~racy facton (tolerance, reliability, Factors etc.) for each of ihe measurement deilices

Idat:oratorq. Lab t a t results (density. %H2U. cctnpositioi!~, ctc.)

Additioi~ai Dxta Entry Unspecified or adjusted data to hz nsc:i by the modzl px-ior to its balarxs calculation. Includes missing valuzs. specified valueb, adjustments, etc.

Thz balance deficit for each node is compared with a critical vtilue, say r,,,. If for a paiticula- node r > rCTi,, then the balance around that node is declared inconsistent. The major proble~n is how to set the rCrit value f9r

each node. A good portion of the node imba!ance can be attributed t o errors in data, and is therefore zdmicsible. The remaining of {he difference is conside~c' x:d=!izg e n s . The vz!ze r,,, can be obtained f rox the statistical analysis of the balance residual r, which is a racdoxn variable (sirnilar to the nodal test for gross error detection described in Chapter 7).

Serious imbalance problems occur due to frequent (daiiy) changes in some movements [26]. The p!ant topology for many refinery processes is rather dynamic. There are many "te~nporary flows" that one day have a nonzero value, and in another day becomes zero (closed valve) or the flow is redirected to a different tank or unit. Mistakes in the reconciliation topology input can very easily be produced. Kelly 1261 proposed a

inore complex strategy for finding the wrong material constraints followed by detection of gross errors in measured data. His strategy is based on a previous algorithm for deleting different combinations of measurements in order to assess the ~eduction in the objective function, developed by Crowe [27]. Since the deletion process gives rise to a large combinatorial problem, Kelly designed an algorithm to narrow down the number of possible combinations to delete.

A good method for gross error detection is crucial for a productiorl accounting system, which is exposed to many sources of errors. In the presence of significant topology and measurement errors, the reconciled result may become meaningless. The gross error detection task, however, is very challenging for this type of problem because it is very difficult to distinguish between the true measurement errors and topology errors. Leaks and losses and existence of unmeasured flows create an even higher level of complexity. A lot of research effort in data reconciliation is done to resolve these issues and provide more accurate production accou~iting algorithms.

Some unmeasured flows are observable and can be estimated based on their relationship with measured flows. But the precision of such estima- ricn is often unsatisfactory. due to propagation of errors 1251. An alternative way to get an estimate for unmeasured flovis and other variables invo lv~d in !he p ! a ~ ~ t reconciiiation is by using appropriate chemical enzineering cr,lculations. which is best accomplished with a process sim- ulator 14, 251. Process unit data reconciliation performed before plznt- wide reconci!iation is another possible approach [28].

A more compliczted problem for plant reconcilia~ion problems is the estimatior? of material and energy iosscs [26' 291. There are many sources f ~ r mzterial and energy losses in a refinery or chemical plant such a,; flares, fugitive emissions (from volatile organic compounds), leaking valves, fit- tings, pumps, or heat exchangers, and tank losses (by evapcration or liquid leaks). In addition to the real losses nieiltioned above, there are apparent losses caused by rneasc~mer.: errors, lab density errors, line fills, or timing errors due to unsynchronizetl readings in tank gauges and meters.

Many loss models and loss estimation fonnulas arc available. Govern- mental agencies in the United States such as the American Petroieum Institute (API) and the U.S. Environmental Protection Agency (EPA), provide publications with procedures for predicting fugitive emissions, tank losses, and various leaks. For plant-wide data reconciliation, i t is important to clarify how to use these estimates. There are three major ways of including the loss estimates in the data reconciliation model:

1. Treating the losses as un~neasured flows, and reestimating them based on the measured data. This approach does not require a "good" estimate of the loss flows, but requires observzbility of all loss flows, which is very unlikely to be obtained in a real plant.

2. Modeling leaks separately as explained in Chapter 7. A GLR test procedure can be used to detect leaks and estimate the order of magnitudes of the leaks (or losses). The method might have practical limitations if too many leaks are included in the model (it becomes a large combinatorial problem; also there might not be enough redundancy to accurately estimate the magnitudes of all leaks and losses).

3. Treating the losses and leaks as "pseudo-measured" flows. The estimated loss or leak value is used as a "measured" value, and a relatively higher standard deviation than that for the real measured flows is given to the loss flow. The values of the flow rates for the loss flows are reconciled together with the other measured flows.

Typical software for plant-wide reconciliation and yield (productiort) accounting is [21]: OpenYieid (Simulation Scie~lces Inc.) and SIG- MAfine (KBC Advanced Technologies).

CASE STUDIES

Reconciliaiion of Refinery Crude Preheat Train Data 11 63

A crude preheat train is an impcfiant subsystem of a refirieiy used to rni~iimize the external energy ronsitmption reqsired for heaticg crude oil. It consists o i a network of heat exchar?gers which 31.5 used to preheat crude oil before the crude is sent to a furnace for further heating prior to distiliation. The hot streams used for preheating the crude are tile distillate streams from the downstream C O ~ U I I I ~ ~ S . Figur-e 1 !-I shows the crude preheat train of a refinery consisting of 2 1 exchangers in which the crude ;.: h w t d by 1 I distillate streams.

The flow of the crude before the splitter. as well as the two split f l o w qf the crude, are measured. The inlet flows of all distillate streams are measured as are also the inlet and outlet temperatures of both hot and cold stream to every exchanger. Table 11-2 shows a typical set of measured flows and Table 11-3 shows the measured temperatures. Tile motivation for reconciling these measurements arises from the need to optimize the crude split flows every few hours. In Chapter 1. we have already described this application and here we will focus only on the ~econciliation problem.

Dur(~ K~~coi~ci!iuriori and Gross Error Drraclion

Crude

Figure 1 1-1. Crude preheat train of a refinery.

Tcble 1 1-2 Measured and Reconciled Flows of Crude Preheat Train

Measured Fiow (tonslhr]

CR 1 CR2 CR3 HN KE;I TPl DS I MPl HV 1 BPI VR 1 HV2 VR2

Reconciled Flows (tons/hr) Bei~re GED Afier GED

399.2352 409.5705 153.1427 151 8511 246.0925 257.7!94

8.3865 8.3885 58.4888 58.6845

267.0283 267.7579 86.5460 8 1 SO42

454.8024 455.0960 54.5657 54.5277

209.2158 209.4634 106.7616 109.9967 106.6069 106.7458 170.3060 17 1.2554

lndrrstrial Ap;,dicutioii.s of Dara Rrcorzr:iliarior7 and Grwsv Error DelectiOll Tcchr,olr,sies 337

Table 1 1-3 Measured and Reconciled Temperatures of Crude Preheat Train

Stream Measured Reconciled Temperature (C] Temperature (C] Before GED After GED

CR I CRl A CRIB CRlC CRlD CRlE CRlG CR2

CR2A Ck23 CR2C CR2D CR2E CR2F CR2G CK2H CR3

CR3A CR3B CK3C CR3D CR3E CR3F CR3G CP.3H CR4 HN

HNA KE 1

KEiA KElB TP 1

TP l A DS 1

DS 1 A DSlB DSlC MP I

MPl A MPlB

0 * (table contit~ued on tieir page)

Table 1 1-3 (continued]

Measured and Reconciled Temperatures of Crude preheat Train

Stream Measured Recor,ciled Temperature (C) Temperature (C) Before GED After GED

HV 1 302.300 301.0654 301.4735 HVlA 299.800 301.0387 300.7566 HVlB 263.800 266.6501 265.2976 HVlC 229.492 227.4435 228.3426 BPI 309.458 304.7259 306.4544

BPlA 253.525 253.0694 253.1749 BPlB 230.667 233.4092 232.4505 VR 1 283.885 277.3424 283.3504

VRlA 258.167 242.7938 260.7749 VRlB 152.550 158.8106 198.7786 LV 1 212.142 206.1244 2 1 1.7749

LVl A 190.267 195.0234 190.5526 HV2 302.300 295.0904 30U.22 1 1

HV2A 273.617 268.48 12 275 4542 HV2B 233.075 240.6166 25 1.9620 VR2 345.925 345.4640 347.0395

VR2A 345.925 355 4640 347.0395 VR2i3 345.925 345.4640 317.0395

The problem iri this case is tc reconci!e all t!le f! o u ~ s and ccmperatures so as to sa~isfy material arid energy baiances of each process unit ef this subsystem 1x1 addition, ir is required to estimate the overall heat transfer coefficient of ezch exchanger given the area and number of tube and shell passes. It is assuned that ail the streams zre single phase Auit-ls and

C * ihe specific heat capacity of ~ a i h strsam is giver. by

where ai and bi are constants and Ti is the temperature of the strean? in Jclgrees C. T i l ~ ~ulibcanrs ai anci bi ful- rne different streams are given in T~!J:E 1 1-4. The are2 and the iiumber tube passes for each exchanger are given in Table 11-5, while all exchangers have a single shell pass.

Table 1 1-4 Constants for Specific Heat Capacity Correlation

Stream a b x 100

CRUDE 0.4442 0.101 1 HN 0.458 1 0.1036 KEl 0.4455 0.1011 TP 1 0.48 19 0.1081 DS 1 0.4263 0.0975 MP I 0.4455 0.1011 HV I 0.4143 0.0959 BPI 0.4263 0.0975 VR 1 0.4092 0.0962 LV I 0.4285 0.0986 HV2 0.4143 0.0959 VR2 0.4062 0.0957

Table 1 1-5 Heat Exchanger Areas and Number of Tube Passes

for a Crude Pr9heat Train

Exchanger Areo (m2) Tube Passes

j lrzdu~trial Applicarionr of Dura Keconciliariorr arrd G r m Err<,r Detertiort TPchJloiogies 341 i

Using a standard deviation of 1% of the measured values for all stream flows and temperatures, the reconciled estimates are obtained assuming that no gross errors are present in the data. In the third colunln of Tables 11 -2 and 1 1-3, the reconciled flows and temperaiures, respectively, are shown. (All the results of this case study were obtained using the software package RAGE.) The constraints that are used for each exchanger are the flow balances for the hot and cold streams, as well as the enthalpy balance.

For the mixer, flow and enthalpy balances are imposed while, for the splitter, a flow balance and equality of temperatures across the splitter are imposed. No other feasibility constraints or bounds on the variables are imposed. In order to remove gross errors from the data, the GLR test along with serial compensation strategy is applied for multiple gross error detection after linearization of the constraints around the reconciled estimates. The final reconciled estimates after all gross errors are identified and compensated are also shown in the last column of Tables 11-2 and 11-3.

Ure focus on some interesting problemsffeatures of the measured and reconciled dara. I i we consider the measured temperatures of streams incident on exchanger E38A (the streams CR2C, CR2D, DSlA, and DSI B), we note :hat the crude stream is getting cooled fro111 217.608 to 216.992 degrees C. whi!e the intended "hot" distillate strearn is getting heated froin 224.333 io 254.908 dcgrees C. A1:hough it may bc possib!e for the roles of ho: and cold streams ti: bz reversed dcpznding on the pre- vailing flows and temperatures. what is urlacceptable here is :bat heat is being transferred fi-om the !owel temperature ci-udc to the higher ternper- ature diitilla~c stream which is thermodynamically infeasible. 11 can be verified thai reconciliation before cr after gr-oss errpol- detecfion (GED) does riot correct this probleni 2nd thz estiniates for exchangcr E3YA still \&)late ther-mody~ialnic fea;ibi!ity. If wc use these estimates to obtain an estimate of the overall heat iransf'er coefficient for this exchanger. then we obtain a ne9a:ive value for it which ic absilrd.

In order to obtain thermodynamically feasible estimates, several possibilities were examined. One general zp~rcnch is $5 inc!i,de feasihi!i?y constraints at the ho; and cold end of each exchpqger of the form

where the subscript i is the inlet and subscript o is the outlet end of the exchanger. This would, however, increase the number of constraints significantly. Moreover, this presupposes knowledge of the cold and the hot streams for each exchanger and does not allow any role reversal. A simpler technique is to include the relation between overall heat transfer coefficient (U) and heat load for every exchanger and impose bounds on the overall heat transfer coefficient. If we impose a nonnegativity restric- tion on U, then we can ensure that thermodynamic feasibility is maintained regardless of which of the streams plays the role of the hot stream and which plays the role of the cold stream.

Using this approach, the reconciliation problem was solved again. (Note that, as explained in Chapter 5, in order to solve this problem a constrained nonlinear optimization program has to be used and the unmeasured heat transfer coefficient parameters cannot be eliminated using a projection matrix.) The reconciled temperature estimates of the four streams incident on exchanger E38A alone before GED and after GED are shown in the second column of Tables 11-6 and 11-7. respectively. For comparison, we also reconcile the problem by deleting each of the four suspect temperature measurements in turn and also after deleting all the four temperature measilrenients which violate thermodynamic feasibility. The reconciled temperatcre esiimrrtes for these four streams before and zfter GEU are also shown in Tables 11 --6 and I 1-7.

Table 1 1-6 Reconciled Ternperstvres 6efore GED Around Exchanger E38A

for Different Cases -- --

Stream Rec~ncild Tempcrob~res (C) Bounds on U T of GSlA T of DSl B T of CR?C T of CR2D At! Four T's

unmecsured unmeasured unmeasurd unmeasbrd unm~sured --

DS iA 235.6707 227.4071 222.3988 222.7002 223.2842 :95.7'.178 US1 9 235.6655 248.6476 232.2426 250.8886 250.1673 23 1.7550 CR2C 217.2356 223.1970 2 19.9628 229.2839 ?21 1 179 227.4480 CR2D 217.2356 212.1257 214.8731 214.7073 207,0257 210.7080

- - --

342 D~iro R,~ronciliarinr~ ortd Gross Ermr Drrc+criorz

Table 1 1-7 ~econciled Temperatures After GED Around Exchanger E38A

for Different Cases -- Stream Reconciled Temperatures (C)

Bounds on U T of DSlA T of DSl B T of CR2C T of CR2D NI Focr T's unmeasured unmeasured unmeasured unmeasured unmeasured

ak DS l A 225.7176 221.1880 224.43 15 224.4547 223.9519 215.7303 DS l B 225.7 175 255.417 1 217.9601 255.2667 255.5525 223.5384 CR2C 216.8586 235.1001 217.0624 236.3756 233.8225 23 1.2648 CR2D 216.8586 217.0614 220.4571 220.0339 217.1436 227.2575

From the reconciled estimates, it can be observed that by imposing nonnegativity bounds on U, it is possible to obtain feasible estimates before and after GED. In fact, the results show that the heat transfer coefficient for exchanger E38A is at its lower bound of zero, which implies that this exchanger is being bypassed completely by one of the streams, resulting in C the temperatures of both hot and cold streams being unchanged across this exchanger.

The only other case when feasible temperature estimates were obtained was after deleting the temperature of stream DSlB and application of GED (refer to column 4 of Table 11-7). Even in this case, the stream ten1peratitl.e~ change marginally across exchanger E38B, indicating tha: this exchanger is largely being bypassed. ( T h ~ s was also later confirmed after inspec t i~g the manual valve positions on the crude bypass line f ~ r this exchanger.) 'fhe results clearly demonstrate that imposition of 'oo~ind coristl-aints on the parameters can be used as a gener;lc nlethod to obtain feasible estin~atzs. I.

This case stildy also brought out other issues that needed to be addressed in practice:

oorous It was assumed that ali the streams ar-e in a single phase. A more ri, method would r.x-..:.-.- LyU,,+ *I.,. U I C ..+-+- "I -C +L- U.L .,,,eaiz -+- !G be determined and an appropriate correlation tn he used for determining the stream enthalpy. Some fraction of the crude flows was being bypassed in a few other exchangers also, but sufficient measurements (redundancy) were not available for treating the bypass fractions as unknown parameters and estimating them as pa11 of the reconciliation problem. Heat losses fro111 exchangers were not accounted for in the enthalpy balances of eschan~ers . In order to use the reconciled data for better

C

It~da.\-rricil Applicarions r,fData R~cc~ncriliarion nr~d Gross Err-or Defection Techndogies 343

optimization, it may be necessary to include a heat loss term in the enthalpy balances. However, enough redundancy does not exist for treating the loss terms as unknowns. One possibility is to assume a specified fraction of the heat load of exchangers to be lost based on past experience or based on recommended loss estimation methods.

A s pointed out in Chapter 1, the reconciliation of the crude preheat train data was performed every four hours using averaged measured data for the preceding two hours. Since the heat transfer coefficients of exchangers cannot be expected to change dramatically from one time period to the next, it is possible to use their estimates derived in one time period as "measurements" for the next time period with a larger standard deviation. Due to this extra redundancy, better estimates can be obtained. Moreover, the heat transfer coefficient estimates change smoothly from one time period to the next and will not fluctuate wildly. A trend of the heat transfer coefficients can be used to decide when cleaningl~nainte- nance procedures have to be initiated for the exchangers.

Reconciliation of Ammonia Plant Data [30]

Ammonia is a chemical product with many industrial ap~lications such as refrigerants and fertilizers. Figurz 11 -2 shows a simpiified process flowsheet diazram for the sy11:hesis section of an ammonia prGcess [30]. Ammonia is produced by ar, exoti~ermic reaction of nitro- gen and hydrogen:

The feed stream S l t3 the synthesis section already contain.: ammonia from ilpstrezrn processes. To separate it, stream S! is cooled 2nd sent to flash drum F1, where the ammonia-rich liquid SS is separated from the remaining vapor S2. Before entering the reactor section, the vapor stream is preheated by a product stream. The reactor section consis:> of t:vo reaction stages and two internal heat exchangers. Stream S4 is split into three streams (SS, S6, and S7) and the split fractions are used to control the reactor feed temperatures.

Stream S7 is used to quench the hot product stream from the first-stage reactor and stream S5 is used to recover some of the heat from the product of the second stage reactor (S13). The three streams are then recom- bined and fed to the first-stage reactor. Most of the cooled reactor prod-

uct (S1.5) is rccycled to another section of the plant (stream S 16), while the remainder (S17) is further cooled with refrigerant (stream S22) to condense most of the ammonia (stream S20). The two condensed streams (S3 and S20) are combined and further purified downstream.

S!3

Figure 1 1-2. An crmmonia synthasis industrial process.

The atnrnonid synthesis plant contains instrumentation for measuring flow rates, temperatures and \lariol~s strezrn compositions (mole fractions). The measured values, their associated standard deviations and the reconciled values are reported in Tables 11 -8 and 1 1-9. Tables 1 1-8, I 1-9, and 11-10 show the reconciliatiort results for a case where no gross errors were found (Case A). Table 11-8 reports all stream calculation results, for both measured and unmeasured data. Other calculated values such as rcaction extents. heat exchanger duties, U A values, and flash data, art: also reported at the bottom of Table 11 -8.

irrd~si,-i~~l,4,7,7iir:ar;r)t1r ofno?u Kecorrci1;at;on und Gro.5.s Error Derecrior: Tec/utologies 345

Table 1 1-8 Stream and Unit Reconciliation Solution for Ammonia Example

Case A. No Gross Errors Present in Measurements

TAG STANDARD MT- MEASURED C N C STRM VBL STAT UOM NAME DEVIATION STAT VALUE VALUE

S I RATE U M3MR 256 367 TEMP M C TO 1 1.50 1.50 -23.00 -22 46 PRES F ATM 150.001) XI U MOL% 32.7478 X2 ti MOL% 1 1.8345 X3 U MOL% 7.1737 X4 U MOL% 2.3594 X5 ti MOL% 45.8846

S10 RATE M TEMP M PRES I- X 1 u X2 u x 3 LT X? 2 X 5 u

S l l KATE M TEXT? M FRES F X ! U X2 L' x 3 u Xd I! X5 I :

S12 R.47-E U TEMP M PRES F X 1 U X2 ii

X -3 li X4 il X5 ti

M3lHR F10 3075 000 C TI0 5 0 0 ATM hlOLLio MOL% MCLC'c ,MC)L% MOL7c

M31flK F! 1 2880.000 C TI 1 5.00 ATM MOLT MOLi/;c MOL% MOL'ir, MOL%

(tobie corzritzued on next pnge)

Ir~dus(riol Appiicuiions of llara K~.cor~crliutiun arld Gross EI-ror- Dctr~ctlor~ T~c-i:r-i:noio,~i~~ 347

Table 1 1-8 (continued)

Stream and Unit Reconciliation Solution for Ammonia Example Case A. No Gross Errors Present in Measurements

TAG STANDARD MT- MEASURED CALC STRM VBL STAT UOM NAME D M A l l O N STAT VALUE VALUE

S13 RATE TEMP PRES X 1 X2 X3 X4 X5

S14 RATE TEhlP PRES X 1 X2 X3 X4 X5

515 RATE l'Eh4P PRES x I x 2 X3 X4

X 5

S 16 RATE TEMP PIII-,S X 1 X2 X3 X4 X5

M3MR C ATM MOL% MOL% MOL% MOIL% MOT,%

hl3/HR C ATM MOL4 .MGL% XtO1*5c MOL8 MOL%

b13XR C ATM MOLQ MOL% MOLS; MOLq

M O i Yc

M3NR C ATM MOLQ hlOL% MOLO MOLO MOL%

I Table 1 1-8 {cont~nued) I Stream and Unit Reconciliation Solution for Ammonia Example


TAG STANDARD MT- MEASURED CALC STRM V B L STAT UOM NAME DEVIATION STAT VALUE VALUE

0 S17

RATE M MYHR F17 48 000 I 84 1600 000 1665 585 TEMP U C 127 69 PRES F ATM 150 000 X 1 U MOL% 49 5273 X2 U MOL% 18 3357

I X3 U MOL% 14 2649 X4 U MOL% 46916 X5 U MOL% 13 I805

S18 RATE 'J M?/HR 1665 585

C TEMP M C TI8 1 00 1 50 -15 00 15 24 PRES F ATM 150 O(K)

X 1 U MOL% 49 5277 X2 U MOIL% 18 3757 X U MOL"& 14 2649 X4 U hiOL% 46910 X5 C MOL% l i 1805

S 19 RA'IE h1 hl?/HX l-19 45 000 84 l200K'U 1472177 TEMP U C -8 15 PRES F A I-M i 50 OtXI X! hf MOL% H I 19 1 0000 1 14 57 0000 56 0256

4. X2 1 MOL% NI 19 1 COO8 32 21 0OOr) 20 74:'; X3 h4 MOL% C1 13 1 9GQG 1 4 C 15 0000 16 1 557 X4 21 MOL% 4R-I9 I 0090 78 5 0000 5 1069 X5 M 0 '-6 Xt33 19 1 0000 19 :! OOO(1 1 7915

S2 RATE rJ M3ff3R 1 02E+Oi rkMP U C -2 1 82 PEES hl ATM PO:! 500 1 50 i50 000 149 998 X 1 U MOL% 58 9875 X2 U MOL% 21 3170 X3 U MO1Q 129215 X4 U Mold% 4 2499 X5 1 MOL% 2 5238

able cor1rrr:lred or1 rzc,xt p u ~ e i 6

348 Data Recoriciliatiorz and Grocs Error- Derrcrion Irrdt~str-iuLA~~~~licrrriur~.s of Ilata ReconciliaZion and Gro.ss Error D~recriorl Techrrolo~ies 349

Table 1 1-8 (continued) Stream and Unit Reconciliation Solution for Ammonia Example


TAG STANDARD MT- MEASURED CALC STRM VBL STAT UOM NAME DEVIATION STAT VALUE VALUE

S20 RATE TEMP PRES XI X2 X3 X4 X5

S2 1 KATE TEMP PRES XI X2 X3 X4 X5

S22 KATE TEMP PR ES XI X2 X3 X4 x.5

S23 RATE TEMP PRkS X 1 X2 X3 X4 X5

M3MR C ATM MOL8 MOL% MOL% MOL% MOL%

M3MR C ATM MOL% MOL% MOL% kfSL% h.IOL9

KCWR P L

ATM MOL% MOLR M o t % MOL% MOi%

KGn-IR

C ATM MOL% MOL% MOL% MOL% MOL%

Table 1 1 -8 {continued) Stream and Unit Reconciliation Solution for Ammonia Example



S3 RATE M M3i'HR F03 .500 1.50 !OO.OOO 100.000 TEMP U C -21.82 PRES F ATM 150.000 X 1 F MOL% .0000 X2 F MOL% ,0000 X3 F MOL% .0000 X4 F MOL% ,0000 X5 F MOL% 200.0000

S4 RATE M M3MR F04 3075.000 .30 1.03E+05 1.02E+05 TEMP M C TO4 2.00 .31 135.00 134.86 PKES F ATM 150.000 X 1 M MOI,% H2-4 1.0(100 1.31 58.0000 58.9875 X2 M MOL% N2-4 1.0000 .36 21.0000 21.3170 X3 M MOL% C1-4 1.0000 .09 13.0000 12.9218 X3 M MOL8 AK-4 1.0000 .28 3.0000 4.2499 XS M MOL% NFI3-4 1.0000 .76 2.0000 2.5238

S5 RATE M ELI3MR PO5 1650.00C 81 5.50E+04 5.6OEi4~! TEMP U C 134.86 PRES F ATM 150.000 XI U MOLQ 55.9575 X2 t MOLB 21.3!70 X3 U MOL'X i2.921 S X4 U MOL% 4.2499 X5 U MOL% 2.5238

S6 RATE M M31kiK F06 450.000 1.1 1 l.50E+04 1.51E+i)l TEMP U C 134.86 PRES F ATM 150.000 XI U MOL9b 58.9875 X2 IJ MOL% 21.3170 X3 U MOL% 12.921s X3 U MOL% 4.2499 X5 U MOL% 2.5238

350 Durn Recor<ciliurior~ und (21-ass t/-I-or- Derccriori Irulusrriul Ap~~licufi~~rrs qfDnra Keconcilinrior~ and Cross Error Detccrrorr T~~cl~,lologres 35 1

Table 1 1-8 (continued)

S t r e a m and U n i t Reconci l iat ion So lu t ion for A m m o n i a E x a m p l e Case A. No Gross Errors Present in Measurements


S7 RATE TEMP PRES XI X2 X3 X4 X5

S8 RATE U TEMP M PRES 5 XI U X2 u X3 U X4 b X i U

S9 RATE ii TEMP ii PRES I; X! U X2 U X3 U X4 u X5 u

M31HR C ATM MOL% Mold% MOL% lMOL% MOLCie

M3/I3R C TO8 ATP.4 >fOL':c MOLQ KOL % M0I.C hlOLCc

HEAT EXCHANGER D U N A N D U A VALUES

DUTY UA HEAT EXNGR (M'KCAL/HR) (KCAL/HR-C)

Tab le 1 1-8 (continued) S t r e a m and U n i t Reconci l iat ion So lu t ion for A m m o n i a E x a m p l e


SPLITTERS: SPLIT FRACTIONS

UNIT CALCULATED STRM STAT VALUE

SPI S16 U .98191 S 17 U .O 1809

SP2 S5 U .55063 S6 U .I4869 S 7 U -30068

REACTOR UNITS: EXTENT O F REACTION A N D D U N

UNIT CALCULATED VBL STAT UOM VALUE

RI EX 1 IJ KG-MOL/HR 1 22E+03- DUTY U M ' K C A L ~ R I lit19

R2 EX I iT KG-MOLIH R 91 67165 DUTY F M*KCALIHII 03000

FLASH UNITS

UNl i STANDARD MEASURED CALC VBL STAT UOM TAGNAME DEVlAT!ON VALUE VALUE

F1 TEWP II C -21 g2 PRES F ATV 150 oon DUTY F M'KCALIHR Go003

F2 TEMP U C -8 15 PRES M ATM ~ 2 . p sry) !5000n !50000 DUTY F M-KCAI~IHR

NOTA 7'1 OI\' STRM : STREAM ID

VI3L : K4RI/1 RLE NAME STAT: LrARIABLE .TTATLrS I N THE MODE/, (M=MEASLfRED.

U=UNMEASURE2. F=FIXED) UOM : UNIT OF MEASURE

,WT-STAT : MEASUREMENT TEST STATISTIC NK : IV'ON-REDUNDANT MEASUREMENT

Darn Kecorrciha~ion and Gross Error Derec~ion

Table 11-10 Summary of Calculation Results for the Ammonia Example


NUMBER OF ITERATIONS = 5 MEASURED VARIABLES = 40 (1 NON-REDUNDANT)

UNMEASURED VARIABLES = 59 (0 UNOBSERVABLE) FIXED VARIABLES = 4 1 (29 FIXED BY USER)

NUMBER OF EQUATIONS = 82 DEGREE OF REDUNDANCY = 20

GLOBAL TEST (.950 CONFIDENCE LEVEL)

GT STATISTIC = 10.96 CRITICAL VALUE = 3 1.40

*** MEASUREMENTS PASSED THE GLOBAL TEST ***

MEASUREMENT TEST ( .950 CONFIDENCE LEVEL)

CRITICAL VALUE = 3.16

**" ALL MEASUREMENTS PASSED THE MEASUREMENT TEST """

PRINCIPAL. CORlPONENT MEASUREhlENT (PCM) TEST

(.950 JOIlTI, .997 INDIVIDUAL CONFlDENCE LEVELS)

CRITICAL VhLljE = 3.02

*** A1.L PR~NCIPAL COMP9NENTS PASSED THE FCM E S T """

Tr?ble I i-9 coniains the reconciliation results far all measurcd vzriables. Table 11-16 indicates general run data jncmber of iterations wt i i convergence, number of equations, and :lumber of variables for each cai- egcry-mezsured, unmeasured, fixed, the number of nonredunciant and unobservable variables, the degrees of redundancy and a suinlnary of resul ts f rom statistics! tests . P ! ! -','u!:" were <r-"nr-+,lrl - l i i th ,. L C

DA'TACONT", a product of Sirnillation Sciencec Inc. (an Invensys Com- pany). The thermodynamic properties were calculated with the Soave- Redlich-Kwong (SRK) method for all components.

No gross errors were simulated in Case A (only random errors are present in measured data). All t k e e statistical tests that were used, the glob-

11ld1rsr1-id A[q~licafions of Durn Kecoi~ciliuiro~~ rnflrl Gross Error 13rierrio11 Tecl~nolo~ie .~ 355

a1 test (GT), the measurement test (MT), and the principal component measurement test (PCMT), properly indicated that there are no gross errors in measurements.

In Case B, three gross errors were simulated in the following measurements:

F07 (magnitude = 10000 m3/hr, ratio 610 = 10); C1-4 (magnitude = 5 rnol%, ratio 610 = 5); T22 (magnitude = 3.6 Deg. C, ratio 610 = 3).

They have different detectability factors, as indicated in Table 11-9. To increase the chance of detection and correct identification, a higher ratio of the gross error magnitude 6 to the corresponding standard deviation o was used for the measurements with lower detectability factor. The i o b i r the detectability, the higher the ratio 610.

Table 11-11 shows the reconciliation results for Case B for all measured variables. No error elimination was used in this run. Table l l - 12 indicates various run datr? and the summary results from the GT and MT. The GT indicates the existence of Fross errors, while the MT declares 7 measurements in gross error. In addition to the three true gross errors, foils other gros: errors were found by the MT. Serial eliminatior? was used for better en-or iderttificarion.

Table 11-13 shows the summarized results from a first 1z11 with serial elimination calculations. 111 this rdn. the MT was used in zddition to GT. We notice that in the first elimination step the]-e were two measured variables sharing the s a n e MT statistic (the larges! MT statistic value). This pziticulzr algorithm used the detectability F&cior as a tie brezker in the elimination pracess. Since FO7 has a higher deteciabiliry factor (0.5 135) than FC6 (0.2297), F07 wzs chosen tc be eliminated first. This turned out to be the right choice. S~lbsequel?tly. F06 was not foanld in gross en-or anymore. Next. CI-4 and T22 were also eliminated and no more gross rvnrs were found by both GT and iviT. The estimated values for the three eliminated measurements are very close to the values reported in Table 11-9 for the no gross error case.

Table 11-14 shows a similar run, this Time using the PCMT for gross error identification. Initially. both GT and PCMT indicate existence of gross errors. The elimination path and final results. however, are some-

' & a 356 Duru Rcc(iri< rlrurron rrrzd Grou Error Derecrron

what different from the run with the MT. In the first elimination pass, F05 was found to be the major contributor to the largest inflated principal component. It is true that F05, F06, and F07 are model-related to each other (they are all outlet streams of the splitter SP2) and the smearing effect of the gross error in F07 is easier.

In this case, the calculation of the contribution shares to the first lxgest 9 C principal component that failed the PCMT, indicates that F05 should be first eliminated. Subsequently, an associated temperature, T13. also ha? to be eliminated and properly adjusted in order to satisfy the heat balance for exchanger E3. The last two eliminated measurements, T22 and C1-4, are true gross errors. The MT and PCMT tests usually detect correctly gross errors in measured variables with relatively higher detectability factor. The outcome of the two types of tests for gross errors in measured variables with relatively lower detectability factors could be the same for both tests I

(no gross error detection or wrong gross error identification) or one test can perfom better than another. it is not clear which test performs consls- tently better. More analj~sis and comparisoil of the two type of tests for the ammonia example can be found in Jordache and Tilton [31].

Itzduir~-~uI Appliculion.~ of l>a/u Rc,conci!iutron awd Gross Error L)efrcri(~..: T~clzlzulogies 357

Darn Reconciliutio~~ orld GI-oss Et-ror L ) C I E ( . I ~ U I I

303 '3 0 C 2 5 5 m - m N l I

Table 11-12 Summary of Calculation Results for the Ammonic Example

Case B. Gross Errors Present in Three Measurements

1 Measurement Test Used for GED i

NUMBER OF ITERATIONS = 5 MEASURED VARIABLES = 40 (1 NON-REDUNDANT) * b UNMEASURED VARIABLES = 59 (0 UNOBSERVABLE)

FIXED VARIABLES = 4 1 (29 FIXED BY USER) NUMBER OF EQUATIONS = 82

DEGREE OF REDUNDANCY = 20

GLOBAL TEST ( .950 CONFIDENCE LEVEL)

GT STATISTIC = 80.86 CRITICAL VALUE = 3 1.40

**" DID NOT PASS THE GLOBAL TI35 1 .:**

MEASUREMENT TEST ( ,950 CONFIDENCE LEVEL)


*** 7 MEASUREMENTS FAILED THE MEASUREMENT TEST***

STRMIUNIT TAG- MEASUREMENT CALCULATED VBL UOM NAME VALUE VALUE MT-STAT

S7 RATE h,13KR F07 20000.0000 23 125 7732 6.1226

S5 RATE M3lHK F05 550Ci0.0000 59426.0C03 3.7092

S8 TEMP C TO8 .230.0000 421.9775 3.4451

S 13 TEMP C T 1 -3 450.0000 461.2256 3.3358

Table 11-13 Summary of Calculation Results for the Ammonia Example

Case 6. Gross Errors Present in Three Measurements Serial Elimination of Gross Errors Applied

Measurement Test Used for GED

INITIAL DATA RECONCILIATION

*** 7 MEASUREMENTS FAILED THE MEASUREMENT TEST (SEE TABLE I 1 - 1 2) ***

TAGNAME DETECTABILIN FACTOR

*** MEASUREMENT F07 WJLL BE DELETED IN THE NEXT PASS ***

PASS 1 OF SERIAL ERROR ELIMINATION

MEASUPSD VARIABLES = 39 (2 NON-REDUNDANT) UNMEASURED VARIABLES = 60 (0 UNOBSERVABLE)

FIXED VARIABLES = 41 (29 FIXED BY [JSER) NUMBER OF EQUATIONS = 82


SYECNA-RY OF ELIMINATED MEASUREMENTS

PASS STRMiUNlT STANDARD MEASUREMENT CAiCULATED VARIAGLE iJOh5 TAGNAME DEV~ATION \!ALUE VALUE

GLOBAL, T'dSTT ( .95G CONFIDENCE LEVEL)

GT STATISTIC = 43.42 CRITICAL VALUE = 30.10

*** DID NOT PASS THE GLOBAL TEST "**

Industrial Applicarions ofI~u10 Reconcilialion and Grosr Error I ) E ~ ~ ~ I ; ( , ~ rEch,zolog~Es 36 1

Table 1 1 - 13 (continued) Summary of Calculation Results for the Ammonia Example


Measurement Test Used for GED

MEASUREMENT TEST ( .950 CONFIDENCE LEVEL)


*** 2 IMEASUREMENTS FAILED THE MEASUREMENT TEST *"* STRM/UNlT TAG- MEASUREMENT CALCULATED

VBL UOM NAME VALUE VALUE MT-STAT

S4 X3 MOL% C1-4 18.0000 14.041 7 4.5065

S22 TEMP C T22 -33.6000 -29.1232 3.7306

*** MEASUREMENT C1-4 WILL BE DELETED IN THE NEXT PASS ***


MEASURED VARL4BLES = 38 (2 NON-REDUNDANT) UNMEASURED VARIABLES = 61 (0 UNOBSERVABLE)

FIXED VARIABLES = 4i (29 FIXEC BY USER) NUMBER OF EQUATIONS = 82

DEGREE OF PADUNDANCY = 18

SilMbMRY OF ELiMlNATED MEASUREMENTS

PASS STZM/UNIT STANCARD MEASUREMENT CAlCtiLATEG .* VARIABLE UOM iAGP4AME DEVlATiON VALUE -- VALUE -

i s7 RATE MiMR F07 IOOO.L7(iOC 19999.9992 32202.97 16

2 S4 X3 MOL% Cl-4 1.0000 18.0000 12.8745

S1.3BAL TEST ( .S3G COiu'FiDENCE LEVEL)

G?' STATISTIC = 23.10 CRITICAL VALUE = 28.90

*** MEASUREMENTS PASSED THE GLOBAL TEST ***

fndustr;ul A,vpliccitinrr$ of Dafa Recorlciliatiorr and Gross El-ror Derecrior~ Tech/ogjus 3453

Table 1 1 - 1 3 (continued)

Summary of Calculation Results for the Ammonia Example Case B. Gross Errors Present in Three Measurements

Serial Elimination of Gross Errors Applied Measurement Test Used for GED

MEASUREMENT TEST ( ,950 CONFIDENCE LEVEL)

CRITICAI, VALUE = 3.13

*** 1 MEASUREMENT FAILED THE MEASUREMENT TEST **" STRM/UNIT TAG- MEASUREMENT CALCULATED

VBL UOM NAME VALUE VALUE MT-STAT

S22 TEMP C T22 -33.6000 -29.1232 3.7306

*** MEASUREMENT T22 WILL BE DELETED IN THE NEXT PASS "*"

PASS 3 OF SERIAL ERROR ELIMINATION - - -

MEASURED VARIABLES = 37 (2 NON-REDUNDANT) UNMEASURED VARIABLES = 62 (0 UNOBSERVABLE)

FIXED VARIABLES = 41 139 FIXED BY USER) NUMBER OF EQUATIONS = 82

DEGREE OF REDUNfiAh'CY = 1 7

SUMMARY OF ELIMINATED NLEASUREMENTS

?ASS STRMjUNlT STANDARD MEASUREMENT CALClJLAED VARIABLE UOM TAGNAME DEVIPT!ON VUUE VALUE -

I S7 RATE M3!HR F07 1030 @600 !9a99 9992 22203 31 28

2 S4 X3 MCLQ C1-4 I 0009 18 0000 !2 8741

3 S22 TbMP C TS2 1 2000 -11 6000 -29 1212

GLOBAL TEST (.950 CONFIDENCE LEVtL)

GT STATISTIC = 9.1 6 CRITICAL VALUE = 27 60

*** MEASUREMENTS PASSED THE GLOBAL, TEST ""*

RtEASUKEMENT TEST (.950 CONFIDENCE LEVEL)


-** ALL MEASUREMENTS PASSED THE MEASUREMENT TLST ^'^ J d

Table 11-14 Sr~mmary of Calculation Results for the Ammonia Example

Case B. Gross Errors Present in Three Measurements Serial Elimination of Gross Errors Applied

Principal Component Measurement Test Used for GED

INITIAL DATA RECONCILIATION

PRINCIPAL COMPONENT MEASUREMENT TEST

( .950 JOINT, .997 INDIVIDUAL CONFTDENCE LEVELS)


*** 3 PRINCIPAL COMPONENTS FAILED THE: PCM TEST ***

MAJOR CONTRIBUTING MEASUREMENTS TO TKE FAILED PRINCIPAL COMPONENTS AND SHARES

STRM/UNU STGM/UNIT STRM/UNIT STRM/UNIT OUTLIER PC VARIABLE, VARIABLE, VARIABLE, VARIABLE, PC# SCORE SHARE % SHARE % SHARE % SHARE %

10 4513 S5 S8 S 16 S13 RATE TEMP RATE RATE

28 -20 17 i 4

Slr) S4 S!4 Sf2 KATE RATE RATE TEMP

12 12 I I -8

S7 s11 S10 S l l RATE TEMP TEMP RATE

7 I 7 7 6

S14 S13 TEMP TEMP

5 4

4 3 731 S22 TEMP

! 00

I3 3 049 S4 Sf4 3l.r S 14 X3 X4 XS X 1

74 1 0 8 7

S 19 S19 X2 X!

5 4

""" MEASUREMENT F05 WILL BE DELETED IN THE NEXT PASS ' "' --

(tciblr ~o~i~irrrrcd o n tletr j~aqc!

364 Dnro Rer on( ~lratron and Gloss Error Deii,ctron

Table 1 1-1 4 {continuedj Summary of Calculation Results for the Ammonia Example


principal Component Measurement Test Used for GED


MEASURED VARIABLES = 39 (1 N3N-REDUNDANT) UNMEASURED VARIABLES = 60 (0 UNOBSERVABLE)

FIXED VARIABLES = 41 (29 FIXED BY USER) NUMBER OF EQUATIONS = 82


SUMMARY OF ELIMINATED MEASUREMENTS

PASS STRMIUNIT STANDARD MEASUREMENT CALCULATED VARIABLE UOM TAGNAME DEVIATION VALUE VALUE

RATE MYHR F05 1650.0000 54999.9996 6368 1.9602 --

GLOBAL TEST (950 CONFIDENCE LEVEL)

S T STATISTIC = 66.85 CRITICAL VALUE = 30. I(!

*** DIE NGT PASS THC GLOEAL TEST **"

PKZKCfl'AZ COMPONENT MEASUREMENT TEST

;.SjCi JOIN'T, .997 INDIVIDUAL CONFIDENCE LEVELS)

CRITICAL \IALUE = 3.00

*** 3 PR INC!PAL COMPO3ENTS FAILED TIiE BCM TEST **?' - --

Table 1 1 - 14 (continued) Summary of Calculation Results for the Ammonia Example


Principal Component Measurement Test Used for GED

MAJOR CONTRIBUTING MEASUREMENTS TO THE LARGEST PRINCIPAL COMPONENT AND SHARES

STRMIUNIT, STRM/UNIT, STRMIUNIT, STRM/UNIT, OUTLIER PC VARIABLE, VARIABLE, VARIABLE, VARIABLE, PC# SCORE SHARE % SHARE % SHARE % SHARE %

2 4.348 S13 58 S 14 SI 1 TEMP TEMP TEMP TEMP

38 30 16 6

S 10 S12 TEMP TEMP

* C 6 5

4 3.731 S22 TEMP

100

**" MEASUREMENT TI 3 W!Lt BE DELETED IN THE NEXT FASS **&

--

PASS 2 GF AUTOMATIC ERRQR ELIMINATION -- -- - --- - .--

MEASURED VARIAi31AbS = 38 (1 SON-REDUNDANT) UNME4SURFD VAKlABLE'5 = 61 (0 klhOBSERVABLE)

FIXEL: ', .\RIABLLS = 41 (?9 FIXLD BY USER) NUMBER OF EQIJATTOhS = 82

TrLZGREE OI- REDUNDANCY = 18

litdustrial A/~~~licutiuns of Dara Rrconci1iu:;orz nizd GI-us5 El-ror L)e:ecrio~~ T r c h n ~ l o ~ i e s 357

Table 1 1 - 14 (continued) Summary of Calculation Results for the Ammonia Example.

Case 6. Gross Errors Present in Three Measurements. Serial Elimination of Gross Errors Applied.

Principal Component Measurement Test Used for GED.


PASS STRM/UNIT STANDARD MEASUREMENT CALCULATED VARIABLE UOM TAGNAME DEVIATION VALUE VALUE

1 S 5 RATE M3mR F05 1650.0000 54999.9996 65494.10 18

2 S13 TEMP C TI3 5.0000 450.0000 48 1.3397

GI,OBAI, TEST ( .950 CONFIDENCE LEVEL)

GT STATISTIC = 49.74 CRITlCAL VALUE = 28.90

*:** DID NOT PASS THE GLOBAL TEST *"" PRINCIPAL COMPONENT RlEASUREMENT TEST

1.950 JOIKT, .a97 INDIVIDUAL CONFIDENCE LEVELS)

CRITICAL VALUE = 2.98 **+ 2 I'XJNCIPAL COMPONENTS FAILED THE PCM TEST *<:": MAJOR CONTRIBUTING MEASUREMENTS TO

THE LARGEST PR!NCIPAL COMPONENT AND SHARES

STRM/UNli, STR!d/UNIT, STPA/UNIT, STRM/UN:T, OUTLIER PC VARIAGE, VARIABLE, VAR!ABlE, VARIABLE, FC# SCORE SHARE % SHARE % SHARE % SHARE %

> 3.731 S22 TEMP

100

Table 1 1 - 14 [continued) Summary of Calculation Results for the Ammonia Example


Principal Component Measurement Test Used for GED -- -


MEASURED VARIABLES = 37 (1 NON-REDUNDANT) UNMEASURED VARIABLES = 62 (0 UNOBSERVABLE)

FIXED VARIABLES = 41 (29 FIXED BY USER) NUMBER OF EQUATIONS = 82



PASS STRMIUNIT STANDARD MEASUREMENT CALCULATED VARIABLE UOM TAGNAMi DEVIATION VALUE VAlUE

1 S5 RATE M3/HR F05 1650 0000 54999 9996 65494 1276

2 S l 3 TEhIP C Ti3 i O(i00 - l iO GO00 4s 1 3396

3 S27- TEMP C T22 1 2000 33 6000 -3-9 3 232

GLOBAL TEST ( 950 CONFIDENCE L-GVEL)

GT STL4TISTJC = 35 82 CRITICAL VAL'JE = 27 fv0

"*" 33113 NOT P4SS THE GLOBAL TEST '--

PRINCIPAI, CORIPONBNT MEASUREMENT TEST

(.95C JOINT, .997 INDIVIDUAL, CONT-lDENCE IEVEL S:

CIIITICAL VALUE = 2.97

"** 1 PRINCIPAL COMPONENTS FAILED THE PC3f TEST "':* -. . -- -- -

*** MEASUREMENT T22 WILL BE DELETED IN THE NEXT PASS "**

Table 1 1 - 14 (cont~nvedj

Summary of Calculation Results for the Ammonia Example Case B. Gross Errors Present in Three Measurements

Seriai Eiimination ofGross Errors Applied Principal Component Measurement Test Used for GED

MAJOR CONTRIBUTING MEASUREMENTS TO THE LARGEST PRINCIPAL COMPONENT AND SHARES

STRMIUNIT, STRM/UNIT, STRMIUNIT, STRM/UNIT, OUTLIER PC VARIABLE, VARIABLE, VARIABLE, VARIABLE, PC# SCORE SHARE% SHARE % SHARE % SHARE %

9 3 048 S4 S19 S19 S4 x3 X3 X1 X 1

5 8 14 11 8

S4 S14 S 14 S14 X5 X5 X4 X3

5 4 -3 *

=** MEASURZMENT T22 WILL BE DELETED IN THE NEXT PASS ""'

PASS 4 OF AUTOMATIC ERROR ELIMINATION

bIEASURED VARIABLES = 36 ( 1 NON-REDLINDANT) UNMEASURED V4RIABLES = 63 (0 UNOBSERVABLE)

FIXED VARIABLES = 41 (29 FIXED BY USER) NUMBER 01- EOUATIONS = li2

DEGREE OF KtDIJNGAWCY = 16

SUMMARY GF EL!MINATED MEASUREMENTS

PASS SIRMIUNIT STANDARD MEASUREMENT CALCUL4TED VARIABLE UOM TAGNAME DEVIATION VALUE VALUE

-- --

1 S5 RATE M3mR FOS 1650.0000 54999.9996 65795.7594

2 S13 TEMP C TI 3 5 OC00 450 0000 181 4463

3 S2L TEMP C T22 1 2000 -37 6000 -29 1232

4 S4 X3 MOLLrc CI-4 1 0000 18 0000 12 8642 --

GLOBAL TEST ( 950 CONFIDENCE LEVEL)

GT STATISTIC = 15.42 CRITICAL VALUE = 26.30

**" MEASlJREMENTS PASSED TI-IE GLOBAL TEST """

lndirsrriai Applicurions of L)ata Reronciliation and Gross Error Def. riior~ Techno!nRie.y 369

SUMMARY

Steady-state process unit data reconciliation and gross error detection technology is widely used in chemical, petrochemical and other related industrial processes. On-line data reconciliation is important for enhancing the accuracy in process optimization and advanced process control. Steady-state detection is necessary in order to increase the accuracy in the reconciled values and to provide meaningful gross error detection. If the process is not operated at steady-state for a longer period of time, dynamic data reconciliation should be applied. Proper component and thermodynamic characterization and accurate compositions are very important for a successful data reconciliation and gross error detection. Rigorous rnodel enables merging the data reconciliation and parz- meter estimation into one problem which can be solved simultaneously. Plant-wide material and utilities reconciliation is an important tool for production (or yie!d) accounting. 'The most challenging prcblem in produciion acccunting is the estimatior? of various leaks and losses. With enough redundancy, data reconciliation can provide reasonsble estimates :or the magni:udes of materials that are not accounted for. Irnposing bounds on vaiiables is often necessary to eilsure a feasible solutior, for the daia reconciliation problem. 7'0 so!ve a botind- ed problem, an NLP-based software is n~eded. The existen: gross error detection meThods do not accurately detect gross errors ail the time. Some methods are better than 0th- ers, but their overal! performancz depends upon the model accuracy and the level of data redundancy.

'** ALL PRlNCIPAL COhll'ONENTS PASSED THE PCM TEST 'I"*

*'* ALL VARIABLES ARE L1'17'HIN i3O{JNI)S ""*

Uuf<r K(,cor~crl'~rion alrd Gross Errol- Uetecriorr

REFERENCES

1. Ravikumar, V. R., S. Narasimhan, S. R. Singh, and M. 0. Garg. "RAGE- The State of the Art Package for Plant Data Reconciliation and Gross Error Detection," presented at the International Symposium on Automation and Control Systems, New Dehli, India, 1992.

2. Chi, Y. S., T. A. Clinkscales, K. A. Fenech, A. V. Gokhale, C. Jordache, and V. L. Rice. "On-line, Closed Loop Control and Optimization of an FCCU Using a Self Adapting Dynamic Model," presented at the AIChE Spring National Meeting, Houston, Tex., 1993.

3. Bagajewicz, M., and S. L. Mullick. "Reconciliation of Plant Data. Applica- tions and Future Trends," presented at the AIChE Spring National Meeting, Hcuston. Tex.. 1995.

4. Charpentier, V., L. J. Chang, G. M. Schwenzer, and M. C. Bardin. "An On- Line Data Reconciliation System for Crude and Vacuum Units," presented at thc NPRA Computer Conference, Houston, Tex., 1991.

5. Leung, G.. and K. H. Pang. "A Data Reconciliation Strategy: From On-Line Implementation to Off-Line Applications," prese~lted at the AIChE Spring National Meeting. Orlando, Fla., 1993.

6. Scott, M. D., J. M. Tiessen. and S. L. Muilick. "Reactor Integ~ated Rigorous on-line rnodel (ROMTM) for a Muiti-Unit Hydrctrearer-Catalytic Refonner Complex Optimization," prescnted at the NPRA Computer Conference, Anaheim, Calif., 1994.

7. Chiari. M., G. Bussari, M. G. Grotto!;, and S. Pierucci. "On-line Data Rec- onciiiatioil and Optitnization: Refinzry P,pplizat~ons." Co:nputer.s C!l?n~. Elzgng. 21 (Sup@., 1997): S 1 ! 85-S 1 190.

8. Nair, P., and C. Jordache. "Rigorctus Data Reconciiir?tioi~ is Key to Opti~nal Operations." Contro! (Oct. 1991): 118-123.

9. Tan~ura, K. I, T. Sumioshi. G. D. Fisher, and C. E. Fontenot. "Octimization of Ethylene Plant Operatioils Using Rigorous Models," presented at the AIChE Spring National Meeting, Houston. Tex.. 1991.

10. Sanchez, M. A., A. Bandoni. and J. Romagnoli. "PLAPAT-A Package fur Process Variable Classification and Plant Data Reconciliation." Computers Chern. Erzgng. (Suppl. 1992): S499-S506.

11. Natori, Y., M. Ogawa. and V. S. Vemeuil. "Application of Data Rcconcilia- tion and Sirnulatior1 to a Large Chemical Plant." Proceedings of Large Chemical Plants 8th International Symposium, Antwerp, Belgium, 1992, pp. 101-1 13.

12. Christiansen, L. J., N. Bruniche-Olsen, J. M. Carstensen, and M. Schroeder. "Performance Evaluation of Catalytic Processes." Co17zputers Clzern. Engng. 21 (Suppl., 1997): S 1179-S1184.

13. Holly, W.. R. Cook, and C. M. Crowe. "Reconciliation of Mass Flow Rate Measurements in a Chemical Extraction Plant." The Canadian J1. of Chenz.

0 * Erlgr~g. 67 (1989): 595401.

14. Dempf, D., and T. List. "On-line Data Reconciliation in Chemical Plant." Cornj~uters Chenz. Engng. 22 (Suppl., 1998): S 1023-S 1025.

15. Placido, I., and L. V. Loureiro. "Industrial Application of Data Reconcilia- tion." Conllputers Chem. Engilg. 22 (Suppl., 1998): S 1035-S 1038.

16. Ravikumar, V., S. Narasimhan, M. 0. Garg, and S. R. Singh. "RAGE-A Software Tool for Data Reconciliation and Gross Error Detection," in Foul*- darions of Computer-Aided Process Operations (edited by D.W.T. Rippin, J. C. Hale, and J. F. Davis). Amsterdam: CACHE/Elsevier, 1994, 329L.436.

1 17. Stephenson, G. R., and C. F. Shewchuck. "Reconciliation of Process Data

i . with Process Simulation." AICIzE Jo~cl-1lal32 (1 986): 247-254.

18. Meyer, M.. B. Koehrct, and h4. Enjalbert. "Data Reconciliation on Multi- component Network Frocess." Conipzito..~ Clzenz. E!zgng. 17 (no. 8. 1993): 807-8 17.

13. Nar:isirnt;an. S., R.S.H. Iviah, atid A. C. Tamhaiie. "A Composite Statistical l'cst foi Detecting Changes in Steady State." AICIIE Joii:-~zal 32 (1386): i4i19-:-118.

i 20. Narasimhan. S., C. S. Kao, aitd R.S.H. Mah. "Detecting Changes in Stzady

i Stzte Using the Mathematical Theory of Evidence." AIChE Jo~cr~lal

i 33(1990): 1930-1 931.

j l! d 21. CEP Software Cirectory, a Sup;~!ement to C;le~ti. C:,qng. P~.o,oress. p ~ i b -

I lished by thz American I ~ s t i t u ~ e of Chcrnical Er;gineers. 1998.

22. Tong. Fi., aiid D. Bluck. "An Indurtrir?I kpplicaticn of Principa! Ccinponer:t - lest ro Fault Cetec~ion and Ideniifisatiott." presented at tile IFAC Confcr- ence, 1998.

23. Reagan E.. B. Tilton, and S. Sa!nmcx!.. “Yield A ~ ~ ~ i i i i t i n g a d Data inte- zraiion." presented at the NPKA Compi1ti.r Conference, Atlanta. Ga., 1996.

24. Grosdidier. P. "Understand Operation Information Systems." H~drorar1,on Pmcessii~y (Sept.1998): 67-78,

25. Veverka. V. V., and F. hladron. Marc.)-in1 cr~icl G~arg? Ralancirig ill PI-ocess 1rzdu.srries: Fronz Micr-oscopic. Ra1~11ce.s to L A I I X ~ Plants. Amsterdam: Else- vier. 1997.

By suitably adding vectors which have the same number of elements, other vectors can be formed. A linear coinhii~afion of ~ L L ' O or- ln(71-e vectors is a vector which is formed by niultiplying each vector by a real number (scalar) and adding them. For example, consider a linear combination of the above vectors a and b represented by the vector d = a l a + a2b. If we choose the scalars a, = 1.5, and a2 = 2.0, then the vector d is given by

Given a set of 11 vectors, S = {a , , a?; . ., an} , we can generate all possible linear combinations of vectors in this set, a l a l + a2a, + . . . anan , by choosing all possible values for the scalars a i . We refer to the col!rrt;qn of vectors thus generated as a vector- space sl~arz~zed by the vectors in set S and denote it as V(S). It should be noted that the zel-o vector is a laern- be1- of this space.

A set of vectors S is said to be linear-ly independent, if a linear combi- nstion of the \iectors in this set equals 0 only f ~ r the case when all the scalars a, are equal to 0 and not for any other choice of the scalar values. if a set of vectors is lineal-ly dependc~it, then t!?ei-c is a vector in thi:; set hose scalar multiplyin2 factor (a, is i l o n ~ ~ r o ) \vhich can Sc expressed a.; a li!lear con:birlation of the other vec!ors in this st-!. We can delere this vector and again check if the I-enlaining vccto:-s are lineariy independent If' nct, we can repeat this procedure until we are left with a sct of veciors that arc lineal-!y ir~dependent.

The sr,t of vectors \vhich remain f ~ r m a ~:zir~irnnl set ~f lineal-11- ii7d~pe17- ilclzt ;Jcc.to~..i which span the vectar space V ( S ) . This mijtimal set of xc tors is said to for111 a hcrsis sct for V(C). For example, if we consider the set S consisting of vectors a. b. anti d defined above, then this forms a linearly depertcicnt set because an! vector in this set can be expressed as a lineas combination of the other two vectors in this set. We can choose to delete \,ector d tiom this set. in which case we are left with the two vectors a afid r j which can be vei-tfied to be linearly independen!. Thils. the vectors a and b for-m a basis set for the vector space spanned the three vectors a. b, and d. Another basis set for the same vector space is a and d.

There can be many different choice of a basis set for a vector space. but the number of vectors in every basis set is the same, and is denoted as the di171~17~iolr of the \.ector space. It must be borne in mind that the num-

ber of elements in any vector in a basis set (components of a vector) need not be equal to the dimension of the vector space spanned by the basis. This is illustrated clearly by the vector space spanned by the basis set a and b, where each of these vectors has 3 components but the dimension of the vector space spanned by them is only 2. Note that we often speak of a vector having n elements as an n-dimensional vector. This only implies that the vector having n elements is a member of the n-dirnen- sional space of vectors. We use the notation 157," for the n-dimensional real vector space.

MATRICES A N D THEIR PROPERTIES

A real matrix of order nz x n is an ordered set of eltmentc consisting of 171 rows of i z elements each. Each row of a matrix can be regarded as an n-dimensional row vector and each column of the matrix can be regarded as an trz-dimensional column vector. Thus, an nz x- rr matrix can either be ccnsidered as an ordered set of ~n row vectors each of dimension n. or as a set of 11 column vectors each of dimension 111.

Two specla1 r~at r ices are the zero matrix dent~ted by 0. whose elements are 0, and the identity matrix of orcier I? s 17, denoted by the sylnhol 1, whose column i is the unit vector ei. it shou!d be noted tha: we do not explicitl!. denote the dimensions of the matrix in the notation because i t is usua!ly c!ear from the context.

There 2re f ~ u r iinpostant vectcr spaces associated with evei-y matrix as definzd below:

(i ) Row Space: the space spanned by the roklJs. (2) Column Space: the space spanned by the co!unins. This space is 2 1 ~ ~

kno~v11 as the I-unge sj~oce of a mat:-ix. (3) Null Space: the space spanned by all veclors x which satisfy A x = G.

where x is a vector belonging to 9i". (4) Left Yiuli Space: the null space of the transpose of a matrix.

There are some important properties that link these vector spaces. The l-a~zk of a matrix is equal to the dimension of its row space, u~hich is also eqiial to the dimension of its column space. This immediately implies that the rank of a matrix r I ri~in(nz, 11) . A matrix of order 11 x iz is know11 as a square matrix of order n. If the rank of such a matrix is n, then the matrix is known as a nonsingular matrix and its inverse exists.

376 liurir Kci ori<-ilinrion ar~d Gross Et-ror Deteciion Appendix A-Basic Coricrpfs in Linear Algebra

The following equality can also be proved:

where N(A) is the null space of matrix A. From the above equation, it follows that if the rank of a matrix is equal to n (which implies that the columns of a matrix are linearly independent), then the dimension of the null space of the matrix is 0. The only vector which satisfies Ax = 0 in thls case is the 0 vector. In general, we are interested in obtaining a vector x which is the solution of the linear set of equations Ax = b. In other words, we wish to express the vector b as a linear combination of the colurnns of A. This is possible only if b is a member of the column space of A. Furthermore, the solution is unique if the null space of A has d iw-n~ion 0. This property is used in obtaining the solution of the unmeasured variables in data reconciliation discussed in Chapter 3.

In general, we can express the solution vector x as

x = X r + X,, (A - 2)

where x, belongs to the column space of .4 and x, belongs to the null space of A. This is known as the ralrgr and rzull s p c e dc~co)npo.~itio17 (RND). which is used in the RND-SQP nonlinear constrained opti~niza- tic11 algorithm discussed i i l Chi!pter 5.

'The el;.erzlv!urr of a square matrix A of clrder n are the 12 roots of its characteristic equation:

The set of these roct\ is denoted by h < ~ ) ={hl,jL2, . . .. A,,}. !f we define the t1at.r of matrix A by

The nonzero vectors xi of size n that satisfy the equation

are referred to as eigenvectors. The eigenvalues and eigenvectors of certain matrices are used to build the principal component tests in Chapter 7.

REFERENCES

1. Noble, B., and J. Daniel. Applied Linear Algebra. Englewood Cliffs, N.J.: Prentice-Hall, 1977.

2. Strang, G. Linear Algebra and its Applications, 3rd ed. Orlando, Fla.: Har- court Brace Jovanovich, 1988.

3. Golub, G. H., and C. F. Van Loan. Matrix (3omnputations. Baltimor-c- Johns Hopkins University Press, 1996.

then

tr(A) = h, + h2 + . . . + &,

Appendix B

Graph Theory

Thus, if the process contains n units and e streams, then the corre- I sponding process giaph contains n+l nodes and e edges. For ease of ref-

erence, we nurnbei or label the edges and nodes of the graph using the same numbers or labels as used in the process flowsheet, except for the environment node which is labelled as E. If the directions of the edges

b i r ' are ignored, as in Figure R-I , then an undirected graph is obtained; otherwise, the graph is directed. In this text, we are only concerned with undirected graphs.

0 8 Graph theory deals with problems related to topological properties of

figures. It is also useful for analyzing problems concerning discrete objects and their interrelationships. In this appendix, we define some of the important concepis of graph theory used ir? the book and illustrate them usins era~npies. Son?e facts are simp!;' stated withou: proofs and we direct the interested render to the book by Deo [ I j for thest: proofs 2nd add~tional concepts and theorems.

GRAPHS, PROCESS GRAPHS, AND SUBGRAPHS

C d A grcrplz consists of a set of /?odes, V, and a set ~f edges, E. Each edge is associatei with a pair of nodes, which it joins. An exampie of a graph is shown in Figure £3-1, which has six nodes draw2 as circles and eight edges shown as lines. Each edge is said to be incident on the nodes with which it is associated. The degree of a node is the number of edges incident on it. Aprocess graph is a graph which is simply obtalned trom a process flowsheet by adding an additional node called the envi~c)nrnent node to which all process feeds and products are connected.

For example, the graph in Pisure H- I is the process graph of a simplified ammonia process whose flowsheet is shown in Figure 10-1. The nodes of a process graph cotrespond to process units and the edges of the process graph correspond to streams that interconnect the units.

Figure B-1 . A graph.

A suligt-aph of a graph consists of a subset of nodzs ar.d edges of the graph. Each edge of the subgraph jsins the sane two nodes as it does in the graph. In other words, if an edge is part of a subgraph. then the end nodes with which it is associated in the graph should also be pait of the subgraph. The graph in F l g ~ ~ r z B-2 is a subgraph of thz graph shcwn in Figure B- 1.

Figure 6-2. A svbgraph of the graph in Figure 6-1.

L)UIU Rerr,ncili:ztion and Gross El-ror 1)efecfion

PATHS, CYCLES, A N D CONNECTIVITY

A pat11 between two nodes (denoted as the initial and terminal nodes of the path) is a finite alternating sequence of edges and nodes such that each edge in the sequence is incident on the two nodes preceding and succeed- ing it, and no node appears more than once in this sequence. A path is called a cycle if the initial and terminal nodes are the same. For example, in Figure B- 1, the alternating sequence of nodes and edges, E- 1 -M-2-H-3- R, is a path between initial node E and terminal node R, while the sequence E-6-S-5-SP-8-E is a cycle. A graph is connected if there exists a path between every pair of nodes of the graph. The graph in Figure B-1 is a connected graph as is generally the case for all process graphs.

Figure B-3. Subgraph formed by deleting node E from graph in Figure B-1

SPANNING TREES, BRANCHES, A N D CHORDS

A connected subgraph of the graph which does not contain any cycles and which includes all nodes of the graph is called a spanning tree of the graph. An edge of the graph that is part of the spanning tree is called a hlmlclz, while edges of the graph not part of the spanning tree are called c1zords. Figure B-2 is a spanning tree of the graph of Figure b-i. Corre- s~onding to this spailning tree, edges 2, 4, 5 , 7, ar,u 8 are branches while the remaining edges 1, 3, and 6 arc chords

I? should be noted that brafiches and chords are defined with respect to the specified spa~ning tree of a graph. If a different spafining tree of the graph is chosen then, accordingly. different edges of the graph are c1sssi- fie6 as brancl~es cr chards. It can be proved tha: n spanning tree contains * -%

n branches and e-1: chords where n is the number of upits in the process flowsheet (or one less than the number of nodes o l the process gr-zph). and e is the number of streams or edges of the graph.

Figure 6-4. Graph fcrmed by m~rging nodes E and M of graph in F i ~ u r e 5-1.

The subgraph shown in Figure B-3 can be cbrained from :he graph of F ig~rz B-1 by deleting the node E. The t t 7 e t ~ i n g cf nvo izodes cT a graph resul:s in a modified graph obtained by replacing the two merged nodes by a new node and dcieting the edge,; ir~cidcnt (111 both these :lodes. E d p which are incident 011 only one of the two merged nodes in the orig~nal graph arc now incident on the new node of the modified graph. The graph in Figur-e B-4 is obtai~ied Crom erzph in Figur-e B-l by n~erging node5 E and M. The new merged node in Figure B-4 is denoted as EM. GRAPH OPERATIONS

CUTSETS, FUNDAMENTAL CUTSETS, A N D FUNDAMENTAL CYCLES

A graph can be modified by operations such as deletion of edges or nodes and by merging of nodes. The deletion of an edge from a graph results in a subgraph which contains all nodes and all edges except the deleted edge. For example, the spanning tree shown in F i g ~ r e B-2 can be obtained from the graph in Figure B-l by deleting edges 1, 3, and 6. The deletiorz of a node from a graph results in a subgraph which contains all !

the nodes of the graph except the deleted node, and contains all edges of 5 '3

the graph except the edges which are incident or1 the deleted node.

A crttret of a graph is a set of edges of a graph whose deletion disconnects the graph, but the deletion of a proper subset of the edges of a cutset does not disconnect the graph. The set of edges 12. 5. 61 is a cutset of the zraph in Figure B-l since the de!etion of this set of edges disconnects the graph into two node sets one cont;:irmin~ ikf. E, and SP, and :he other

382 Doru Keconciliulion and Gmss Erro~- Dprecriotz

containing H, R, and S. On the other hand, the set of edges [l, 2, 5, 61 is not a cutset although the removal of this set of edges disconnects the graph si~ice its proper set of edges [2,5,6] is a cutset.

There is a correspondence between cutsets and flow balances that can be written for a process. A flow balance can be written around every unit of a process, which will involve the flows of streams that enter or exit this unit. It can be verified that the edges corresponding to these streams form a cutset of the process graph. Thus, corresponding to every cutset consisting of all edges incident on a node, a flow balance can be written. Flow balances can also be written corresponding to other cutsets which are essentially linear combinations of flow balances around individual process units.

Thus, corresponding to the cutset [2, 5, 61 of the graph in Figure B-1, the flow balance equation involving the flows of streams 2, 5 , and 6 is a linear combination of the flow balances around process units H, R, and S of the process. It should be noted that the direction of the streams should be taken into account when writing the flow balances corresponding to cutsets of the process graph.

A cutset of the graph which contains only one branch of a spanning tree of the graph and zero or more chords is called a fu1dui~let7tal cutset corresponding to the spanning tree. For exa~nple, edge set [I, 3, 71 is a filnda- rnenta! cutset of the graph ill Figure EZ-1, corresponding to the spanning tree shown in Figure B-2. However. although the set of edges [2. 5, 61 is aiso a cutse! of the gra.ph in Figure B-1, it is no: a fundamental cutset with respect to the spanning uee of Figure 3 - 2 because it c o n ~ ~ i n s two branches, 2 and 5 of the spalning tree. With respect to every branch of a spanning tree of a graph, 2 fundamental cutset can be identified. The fiindarnental cu:sets corresponding to t!e spznning tree, Figure B-2, sf rhe graph in Fig- ure E-1, are @, 31, @, 31, [5, 3 , 6 ] , [Z, I , 31, and [& 1, 61, where the branch in each fundamenral cutset is indicated by an undersccre.

A concept, which is complementary to a fundamental cutset, is that cf a fundamental cycle with rzspect to a spanning tree of a graph. A funricl- r?ze:7tul cycle with rcspect tc a spnxing :Tee cf ; &;'-$I is r; c j x k ;f ;:.-: graph farmed by exactly one c h ~ r " 2nd 9ne or more brq.r?ches. The cycle E-1-M-7-Sf-8-E, which consists of edges [ I , 7, 81, is a fundamental cycle of the graph of Figure B-1 with respect to the spanning tree B-2, which consists of chord 1 and branches 7 and 8. For each chord of a spanning tree of a graph a fu~idameatal cycle can be identified. The fundamental cycles with respect to the spanning tree, Figure B-2, of the graph in Figure B-I are [I, 7, 81, [G, 5, 81, and [3, 4, 5, 7, 21, where the chords are indicated by an underscore.

Fundamental cutsets are complementary to fundamental cycles in the sense that if a chord cj occurs in the fundamental cutset of a branch b;, then branch bi occurs in the fundamental cycle of chord cj. This may be verified from the fundamental cycles and funda~nental cutsets with respect to the spanning tree of Figure B-2 listed in the preceding para- graph. This property can be used to identify the fundamental cycles with respect to a spanning tree given the fundamental cutsets with respect to the same spanning tree.

Fundamental cycles (or fundamental cutsets) can be used to generate new spanning trees of a graph starting from a given spanning tree. The technique known as an elementary tree tt-unsfot-tnatioir (ETT) involves the interchange of a chord with a branch. In this technique, we add to the spanning tree a chord and delete a branch belonging to the fundamental cycle with respect to the original spanning tree formed by the chord which has been added. For example, the spanning tree shown i n Figure B-5 is a new spanning tree of the graph in Figure B-1 obtained from the spanning tree in Figure B-2 by adding chord 1, and deleting branch 7, which belongs to the fundamental cycle forrned by chord I. The new spanning tree differs from the initial spanning tree in respect of one cizoid and one branch, and is also referred 10 as the neighbor of !he initial spanning tl-ee.

Fioure 6-5. Spannina tree formed by ETT of spannir.9 tree in Figure 8-2

REFERENCE

1. Deo, N. Gruplz 777eor~ ~ , i t k A/~j)Iicafiorr.~ to E I I ~ ~ ~ I C ' C ~ Y I I ~ ( z ~ d (:Ott~j~i~iet- Sci- ence. E~~glewood Cliffs, N.J.: Prentice-Hall. 1974.

Appendix C

Fundamentals of Probability and Statistics

R A N D O M VARIABLES A N D PROBABILITY DENSITY FUNCTIONS

Probability is a mathematical theory dealin: with the laws of rundonr e1.e77t~. For example, the rzsu!t of a phy~icdl or cheinical experiment is a random event. The measti!-ed ol inferred va!m obtained at the end o i experiment is a ru?zdoin variirh!e which lies within a specified interval with a certain prcbability.

it is easier to understand the behavior of random variables if we analyze a discrzte event. The rolling of a pair of dice provides a good exam- ~ l e a i a random variable. It is impossible to PI-cdict the outcome of an individ~lal roll; however, i t is nmre likely that the summation ngmher lor the pair of dice i~ a 7 rather [hail a 12. This is because, of the 35 possible rolls, there is oaiy one way :o roll a 12-naniely (6.6). while thcre are six ways to roll a 7-namely (6,l). (5,2). (4.3), (3,4); (2.5). (1,6).

Let us assume that we roll the dice thousands ui' i i ~ ~ ~ c a dlrd record how many times each ro!l occurred. Then. the probability fur,c:ion for each roll R. i.e., P(K), or so-called prohahilip derl.si~~.fir17crior? (PDF, or p.d.f.1 is:

Number of tiriles roll K occun-ed P(K) =

Total number of rolls

Figure C-1 shows the graph obtained by plotting P(R) for all possible rolls R. The probability of rolling a given valne R in a single throw of the dice is the area under its rectangle. For example, you have a 4 in 36 chance of rolling a 9. The probability of rolling 3 or 12 in a single throw of the dice is the total area under their rectangles, 2/36+1/36=3/36.

The PDF graph in Figure C-1 is discontinuous, because rolling a pair of dice produces only discrete values, i.e., the resulting value must be an integer between 2 and 12 inclusively. Integrals of the PDF are quite useful because they determine the probability of occurrence of a group of events. For example, we can obtain P (5<R<9) as the area of the plot which lies between R=5 and R=9, inclusively, i.e., 2(4/36)+ 2(5/36) +6/36 = 24/36, or 0.6667. Therefore, 66.67% of the rolls will have values 5 < K 9 , or, in other words, the ~rcbabilitv of getting a roll R such that 51R19 is 66.67%. Another useful quantity is the probability that the roll is greater (or smaller) than a particular value. For example, P(R>9) = P(R= lC)+P(R=l l)+P(R= 12) = 3:36+2/26+ 1/36 = 6/36.

Most e~-rors in plant tnerrsurements are random variables. But unlike the rolls of a pair of dice in the previous example, they are continuous varirtbies, not results of discrete events. This means that there is an infinity of possiblz "discrete" values for the events sssociated with contingous variables. For that reason. for continuous random variables, the integral

I 6136 -i Probability of

rolling 9 is 4/36 5/36

PI-obability of vaiue rolled

2 3 4 5 6 7 8 9 1 0 1 1 1 2

Value rolled with a pair of dice, R

Figure C-1. Probabiiiiy density function for r o l l i ~ ~ a poir of dice.

S , @

386 Daru R~rcoi~cil~o;ii~ri and GI-oss Error- ijc.rec1ir.1

probabilities are of special practical interest. For example, we can make statements such as "there is a 95% chsnce that the true flow rate for stream S10 lies between 5,000 and 6,000 BPD." Or, "there is a 2.5% probability that the true flow rate in stream S10 exceeds 6,100 BPD." In order to calculate such probabilities, a continuous probability density function is required. For reasons specified in Chapter 2, the most widely '4 B used density function for continuous random variables in physical and chemical sciences is the normal dis~ributiorzfir7ctiorz.

The normal distribution is also known as the Gaussiarz distribution, and its PDF IS described by the formula:

where p is the mean -value of the random variable X, and o is its standard e 3

deviation. Since in practice we expect the errors to be zero on the average, p = 0 for the measurement en-or density filnction. Figure C-2 shows a Gaussian PDF with a mean of zero and a standard deviation of 1. This funct~on is a conti!luous analog of the dice rolling density function. The ~orn ta l disti-ibution with Lero mean and standard deviarion of I is also know11 as .srcrrzilcir-d I Z O I - I I K ~ I di.~trib~4tto1z.

Figure C-2. Normal distribution density function

The most important properties of the normal distribution PDF are as follows:

1. The maximum value of F(X) occurs at the mean. p. 2. The standard deviation o determines the width (or the skewizess) of the

curve. For a very accurate instrument (small o), the density function will look like a sharp peak centered at zero. On the contrary, for an inaccurate instrument (large o), the PDF will look rather flat.

3. The yoG factor normalizes the density so that

4. It is symmetric about the mean.

5. The probability of a measurement error lying between XI and X2 is:

This probability is equivalent to :he area under the curve between XI and X7.

6. Similarly, the probability that a nieasurement error (it\ ab~olute value) is greater than a panicuiar value X* is

Tnis probability is equivalent to the area u:ldcr the curve ou:side the interva! (-X* and X*).

4 s iilcstrated in Figcre C-2, 95% of !iie randorn errors should lie within 1.96 standard deviztions. Anaiytlcaily, this means:

This is often ca!led the 95% coi~fiderlce irztet-::nl. The 99410 conf- dence interval occurs within 2.58 standard deviations of the mean. Note that these figures are true when there is only one measured variable. For ~nultiple measured variables, the threshold is recalculated based on the rules given in Chapter 7 (see Sidak's rule).

Another distribution of interest for the statistical applications in this book is the chi-squaw (x" dislt-ihution. If' Kl, K?, . . . R,, are indepen-

dent random variables described by a standard normal distribution. then a chi-square random variable is defined by:

r = l a' .a

The integer v is usually known as the tzumher- of degrees of fi-eedot~z. The probability density function for a x' for different degrees of freedom v is illustrated in Figure C-3.

The probability distribution function for the chi-square distribution l i

described analytically by the following fomlula:

where r is the gamma function. The most important propmies of the chi- square distribution are:

1. The inean m!ue of %'(\I) is v.

Figure C-3. Chi-square distribution dens+ function

2. As v approaches infinity, the chi-square density approaches the normal distribution. The x2(8) curve in Figure C-3 is starting to illustrate this behavior.

0.18 0.16

I P ( X ~ < 9.49) = 0.95 i I

0.14 I W2> 9.49) = 0.05 1 0.12

F(X) 0.1 95% o f total area I

0.08 0.06 0.04 0.02

n

Figure C-4. Ccnfidence intervals for chi-sauure distribution.

Confidence intervals can also be constructed far the X' distributioi;. For e?tar;lple. Fignre C-4 silows a 95% i-crifidence region for a 2' distn1;- ution f~nct ion with 4 degrees nf freedom. in this particular cxce, 95%- of the random x2 variab!es siiould lic between 0 and 9.49.

STATlSTICAL PROPERTIES OF RANDOM VARIABLES

The statistical properties for random variables were indirectiy mentioned as part of the ar?alytical description of :he probability density tunctions above. Now we are going to define thein i n ;1 more genci-a1 fratncwork. There are two basic properties for the random variables (also known in statistics as ~uot?zer~!s): the mean and the variurlce of the random variable.

The mean vaiue of a random variable X. px. is defined as the e.rpected v~ilue of X . For a continuous variable. it can he expressed analytically as:

The expected value defined above can also be defined as the first !

moment about zero. In general, the first moment about a constant value 6 is given by E[X-6). If 6 is equal to the expected value of X, the central first moment is obtained, and the corresponding distribution is called the centi-ul distribution. Otherwise, a noncenti-a1 distr ib~~rior~ is obtained. -

The mean (expected value) of a random variable Z whose distribution C. ' - a is a joint distribution of other random variables, X I , X2 . . . X, (i.e., a tnultivariate distribution) is defined as

where Z = f(X1. X2 . . . XI,) and @(XI, X2 . . . X,) is the joiilf pi-obabilitl), density$i17ctioil of the random variables XI , X2. . . X,,.

If f(X1, X2. . . XI,,) is a linear function, i.e.,

then the mean :,slue of % 1s a 1 ~ 1 Ilnear.

-- l!ic variance of a random variable X. var(Xj, i s defir~ed a< the reccnd P z ~uoinerzt about Lero, 1.e..

'I'he relationship between the variance and the standard deviation of a random variable is given by:

Var(>o = o i (C- 13)

The variance of the multivariate random variable Z = f(X,. X?. . . XI,) is defined as

If f(X,, X2 . . . X,) is a linear function and the errors of the primary random variables X I , . . . X, are mutually independent random variables, (C- 13) reduces to:

since the variance of a constant is zero. A practical application of the above definitions is the derivation of thc

mean vector and the covariance matrix of a vector of randorn variables which is a linear function of other vector of random variables. For exani- ple, let us assume a iinear equation in vector form such as:

Let E(x)=O arid Cov(x) = Q, ~vherz Cov(x) means tha covarisnce matrix of vector of laildo111 val-iables x. Then, tile expccted \~alue of y is:

These results for linear transformations of vector :)f randon, variables arc used in Inany derivations throu,?hoot this book.

HYPOTHESIS TESTING

Hypothesis testins is a very inipol-tant statistical tool for making decisions about random variables. The procedure uses information from a random sa~nple of data to test the truth or falsity of a statement. 7'he basic staternent about a random variable is usually called the rzuil Izy7othe.si.s,

denoted by Ho. The opposite hypothesis about same random variable is ca!!ed :he alternative hypothesis, here denoted b y H I .

The decision to accept or reject the null hypothesis is based on a statistical test. The test statistic (the value of the statistical test for given data) is first calculated with the data in the random sample. A decision criterion (a threshold of the statistical test) is used to make the decision about the hypothesis Ho. Two kinds of errors may be made at this point. If the null hypothesis is rejected when it is actually true. a Type I error is made. Alternatively, when the null hypothesis is accepted when it is actually false. then a Type II error is made. The probabilities of occurrence of Type I and Type I1 errors are as follows:

N = P(Type I error) = eject Ho I I& = true) (C - 19)

y = p(Type I1 error) = eject H , I H , = true) (c - 20)

The powel- of the test is often bsed to evaluate a particular statistical test. and ir is defined as:

Power = ~ ( ~ c c e p t N, 1 H I = true) = 1 - y (c -21)

In ihis book, the hypothesis testing is used tc: tetit the nui! hypothesis:

H G : :here is uo gross error in proceys dera,

\rcrsut; thz al:en~ative i\y?oti1r,sia:

HI : there is ar least a gi-oss error in process data.

or. mori- specifically:

f3,, : there is a gross error in measurement j.

For multiple tests, as in the case of multiple measurements in tile piant, the probability of Type 1 error is higher than a. An upper bound P can be designed, as explained in Chapter 7. Let zj be the test statistic for n?easurement j. If lzjl > Z I d 2 , then the null hypothesis Ho is rejected and hypothesis HI , is accepted. This means that z, is outside the +z1-B/2 confidence interval for a standard normal distribution. This is similar to value X being outside the interval (-1.96. +1.96) for a = 0.05 in Figure C- 1 .

On the other hand, if a global test described in Chapter 7 is used to test the null hypothesis ifo against the global alternative hypothesis H,, thc threshold for the test is x;, , at a chosen level of significance. As in Figure C-4, if the test statistic is greater than xf,,,, the null hypothesis is rejected and a gross error is declared in the measurement set.

REFERENCES

1. Wadsworth, H. M. Hntzdbook of Statisrical Merizodstbr E12ginrer.r and Sc.ierz- risrs. New York: McGraw-Hill, 1990.

2. Hines. W. W., and D. C. Montgomery. PI-uiiiihili~. nrd Srotirricr it; Ell,girierr-- ing atlci !Mnrlaget?~enr Scie;zre. New York: John \Vile). & Sons. 1980.

3. CATACON Workbook. Brca, Calif.: Simulation Sciences. Inc.. 1996.

The choice of the test threshold depends on the statistical test that is used for hypothesis testicg. If the statistical test follows a standard nor- nlal distribution, such as some of the sta~istical tests in Chapter 7, a thresli~ld Z , , , is used, at a chosen a level of significance. The value f( is used to control the probability of Type I error at value a.

Accwacy of estimation, 301 of measurement, 36, 37

Adjustability, 210-21 1 Ad.jastments. 12-14, 16, 19 Amrnonia

plant case study, 343-368 synfnesis process, 21 9,307-308, 3 14

Aritoine equation, 120 ,%verage

error of estimatioi~ (AEE), 258 nuicber cf Type I errors (AVTI), 258

Balance compone.nt flow, 88. 91,93-95, 98 deficit, 332 elemental, 95 entlialpy, 115 ovei-a11 low, 88,93, IUb, 107 residuals, 17:

Ball mills, 107-1 11 Bayes

decision rule for identification, 266 formula, 267, 269

Bayesian algorithm, 27 1-273,278 Bayesian test, 23, 264-273, 278

sequential application of. 267 Bernoulli randorn variables, 265, 267

Bias in measurement, 32, 37, 176, 186.256,282,290,29 1,294

Bounded GLR method (BGLR). 249-253

Bosnds cn variables. 22. 25, 61. 138. 23Y.241,246-253,262.369

Branches of spanning trees. 1 i0-112. 3 16-320,38Ct_:X?

Broyden's r:l:it~-:x ilpdste pmcedure. 127. 131

Cauchy-Schwartz ineqnality, 192 Certainty Equivaleace Principle. 157 Chi-square

distribution, 189, 225. 390-351, 387,388

random variable, 358. 389 Cholesky Cactoi-ization, 159 Chords in a graph, I 10-1 12,3 16-320.

380-383 Circuits

mineral beneficiation, 7 Coaptatioii subproblern, 77 Collective

methods for bias and leak detection, 256

principal component tests. 200. 209 Combinatorial strategies, 253-254

Confidence interval, 387 Bonfmoni, 181

Connectivity, 382 Constant direction approach, 125 Constraint test, 23, 180-182,201, 203,

232,253-255,259,278,334 (see also Nodal test)

Constraints bilinear, 134 equality, 122 inequality, 128-129: 166 nonlinear, 124, 13 1, 138

Continuous stirred tank reactor (CSTR), 162-1 64, 168-1 70,261

Control law, 148 Coulers, 116 Correlation coefficients, 33 Covariance matrix, 63, 12 1

of balance residua!^, 178 of measurement adjustments, 183

Critical value of a statistical test, 177, 178, 180,

Crowe's project matrix methcd, 9'7--104, 113-1 14. 1 16, 126, 132-1 33, 138,219,333

Crude preheat tiair?, 5-4, 86, 329,

335-339,340 split optirnrzatio!l, 5 , 10

CUSUM tests, 55 C11:sets in a giapt-r, 1 i0-! !2, 317, 319,

320,381-383 Cycles in a greph. 380

Data coaptation, 8, 15. 22 conditioning, 4, 56 filttlring. 27, 39, 5 1 rectificaiion, 3 smoothing. 39, 5 1 validation, 27, 56

Data reconciliation (DR) benefits from, 2@21 bilinear, 25, 85-1 17, 119, 316 dynamic, 10,23,27, 142-173,330 estimation accuracy of, 301-303 flow reconciliation example, 1 1-13

for nonlinear processes, 22,26, 138 history of, 2 1-24 in dynamic systems, 142-173,282 industrial applications of, 327-37 j linear steady-state, 59-84, 155 material balance, 72 nonlinear problems, 262 nonlinear dynamic (NDDR), 165,

166, 168, 169, 170 nonlinear steady-state, 25, 119-141

' parameter estimation and, 33 1-332 plant-wide material and utilities,

332-372 problem formulation, 7.9-10 process unit, 328, 334 simple problems, 11 simulation techniques for

evaluating, 8 1-82 statistical basis of, 61-63 f

steady-state, 4,5,6, 7, 10, 23, 25,27, 80, S1,85, 153, 154, 166, 329

successive linear (SL), 124-128. 135. 137

DATACON'" software. 35 1, 354 DATREC s~ftware, 33 1 Degrees

of frecdorn, 388 of redundancy, 65

Delay ir, instmment checking, 27; in daia filtering, 39, 41

Detectability factor, 21 1 Dirac delta function, 161 Distributed control system (GCS). 4G Distl-ibutions

beta. 268 ceetral. 192 multivariate, 392 noncen~ral, 392 normal. 35,386-387 standard normal, 386

Dyrlarnic measurement test (DMT). 248-149

Edges, 321,378-382 Eigenvalues, 196, 200, 209, 376 Eigenvectors, 196, 377

396 Uutrr Reco~~ciliufion mid Gross 6-rol- Derectior7

Elementary tree transformation, 383 Energy

balances, 9 conservation constraints, 9 conservation laws, 8, 27 flows, 1 I

Enthalpy balance, 212,340,342 flows, treatment of, 114, 116

Equivalency classes, 215,2 16,258 Error-in-variables (EVM)

estimation, 168 Error reduction methods, 38 Errors

gross, 1 4 , 6 ,7 , 1 1, 17-20, 21, 22, 23,24,26, 27,32,34, 35, 37, 60,XO-81, 128, 174-225, 226280,327-372

normalized, ! 76 random, !A, 7, 12,27,32-37,41.

56.61, 81, 1 6 1 4 5 . 151, 154, 163, 168,175. 176,358

reduction methods. 38-56 sqtiared prediction, 199 syr;temaric. 32 Type I. 177, 181, 184, !e8, l9c.

191, 198,223,229-23 1,233. 234, 236.240,254,255, 257, 258,259,262,264, 27!, 273, 284- 286,294,332,393

Type Ii, 177,223,234,236,254 255.284,286.294,392

Es t~mat~o:~ xcuracy of da:a reconc~l:at~on. 30 1 of mlnlmum obsen able scn\or

net\\forka, 306 Expected value

of random errors. 32-34.389-39 1 of a furict~on of mndom varr~Sles,

35-37 Extended measurement test (EMT),

248.249

Faults, 282, 295 additive. 289 diagnosis, 28 1,284,288, 295-297 hard. 295

isolability, 295-297 signature, 296 soft, 295

Filters analog, 38 digital, 38, 39, 54, 56 double exponential, 42,45 exponential, 40-47,48, 54 exponentially weighted moving

average (EWMA), 50 finite impulse response (FIR),

49 ,s 1 first-order, 40,42,45, 53.54, 55 geometric moving average, 50 hybrid, 54-56 infinite impulse response

(IIR), 40 Kalman (see Kalrnan filterj least-squares, 5 1 moving average, 48,49,54 nonlinear exponential. 42-43, 45.

46.47, 50, 54 polynomial, 5 1-54 reverse nonlinear exponential, 44 seccnd-order, 4 5 5 3 square-root covariance, 158

Flow balznces, 12 energy, l I enthalpy, 114 estimated. 12 mass, 11 measured, 12 reconciliation. 13, 14. 16. I8

Fourth-order Runge-Kut:a method. 163, 169

Fundamental cutsets, 382-383 cycles, 382-383

CAMS, 323 Gauss-Jordan elimination process, 124 Gaussian

distribution, 10-1 1, 161, 284, 289, 290, 386 (see also Normal distribution)

elimination, 31 3, 3 15

Generalized likelihood ratio (GLR) test, 23, 185-194, 199, 201, 203-205,214,223,227-230, 234-238,240,241-244,252, 259, 261, 262, 266,269, 277, 288-294,296,298,340

Generalized reduced gradient (GRG), 132-133, 137, 138, 167

GINO, 323 Global test (GT). 23, 178-1 80, 193,

194, 198, 199,201.203-207,, 222. 230, 23 I, 236, 238, 240, 248,252,259,277,283-288, 293,294,298,355,359-362, 364,366,367,368

Graph, 378-380 operations, 380 process, 378-380 subgraphs, 379-381.383-384 theoretic methods, 22. 25, 72. 82,

1 10, I35,3 15-3 16.320,324 theory fundamentais. 378-393

Grinding mills, I I3 Gross error detzction (GED). 1 4 . 2-3.

24.26, 174-225,226-280,330. 340-541. 359

basic statistical tests for. 174-195 benefits from, 20-2! for- steady-state processes.

226-280 history of, 21-24 i n lineal- dynaniic sysirms.

28 1-299 in nonlinear processes. 260-154 rrtedel. 185 $?rial strategies f ~ r . 236-238 signature models, 187 simultaneous strategies for,

227-236,248 using principal componcni

trsts, 195-200 Gross errors. 1,6, 7, 1 I. 17-20. 2 1 .

22. 23, 24, 26. 27. 32, 34. 35. 37-38,60, 80-81, 128, 174-225. 226-280,327-375

equi\rale~~cy classes, 215. 216. 258 equivalent sets of, 2 14, 2 16

ident~fiability of, 214-21 7 identification strategies,

256-260 signature vectors, 20 1 , 2 15,

216,230

HARWELL mathematic library, 82, 131

Heat balance equations, 25 exchangers, j4, 9, 10, 1 1 , 2 1 . 73,

74, 75, 1 15-! 16, 162, 179, 21 2,233,252.274,276,29S. 336,339,340,342.343

transfer coefficients, 9-1 0. 2 1, 295,33 1

transfer fluid (HTF). 27G-275 Heatcrs, 1 16 Hessian matrix. 130-1 3 1 Hotelling T' test, 329 Hypotheses

alternative, 176. 187. 228, 743. 241, 392

cornhinatoria!. 278 ~lobal alrema!ivc. 395 null. 176. 2G3. 529. 231. 2%-2%.

293. 391-393 resting. 39 1-393

irnpierner~tation of da::~ recocziliarion yidz!ines. 339 on-iine. stead) -state. 329

IMSL niathen2a:ic library. S2 Independent

equatic~~s. 88 r-sr~cioin errors. 3 ;

I~~novations. 150. 25-7-2813 inteeral o!'ahsolutc. eri.ors (IAEI,

3934.47,19 lritcgral dynamic rncasurerncnt

test. 297 Itcr.:lti\.c rueasurcnient test (IMI'),

238-240, 213.236. 248.277

Jacobian matrix, 123-1 24. 125. 126. 127

Jo~nt probability dzns~ty funct~on. 390

398 Doin Ilccor~crllarron and Gross Error- Derrcrion

Kalnian filter, 148-160, 163, 164, 165, 170, 171,283,285, 289, 293,294,296,298

extended, 23, 161, 163, 169 filtering methods, 26, 148-160 gain matrix, 150, 158, 160 implementation, 158 steady-state Kalman gain,

151, 152 Kruskal's algorithm, 321-322

Lagrange multipliers, 60, 61, 122-124, 131

Leak detection, 185-189,254-256,335 Leaks, 37, 174, 185, 189. 190 Least-squares

formulation, 160 minimization, 121 op~imization, 8, 13. 161 weighted objective fur~ction. 8

Le\ el of significance, 176. 392 modified, 181

Likelihood functi~n, 52 Line search. 127, 130 Idinear

cc~nbination techniqw (LCT). 254-256.260

da:a rezonci!istior. pioblems. 9, 59-82,155

program (LP), 132 systcrns, 63-77

L.ccal neigilhorhood search techniq~te. 323

120ss e5timatlon. 334 LU decomposition of matrix,

134, 135

Maznitude of bias, 185, 262 of gross error, 37,265.270. 273

MATLAB, 82.2 17.274 Matrices and their properties, 373,

375-377 Matrix

column space, 375 covariance, 63. 121 decomposition methods, 70-72

left null space, 375 null space, 375,376 projection, 64 ,6649 range space, 375 rank, 375 row space, 375 signature, 289 trace of, 303, 376

Maximum likelihood estimates (MLE), 122,230

Maximum power (MP) constraint tests, 18 1-182, 199 measurement test, 184-185,

190-1 93,199,203-206 Mean values, 392 Measurement

accuracy of, 37 direct method, 78 elimination, 23, 24 error covariance matrix, 77,78,

79, 178 errors, 27,32-38 indirect method, 78, 80 practically nonredundanr, 21 0 practically unobservable

variables, 21 0 precision, 37 test (MT). 20, 23, 183-1 &5, 201,

222,227,255,355-356,359, 360,361

test statistics, 183--I 85 Mineral

beneficiatiotl circuits, 7, 104 flotation process, 102 process circuits, 23

MINOS, 137,323 Mixers, 9 1, 94, 102, 1 1 3

enthalpy balance, 114 two-phase, 106-107, I08

Model identification, 143 linear discrete dynamic system,

143-145 tuning, 20-21

Modified iterative measurement test method (MIMT), 247-249,25 1, 261,278

Modified serial compensation strategy (MSCS), 26246,259,260, 263-264,277-278

Moving window approach, 166-167

Newton-Raphson iterative method, 123,132

Nodal test, 23, 180-182, 208,232, 253-255,259,278,334 (See also Constraint test)

Nodes, 378-38 1 Nonlinear

data reconciliation, 9,26,85, 164-1 70 GLR test, 263-264 optimization strategies for data

reconciliation, 136, 167, 17 1 programs (NLP), 23,25,26, 103,

104, 128-129, 134, 137,261, 276,331,369

state estimations, 160-1 64 Normal distribution, 10-1 1, 161, 284,

289, 290,386 (See also Gaussian distribution)

Objective function (OF), 261-263 for data :econciliation. 8, 60

difference. 26 i-262 reduction in. 204-205

Observzbiiiiy, 22, 69-70, 71, 72. 74, 82, 135, 135,2lG

definition of. '70 , $ Oil-iine data coilection and cocditioning, 5 implementation a:' data

reconciliatiun. 329-330 optimization, 10

Open Yield software, 335 Optimal

control and Kalmax fi1:::-in;, ! 55 state estimation, 148

Orthogonal collocation, 166-167 Overall power (OP), 257

function (OPF), 257 function equivalency, (OPFE), 258

Parity equations, 296 + *' Paths, 380

Performance measures for GE identification strategies

average error of estimation (AEE), 258

sverage number of Type I errors (AVTI), 258

overall power (OP), 257 overall power function, 257 overall power function as

equivalent sets, 258 Plant-wide material and utilities

reconciliation, 332-372 Posterior probability, 266267 Power of statistical test, 177, 181,

190,392 Preheat train, 5 4 , 2 1 2 Principal component

analysis (PCA), 296 scores, 196 measurement test (PCMT). 197,

355-356,360 model, 197,297 of constraint residuals, 196 tests, 21, 176. 195-200.207-209,

223.232.259,364-366 Prior

distribution. 268 pr~babilitji, 255

Probability density functions (PDF or p.d.f.). 35, 384-339

Process control applications, 10 data conditioning methods. 1 4 unit balance reconciliatiotl, 328-33 1

Produt-ti011 accounting, 332, 335 Projection matrix, 22, 25,64,66,67,

70. 74, 8 1, 82, 101-102, 132, 134, 146,2U1, L ! Y , ZLI

Q statistic, 199 (See also Rao-statistic error or squared prediction error)

Quadratic objective function, 129 problem (QP), 13 1

QPSOL, 13 1 QR factorization, 22,66-7 1, 8 I,

127, 128

d Gross El-ror I>etecrion

RAGE software, 136,33 1,340 Random

errors, 1 4 , 7 , 10, 12, 27,32-37, 38, 41,56,61, 82, 143-145, 151, 154, 163, 168, 175, 176,358

events, 384 variables, 384-393

Range and null space decomposition (RND), 131, 136,376

Rao-statistic error, 199 Raoult's law, 120 Reactors, 94-96, 113 Real

numbers, 376 vectors, 376

RECON software, 33 1 Reconciliation of ammonia plant data.

343-372 RECONSET software. 33 1 Redundancy, 22,27,69-73,82, 134,

135,209,210, 21 1, 228, 300. 330,342,343,354,369

classification: 71. 135 definition, 70 degrees of, 65 spatial. 4 temporzl, 4, 149, 171, 282, 291

Redundant sabproblem, 77 Rigorous on-line modeling. 2 1 W4D-SQP, 131. 133, 136,376 ROh4e01" softwa-e. 331 Runge-Kutta method, Courth-order.

163, 169

Seiectivity, 258 Sensor network

design, 70, 300-326 developments in des~gn, 323 mar;im.;m es:imati~n accuracy

design, 306. 3 15 minimum cost designs, 3 13-3 15,

320,322,324 minimum observable. 304306,307,

312,315,316,318,322,324 optimization techniques for, 322

redundant observable, 305, 309-313,320,324

Separation Theorem, 157 Separators, 94, 1 13

two-phase, 105, 108 Sequential

probability ratio test (SPRT), 284 quadratic programming (SQP),

129-132, 135,136 Serial

compensation, 237, 241, 243, 260, 277

correlation, 34 elimination procedure, 24,204, 214,

strategies, 236,246247, 259 Shewhart test, 54-55 SIGMAfine software, 335 Signal

aliasing, 38 processing, 25 reconstruction. 45.55, 297 types, 55

Signature marrix, 289-290 Simpie seriai compensation strategy

(SSCS:, 241-244,245.259, 260,277

Simpson's technique, 104-105, 108. 109,111,113, 114, 117, 133,3!6

Simultaneous strategies for multiple gross error

identification, 227 using 2 Bayesian approaclr,

264-273 using combinatorial hypothesis

testing, 228-23 1 using simultaneous estimation of

GE's magnitudes, 232-233 using single gross error test

statistics, 227 Smearing effects, IS, 228 Soave-Redlich-Kwong (SRK), 354 Spanning tree, 110-1 12.316-322,

325,380-383

Spatial correlation, 33 redundancy, 4

Splitters, 91-93, 96, 113-1 14 enthalpy balance, 114

Squared prediction error, 199 SQPHP, 131 Standard

deviation, 33, 34-37,45,56, 2 18 nom~al distribution, 386-387

Statistical moments, 389, 390 process control, 38, 55 properties of innovations, 283-284 properties of random variables,

389-391 quality control tests (SQC). 3, 38 tests for general steady-state

models, 200-202 Steady-state

linear reconciliarion, 25 processes. 4. 25, 282, 369

Subgraph, 379-381 Subproblems

coapta~ion. 77 redundant. 77 -

Successive linear data reconciliation, 126128

Successive quadratic progran?minz (SQP), 22-23, 13 I . 135, 138. 167

Successively linearized hoiizon estiriiation (S1,IIE). 167

Systematic errors, 32 Systems

bilinear, 25, 85- 117 containing gross errors, 17 dynamic, 25, 27 linear. 63 linear dynamic, 28 1 noilli~~ear, 160 non1-edundant, 16 observable, 17, 25 redundant, 25 unobservable, 17

with all measured variables, 11-14, 15, 16,22,26

with unmeasured variables, 14-17,22

Taylor's series expansion, 36, 123, 129

Test statistics, 175-203, Theory of evidence, 329 Truncated chi-square test, 199

Unbiased estimated techniques (UBET), 232-233,236

Univariate tests, 180, 184. 199

VALI, 33 1 Variables

basic, 132 classification methods, 77 dependent. 132 independent, 132 measured, 11-14, 15, 16, 22,26. 32,

33,62, 6356.69, 70,72, 81, 101, 109. 121, 132, 135. 149, 177,212,238,249,257.26C, 305,314,3 16,358

nonbasic. 132 nonredundant, 16,2 10 observable, i7, 25 primary, 35 random, 3557, 119,256,257.

184.389 redundsqt, 17% 136. 200, 2 I?. restricted, 250 secondary random, 35 split-fractior., 1 13-1 14 superbasic, 132 unmeasuied, 14-17. 21, 27.63-58.

81, 100, 103, 110, 121. 126, 132-133, 135, 136, 177,200. 202, 204,2 12, 24 I, 248, 249. 305, 312, 315, 316,331. 358

unobservable. 17, 135, 136. 210 Variance. 392-393

of the estimated error, 302-303 of random variables, 33,389-39 1

402 Dora Rccoitciliatiun aud Gross Error Urrecriol~

Vectors row, 373 and their properties, 373-375 space scanned, 377 column, 373 dimension of, 374 Weighted least-squares objective gross error signature, 187, 193, 215 function. 8 of balance residuals, 178 Windows, 166-1 68,285,293 of measurement adjustments, 183 real, 373 Yield accounting, 332, 335, 369

Author

Abadie, J. i 32 Albcquerque, J. S., 134,297 Ali, Y., 324 Almasy, G. A.. 23,79.80

Bagajewicz, M., 77, 155,214,260, 297,322,324,330

Bagchi, A., 158 Basseville, M., 295 Bellingharn. B., 145, 295 Bequette. B. W., 128, 137 Biegier, L. T., 22, 134, 136,297

i' Bodirrgtcn, C. E., 24

Borrie, J. A,. 158 Eritt, H. L., 1-2, 25,!26

Carpani, R. E., 23 Charpentier, V., 2 10, 329 Chen, J., 80 Clinkscale>, 1. A., 5 1, 52, 54 Crowe, C. M., 22,24,66, 72.97,98,

101, 113, 135, 181, 190, 196, 198, 199,204,239,210,314

Daniel, J., 373 Darouach, M.. 155

Davidson, H., 22 Davis, J. F., 24, 181,232, 243,254,

257,259,260 Dee, N., 317, 321, 378 Devanathan, S., 155,2S4 Dunia, R., 37. 296, 297

Edgar, T. F., ! 39, 137, !62 Everell, M. D.. 23

Fisher, G., 22, 127. 133 Fisher, D. G., 151

Gelb, .4., 153 Gertler, J . J., 281,295. 296 Gill, P. E., i29 Gonnan, J. 'vSi ,22, 12-5

Har~humar, P., 256 . . neenan, W. A., 24,203,228,238, 239,247,255,261

Heraud, N., 23 H~mmelblau, D. M.. 129.295, 295 Hlavacek, V., 24 Hodouin, D., 23 Howat, C. S., 24, 120, 125, 126

n' i;rois EI ror- Ilerectian

Ichiyen, N., 102 lordache, C.. 210,214,269,273 Isermann, R, 295

Jazwinski, A. H., 160 Jiang, Q., 155,214,233,260,297 Jones, H. L., 285,293 Jordache, C., 24, 37,53,55, 136,260,

360

Kalman, R. E., 149 Kao. C. S., 34 Keller, J. Y., 80,238. 244. 245, 259 Kelly, J., 333 Kim, Y. $1.. 39, 168. 261 Knepper, J. C., 22. 125 fieisovalis, A., 72. 135. 303, 309,

3 12,324 Kuehn, D R., 22

Lasdon. L. S.. 132. 13'7 Lee, J. M.. 39 Lees, F. P.. 145, 295 Luccl-te, R . H.. 22. 125. 126 L-iebman, M. J.. 23. 137. 165, 160, 169 Liprak, B. (3. . 57 Loucka, M., 72.323 !,uo, Q.. 296

MacDonald, R. J.. 24. 120. 125, 126 MacGrcgor, J. F.. 50 Mac!:on. F., 24. 32. 31. -35. 37. 123.

121.209,3 13.3; 5. 324, 332 h4ah. R.S.H., 22. 23.2-4, 34, 72, 74,

79,SC, 135.151.181.185,200, 206, 2 14. 237. 253, 255, 288, 303,309,3 1 1.321

Makn~. S.. 15 1 Maquin. D., 32 : Mehra. R. K., 285 Melsa. J . I.., 148. 150. 15 1 Meyer, M., 72.96 Montgomery, R. C.. 284 Mullick, S. L., 330 RQu~tagh, B. .A,, 137 Muthy, A.K.S.. 23 Muske, K. R., 162

Nair, P., 37, 136 Narasimhan, S., 23, 24, 185, 190, 200,

2 14,237,256,26 1,264,288,324 Nikiforov, I. V., 295 Noble, B., 373

Pai, C.C.D., 22, 127, 133 Parr, A,, 45, 55 Patton, R., 281, 295 Peschon, J., 285 Press, W. H., 159

Ragot, J., 72 Ramamurthi. Y., 128, 137, 165,

167, 169 Rao, R., 324 Ravikumar, V., 23, 24. 136 Reid, K. J., 23 Reilly, P. M., 23 Reklaitis. G. V., 8, 95 Kenganathan, T., 261,264 Rhinehart. R. R., 45 Rlpps, D. L., 23, !95, 204, 237 Rolllns, D. K . 24, 155, 181, 232, 243.

257,259,260,284 Romagnol~, J. A., 22. 24.66. 77,

11 6,258 Rosenberg. J ,24,204,231.233,236.

248.249,260

Sage, A. P., 148, 150, 151 Sanchez, M., 22.66,77, 116.233,258 Saunders, M. A., 137 Seborg, D. E., 38,45 Sen, S., 324 Serth, R. W., 24, 123, 204, 228, 238,

239,247,255,261 Sheel. J. P., 24 Shewchuck, C. F.. 123 Shinskey, F. C. , 37 Sirripson, D. E., 23, 104, 108 Smith, H. W., 102 Sorenson, H. W., 150 Stanley, G. M., 23, 56,72, 151 Stephanopoulos, G., 24,77 Stephenson, G. K.123 Strang, G., 373

Swartz, C.L.E., 22. 66, 128, 1.34, 135 Sztano, 'r., 23

Tamhane. A. C.. 23.24, 1 8 I , 183, 264,273

Tham, M. T., 55 Tilton, B., 260, 356 Tjoa, I. B., 22, 136 Tong, H., 24, 196. 198, 199.209 Turbatte, H. C., 324

Wald, A., 284 Wang, N. S., 24 Warcn. A. D., 132. 137 Weher. R.. 45 Wiegel. R . I., 23 Williams, J. P., 284 Wi!lsky, A. S., 288, 293 Wisltner, R. P., 162

Yang. Y., 255

Vaclavek, V., 72, 323, 324 Zalkind. C. S.. 37 Veverka, V. V., 24, 3 13,315,324,332 Zasadzlnski, M., 155