stability in linear estimation

11
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 38. NO. I. JANUARY 1992 39 Stability in Linear Estimation Patrick A. Kelly, Member, IEEE, and William L. Root, Life Fellow, ZEEE Abstract-The stability with respect to model uncertainty of linear estimators of the coefficients of a linear combination of deterministic signals in noise is investigated. A class of estima- tors having nominal performances constrained to be close to that of the nominal linear, unbiased, minimum-variance (LUMV) estimator is specified. Two estimator stability indexes are de- fined, one based on a worst-case estimate mean-square error and the other on a type of signal-to-noise ratio. The estimator minimizing each index, subject to the optimality constraints, is found by reference to related LUMV estimation results. In most cases, the minimizing (or most stable) estimator is the same under the two indexes. Index Terms-Linear unbiased minimum-variance estima- tion, model uncertainty, stability. I. INTRODUCTION HE concern of this paper is the stability or robustness of T linear estimators of the coefficients of a linear combina- tion of known signals in noise. The observation space is taken to be a separable Hilbert space 2 (which would usually, perhaps, be L,[O, TI). The nominal model for the observation ZEZ is where each signal dk) is a known element of Z ; the noise w is a Z-valued random variable having zero-mean and covari- ance operator A; and b,, - a, b, are unknown real coeffi- cients to be estimated. Linear regression problems of this type occur in applications such as system identification or detection of parameterized signals in noise; often, (1.1) is a convenient linearized approximation to observations involv- ing general parameterized signals. It may be recalled that for A to be the covariance operator for w, A must be a bounded, linear, self-adjoint, nonnegative-definite operator on Z such that (Ax, y) = E[(x, w)(w, y)] for all x, yeZ (see, e.g., Appendix B in [l]). We assume that N(A), the null space of A, is zero; that is, A is (strictly) positive-definite. Since A is a covariance operator it must be compact and of trace-class. If, for example, Z = L,[O, T I , w is a stochastic process on Manuscript received April 17, 1989; revised June 28, 1991. This work was supported in part by the National Science Foundation under Grant MIP-8710871. This work was presented in part at the 23rd IEEE Conference on Decision and Control, Las Vegas, NV, December 1984, and in part at the 24th Allerton Conference on Communication, Control, and Computing, University of Illinois, Urbana, IL, October 1986. P. A. Kelly is with the Department of Electrical and Computer Engineer- ing, University of Massachusetts, Amherst, MA 01003. W. L. Root is with the Departments of Aerospace Engineering and Electrical Engineering and Computer Science, The University of Michigan, Ann Arbor, MI 48109. IEEE Log Number 9103260. [0, TI with sample paths that are square-integrable, a.s.; then A can be represented by a linear integral operator with the autocorrelation of w as kernel (see again, e.g., Appendix B in [ 11). Let { $(k), k = 1,. . ., L} be the usual orthonormal basis for RL. Let b = E,"=, bk$(,) , and let S: RL +Z be the linear transformation given by Sb = xi= , b,dk). Then (1.1) may be written as z = Sb + W. It is further assumed that each dk) E R(A) (the range of A), and that { dk), k = 1,. . . , L} is a linearly independent set. Then S has rank L, and S*A-'S: RL + RL is well defined and invertible (where the superscript asterisk denotes adjoint). For convenience, and without loss of generality, we assume that the observation is scaled so that IIAII (equal to the largest eigenvalue of A) is equal to one. We are interested in linear estimates of 6; that is, esti- mates of the form 8 = Hz for some continuous linear trans- formation H: Z 4 R '. We shall require that the transforma- tion (or estimator) H be unbiased under the conditions of the model (1.2), which is true, if and only if HS = I,, the identity operator on RL. A standard measure of estimator performance is the estimate mean-squared e_rror (mse). When H is unbiasFd, the mse of the estimate b is equal to the variance of b, and is given by (1 4 I. = (HAH*$'k', $'") k= 1 = tr [ HAH*], (1.3) where tr[ e] denotes the trace of . , and we are using ( . , * ) to denote the standard inner product and 11 11 the norm induced by the inner product (it should be clear from the context whether the space being referred to is R or Z). Notice that, when H is linear and unbiased, the variance (or mse) of b is independent of the value of b. Hence, one can seek a minimum-variance element of the class of linear unbiased estimators; that is, an estimator having minimum estimate variance for all b. As will be discussed further in Section 11, it follows from results in [ l ] that under the conditions previously stated, a unique linear, unbiased minimum-vari- ance (LUMV) estimator exists for the nominal estimation problem. The question of estimator stability or robustness becomes important when the actual observation does not exactly match the supposed model (1.2). For example, the true signal space may only be approximated by the span of { d'), . . . , dL)}, and the noise covariance may only be approximately equal to 0018-9448/92$03,00 cc) 1992 IEEE

Upload: wl

Post on 22-Sep-2016

227 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Stability in linear estimation

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 38. NO. I . JANUARY 1992 39

Stability in Linear Estimation Patrick A. Kelly, Member, IEEE, and William L. Root, Life Fellow, ZEEE

Abstract-The stability with respect to model uncertainty of linear estimators of the coefficients of a linear combination of deterministic signals in noise is investigated. A class of estima- tors having nominal performances constrained to be close to that of the nominal linear, unbiased, minimum-variance (LUMV) estimator is specified. Two estimator stability indexes are de- fined, one based on a worst-case estimate mean-square error and the other on a type of signal-to-noise ratio. The estimator minimizing each index, subject to the optimality constraints, is found by reference to related LUMV estimation results. In most cases, the minimizing (or most stable) estimator is the same under the two indexes.

Index Terms-Linear unbiased minimum-variance estima- tion, model uncertainty, stability.

I. INTRODUCTION

HE concern of this paper is the stability or robustness of T linear estimators of the coefficients of a linear combina- tion of known signals in noise. The observation space is taken to be a separable Hilbert space 2 (which would usually, perhaps, be L,[O, TI). The nominal model for the observation Z E Z is

where each signal d k ) is a known element of Z ; the noise w is a Z-valued random variable having zero-mean and covari- ance operator A; and b , , - a , b, are unknown real coeffi- cients to be estimated. Linear regression problems of this type occur in applications such as system identification or detection of parameterized signals in noise; often, (1.1) is a convenient linearized approximation to observations involv- ing general parameterized signals. It may be recalled that for A to be the covariance operator for w, A must be a bounded, linear, self-adjoint, nonnegative-definite operator on Z such that ( A x , y ) = E [ ( x , w)(w, y)] for all x, y e Z (see, e.g., Appendix B in [l]). We assume that N ( A ) , the null space of A , is zero; that is, A is (strictly) positive-definite. Since A is a covariance operator it must be compact and of trace-class. If, for example, Z = L,[O, T I , w is a stochastic process on

Manuscript received April 17, 1989; revised June 28, 1991. This work was supported in part by the National Science Foundation under Grant MIP-8710871. This work was presented in part at the 23rd IEEE Conference on Decision and Control, Las Vegas, NV, December 1984, and in part at the 24th Allerton Conference on Communication, Control, and Computing, University of Illinois, Urbana, IL, October 1986.

P. A. Kelly is with the Department of Electrical and Computer Engineer- ing, University of Massachusetts, Amherst, MA 01003.

W. L. Root is with the Departments of Aerospace Engineering and Electrical Engineering and Computer Science, The University of Michigan, Ann Arbor, MI 48109.

IEEE Log Number 9103260.

[0, TI with sample paths that are square-integrable, a.s.; then A can be represented by a linear integral operator with the autocorrelation of w as kernel (see again, e.g., Appendix B in [ 11). Let { $ ( k ) , k = 1, . . . , L } be the usual orthonormal basis for R L . Let b = E,"=, bk$( , ) , and let S : R L + Z be the linear transformation given by Sb = xi= , b,dk). Then (1.1) may be written as

z = Sb + W. It is further assumed that each d k ) E R ( A ) (the range of A), and that { d k ) , k = 1 , . . . , L } is a linearly independent set. Then S has rank L , and S*A-'S: R L + R L is well defined and invertible (where the superscript asterisk denotes adjoint). For convenience, and without loss of generality, we assume that the observation is scaled so that IIAII (equal to the largest eigenvalue of A ) is equal to one.

We are interested in linear estimates of 6 ; that is, esti- mates of the form 8 = Hz for some continuous linear trans- formation H : Z 4 R '. We shall require that the transforma- tion (or estimator) H be unbiased under the conditions of the model (1.2), which is true, if and only if HS = I,, the identity operator on R L . A standard measure of estimator performance is the estimate mean-squared e_rror (mse). When H is unbiasFd, the mse of the estimate b is equal to the variance of b , and is given by

(1 4

I . = ( H A H * $ ' k ' , $'")

k = 1

= tr [ H A H * ] , (1.3)

where tr[ e ] denotes the trace of . , and we are using ( . , * ) to denote the standard inner product and 11 1 1 the norm induced by the inner product (it should be clear from the context whether the space being referred to is R or Z) . Notice that, when H is linear and unbiased, the variance (or mse) of b is independent of the value of b. Hence, one can seek a minimum-variance element of the class of linear unbiased estimators; that is, an estimator having minimum estimate variance for all b. As will be discussed further in Section 11, it follows from results in [ l ] that under the conditions previously stated, a unique linear, unbiased minimum-vari- ance (LUMV) estimator exists for the nominal estimation problem.

The question of estimator stability or robustness becomes important when the actual observation does not exactly match the supposed model (1.2). For example, the true signal space may only be approximated by the span of { d'), . . . , d L ) } , and the noise covariance may only be approximately equal to

0018-9448/92$03,00 cc) 1992 IEEE

Page 2: Stability in linear estimation

40 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 38, NO. I , JANUARY 1992

A . Since observation models are almost always uncertain to some extent, it is desirable to find estimators that are stable with respect to model inaccuracies; that is, estimators whose performances deteriorate as little as possible when actual conditions deviate from the nominal. Similar problems have been treated previously in the context of matched filtering. In [6], [7], minimax-robust matched filters (i.e., filters maxi- mizing worst-case signal-to-noise ratios) were discussed for problems of one signal in noise, with signal and noise covariance uncertainty. In [2], robust matched filters were derived for multiple-input systems. The problem of robustly estimating the parameters in regression models with white noise was considered in [ 5 ] , where nonlinear estimates that are minimax-robust in terms of asymptotic variance were derived.

The problem treated here is different from those in [2], [6], [7] because the quantities of interest are vector parameter estimators rather than matched filters. The approach taken is also different, and is similar to that taken in [4] for the problem of stable linear detection of a known signal in nominally Gaussian noise. One problem with minimax-robust filters or estimators is that, while worst-case performance is optimized, there is generally no guarantee of very good performance under nominal conditions. The loss in nominal performance may, in some cases, outweigh the gain achieved in the worst case by using the robust method. It, thus, seems reasonable to search for the most stable or robust procedure under some constraints that do guarantee acceptable nominal performance. Another possible problem with standard mini- max-robust design is that it requires the specification of a level of model uncertainty, and the solution depends on that level. It would also be desirable to characterize methods that are stable or robust whatever the degree of model uncer- tainty. In [4], an optimality index, corresponding to the signal-to-noise ratio, was first specified to characterize detec- tor performance under nominal conditions. Uncertainty was introduced by allowing the true probability measure for the observation under each hypothesis to vary in a Prokhorov neighborhood of the nominal. A second scalar quantity, called the stability index, was shown to determine the largest possible difference between the actual and nominal test statis- tic distributions under either hypothesis and with any degree of model uncertainty. The stability index also had the form of a signal-to-noise ratio, where the “signal” term was essen- tially the largest possible shift in the mean of the test statistic from the nominal value. An optimization problem was then formulated and solved to give the most stable linear detector meeting the nominal performance constraints.

In this paper, we also set up and solve optimization problems, this time to find the most stable linear, nominally unbiased estimators meeting constraints on nominal perfor- mance. One major difference from the problem considered in [4] is that the measures of performance for the estimators are determined by means and covariances rather than specific probability measures. Hence, the nominal problem and the characterizations of estimator optimality and stability are formulated in terms of second-order statistics only. A second difference is that there is more than one reasonable choice for

a stability index for linear estimators. We will consider two indexes, based on two slightly different representations for model uncertainty and on different stability criteria, and show that they lead to the same stable linear estimator. For the special case of estimating a single parameter (i.e., L = 1 in (1. I)), the stable estimator has a form similar to the stable linear detector found in [4] and to the robust matched filter for signal uncertainly found in [6].

The organization of the rest of this paper is as follows. In Section I1 we review and apply results from [I ] on LUMV estimation. These results give the LUMV estimator for the nominal problem. Furthermore, they are crucial to the solu- tion of the optimization problem specifying the stable estima- tor. The optimality constraints characterizing nominal perfor- mance and the different stability criteria are discussed in Section 111, and the optimization problems leading to stable estimators are formulated. The solutions to these problems are discussed in Section 1V. In Section V, the results are generalized by allowing for some “directionality” in the model uncertainty. Section VI contains concluding remarks.

rr. LINEAR UNBIASED ESTIMATION

First, consider unbiased linear estimators for b in the model (1.2). In order to characterize stable estimators, we will need to solve the problem of finding an unbiased estima- tor H that minimizes t r [HOH*], where 0 is some positive-definite, self-adjoint, bounded linear operator on 2. If 0 = A , the trace term to be minimized is the nominal estimate variance, and the minimizing H is the nominal LUMV estimator. However, whether 0 is or is not the covariance of w is irrelevant to finding a minimizing H , and [ l , Theorem 2.21, which is ostensibly a theorem on LUMV estimation, can be applied. This theorem is actually more general than is needed, since it covers the case where both the parameter space and the observation space are infinite- dimensional separable Hilbert spaces. As stated, the theorem requires 0 to be trace class (which, of course, it is if 0 = A), but that condition is used in the proof of the theorem only to guarantee that t r [HOH*] exists; it may be dropped here because the parameter space is finite-dimen- sional, so that HOH* necessarily has finite trace.

The conditions to be met in [ I , Theorem 2.21 for the existence of a unique continuous H that is unbiased on N (S) (the orthogonal complement of the null space of S ) and minimizes t r [HOH*] are

1) R ( S ) is closed; 2) 0 - ’ / ’ S is densely defined; 3) ( A * A ) , ‘A*@-”* is a bounded transformation, where

A = o-‘/*s. The notation C, denotes the restriction of the operator C to domain N (C) and counterdomain R(C), and C;’ = (C,)- ’, which necessarily exists on R(C). If these condi- tions are satisfied, the theorem states that the minimizing H is the continuous extension to all of Z of

( A * A ) , ’ A*@- ’ ’* ,

which is provided in the theorem to be densely defined

Page 3: Stability in linear estimation

KELLY AND ROOT: STABILITY IN LINEAR ESTIMATION 41

In the case of interest in this paper, with parameter space R and S of full rank, it is easily seen that almost all of the complexities in the original theorem disappear to yield the following simple result.

Theorem I : If 0 is a bounded, self-adjoint, strictly posi- tive-definite operator on 2, and d k ) E R ( O ) for each k = 1, - * , L , then there is a unique continuous, linear, unbiased estimator H , for b in the model (1.2), that minimizes t r [HOH*] over the class of such estimators. The minimiz- ing H is given by

H = [(S*@-lS)-'s*@-l]**. (2 4

by assuming that the true observation has the form

z = Sb + w + q ,

where the precise nature of q is unknown. (We can also consider uncertainty in S , but discussion of that case is deferred to Section IV.) Suppose that we first consider q to be a Z-valued random variable, independent of w, with mean and covariance that, while not exactly known, have the known bounds

(3.2)

II B-"*E( q ) II 5 e l , (3.3)

cov[q'] I e;B, (3.4)

Furthermore, where B is some bounded, self-adjoint, positive-definite linear operator on Z, q' = q - E ( q ) , and the inequality in (3.4) is to be interpreted in the usual sense for nonnegative tr[ H @ H * ] = t r [ ( S * W ' S ) - ' ] . (2.3)

1

Proof: See Appendix A. definite operators (i.e., A , I A , means that A , - A , is nonnegative definite.) The operator B is used to indicate a

Remark I : Suppose that 0 = A . The conditions for the theorem are satisfied in this case by the assumptions made in the Introduction. Thus, there is a unique LUMV estimator H , for the nominal problem, having the form

H, = [(s*A-~s)-~s*P]**

level of uncertainty, represented by the mean and covariance of q , that is possibly larger along some components than it is along others. When there is no reason to incorporate direc- tionality in the uncertainty, we can set B = I (the identity operator on 2). The nominally unbiased estimator H is not necessarily truly unbiased when the extra noise q is present (2.4) since E ( q ) may not be zero. The actual mse achieved with the estimate b = Hz is and the minimum estimate variance for the nominal problem

is mse ( b ) = E ( 1 1 6 - b 1 1 ' ) = E( 1 1 HW + Hq 1 1 * )

= E( H w , H w ) + E( Hq 1 Hq) V, = t r [ (S*A-IS)- ' ] . (2.5)

Remark 2: Suppose that 0- is a bounded operator. In this case, the conditions of the theorem are automatically satisfied for any S ' ~ ) E Z , k = 1; * e , L . The unique mini- mizing H is then given by the conventional formula

= t r [HAH*] + ( H E ( q ) , H E ( q ) )

+ E(Hq' , Hq') L

I tr[ HAH*] + ( E ( q ) , H*$'k')2 k = 1

just as in finitely many dimensions. + E : tr[ HBH*]

4 tr[ HAH*] 111. OPTIMALITY AND STABILITY CRITERIA

We now consider characterizations of the optimality and stability of an arbitrary continuous linear estimator H . The criterion for optimality is to be based on a comparison of how H performs under nominal conditions with how the LUMV estimator H, performs. We restrict attention to nominally unbiased estimators; hence, we require HS = 11- as one optimality constraint. As noted in the Introduction, when H is unbiased the variance of the estimate b = Hz is equal to tr[ HAH"] under nominal conditions. We require as a second optimality constraint that this value be acceptably close to the minimum variance V, achieved using H,. That is, we require that

tr[ HAH*] I V, + 6 , ( 3 4 for some choice of 6 > 0.

L

+ I ( B- 'I2E( q ) I ( ' + E: tr[ HBH*]

( 1 B1'2H*$(k) I( ' k = 1

I tr[ HAH*] + y tr[ HBH*] , (3.5)

where y = E: + E ; . This bound on the mse for the estimate under actual conditions serves as one indicator of stability.

A second possible interpretation of the uncertainty term q in (3.2) is to view it simply as an unknown having a bounded norm in the form 11 B-'I2qII I €,_for some e 3 > 0. The effect of q is to shift the mean of b so that it is no longer equal to b. The size of the shift in mean is given by

( 1 E( 6 - b ) 1 1 ' = ( H q , H q ) 5 E-: tr[ HBH*] . (3.6)

The estimator H is to be stable with respect to model We could consider a bound on the mse for this case. Since uncertainty. Even with models specified only by their with q viewed as an unknown the variance of b remains second-order properties, we could describe uncertainty in equal to tr[HAH*], we would get a bound on the mse several different ways, and use several different criteria for having the same form as (3 .3 , with y = € 3 2 . Another indica- stability. We choose to introduce uncertainty into the model tor of the effect of the mean shift on the estimator output is a

Page 4: Stability in linear estimation

42 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 38. NO. I , JANUARY 1992

signal-to-noise ratio with the unwanted signal term equal to the size of the shift; that is,

' I E ( ' - b ) 1 1 2 I c:(tr[ H A H * ] ) - ' tr[ HBH*] . (3.7) var ( b)

We can, thus, choose as an alternate measure of stability the scalar (tr[ HA H*]) tr[ HBH"]. Aside from its interpreta- tion as a signal-to-noise ratio, this quantity has the appealing feature of being the same as the stability index used in [4] when L = 1. Note also that it does not depend on the specification of a value for c 3 .

Our goal is to find the most stable estimator meeting nominal performance constraints. With the indicators of nom- inal performance and stability previously described, we can define the most stable estimator to be the solution either to the problem,

1) given y > 0 and 6 > 0, find a bounded linear transfor- mation H minimizing tr[ HAH*] + y tr[ HBH*] sub- ject to HS = IL and t r [HAH*] I V, + 6

or to the problem,

2) given 6 > 0, find a bounded linear transformation H minimizing (tr[ H A H * ] ) - ' tr[ HBH*] subject to HS = IL and t r [HAH*] I V, + 6 .

In Section IV, we shall discuss the solutions to Problems 1) and 2) for the special case B = I. In Section V, we will show that, in many cases, the solution for a more general B follows from the solution for B = I after a change of vari- ables. Note that Problem 1) requires us to specify parameters indicating both the amount of uncertainty y and the maxi- mum allowed deviation from optimality 6, while Problem 2) requires only the value of 6. As we shall see, in most cases of interest the exact value of y is unimportant, and the two problems have the same solution.

IV. STABLE LINEAR ESTIMATORS

In this section, we consider solutions to Problems 1) and 2) when B = I; for convenience, we label the problems in this special case 1') and 2'). Solutions for problems 1') and 2'), if they exist, give the most stable estimators according to two different criteria. We show in what follows that, under mild conditions, the solutions do exist and they are closely related. The procedure used for solving Problems 1') and 2') is based on the Gauss-Markov theorem, Theorem 1, previously stated. In order to use the results of Theorem 1, we first pose two auxiliary problems:

3)

and

4)

given 6 > 0, find a bounded linear transformation H minimizing t r [ H H * ] subject to HS = IL and tr[ H A H * ] = V, + 6

given 7 > - 1, find a bounded linear transformation H minimizing tr[HH*] + 7 t r [HAH*] subject to HS =

I L .

Note that, if the inequality constraint in either 1') or 2') is replaced by an equality, then that problem is the same as Problem 3). We start by considering Problem 4), which is readily seen to have a unique solution in view of Theorem 1 ; related Problem 4) to Problem 3); and eventually, relate Problem 3) to Problems 1') and 2').

First, for 7 > - 1 define

H ( q ) = [S*(I+ qA)-IS]-IS*(I+ q A ) - ' (4.1)

and

Recall that ) I A 11 = 1 by assumption, so that (I + 7A) has a bounded inverse for all 7 > - 1; hence, for each such 7 the linear transformation H(7): 2 + R is well defined. We call V( 7) the variance of H( 7).

Lemma I : For any 7 > - 1, H(7) is the unique solution to Problem 4).

Proof: Put 8 = I + qA. Then since t r [ H O H * ] = t r [HH*] + 7 tr[HAH*] and 0- ' is bounded, the assertion follows at once from Theorem 1 and Remark 2. 0

While Lemma 1 states that H(7) is the unique solution to Problem 4), nothing has been shown yet to prevent H(7) = H(7 ' ) for some q' # 7 . The following lemma indicates that, in most cases, this cannot occur.

Lemma 2: V ( q ) is a continuous, and in fact differentiable, monotone decreasing function of 17 for - 1 < 17 < 03. If it is true that:

A) R ( S ) # R ( A ( Z + qA)- 'S ) , for all 7 > - 1 , then V( 7) is strictly monotone decreasing.

Proof: See Appendix B.

If condition A) of Lemma 2 is satisfied, then 7' # 7 implies that V(n ' ) # V(7) , which in turn implies that H(7 ' ) # H(q) . Condition A) would normally hold in practice; it is satisfied unless R( S ) is invariant under A( I + qA) I , which would require a rather special choice of the signals {s'"}. This is perhaps made more obvious by noting a second, more transparent, condition that implies Condition A). Let {+', &, * * } be a complete orthonormal set of eigenvectors of A in some order, and let K be the sum of the L largest multiplicities of eigenvalues of A . Define s , ~ E R by s, = [(d'), +,),. . * , (dL ) , +,)I*. Then we have the following lemma.

Lemma 3: Condition A) is implied by

B)there are more than K values of m for which s, is nonzero.

Proof: See Appendix B. 0 Since V(7) is monotone decreasing and positive, lim,-?

V(q) exists (whether or not Condition A) holds). We will need to refer to this limit in what follows.

Page 5: Stability in linear estimation

KELLY AND ROOT: STABILITY IN LINEAR ESTIMATION 43

Lemma 4: 1imq4- V(q) = V, = tr[(S*A-’S)-’], the

Proof (Sketch of proof only): Expand (4.2) for V(q) , with H(q) as given by (4.1), about (1/q) = 0. It can be shown that

variance of the nominal LUMV estimator.

and

[ S * ( I + qA)-’A(Z+ qA)-’S]

= q-’[(S*K1S) + B ( q ) ] ,

where 11 A(q)II + 0 and 11 B(q)II + 0 as q + W. From this it 0

A full proof of Lemma 4 is given in [3]; it is straightfor- ward but somewhat tedious. It is convenient for us to denote lim,+,V(q) by VI. We do not, in general, have a closed- form expression for VI, and do not even know whether it is always finite (although it can be shown to be finite in at least one special case [3]). This lack of knowledge of VI matters very little, as will be seen, since VI is used only to define a range of admissible values for 6, and the values of most interest are in the range 0 < 6 < V(0) - V, I VI - V,. Let V(q) be written as V, + 6 ( q ) , 0 < 6 ( q ) < VI - V,. By Lemma 2 , for any 6‘ between 0 and VI - V, there is an q’, - 1 < q < 00, such that 6(q’) = 6‘. If Condition A) or Condition B) is satisfied there is a 1: 1 correspondence be- tween q and 6, and one can write q ( 6 ) for any 6 in the range specified above (i.e., q(6’) = q’ when 6(q’ ) = 6’). If q is restricted to be positive, then the corresponding range for 6 is 0 < 6 < V(0) - V,. The connection between Problems 3) and 4) can now be established.

Lemma 5: For any 6, 0 < 6 < VI - V,, Problem 3) has a solution H(ij) for some i j such that V($ = V, + 6. If Condition A) or Condition B) is satisfied, then H($ is the unique solution to Problem 3) where 9 = q(6) .

Proof: For the range of 6 indicated there is an ij such that V($ = V, + 6 by Lemma 2. Since H(q)S = IL for any q > - 1, H ( $ is admissible (i.e., satisfies the con- straints) for Problem 3). Now suppose that H is admissible for Problem 3) and that tr[H‘H’*] < tr[H($H*($]. Then since t r [H’AH*] = tr[H(q)AH*(q)] = V, + 6, H pro- vides a smaller value to tr[ H H * ] + ? tr[ H A H * ] than does H($ . This contradicts Lemma 1, so the first assertion is proved. If Condition A) or Condition B) is satisfied there is only one such i j , i j = q ( 6 ) . The solution H($ for Problem 3) is then unique; if there were a different solution H‘ it would also be admissible for Problem 4) with q = i j , thus contradicting the uniqueness of the solution to Problem 4). 0

In order to relate the solution for Problem 3) to solutions for Problems 1’) and 2’), we need the monotonicity of the traces of certain other matrices that depend on q . The desired results follow from the monotonicity of V(q) , used in con-

follows that V( q ) + tr[( S*A- ‘ S ) ‘1 as q + W.

junction with Lemma 1. Lemma 6:

a) If q” > q’ > 0, then

b) If 7” > q’ > -1 , then tr[ H(q”) ~ * ( q ” ) ] 5: tr[ H( 7’) H*( q’)] . (4.3)

> tr[H(q’)(Z + 7’A)H*(v?I. (4.4)

2 tr[H(q’)(l+ ijA)H*(q’)]. (4.5)

tr[H(q”)(I + q”A)H*(q”)]

c) If 7’’ > q’ 5: i j > -1, then tr[H(q”)(Z + ijA)H*(q”)l

If Condition A) or Condition B) is satisfied, the inequalities in (4.3) and (4.5) are strict.

Proof:

By Lemma 1, q’ tr[H(q?AH*(v?I + tr[H(171H”(7?1 I q’ tr[H(q”)AH*(q”)] + tr[H(q”)H*(q”)]so that

I tr[ H(n”)H*(q”)]. Since q’ > 0 and [ V(q’) - V(q”)] 2 0 by Lemma 2, the conclusion follows. In the first inequality in the proof of a), the right-hand side is made larger by replacing q’ with 7”; this gives the result. Note that q’ does not need to be nonnegative for this part. Note that

q’[ V(q? - V(V”)l + tr[H(n’)H”(q?l

tr[ H(q”) ( I + f A ) H * ( q ” ) ]

-tr[ H(V’)(I + ?A>H*(T’)]

-tr[ H ( v ! ) ( I + T’A)H*(77’)] + (6 - v’){tr[ H(r”)AH*(r”) ] -tr[ H(77’)AH*(rl’)]}.

= tr[ H ( q ” ) ( I + q r A ) H * ( q n ) ]

The difference between the first two terms on the right-hand side is nonnegative by Lemma 1. Since ( i j - q’) 5 0 and { V(q”) - V(q’)} I 0 the result fol- lows. Again, q’ does not need to be nonnegative.

The final assertion is obvious from the proofs. 0

We can now give solutions to Problems 1’) and 2‘).

Theorem 2: Consider Problem 1‘) with y > 0 and 0 <

a) If V(y-’) I V, + 6, there is a unique solution H(ij) where i j = y-I.

b) If V(y-’) > V, + 6 and Condition A) or Condition B) is satisfied there is a unique solution H(q) where

Proof: In Case a), H($ is admissible, and since by Lemma 1 it provides a unique minimum for f ( H ) 4 tr[ H( I + ijA)H*] among the class of all unbiased H , the assertion follows.

In Case b), H($ (again, with i j = y-’) is not admissible since it does not satisfy the inequality constraint. Clearly, H(q) with q = q ( 6 ) > ij does satisfy the constraint (with equality) and is admissible. To show it is the unique solution in this case, we suppose that there is a different H such that

6 < V(0) - V,.

17 = q(6).

Page 6: Stability in linear estimation

44 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 38, NO. 1, JANUARY 1992

H S = IL , V ( H ’ ) e tr[H’AH’*] I V, + 6, and f ( W ) I f ( H ( 7 ) ) and obtain a contradiction. First, if V( H’) = V, + 6, there is a contradiction of Lemma 5, which gives H(q) as the unique solution when the variance constraint is an equal- ity. So we need consider only V ( H ’ ) < V, + 6. Since H, is the unique unbiased estimator having smallest possible variance V,, V ( H ) = V, + 6’ for some 0 < 6’ < 6. Now, consider the class of all unbiased H with variance equal to V, + 6‘. The unique H that minimizes f ( H ) within that class is H(7’) with 7‘ = ~ ( 6 ’ ) > 7 , again by Lemma 5. Thus, we may as well assume that H = H(7’). By Lemma 6c) with 7’ > 7 > i j , f ( H ( 7 ’ ) ) > f ( H ( v ) ) , which is a con- tradiction. 0

Theorem 3: For 0 < 6 < V(0) - V,, if Condition A) or Condition B) is satisfied then Problem 2 ‘ ) has the unique solution H(7) with 7 = ~ ( 6 ) .

Proof: Let g( H ) 4 (tr[ HA H*]) I tr[ HH*]. Then g(H(7) ) = (V(7 ) ) - I t r [ ~ ( 7 ) H * ( q ) ] is strictly monotone increasing as a function of 7 , 0 < 7 < W . In fact, v(q) is strictly decreasing by Lemma 2 and tr[ H ( 7 ) H*( 7) ] is strictly increasing by Lemma 6a). Clearly, H ( 7 ) with 7 = ~ ( 6 ) is admissible for this problem. The proof that it is the unique solution follows from the monotonicity of g ( H ( 7 ) ) in a manner similar to the proof of part b) of Theorem 2 , with

Hence, at least for 6 in the range 0 < 6 < V(0) - V,, and under the assumption that Condition A) or Condition B) holds, we can find unique solutions to Problems 1’) and 2’). The restriction 6 < V(0) - V, corresponds to q > 0, which comes about in Theorem 2 because 7 L ij = y - I > 0, and in Theorem 3 because Lemma 6a) requires it. In any event, the restriction is not very significant-it serves only to keep 6, the allowed increase in nominal variance over the minimum value, from being too large. Note that if the level of uncer- tainty y in Problem 1’) is not very small (so V ( 7 - l ) > V, + 6), then both Problems 1’) and 2‘) have the same solution, which depends only on 6.

Remark 3: Suppose that, in addition to the additive uncer- tainty q in (3.2), there is uncertainty in s, so that the true observation has the form

g(H(7) ) replacing f ( H ( 7 ) ) . 0

z = ( s + T ) b + w + q , (4 4 where E( q ) is assumed to be zero, T : R + Z is such that B-’I2 T is defined with 11 B-’I2 T 1 1 I € 2 , and T is other- wise unknown. The effect of T is to add one more term of the form €42 11 bII tr[HBH*] to the bound on mse ( h ) obtained in Section 111. That is, the bound still has the form ( 3 . 3 , with y now equal to E : + €42 11 bII *. This looks awk- ward, since y now depends on the parameter to be estimated. However, as Theorem 2 shows (at least when B = I), the exact value of y is irrelevant in most cases of interest. In fact, as long as €22 is large enough, the most stable estimator in the presence of the “multiplicative uncertainty” T de- pends only on 6, and is the same as that derived with just additive uncertainty.

Note that the most stable estimator H(q) , having the form (4.1) for some 7 , differs from the nominally optimal estima-

tor H, in that the noise covariance A is replaced by I + 711 (or equivalently, by A + 7-’1). This is similar to the way in which the most stable linear detector found in [4] and the minimax-robust matched filter for signal uncertainty de- scribed in [6] differ from the nominally optimal quantities. In fact, H(7) is in some sense a minimax-robust estimator. This is a consequence of the following lemma.

Lemma 7: Let U be the class of unbiased continuous linear estimators. Let 0 be a positive-definite, self-adjoint linear operator on Z , and let P be the class of positive-defi- nite self-adjoint operators such that

a) G E P ; b) 0 5 0, for all 0 E P ,

c) S ( ~ ) E R ( O ) , for all k = I ; . . , L and all O E P . and

Define V ( 0 , H ) = t r [HOH*] , and let He be the operator that minimizes V( 0, H ) , as given by Theorem 1. Then

max min v (@, H ) = v ( G , H G ) P U

= min max V ( O , H ) . (4.7) U P

Proof: See Appendix B 0

If we let -> = A + ~ ~ ‘ 1 , it follows that the worst-case estimate variance for noise covariances dominated by A + q-’ I is minimized by using the estimator H(7) .

Heuristically, the results of this section indicate that the stable estimator should guard against some amount of white noise in the observation. In some problems, it may not be reasonable to assume that the uncertain terms can play the role of white noise; for example, the uncertainty may be constrained to be smaller in some components than in others. This constraint on the uncertainty can be incorporated by using the operator B in Problems 1) and 2 ) in place of I . In the next section, we discuss solutions to Problems 1) and 2 ) in the more general case.

V. STABLE ESTIMATION WITH GENERALIZED STABILITY CRITERIA

We now consider Problems 1) and 2 ) with B some bounded, self-adjoint, positive-definite linear operator on 2. It turns out that, with modest hypotheses imposed on B , the general stable estimation Problems 1) and 2 ) can be reduced to problems of the form 1’) and 2 ’ ) , respectively, with a “change of variables.” The solutions to the new problems follow essentially as in Section IV. The chief difference between what happens here and in the treatment of Problems 1’) and 2’) in Section IV is that the covariance operator A is replaced by a positive-definite operator that may not be a covariance. The change of variables is a simple idea, but in general there are additional technical conditions that must be imposed to ensure that the necessary transformations and operators are well defined.

Before considering the problem more generally, we first consider the special case that B is such that is a bounded operator on Z . As will develop, the procedure previously described goes through trivially in this situation.

Page 7: Stability in linear estimation

KELLY AND ROOT: STABILITY IN LINEAR ESTIMATION 45

The change of variables is as follows: put H' = HB'/',

are bounded linear operators, for each ( H , S, A) the new triple (HI, S', I?) is well defined; reciprocally, given (HI, S', r), H = H ' B - ' / ' , S = B'I2S' and A = B'/21'B1/2 are also well defined. We assume that B is scaled so that III'II = 1. It follows immediately that H H * = HBH*, H r H ' * = HAH*, and HIS' = HS. Thus, with B-'12 bounded, Problem 1) is equivalent to the problem:

1") Given y > 0 and 6 > 0, find H' minimizing t r [H'rH'*] + y tr[H'H'*] subject to H S ' = I, and t r [H'rH'*] I V, + 6.

If H is admissible for Problem 1, that is, if it is a bounded linear transformation meeting the constraints, then H = HB'/' is admissible for Problem 1'0, and H and H give the same values to their respective object functions. When B-'/' is bounded, it is also true that whenever H' is admissible for Problem l"), H = H ' B - '1' is admissible for Problem 1). However, we note now for what follows that if B- ' I 2 is unbounded then H = H B-'/' can be unbounded and hence not admissible for Problem 1). If H is unbounded, Problems 1) and 1") are not equivalent.

With B-' l2 bounded, Theorem 2 applies to problem 1") since all the preliminary conditions are satisfied: In fact, S' = B- ' / 'S is of full rank, and I' is a covariance operator. The latter statement is true because B - 1/2A1/2 and A'/'B- 'I' are compact operators, so that I' = B - ' / 2 A B - ' / 2 is trace class. The solutions to Problem l"), under the conditions of either Part a) or Part b) of Theorem 2, are in terms of what we may call H'(q) , where H(7) replaces H ( 7 ) as given by (4.1), with S' and I' replacing S and A , respectively. (See Appendix C, (C.3).) The variance function for Problem 1") we write as V(V) = tr[H'(q)rH'(7)]; it has the properties of V ( 7 ) as given as (4.2). The variance of the nominally optimal estimate is

S' = B-'12S, and r = B - ' / 2 A B - 1 / 2 . Since B'I' and BPI/'

Vi = tr[(S'*F-'S')-'] = tr[(S*A-'S)- '] = V,.

The Condition A) in Lemma 2 now has the form

A') R(S') # R(F(Z + 7r)-'S'), for all 7 > - 1,

A") R ( S ) # R ( A ( B + qA)-'S), for all 7 > - 1. which for B-'/' bounded reduces to

Finally, the solution to Problems 1) is obviously g = H B - ' 1 2 , where H' is the solution to 1") found using Theorem 2.

The same change of variables allows one to use Theorem 3 to solve Problem 2). The solution is H(7) = H'(v)B- ' / ' , where 7 = ~ ( 6 ) with respect to V'(7) (i.e., ~ ( 6 ) is 7 such that V(7) = V, + 6). As can be easily verified, the solution to either Problems 1) or 2) when B-'12 is bounded can be put in the form

N(7) = [ S * ( B + q A ) - ' S ] S * ( B + q A ) - ' (5.1)

for the appropriate 7 . In general, while and B - ' exist as self-adjoint

operators densely defined in Z (since B is bounded and

positive-definite, self-adjoint), they need not be bounded. This leads to some difficulties. First, one can say nothing useful about D( B- ' / 'LIB- ' /~) in general. Second, as previ- ously mentioned, H = H ' ( v ) B - ' / ~ must be bounded, or it is not even admissible as a solution to Problems 1) and 2). Some additional restriction on B is certainly necessary to take care of the first difficulty, and apparently the second also. We list five conditions to be required as hypotheses for the two theorems to follow. The first is a restatement of a condition imposed throughout the paper; the second restricts B so that B-'12S and B - ' / 2 A B - 1 / 2 are properly defined, and together with the third condition, is sufficient for a solution to problem 1") (and also to Problem 2"), the corre- sponding restatement of Problem 2 ) ) to exist. The fourth and fifth conditions guarantee the boundedness of H ( 7) B- ' I2 for every 7 > - 1. With these, Theorems 2 and 3 can essentially be recovered for Problems 1) and 2).

The conditions are

1) R ( S ) C R ( A ) , 2) R ( L I ' / ~ ) C R(B' / '> ,

4) B and A commute, 5) R ( S ) c R( B ) (with Condition l), it is sufficient that

R(A) C R ( B ) ) .

3) N(r) = 0,

Theorem 4 will be proved in Appendix C with a sequence of lemmas; Theorem 5 follows in almost exactly the same way, and no proof is given. Let H(7) and V'(7) be as previously. Let H( 7) be the continuous extension of H ' ( T ) B - ' / ~ to all of Z . The quantity v, = lim7+m V'(7) is used in place of V, in the theorems; the proof that the limit equals V, (sketched in Lemma 4) used an expansion that is valid only when r (or A in the lemma) has a discrete spectrum, which it may not when BPI/' is not bounded. (See the remarks following Lemma C. 1 in Appendix C.)

Theorem 4: Let B be a bounded, positive-definite, self- adjoint operator on Z satisfying Conditions 1)-5) just listed. Consider Problem 1) with v, in place of V, and with 0 < 6 < V'(0) - v,.

N(y-1). 1) If V'(y- ') 5 v, + 6, then there is a unique solution

2) If V'(y-') > v, + 6 and Condition A') is satisfied, then there is a unique solution N(7), where 7 = ~ ( 6 ) (with respect to V(7)).

Theorem 5: Assume the same hypotheses as for Theorem 4. For 0 < 6 < V ( 0 ) + F,, if Condition A') is satisfied, then Problem 2) (with v, in place of V,) has the unique solution H(7) where v = ~ ( 6 ) (with respect to V'(7)).

Remark 4: If H ( v ) B - ' / ~ is known to be bounded, Conditions 4) and 5 ) can be dropped. This is the situation, e.g., when is bounded. If r has a discrete spectrum, as it well may, then v, = V,; if further the eigenvalues of I' have multiplicities bounded by some number, Condition B') (Condition B) with the appropriate changes) can be used in place of Condition A').

Page 8: Stability in linear estimation

46 IEEE TR .ANSACTIONS ON INFORMATION THEORY, VOL. 38. NO. 1, JANUARY 1992

Remark 5: Conditions 1)-5) may appear to be compli- cated, but there is a wide class of operators B that are readily seen to meet all of these conditions: let B be defined by Bz = E,"= I P,,(z, 4n)4n, where { 4,}:=, is a complete set of orthonormal eigenvectors of A , and the On's are positive real numbers such that sup{ P,,, 1 5 n < m} < 00

and sup{P;'(A~,, 4,,), 1 I n < m} = 1. It seems likely that this class is wide enough to cover most problems of interest.

Remark 6: The case of B = A is interesting in its own right, and it perhaps illuminates the theory a little. Obviously the solution to Problem 1) in this case is H = H, for any y and any 6. Equally obviously, Problem 2 ) is not posed so as to contain this case, since the object function reduces to a constant; however, the problem retains the meaning that any admissible H provides a "solution," and since H , is admis- sible for any 6 it might as well be chosen. It is fair to say that with this weighting of errors there is no real question of stability in either problem. Nevertheless, it is still interesting to consider B = A for Problem 1) in terms of Theorem 4. First, it may be noted that B = A is included in the class of operators B defined in the previous remark; even without this observation it is clear that B = A satisfies Conditions 2 ) , 4), and 5) . However, B = A implies that r = I, from which it follows immediately that R(r (Z + vr)-'S') = R(S') for all 7, so that Condition A ) is not satisfied. Also, it follows that H'( 7) is the continuous extension of (S*A- IS)S*A - ' I 2 to all of 2 and hence, that N(7) = H,, as given by (2.4), for any 7 > 0. Thus, case 1) of Theorem 4 always obtains. This is the extreme case where the optimality constraint is "too loose." We refer to [4, Remarks 5 and 61, where the corresponding situation is discussed for the detection problem treated there.

ered in the special case that the uncertainty bounds are in terms of the identity operator, and were then generalized to allow for some directionality in the uncertainty. We have shown that in most situations of interest, the problems have unique solutions. Furthermore, if the nominal variance con- straint is not too loose, the two problems have the same solution, which depends only on the variance constraint. This solution turns out also to be stable with respect to bounded uncertainty in the nominal signals, and to be robust in a minimax sense. The most stable estimator differs from the LUMV estimator in a way similar to that in which previously derived robust matched filters [6] and stable linear detectors [4] differ from the nominally optimal quantities.

APPENDIX A Proof of Theorem I : First it is to be verified that the

Conditions l ) , 2 ) , 3) stated preceding Theorem 1 are satisfied. The first is immediate since D ( S ) (the domain of S) = R L . The second follows from the hypothesis, since R ( O ) C R(O'12); indeed, both

' / 'S and 8- ' S are defined on R and are necessarily bounded. Now, ( A * A ) , is A*A with domain restricted to N (A*A) ; however , N ( A * A ) = N ( A ) = 0, since A b = O-'/ 'Sb = 0 im- plies that Sb = 0, and hence, b = 0. Thus, ( A * A ) , = A*A = S*O- 'S , which has an inverse on R L . Further, since @ - ' / * A = W ' S is bounded, ( ( Y ' / 2 A ) * is bounded. Then C (@-' / 'A)* is bounded, so ( A * A ) - ' A * W i 2 is bounded and Con- dition 3) is satisfied.

It follows that ( A * A ) - ' A * 8 - ' # 2 = ( S * W ' S ) - ' S * O - ' , which is densely defined. The minimizing H is the continuous extension of this to all of Z , which is given by H = [(S*O- IS)- IS*@- I]**,

Now, since O - ' S ( S * O - ' S ) - ' is everywhere defined it is actually equal to [(S*O-'S)-'S*O-']* = H . T h u s , (S*O-'S)-'S*O-'OO-'S(S*O-'S)-' C HOH*. Since the left-hand side is everywhere defined and equal to (S* (Y 'S) - ' , the final assertation follows. 0

VI. CONCLUSION This paper has considered the stability of estimates of the

parameter vector for a linear combination of nominally known signals observed in noise having some nominal covariance. Under mild assumptions, there is a unique linear, unbiased, minimum-variance (LUMV) estimator for the nominal prob- lem. The goal here has been to find linear estimates that are stable with respect to inaccuracies in the model. Uncertainty is incorporated in the model by allowing the actual observa- tion to include an extra term. The extra term is interpreted as being either random with a mean and covariance that are unknown but have known bounds, or deterministic and un- known except for an amplitude bound. For the case of random uncertainty, an estimator stability index was defined based on the worst-case mean-square error in the estimate; for the deterministic uncertainty, a stability index was defined as a sort of signal-to-noise ratio, where the "signal" term is the worst-case shift in the estimate mean from its nominal value.

Based on the stability indexes, two optimization problems were formulated, with the most stable estimator defined to be that linear, nominally unbiased estimator minimizing the appropriate stability index subject to a constraint on the nominal estimate variance. These problems were first consid-

APPENDIX B Proof of Lemma 2: The function V(q) as given by (4.1) and

(4.2) is obviously differentiable, so we compute its derivative to show that V ( q ) is monotonically decreasing. Let A(q) = [S*( I + qA) IS] and B(q) = S*( I + SA) 'A( I + ?A) ' S . Then

which we write as B'(q) = -2O(q). Thus,

Page 9: Stability in linear estimation

KELLY AND ROOT: STABILITY IN LINEAR ESTIMATION 41

Since A(q) is positive definite, V(q) 5 0 , if and only if D(q) - B(7) A ( q ) B ( v ) is nonnegative definite. Now, define

E ( q ) = ( I + qA)- ’ ’ ’S[S*(I+ qA)-’S]-’S*(I- y A ) - ’ ”

Then,

. { I - E(q)}(I+ qA)-”’A(I+ qA)-’S. (B.2)

To show V’(q) 5 0 it is sufficient to show that { I - E ( ? ) } is nonnegative definite. However, this is immediate since E(q) is an orthogonal projection; in fact, for 7 > - 1, E(q) is bounded and self-adjoint, and it is easy to verify that it is also idempotent.

To prove the second assertion we show that Condition A) is equivalent to the condition that

has positive trace for 7 > - 1. From (B.2) one has that

where J ( 7 ) = ( I + qA)-”*A(Z + qA)-ISA(q) is a 1 : l map from R L onto R [ J ( 7 ) ] = R [ ( I + 7A)-’/*A(Z + vA)-’S], since A(?) is nonsingular on R L . Because K(q) is nonnegative definite, it will have positive trace, if and only if it is not the zero operator. Thus we consider, for any e E R ‘, q > -1,

The value of the inner product in (B.3) is zero for f E R [ J ( q ) ] , if and only if f E R [ E(q)] . Hence, K ( q ) is the zero operator, if and only if

R [ J(41 = R[E(7)1 . (B.4)

By the definition of E($, R [ E ( q ) ] = R [ ( I + qA)-”*S], so (B.4) becomes

R [ ( Z + vA)-”~A(Z+ 7 A ) - ’ S ] = R [ ( I + qA)-”*S]

or finally, since ( I + ?A)- I/’ is 1 : l onto,

R [ A ( Z + ~ p i - ’ S ] = R [ S ]

The second assertion of the lemma follows. 0

Proof of Lemma 3: The negation of Condition A) can be written R[A(I + qA)-’S] C R ( S ) since both ranges have dimen- sion L . Put r = A( I + 7A) - I . Then we note first that the Condi- tion 1) R ( r S ) E R ( S ) , is equivalent to the condition 2): there exists some D: R --* R such that rS = SD. We interpret D as a matrix with respect to the usual basis {$(‘I, . . . , in R ‘, and S as the matrix [si,], si, = (s(’), 4i), i = 1;. ., w; j = 1;. ., L . The proof that Condition 2) implies Condition 1 ) is trivial; the reverse implication is almost trivial. In fact, given Condition l ) , for each $”) there exists e(’) E R L such that rSGi’’ = SOiJ’. Then for any b e R L , b = bI$ ( ’ )+ . . . +bL$lL’,

where D is the L x L matrix with components di, = (e iJ’ , $“I). With respect to the basis { 4,,}:= I the infinite matrix representa-

Am

1 + TA, tion of r is diagonal with mth diagonal entry ?;n = ~

where A... is the eigenvalue of A CorresDondinz to d.... Clearlv.

~

U Y , r r , 2 , tion H : Z + R L , it follows that t r [ H B H * ] =

Y,,, = yp, if and only if A, = A,. Now, I’S = SD for some L X L matrix D implies y,sz = s z D for m = 1, 2 , . . . , or

D*s, = ymsm, m = 1 , 2 , . . . . P . 5 )

Thus, ym is an eigenvalue of D* unless s,, = 0. But D* has at most L nonzero eigenvalues. So if K is the sum of the L largest multiplicities of eigenvalues of A (which is the same as the sum of the L largest multiplicities of eigenvalues of I’) then there are at most K indices m for which a nonzero vector s, could satisfy (B.5). Thus, Condition B) implies that there is no L x L matrix D such that rS = SD, and hence, that Condition A) is satisfied.

Proof of Lemma 7: We have

V ( 0 , H ) 2 V ( 0 , U ) , forall U E U a n d O E P (B.6)

since 0 2 0 implies H g H * 2 HOH*. Also, by Theorem 1,

min ~ ( 0 , H ) = ~ ( 0 , H G ) . 03.7) U

From (B.6), since 0 E P , we have

V ( 0 , U) = max V ( 8 , H ) , forall H E U . (B.8) P

Thus, from (B.7) and (B.8),

Now, from (B.6) and (B.7),

forall O E P ; (B.lO)

so that

max min V ( O , H ) 5 v ( G , H G ) . ( B . l l ) P U

However, min V ( e , H ) = V ( 0 , Hc3) since for each 8 E P , H(+ is the unique minimizing H given by Theorem 1. Then,

U

From (B.11) and (B. 12),

max min V ( O , U) = ~ ( 0 , HG) (B.13) P U

The conclusion follows from (B.9) and (B.13). 0

APPENDIX C In this appendix, we prove Theorem 4. We start with the follow-

Lemma 8:

1) Let R ( S ) C R ( A ) . 2) Let the bounded, positive-definite self-adjoint operator B be

such that R(A”*) C R ( B ’ / ’ ) . Then, we have the following.

ing lemma.

a) B-“2A”2 is a bounded operator with domain equal to all of Z . B- Ii2A B- ‘1’ is densely defined on Z and is bounded; it, therefore, has a continuous extension r to all of Z , where r = ( B - ’ / 2 A B - ’ / 2 * ) . r is self-ad- joint.

b) S’ = B-”’S is defined on R L (so it is necessarily bounded) and R ( S ’ ) c R ( r ) .

c) With H = UBI/’, for any bounded linear transforma-

Page 10: Stability in linear estimation

48 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 38, NO. 1, JANUARY 1992

tr[H’H’*] and t r [ H A H * ] = t r [ H ’ r H ’ * ] . Further, if HS = I L then HIS’ = IL .

If it is further assumed that:

3) N ( r ) = 0, then

d) r - exists as a self-adjoint operator on Z , either bounded o r unbounded and densely defined, and S‘*I’S’ = S*A-‘S.

Proof: The proof of a) is contained in [4, proof of Lemma 71. For Part b), note first that, by the hypotheses

D(B-i’2) = R(B’”) 3 R ( A ’ ” ) 3 R ( A ) 3 R ( S ) ,

S’ = B-’J2S is defined on R L , and hence it is necessarily bounded. To show R(S’) C R ( r ) , we argue as follows: for any b E R L , z = S b = A x , for some X E Z . Then S ’ b = B-li2Sb = B-‘I2Ax. put B’12x = y so x = B - ‘ / 2 y ; then S‘b B - ” 2 A B - 1 1 2 y = r y .

Part c) is immediate if one uses the facts that B-’ ’A and B- ‘/’S are bounded and everywhere defined.

For Part d), since r is self-adjoint and N ( r ) = 0, r - ’ exists with domain dense in Z and is self-adjoint. Then, S’*r - ’S’ is defined on R L since R(S’) C R ( r ) by Part b). To show that S’T’S’ = S*A-’S, note first that X E D ( B ” ~ A - ’ B ’ ’ ~ ) implies

since B-’/’AB-’/’ c r ; thus T i 3 B’”A-‘B‘ ’. Now, S*A-’S is defined on R L and

) x = x = r ( B i / 2 A - i B ’ / 2 1 X, that B - 1 /2A B - I / 2 ( B I 1212 - ‘B 112

Given some H’(q), H’(q)B-’/’ must be bounded in order to be admissible for Problem 1). When B-”* is unbounded, some further conditions on B seem to be required. The following lemma gives one set of sufficient conditions.

Lemma IO: If in addition to Conditions l ) , 2), and 3) we have the following conditions:

4) B and A commute and 5 ) R ( s ) C R ( B ) (with the existing conditions this is satisfied if

R ( A ) C R(B)) , then H’(q)B-Ii2 is bounded for all q > - 1.

Proof: Since [S’*( I + qr) - ’S’] is a well-defined linear opera- tor on R L , one needs only to show that S’*(I + q r ) - ’ B - ’ j * is bounded. First note that, if B commutes with A , then B and B’/* commute with r . We have, by the commutativity, that B’/*(Z + vr) is self-adjoint. Since it is bounded and has zero null-space for q > - 1 , [ B ’ ” ( I + qr)]-’ exists as a self-adjoint operator. On D ( B - ‘ / ’ ) it is given by

[ B ’ ” ( I + 7 r ) l - l = ( Z + q F ) - ’ B - ’ ’ ’ 3 ((2.5)

where ( I + TI’ ) - ’ is bounded. Thus,

on D ( B - 1 ” 2 ) , (C.6)

From (C. l ) and (C.2) and the preceding remark,

S*A-’S = SJ*B’/2A-‘Bi/*S’ = S’*r-‘S’ = ( I + q r ) - ’ B - l S , (C.7)

which, by Condition 5 ) , is well defined (and hence, bounded) on since, as the development shows, R ( S ’ ) C D(Bi ‘A-”’‘’),

0

In order to apply Theorem 2 to Problem l ” ) , note first that Lemmas 1, 2, 5 , and 6 require only that the operator A in Problem 1’) be bounded, positive-definite, and self-adjoint. Thus, there is no difficulty in replacing A with r as defined in Lemma 8. Lemmas 3 and 4 use the fact that A has a discrete spectrum, so if r does not have a discrete spectrum we cannot use Condition B) nor be sure that limT-m V(q) = V,, where we are using the definitions

However, Condition B) is not essential, and since V, =

l imT+m V’(q) exists, V, can be replaced by v, in the problem statement. Theorem 2, as applied to Problem l ” ) , then becomes the

R L . The second expression in (C.7) is meaningful and the middle equality is valid because R(S’ ) C D ( B - ” * ) . Thus, S’*(I + qI’””’’’ is bounded. 0

Lemma 11: Given A , S , B for a specific problem of the form l ) , let H’(q) (for the appropriate q ) be the unique solution to the associated problem of the form 1 ” ) ; i .e. , r and S’ are as defined in Lemma 8, and y and 6 have the same values for both problems. If H’( q ) B - ’* is bounded, define H( q ) to be the continuous extension to all of Z of H’(q)B- Then g(q) is the unique solution to problem 1 ) .

Proof: Observe first that

..

following lemma.

Assume the Hypotheses l ) , 2), and 3) of Lemma 8. Lemma 9: Consider problem 1,,) with < < vt(o) - v,,

a) If V(y-’) I v, + 6, there is a unique solution H’($ where

and. hence, that “(7) = H’(q). Now, for any H‘ = HB’ ’, it is a simple matter to verify that H ‘ S ‘ = HS, H ’ H ’ * = H H * , and H ‘ r H ‘ * = H A H * . For example, the last equality may be verified ab follows,

f = y-’. b) If V‘(y-’) > v, + 6 and Condition A’) (defined in Section H ’ r H ’ * = HB‘ ? r B ’ * H *

V) is satisfied, there is a unique solution H’(q) where q = q ( 6 ) (with respect to V(q)). = H B l 2 ( B - I 2111 2 ) ( B- I 2A1/2)*B1 2 H *

Proof: Follows immediately from Theorem 2, Lemma 8 , and 0 the remarks following that lemma.

Page 11: Stability in linear estimation

KELLY AND ROOT: STABILITY IN LINEAR ESTIMATION 49

so that H ‘ r H * must equal HAH*. Thus H(7) is admissible for Problem l) , and the inequality constraint is satisfied with the same 6. Since the values of the object functions of the two problems are the same when H and H are related by H‘ = HB’’’, and since the set of admissible H for Problem 1 ) maps 1: 1 into the set of

0

The proof of Theorem 4 now follows immediately from Lemmas

admissible H for Problem l” ) , the assertion follows.

9. 10 and 11.

REFERENCES (11 F. J. Beutler and W. L. Root, “The operator pseudoinverse in control

and systems identification,” in Generalized Inverses and Applica- tions, M . 2. Nashed, Ed. New York: Academic Press, 1976.

[2] C. T. Chen and S. A. Kassam, “Robust multiple-input matched filtering: frequency and time-domain results,” IEEE Trans. Inform. Theory, vol. IT-31, pp. 812-821, Nov. 1985. P. A. Kelly, “Robustness and stability in the detection and estimation of parameterized signals in noise,” Ph.D. dissert., Prog. in Computer, Inform. and Control Engin., The Univ. of Michigan, Ann Arbor, June 1985.

[4] P. A. Kelly and W. L. Root, “Stability in linear detection,” IEEE Trans. Inform. Theory, vol. IT-33, pp. 36-46, Jan. 1987.

[SI B. T. Poljak and Ja. Z. Tsypkin, “Robust identification,” Automat- ica, vol. 16, pp. 53-63, 1980.

[6] H. V. Poor, “Robust matched filters,” IEEE Trans. Inform. The- ory, vol. IT-29, pp. 677-687, Sept. 1983.

[7] S. Verdu and H. V. Poor, “On minimax robustness: A general approach and applications,” IEEE Trans. Inform. Theory, vol. IT-30, pt. 11. pp. 328-340, Mar. 1984.

[3]