the extended-least-squares treatment of correlated data

4
IEEt TRANSACTIONS ON INSTRUMLNTAI‘ION AND MEASUREMENT, VOL. 44. NO. 2. APRIL IW5 415 The Extended-Least-Squares Treatment of Correlated Data E. Richard Cohen and Valentin S. Tuninsky Abstract-A generalization of the extended-least-squares algo- rithms for the case of correlated discrepant data is given. The expressions for linear, unbiased, minimum-variance estimators previously derived are reformulated. A posteriori estimates of the variance taking into account the inconsistency of aU the experi- mental data have the same form as in the case of noncorrelated data. These estimates extend the previous improvement on the “traditional” Birge-ratio procedure to the case of correlatedinput data. I. INTRODUCTION HE mathematical analysis of measurements of the fun- T damental constants is generally based on the concepts of the least-squares method of data adjustment [I]. An important aspect of the least-squares technique is the proper treatment of discrepant data. In addition to the simple deletion of the more inconsistent data, several algorithms have been introduced to account for inconsistencies by expanding the apriori variances of the input data. Among these have been the Birge Ratio [2] for homogeneous data (data of a single &type’), an analysis of variance with separate Birge ratios for each group of homogeneous data [3], the LCU (least change of the uncertainties) approach [4], and extended-least-squares estimates [5]. A numerical comparison of these algorithms was performed 161 on the data of several adjustments of fundamen- tal constants. These approaches deal with information giving different levels of detail of the experimental data, and some of these approaches require more experimental detail than others. Following the recommendation of the Bureau Intemational des Poids et Mesures [7] measurements should include, besides the measured value and an estimate s’ of its variance 05, also the components s’.~ of the estimate (sf = Cs?,,) and the degrees of freedom ui,k of those components. The extended-least-squares (ELS) estimators [5 J previously derived define U posteriori estimates of the unknown variables of the adjustment i,, as well as the uncertainties 65 to be assigned to the measured input quantities. Here we shall extend that discussion to include the case of correlated discrepant input data. Manuscript received July I, 1994; revised October 15, 1994. This work was supported in part by the National Institute of Standards and Technology under Service Order 43 NANB 41G964 and in part from a grant from the International Science Foundation. E. R. Cohen is with Rockwell Intemational Science Center, Thousand Oaks, CA 91360 USA. V. S. Tuninsky is with the U. 1. Mendeleyev Institute of Metrology, St. Petersburg, 198005, Russia. IEEE Log Number 940871 1, 11. BASIC MODEL The operational equations for the measurements are as- sumed to be reducible, by an appropriate choice of variables, to the form a= 1 i= l, .... N where Y, are the ‘true’ values (the expectations, Ey, = x), < , = L, - J: are the deviations of the unknowns from a set of initial values &,y/20 = Y(.,..J~,...), and the matrix A = (Azo) is a constant. The errors 6, are assumed to be correlated, with a variance-covariance matrix (or more succinctly, a “covariance matrix”) v = ( u,k) (E,) =o (ELEJEk) =0 (ElEk) = uik (EtEJEkEl) =U~~ukl + %kVj1 + Udv3k. (2) Correlations between input data of an adjustment arise, for example, when different input data are obtained using shared instrumentation, or using the same measurement standards or the same auxiliary constants having nonnegligible uncer- tainties [81. A covariance matrix may be constructed in the form L (3) p=l where ffkare fixed coefficients and a: are independent contri- butions for which estimates s : and the confidence parameters up are known (4) (%Si) =n;. (Si&) = 0, (5) WhenThe a priori variance estimate s : is based on the ob- served variance of a sample drawn from a normal distribution, the variance of that estimate is inversely proportional to the degrees of freedom up; this may be generalized, and vp may be considered to be a parameter that defines the variance of the estimate. It therefore need not necessarily be restricted to integral values. D(s;) =((si - (Sj))2) = 2a;/up. 0018-9456/95$04.00 0 1995 IEEE

Upload: vs

Post on 22-Sep-2016

216 views

Category:

Documents


4 download

TRANSCRIPT

IEEt TRANSACTIONS ON INSTRUMLNTAI‘ION AND MEASUREMENT, VOL. 44. NO. 2. APRIL IW5 415

The Extended-Least-Squares Treatment of Correlated Data

E. Richard Cohen and Valentin S. Tuninsky

Abstract-A generalization of the extended-least-squares algo- rithms for the case of correlated discrepant data is given. The expressions for linear, unbiased, minimum-variance estimators previously derived are reformulated. A posteriori estimates of the variance taking into account the inconsistency of aU the experi- mental data have the same form as in the case of noncorrelated data. These estimates extend the previous improvement on the “traditional” Birge-ratio procedure to the case of correlated input data.

I . INTRODUCTION HE mathematical analysis of measurements of the fun- T damental constants is generally based on the concepts of

the least-squares method of data adjustment [ I ] . An important aspect of the least-squares technique is the proper treatment of discrepant data. In addition to the simple deletion of the more inconsistent data, several algorithms have been introduced to account for inconsistencies by expanding the apriori variances of the input data. Among these have been the Birge Ratio [2] for homogeneous data (data of a single &type’), an analysis of variance with separate Birge ratios for each group of homogeneous data [3], the LCU (least change of the uncertainties) approach [4], and extended-least-squares estimates [ 5 ] . A numerical comparison of these algorithms was performed 161 on the data of several adjustments of fundamen- tal constants. These approaches deal with information giving different levels of detail of the experimental data, and some of these approaches require more experimental detail than others. Following the recommendation of the Bureau Intemational des Poids et Mesures [7] measurements should include, besides the measured value and an estimate s’ of its variance 05, also the components s ’ . ~ of the estimate (sf = Cs?,,) and the degrees of freedom u i , k of those components.

The extended-least-squares (ELS) estimators [5 J previously derived define U posteriori estimates of the unknown variables of the adjustment i,, as well as the uncertainties 65 to be assigned to the measured input quantities. Here we shall extend that discussion to include the case of correlated discrepant input data.

Manuscript received July I , 1994; revised October 15, 1994. This work was supported in part by the National Institute of Standards and Technology under Service Order 43 N A N B 41G964 and in part from a grant from the International Science Foundation.

E. R. Cohen is with Rockwell Intemational Science Center, Thousand Oaks, C A 91360 USA.

V. S. Tuninsky is with the U. 1. Mendeleyev Institute of Metrology, St. Petersburg, 198005, Russia.

IEEE Log Number 940871 1 ,

11. BASIC MODEL The operational equations for the measurements are as-

sumed to be reducible, by an appropriate choice of variables, to the form

a= 1

i = l , . . . . N

where Y, are the ‘true’ values (the expectations, Ey, = x), <, = L, - J: are the deviations of the unknowns from a set of initial values &,y/20 = Y(.,..J~,...), and the matrix A = ( A z o ) is a constant. The errors 6 , are assumed to be correlated, with a variance-covariance matrix (or more succinctly, a “covariance matrix”) v = ( u , k )

( E , ) = o

( E L E J E k ) = 0 (ElEk) = uik

( E t E J E k E l ) = U ~ ~ u k l + % k V j 1 + U d v 3 k . (2)

Correlations between input data of an adjustment arise, for example, when different input data are obtained using shared instrumentation, or using the same measurement standards or the same auxiliary constants having nonnegligible uncer- tainties [81. A covariance matrix may be constructed in the form

L

(3) p = l

where ffkare fixed coefficients and a: are independent contri- butions for which estimates s: and the confidence parameters up are known

(4) (%Si) =n;. (Si&) = 0,

( 5 )

WhenThe a priori variance estimate s: is based on the ob- served variance of a sample drawn from a normal distribution, the variance of that estimate is inversely proportional to the degrees of freedom up; this may be generalized, and vp may be considered to be a parameter that defines the variance of the estimate. It therefore need not necessarily be restricted to integral values.

D(s;) =((si - (Sj))2) = 2a;/up.

0018-9456/95$04.00 0 1995 IEEE

476 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 44, NO. 2, APRIL 1995

In the case of uncorrelated data the covariance matrix is diagonal; the uncertainty of each measured value can be associated with a distinct component 0; and we have

and L = N . Equations (1)-(5) describe the model for which a gener-

alization of the extended-least-squares estimators are to be obtained.

111. MAXIMUM LIKELIHOOD ESTIMATOR

The method of maximum likelihood requires a knowledge of the distribution laws of f I and ,s;. We assume a correlated normal distribution for F , and an independent x2 distribution for si (which, when up is an integer, can be considered as the “radial” part of a v,-dimensional normal distribution for sp):

where ‘ denotes the transpose, (let U is the determinant of the matrix ( 7 / , k ) , and w = w-’ is the weight matrix of the input data: and

This distribution f($; up) should be considered as simply a convenient model for the distribution of si, with up as a parameter.

From ( 3 ) , (6), and (7), the probability density g and the like- lihood function C to which the maximum likelihood condition is to be applied are

Because the likelihood function in (8) is not separable with respect to the unknowns (Eck, 0;). it is impossible to obtain optimum sufficient estimators [9] and there is no uniquely “best” solution to the estimation problem.

Minimizing C with respect to E,; S L / d & = 0, gives the well known estimate of general least-squares:

3 =zo + C ( y - yo) c =

jj = yo + AC(y - yo)

(9)

(10) A C A = A C A C = C (11)

while minimization with respect to 0; SL/aoi = 0, using the partial derivatives relations

d In(& U ) = Tr(wfp) aa;

- aW = -w jpw acr;

gives

Multiplying this equation by U’ summing over all values of p, and using (3) and U = w- P’ gives

ETW& = N + 1 vp(l - S g c r ; ) .

P

This result is a verification of the consistency of the maxi- mum likelihood condition: taking expectations over e, , using ( € j € k ) = V l k = ( w - l ) J k , gives

( E ~ W E ) = Tr(S,,) = N and vP(l - si/(.,”) = 0. P

while taking expectations over s;. gives

(si) = and E ~ W E = N .

Using the estimates i and y from (9) and (10) gives

E E Y - Y = ( y - y ) + ( y - Y ) = r + y - Y = ( I - AC)(y - yo) + A ( ? - 2)

and one obtains

ETwfpw& = rTwfPwr + a Although the expectation of i , - x, vanishes and the “best

estimate” of Y, is G z , the expectation of A is not zero. If the term A is neglected a bias will be introduced into the evaluation. Since

( E J & k ) = ‘U,k and ( T , T k ) = [U - A C V ] , k ,

we introduce the matrix R,k in order to compensate for this bias

rj 213 k r k

( I j k - v j k

R,k =TF~ = -

- 1.1 T k - 1 - Cov($j. $ k ) / v j k

where V = (ATwA)-l is the covariance matrix of the fitted quantities 2, and 7 = AVAT is the covariance matrix of the adjusted input data 6:

coV($l, y k ) = v = A(ATWA)-’AT

The maximum-likelihood estimate for the uncertainty com- ponent U:, from (9). (IO), and (12) becomes

[Tr(wfz’wR) - Tr(wfp)] + vp[si - e;]/&; = 0.

COHEN AND TUNINSKY THE EXTENDEI)-LEAST-SQUAKt.S TREATMENT OF CORRELATED DATA

or, in a possibly better form for iteration (since the weight Introducing the notation matrix w must be evaluated using the estimates 6:)

7; = Tr(BPK), 2; = T r ( B p K B P K ) - 2 ups^ + 5; ‘Tr(wf”wR) IT, = v, + 6; q w p ) permits the variance D(6;) to be written in the form

v = f%;. w = v-I P

C = (ATwA)-’ATw = VAl’w r = ( I - AC)(y - y o ) .

The condition that the variance should be minimal then leads (13) to the equation

For uncorrelated input data ( i .e . , for .f:k = S,,bk,) this expression reduces to the form

The results, (13) and (14) are not the only possible ones. Instead of introducing the matrix R,,c to compensate for the bias caused by neglecting A in (12), an alternative form may be found by introducing the expectation of A into that equation; this yields

rTwf”wr + Tr (wfPwV) - Tr (wfp) = u,[ii,“ - sP1/6:,”,

upKBPK = [U; - T2]K. (20)

Multiply (20), in turn, by w and by B” and take the traces to obtain the two equations

uPT2 = [gi - T ~ ] u v = T r ( I - AC) = N - M

upTa = - ?;IT2

from which the values of T2 and T4 are obtained

Y

For uncorrelated input data this reduces to 1

The minimal variance is then given by putting the results -

(16) (21) into ( 1 9 , ir2 = ___[U s’) + .P’ + V,,]. v , + l p p

The expressions (1 3) and (1 5 ) are equally valid estimators in the sense that the two have the same expectation, and the same may be said also for expressions (14) and (16).

IV. MINIMUM VARIANCE ESTIMATOR

The minimum variance estimator requires only the lower moments of c1 and sf. (2) and ( 5 ) rather than a detailed knowledge of the distributions [SI.

The linear, unbiased, minimum variance estimator is to be

The variance of the estimate cr: has been decreased by the factor vp/ (up + v ) ; the equivalent degrees of freedom for the estimate have been increased from up to vP + U , showing the full efficiency of the estimator.

Because K is singular it has no inverse, and one cannot solve (20) for BP. We can, however, evaluate the quantity rrBpr without the need to evaluate BP explicitly

obtained in the form of a linear combination of the available information pertaining to the estimand; this leads again to (9) r ~ B P r = [ ( I - Ac,(y - Y o ) l ~ B p ( I - Ac)(y - yo)

and (10) for the adjusted values .?<, and &. with the estimate = ( y - Y O ) ~ ( I - W A V A ~ ) B P ( I - V w ) ( y - y o ) .

Using (18), (20), and (21) we have of the uncertainty cry2 given by

rTBf’r = ( y - Y ” ) ~ W K B ~ K W ( I J - y o ) 0; = ups; + ( y - y ) T B p ( y - y) = ur,si + rrBpr (17)

where the coefficient ail and the ,V w N matrix Bp are to be determined

In order that the expectation of the estimate a; equals the

0;

(Y - Yo)TwKw(Y - Y O ) - -

u p + U

true estimand, <5; >= a:, the coefficient up must satisfy - - 0; (Y- Y ~ ) ~ W K ~ K ~ ( ! ~ - yo) 11, + v

ccp = L - TY/. ( B ” K ) / o ;

where -

K = (rr”) = ( I - AC)w = v - V Substituting (22) and (23) into (17) gives the final form of KW = I - AC. KwK = K the estimate

r = K w ( y - yo) ACK = VWK = 0.

478 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 44. NO. 2. APRIL 1995

and the covariance matrix of the input data is

The estimate (24) for correlated data has the same form as for uncorrelated data, but the expression

j J

appropriately takes into account the correlations between data.

v. REALIZATION OF THE ESTIMATORS

The effect of the application of (24) becomes most apparent when the initial approximation of the iteration is considered (6;‘’) = SE) so that (24) has the form

From this it can be seen that the more reliable a priori estimates of the set {.$} (those with large up) will be changed by factors closer to 1 than will the less reliable a priori estimates. (low up). The amount of the changes will depend on the total inconsistency of data. The more the inconsistency (x2 >> U), the more all the uncertainties will be increased. For consistent data ( x2 M v) the factor is approximately unity, and there is little change in the uncertainty, while for x2< f the a priori uncertainties will be decreased. Equations (13), (15). and (24) are implicit equations for 6;

that are to be solved by iteration. Equations (14) and (24) have previously been used in the analysis of uncorrelated data [6] , [lo]. Since the estimator (24) has the same form as for

REFERENCES

uncorrelated data, the detailed discussion [ 5 ] of its application is valid as well for the case of correlated data.

111 E. R. Cohen, K. M. C r o w and J . W. M. DuMond, The Fundamental Constants of Physics.

121 R. T. Birge, “The calculation of errors by the method of least-squares,” New York: Interscience, 1957.

VI. CONCLUSIONS

. . Phys. Re;., vol. 40. pp. 207-227, 1932:

[ 3 ] E. R. Cohen and B. N. Taylor, “The 1973 least-squares adjustment of the fundamental constants,” J . Phvs. Chem. Ref Data, vol. 2, pp. 663-734, .. 1973.

[4) V. S. Tuninsky and V. M. Kholin, “Concerning changes in the methods for adjusting the physical constants,” Metrologiya, no. 8, pp. 3-1 1. Aug.

The importance Of having availab1e methods for the Ob- jective correction of the a priori estimates of variances of ..

the input data in an analysis of their inconsistency increases when the data are correlated because of the difficulty in trac-

1975. 151 E. R. Cohen, “An extended-least-squares treatment of discrepant data,”

in B. N. Tavlor and W. D. Phillios. Eds..Precision Measurement and L I

ing the relations among the original independently measured quantities that define the input values and their uncertainties and covariances.

The approach described has the same interpretation as given for the gives improved estimates for the contributions to the covariance matrix of (3), and the a po,yterjori estimate of correlations between input values y; can be calculated.

The primary result is the expression (24) 6: = ups: + which replaces the simple ~i~~~ ratio (ep =

Fundamentai Constants II , Washington, DC, 1984, pp. 391-395. 161 B. N. Taylor, “Numerical comparisons of several algorithms for treating

inconsistent data in a least-squares adjustment of the fundamental constants,” Nat. Bureau Standards, Gaithersburg, MD, Rep. NBSIR 81-2426, Jan. 1982.

171 P. Giacomo, “On the expression of uncertainties,” in P. H. Cutler and A. A. Lucas, Eds.,Quantum Metrology and Fundamental Physical Constants, Ser. B, vol. 98.

181 V. S. Tuninsky, “The consideration of correlations among the data from the measurements of the constants,” Europ. Scientijic Metrolog. Cor$ 150th Ann. D. I . Mendeleyev Inst. Metrology, ” Abstracts, St. Petersburg, Russia, Sept. 1992, pp. 23-24.

London: Griffin, 1958, pp. 56-59.

[lo] E. R. Cohen and B. N. Taylor, “The 1986 adjustment of the fundamental physical constants,” Rev. Mod. Phys., vol. 59, pp. 1121-1 148, Oct. 1987.

of uncorrelated data [51, New York: Plenum, 1983, pp. 623-629.

(T;: ’x2/vp + d ’ X 2 / l ’ S p ) for both correlated and uncorrelated inhomoge- neous data (data of different ‘types‘).

191 M. H. Quenouik, Fundamentds of Sfatistica/ Reasoning.