undesirable effects of covariance matrix techniques for error analysis

4
PHYSICAL REVIEW D VOLUME 49. NUMBER 11 1 JUNE 1994 Undesirable effects of covariance matrix techniques for error analysis David Seibert* Theory Division, CERN, CH-1211 Geneva 23, Switzerland (Received 14 May 1993) Regression with X2 constructed from covariance matrices should not be used for some combinations of covariance matrices and fitting functions. Using the technique for unsuitable combinations can amplify systematic errors. This amplification is uncontrolled, and can produce arbitrarily inaccurate results that might not be ruled out by a X2 test. In addition, this technique can give incorrect (artificially small) errors for fit parameters. I give a test for this instability and a more robust (but computationally more intensive) method for fitting correlated data. PACS number(s): 06.50.-x, 02.50.Sk, 11.15.Ha Recently there has been some interest in the analysis of correlated data, and people seeking more sophisticated analysis techniques have often performed regression, us- ing the covariance matrix of the observations to construct X2 [I]. DeGrand [2], DeTar and Kogut 131, and Gottlieb et al. [4] use the technique to analyze lattice gauge theory results, while Abreu et al. [5] and Wosiek [6] use it to an- alyze dcaled factorial moment data. I show here that this analysis technique can amplify systematic errors, unlike simpler, more robust techniques. This technique is simple in principle - transform the data to an uncorrelated basis, use regression to fit the data in this basis, then transform back to the laboratory kame. However, some of the results obtained by this procedure are very odd. In particular, Gottlieb et al. 141. Toussaint [7],and Wosiek [6] find that this procedure can produce best-fit lines that fall below all data points, and even below all error bars. In this paper. I first discuss the proposed treatment of correlated data, and show that in a gedanken experi- ment without systematic errors this treatment produces exactly the desired results. In a very similar gedanken experiment with arbitrarily small systematic errors, this procedure amplifies the errors in the data; therefore. this treatment of data is not robust. I use these sim- ple gedanken experiments instead of the scaled factorial moment data or the lattice gauge theory results for pur- poses of presentation. as the effect observed in the dif- ferent data sets is qualitatively the same. I then give a more robust alternative procedure for fitting correlated data, and in the course of this discussion a test for the stability of the regression is shown. The experimental procedure is very simple. Consider N trial measurements of I data points Y,,. Calculate the covariance matrix fro111 these data: *Current address: Physics Department, Kent State Uni- versity, Kent, OH 44242. Electronic address (internet): seibertQscorpio.kent.edu where Fit the data with the curve f;(a), where {a) is the set of free parameters, by minimizing I illustrate this procedure with a gedanken experiment to measure the mean voltage of a generator that pro- duces random voltages 11 with probability distribution p(v). The generator charges a capacitor, and I then mea- sure each voltage v, on the capacitor twice, calling the measurements Yln and YZn, respectively. I model mea- surement noise by assuming that where 6v1(2)n is random with probability distribution ql(,) (bv) and zero mean. After N trials, the expectation value of the experimen- tally determined covariance matrix is where is the contribution to the covariance matrix from the dis- tribution of random voltages, and is the contribution of noise from the set of first (second) measurements. Fitting to the function fl = fz = T.', X2 is minimized when @ 1994 The American Physical Society

Upload: david

Post on 04-Apr-2017

215 views

Category:

Documents


0 download

TRANSCRIPT

PHYSICAL REVIEW D VOLUME 49. NUMBER 1 1 1 JUNE 1994

Undesirable effects of covariance matrix techniques for error analysis

David Seibert* Theory Division, CERN, CH-1211 Geneva 23, Switzerland

(Received 14 May 1993)

Regression with X 2 constructed from covariance matrices should not be used for some combinations of covariance matrices and fitting functions. Using the technique for unsuitable combinations can amplify systematic errors. This amplification is uncontrolled, and can produce arbitrarily inaccurate results that might not be ruled out by a X 2 test. In addition, this technique can give incorrect (artificially small) errors for fit parameters. I give a test for this instability and a more robust (but computationally more intensive) method for fitting correlated data.

PACS number(s): 06.50.-x, 02.50.Sk, 11.15.Ha

Recently there has been some interest in the analysis of correlated data, and people seeking more sophisticated analysis techniques have often performed regression, us- ing the covariance matrix of the observations to construct X2 [I]. DeGrand [2], DeTar and Kogut 131, and Gottlieb et al. [4] use the technique to analyze lattice gauge theory results, while Abreu et al. [5] and Wosiek [6] use it to an- alyze dcaled factorial moment data. I show here that this analysis technique can amplify systematic errors, unlike simpler, more robust techniques.

This technique is simple in principle - transform the data to an uncorrelated basis, use regression to fit the data in this basis, then transform back to the laboratory kame. However, some of the results obtained by this procedure are very odd. In particular, Gottlieb et al. 141. Toussaint [7], and Wosiek [6] find that this procedure can produce best-fit lines that fall below all data points, and even below all error bars.

In this paper. I first discuss the proposed treatment of correlated data, and show that in a gedanken experi- ment without systematic errors this treatment produces exactly the desired results. In a very similar gedanken experiment with arbitrarily small systematic errors, this procedure amplifies the errors in the data; therefore. this treatment of data is not robust. I use these sim- ple gedanken experiments instead of the scaled factorial moment data or the lattice gauge theory results for pur- poses of presentation. as the effect observed in the dif- ferent data sets is qualitatively the same. I then give a more robust alternative procedure for fitting correlated data, and in the course of this discussion a test for the stability of the regression is shown.

The experimental procedure is very simple. Consider N trial measurements of I data points Y,,. Calculate the covariance matrix fro111 these data:

*Current address: Physics Department, Kent State Uni- versity, Kent, OH 44242. Electronic address (internet): seibertQscorpio.kent.edu

where

Fit the data with the curve f;(a), where { a ) is the set of free parameters, by minimizing

I illustrate this procedure with a gedanken experiment to measure the mean voltage of a generator that pro- duces random voltages 11 with probability distribution p(v). The generator charges a capacitor, and I then mea- sure each voltage v, on the capacitor twice, calling the measurements Yln and YZn, respectively. I model mea- surement noise by assuming that

where 6v1(2)n is random with probability distribution ql(,) (bv) and zero mean.

After N trials, the expectation value of the experimen- tally determined covariance matrix is

where

is the contribution to the covariance matrix from the dis- tribution of random voltages, and

is the contribution of noise from the set of first (second) measurements. Fitting to the function fl = fz = T.', X2

is minimized when

@ 1994 The American Physical Society

2 BRIEF REPORTS 6241

It is clear from Eq. (8) that V is the average of yl and y2, properly weighted for measurement error, and so the analysis procedure is very successful a t fitting the curve for this gedanken experiment.

The experimental error is given by

Taking X 2 from Eq. (3) and C from Eq. ( 5 ) gives

Again, this technique works well, clearly giving the cor- rect error in the cases u = 0 and u + m .

Now, I modify the gedanken experiment slightly, by as- suming that the capacitor discharges somewhat between the two measurements. I therefore assume that the first measurement is unchanged, but that each v , is reduced by a factor y a t the time of the second measurement. I could alternatively assume that the scales of the volt-

random voltage. Fitting again to the function fl = f2 = V, X 2 is minimized when

The estimator is biased for this second experiment, as V is always less than Zi. This is not totally unexpected, because the discharge of the capacitor between the mea- surements gives a systematically lower value of yz. If y = 1, as in the first experiment, the bias of the esti- mator tends to zero. For any other value of y , however, V can have any value between zero and Zi. By contrast, a naive least-squares fit always yields a value for V be- tween yfi and Zi. Thus, the covariance matrix technique can produce large estimator bias from arbitrarily small intrinsic systematic errors.

One might think that X 2 should be large whenever the fit is very bad (V << is). However, this is not the case if the sample size is too small. In the limit u -t m , where the fit is the worst,

meters are slightly different, but I wish to have all system- 2

atic effects occur before measurement rather than during x2 I ( N - 1) (E) . it.

In this case, the experimentally determined covariance Thus, X 2 will be acceptably small whenever N < ( u / E ) ~ , matrix is so that an infinite number of events may be required to

rule out the worst fits. (11) One might then expect that, if X 2 is acceptably small,

the error in V will be large enough that V is within - -

After an infinite number of measurements yl = Zi and a few standard deviations of ;li. However, in the limit yz = yZi, where Zi = J d v p ( v ) v is the mean value of the ( y - 1 ) u >> el,ez,

Thus, it is quite possible to have simultaneously V << C, X 2 small, and (V - Zi)' >> u$. Now I try a more robust technique, constructing the best estimator by minimizing the variance in

V = ayl + ( 1 - a ) y Z .

The variance is

The condition du$ /da = 0 then gives

2 (ye: + e i ) 2 u 2 + [ y ( y - l )u2 + e;12e: + [ ( y - l ) u 2 - e:] e i

uv = ( N - 1) [ ( y - 1)2u2 + e? + e;12

6242 BRIEF REPORTS

The value of V is the same for the two techniques, and and r;, is the same when y = 1. In the limit el , e2 -i 0 I find

Thus, the systematic error is not amplified with this pro- cedure, and the estimate of u$ is not artificially small.

The crucial point is the non-negativity of a and 1 - a . Mathematically, this can be written as

which is identical to the result obtained from regression. Thus, the techniques are almost the same. However, the best estimator technique is more transparent, and the cause of the instability is more easily recognized and cor- rected with this technique.

In the previous analysis I left out a condition -- a and (1 - a) must both be non-negative (the fit voltage must not be anticorrelated with either of the measured voltages). In this case, the solution (21) is only valid when

This general requirement for a stable fit is that, given a perturbation in the data, the function does not move locally against the direction of the perturbation. It is intuitively obvious, though I am not sure whether it has been rigorously demonstrated. For the gedanken experi- ment this is equivalent to the requirement that a and 1-n are non-negative, as these are the partial derivatives for the two points.

The partial derivative is calculated as follows. The general fitting condition of minimizing X 2 can be written as

Applying this condition, u$ is minimized with

in the limit e l , e2 + 0. I then obtain

where { a ) is the set of fitting parameters. If yi + yi +dyi, we must have now

This can be written more compactly in matrix form: The denominator is never negative. as it is equal to a sun1 of eigenvalues of C-' (with all weights non-negative), and all eigenvalues of C-' are non-negative. Thus, the stability condition for this regression is

j

which is trivially satisfied for uncorrelated data ( C is then diagonal). If this condition is violated, then the best estimator should be used instead of the regression, to obtain the variance in the fit parameters.

The best estimator technique can also be used to fit lines and more complicated curves to data. For a line, first fit y = ax + b to all independent sets of points ij to obtain

Finally, I obtain

and the partial derivative is

Then construct linear estimators for the quantities a and b:

For the fit to a constant V , d f , / a V = 1, so

49 BRIEF REPORTS 6243

with the constraints

Finally, minimize the variance in a and b, to obtain the values and variances of both, but with the conditions

In general this procedure is not worth the effort required unless the fit fails the stability test, as the fit parameters obtained are identical to those obtained with regression.

I have shown that covariance matrix regression should be supplemented by a test for the stability of the regres- sion. When the regression is unstable, the fit parameters can be altered in an uncontrolled fashion. These alter- ations can sometimes be ruled out by a X2 test; however, for arbitrarily small X2, if the data set is small and fluc- tuations are large, the apparent errors in fit parameters can be much smaller than the difference between their

apparent values and the best estimators for these values. This instability can be present both for measured and constructed covariance matrices.

The alternative to using covariance matrix regression is to fit all possible sets of points (as many points per set as there are fit parameters) to obtain all possible linearly independent sets of the fit parameteqand use the linear combinations of the values obtained in this way (with no negative multipliers) that have the lowest variances as the best estimators of the fit parameters. This is com- putationally more cumbersome, but is the more rigorous procedure so it should be used whenever covariance ma- trix regression is found to be unstable.

I thank F. James for helpful suggestions and K. Za- lewski for useful discussions. This material is based upon work supported by the North Atlantic Treaty Organiza- tion under a Grant awarded in 1991, and by the U.S. De- partment of Energy under Grant Nos. DOE/DE-FG02- 87ER-40328 and DOE/DE-FG02-86ER-40251.

[l] W.T. Eadie et al., Statistical Methods in Experimental [6] B. Wosiek, Acta Phys. Pol. B21, 1021 (1990). Physics (North-Holland, Amsterdam, 1971), pp. 62-66. [7] D. Toussaint, in From Actions to Answers - Proceedings

[2] T.A. DeGrand, Phys. Rev. D 36, 176 (1987). of the 1989 Theoretical Advanced Study Institute in El- [3] C. DeTar and J.B. Kogut, Phys. Rev. D 36, 2828 (1987). ementary Particle Physics, edited by T. DeGrand and [4] S. Gottlieb et al., Phys. Rev. D 38, 2245 (1988). D.Toussaint (World Scientific, Singapore, 1990). [5] P. Abreu et al., Phys. Lett. B 247, 137 (1990).