a comparison of methods of fitting the · pdf file= coefficient of ux in gumbel's...

17
Journal of Hydrology 10 (1970) 259-275; © North-Holland Publishing Co., Amsterdam Not to be reproduced by photoprint or microfilm without written permission from the publisher A COMPARISON OF METHODS OF FITTING THE DOUBLE EXPONENTIAL DISTRIBUTION M.D. LOWERY and J. E. NASH Institute of Hydrology, Wallingford, U.K. Abstract: A number of methods of fitting the double-exponential distribution to a sample of data are compared. The methods are expressed in a common notation and compared in bias and efficiency. For each fitting method, formulae for the variance of estimates are obtained and some in current use are corrected. It is shown that next to the method of maximum likelihood the method of moments is the most accurate. It is also virtually unbiased and simplest to apply. Notation The principle symbols used are listed. Others which are necessary from time to time are defined in the relevant part of the text. p = probability of non exceedence. T = return period in years. rx, u = parameters of the double exponential distribution. X = magnitude of a flood event. f.1 = mean of the X population. u = standard deviation of the X population. ux = standard deviation of a sample of the X's. X = mean of a sample of the X's. X ( T) = estimate of X of return period T. f.lz = second central moment of the X population. f.1 3 = third central moment of the X population. f.1 4 = fourth central moment of the X population. Y = Gumbel's parameter free variate. f.ly = mean of the Y population. uy = standard of the Y population. fn = mean of the Y's from a sample of size n. un = standard deviation of the Y's from a sample of size n. K, K ( T) = Chow's frequency factor. K = mean of the K's from a sample of size n. 259

Upload: hoangnhan

Post on 27-Mar-2018

220 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: A COMPARISON OF METHODS OF FITTING THE · PDF file= coefficient of Ux in Gumbel's prediction equation. ... GUMBEL'S FITTING METHOD Gumbel's fitting method is similar to the regression

Journal of Hydrology 10 (1970) 259-275; © North-Holland Publishing Co., Amsterdam

Not to be reproduced by photoprint or microfilm without written permission from the publisher

A COMPARISON OF METHODS OF FITTING THE DOUBLE

EXPONENTIAL DISTRIBUTION

M.D. LOWERY and J. E. NASH

Institute of Hydrology, Wallingford, U.K.

Abstract: A number of methods of fitting the double-exponential distribution to a sample of data are compared. The methods are expressed in a common notation and compared in bias and efficiency. For each fitting method, formulae for the variance of estimates are obtained and some in current use are corrected. It is shown that next to the method of maximum likelihood the method of moments is the most accurate. It is also virtually unbiased and simplest to apply.

Notation

The principle symbols used are listed. Others which are necessary from time to time are defined in the relevant part of the text.

p = probability of non exceedence. T = return period in years. rx, u = parameters of the double exponential distribution. X = magnitude of a flood event. f.1 = mean of the X population. u = standard deviation of the X population. ux = standard deviation of a sample of the X's. X = mean of a sample of the X's. X ( T) = estimate of X of return period T. f.lz = second central moment of the X population. f.1 3 = third central moment of the X population. f.14 = fourth central moment of the X population. Y = Gumbel's parameter free variate. f.ly = mean of the Y population. uy = standard ~eviation of the Y population. fn = mean of the Y's from a sample of size n. un = standard deviation of the Y's from a sample of size n. K, K ( T) = Chow's frequency factor. K = mean of the K's from a sample of size n.

259

Page 2: A COMPARISON OF METHODS OF FITTING THE · PDF file= coefficient of Ux in Gumbel's prediction equation. ... GUMBEL'S FITTING METHOD Gumbel's fitting method is similar to the regression

260 M.D. LOWERY AND J. E. NASH

= standard deviation of the K's from a sample of size n. = coefficient of Ux in Gumbel's prediction equation. = sample size. = number of samples of size n. = mean of the X population ( = p). = standard deviation of the X population ( = u ). = magnitude of an event with transformed return period K = 5. = refers to any one of Z 1, Z 2 or Z 3 .

= estimate of Z. = standard deviation of Z estimated from N samples. = over a symbol indicates a sample estimate corrected for bias. = over a symbol indicates the mean.

Introduction

The use of the double exponential distribution in the frequency analysis of annual maxima events in hydrology, while not capable of rigorous justifi­cation, is widely accepted and is to some extent justified empirically by such acceptance. Despite its extensive use, however, there is no generally accepted fitting method. Gumbel 1) states that the use of the maximum likelihood method is "very complicated ... and requires numerical work to an extent which is prohibitive for routine work". In practice the parameters are estimated by the method of moments or by regressions of the observed magnitudes on a priori estimates of the probability associated with each magnitude. These methods however are not all equally easy to apply nor are they equally efficient and without bias.

To compare the fitting methods in consistency and bias we require esti­mators for each method, i.e. algebraic expressions in sample terms, which are used to estimate the population parameters. To compare the fitting methods in relative efficiency we require, error formulae for each method, i.e. expressions, in population terms, of the sampling variance of estimates of the magnitude corresponding to given probabilities or return periods. In addition, if an estimate from a single sample is to be of practical use we must have some idea of the error to which it is subject, and this in turn requires an error formula in sample terms, because, of course, the population is unknown.

In addition to the sampling variance of the estimates there is, in hydro­logical practice, another source of error due to the possibility of the popu­lation failing to conform to the assumed form - the double exponential distribution. This source of error cannot be analysed statistically and is not treated in this paper.

Page 3: A COMPARISON OF METHODS OF FITTING THE · PDF file= coefficient of Ux in Gumbel's prediction equation. ... GUMBEL'S FITTING METHOD Gumbel's fitting method is similar to the regression

METHODS OF FITTING THE DOUBLE EXPONENTIAL DISTRIBUTION 261

The double exponential distribution

Annual maxima values of hydrological variables are often assumed to be distributed in accordance with the double exponential distribution.

p(X) = exp(- exp- ()((X-u)) (1a)

where p(X) is the probability of an event not exceeding X and()( and u are parameters of the distribution.

Equation (la) may be written conveniently in terms of a reduced variate Y

p(X) = exp(- exp(- Y)) (1b) where

(lc)

By invertion of Eq. (1 b) the relationship may be written in terms of the return period T (the reciprocal of the probability of exceedence).

or

T Y=-lnln-­

T -1

1 T X = u - - ln ln -- .

()( T -1

(ld)

(1e)

GumbeP) showed that the mean /ly and standard deviation uy of the para­meter free distribution described by Eq. (lb) are, respectively,

/ly = y (Euler's number= 0.5772 ... )

(2)

If 11 and u are the mean and standard deviation of X then from Eqs. (lc) and (2)

or

-)6 u = fL-- yu

n

n 0(=--

u -)6

(3a)

(3b)

which inserted in Eq. (le) yields another form of the double exponential

Page 4: A COMPARISON OF METHODS OF FITTING THE · PDF file= coefficient of Ux in Gumbel's prediction equation. ... GUMBEL'S FITTING METHOD Gumbel's fitting method is similar to the regression

262 M. D. LOWERY AND J. E. NASH

distribution

X = f.l - .J 6 (r + In In _!____) (J

n T -1 which is in the form

X= f.l + K(T) (J (4a) with

K(T) =- .J6 (r + ln·ln·~T-) n T -1

(4b)

a function of the return period only (Chow 2).

Chow's factor K=K(T) is analogous to Gumbel's reduced variate Y as can be seen by comparing Eq. (ld) and Eq. (4b).

Clearly:

.J6 K =- ~(y- Y) (5)

n

K and Yare transformations of the return period which are linearly related to X, the magnitude of the exceeded event.

Fitting the double exponential distribution

Fitting the double exponential distribution to a sample of data involves estimating, by sample statistics, the parameters rx and u of Eq. (1) or f.l and (J

of Eq. (4). It is only a matter of convenience which pair of parameters is chosen for estimation by any one fitting method.

Replacement of the population parameters in the distribution equation by their sample estimates constitutes the prediction equation which may be in the form of Eq. (1) or Eq. (4).

(a) THE METHOD OF MOMENTS

The most obvious and direct method of estimation -though not popular in hydrology - estimates by comparing first and second moments of the data and the distribution.

Thus f.l is estimated by

and (J by

- 1 X=-L'X

n

J 1 ( - 2 fix = -- E X - X) n-1

(fix so defined is an almost unbiased estimate of (J).

(6a)

(6b)

Page 5: A COMPARISON OF METHODS OF FITTING THE · PDF file= coefficient of Ux in Gumbel's prediction equation. ... GUMBEL'S FITTING METHOD Gumbel's fitting method is similar to the regression

METHODS OF FITTING THE DOUBLE EXPONENTIAL DISTRIBUTION 263

The prediction equations becomes

X(T) =X+ K(T) fix· (7)

(b) THE METHOD OF REGRESSION

Fitting by regression on a plotting position as traditionally used in hydro­logy (Chow 2) ), involves estimation of the parameters J.L and a of Eq. ( 4a) by a linear regression of X on K. Thus the interpretation of J.L and a as a mean and standard deviation is not used; they are treated merely as the parameters of a linear relationship between X and K. The method requires the assignment of "plotting positions" or a priori estimates ofT (and hence of K by Eq. ( 4b)) to each event in the sample. This is usually done by ar­ranging the sample in descending order of magnitude and assigning a return period T= (n +I)/ r to each event. r is the rank of the event and n the sample size. When the T's have been transformed to K's by Eq. (4b) a linear re­gression of X on K estimates J.L and a by

(8a)

(8b)

where x=X -X and k=K-K. The prediction equation for this method becomes

X(K) =X+ (K- K) I'x,~,. I'k,

(9)

Note that as the plotting position K's are obtained from the rank and sample

size only, K and flx=JI'(K-I<Y!(n-I) also are functions of sample size only. They are analogous to Y and an as used and tabulated by GumbeP). The values of K and fix (which will be required later) can be got from Gumbel's table through Eq. (5). A selection of values is given in Table I.

(c) GUMBEL'S FITTING METHOD

Gumbel's fitting method is similar to the regression method in that for each event the same a priori estimates ofT are made and these transformed into values of the reduced variate. Gumbel uses the reduced variate Y, Eq. (ld) and like Chow evaluates the plotting position by T=(n+ 1)/r. This

Page 6: A COMPARISON OF METHODS OF FITTING THE · PDF file= coefficient of Ux in Gumbel's prediction equation. ... GUMBEL'S FITTING METHOD Gumbel's fitting method is similar to the regression

264 M. D. LOWERY AND J. E. NASH

TABLE 1

K and ftK as functions of n

n K fiK n K ftK

8 -.0724 .7538 50 -.0224 .9142 9 -.0678 .7681 60 -.0196 .9236

10 -.0640 .7805 70 -.0175 .9308 15 -.0502 .8237 80 -.0158 .9367 20 -.0418 .8502 90 -.0145 .9415 25 -.0361 .8685 100 -.0134 .9454 30 -.0320 .8821 250 -.0066 .9710 35 -.0287 .8927 500 -.00238 .9825 40 -.0262 .9012 1000 -.0021 .9895 45 -.0241 .9082 1.0000

differs only in notation from Chow's use of plotting positions K's. However instead of obtaining the parameters u and r:t in

1 X=u+-Y

l:t (10)

by regression of X on Y (which would be identical with Chow's regression method) Gumbel adopts as the XY relationship the geometric mean of the regressions of Yon X and X on Y. This is equivalent to estimating r:t and u by

(11)

(12)

where

Note: ux and un as used by Gumbel are sample quantities which therefore do not require correction for bias, though correction of both would not change Eq. (12). The bias in Gumbel's method discussed below has a different origin.

Insertion of the estimators for u and r:t in Eq. (la) or (le) provides Gumbel's prediction equation

-o ( ) _ Y (T) - f A T =X+ Ux (13a)

(Jn

Page 7: A COMPARISON OF METHODS OF FITTING THE · PDF file= coefficient of Ux in Gumbel's prediction equation. ... GUMBEL'S FITTING METHOD Gumbel's fitting method is similar to the regression

METHODS OF FITTING THE DOUBLE EXPONENTIAL DISTRlBUTION 265

In Chow's notation this becomes

~ _ K(T)- K X(T) =X+ ax (13b)

(JK

which is of the form

X(T)=X+A(n, T)ax (13c)

The equation may however be expressed in terms of the unbiased estimate fix by

which is of the form

It is necessary to remember to use

with Eq. (13d) and

in Eq. (13b).

fjK = JE(K- K) n- 1

(d) THE MAXIMUM LIKELIHOOD METHOD

(13d)

(13e)

(14a)

(14b)

The maximum likelihood method chooses those values of the parameters a and u in Eq. (la) which maximise the likelihood of obtaining the given sample from the population so defined. Kimball 3) shows that the maximum likelihood estimators are implicitly defined by

(15)

and

(16)

These cannot easily be made explicit. The prediction equation is obtained by numerical solution of Eqs. (15) and (16) and insertion for a and u in Eq. (le).

Page 8: A COMPARISON OF METHODS OF FITTING THE · PDF file= coefficient of Ux in Gumbel's prediction equation. ... GUMBEL'S FITTING METHOD Gumbel's fitting method is similar to the regression

266 M. D. LOWERY AND J. E. NASH

Eqs. (15) and (16) may be written in Chow's notation as

ax=_!!_ [x _EX exp(- ~)] ..)6 ( nX) Eexp ---

a ..)6

(17)

and

A a .J 6 [ ( 1 nX )] u =---;-- y-In -;,; exp- a ../6

(18)

Comparison of estimators

The several methods of fitting may be compared in terms of the estimates of f.l, a and X (K) derived above (Table 2).

Bias in the estimators

The bias attributable to each method of fitting may be found by com­paring the population quantity with the expected values of the estimators of Table 2. The results, in so far as the authors can derive the expected values, are shown in Table 3.

A very slight bias in the moments estimator of a is neglected in Table 3. This bias could be removed by using

E(ax)=a[1- ( 1

)_]___···] 4 n- 1 IOn (Quenouille 12).

Numerical tests of bias

5000 random numbers rectangularly distributed between zero and unity were generated and treated as probabilities p(X). Through invertion of Eq. (la), with arbitrary values of a(= l/30) and u( = 100) these 5000 values were converted into 5000 values of X, a random variable distributed ac­cording to Eq. (1a). These were grouped in N = 500 samples each of size n= 10 and by each sample the population mean f.l, the standard deviation a, and X ( T) = f.1 + 5a, were estimated using each fitting method in turn. Thus 500 estimates Z of each of the three population quantities (Z1 =f.1; Z 2 =a; Z 3 =f.1+5a) were obtained by each method.

If the estimator is unbiased the mean of the N values of Z should approach the population value Z as N increases indefinitely. If az is the observed standard derivation (about Z) of the N individual estimates then Z, the

Page 9: A COMPARISON OF METHODS OF FITTING THE · PDF file= coefficient of Ux in Gumbel's prediction equation. ... GUMBEL'S FITTING METHOD Gumbel's fitting method is similar to the regression

Pop

ulat

ion

Mom

ents

Q

uant

ity

J1 -

1 X

=-

L:X,

n

ax=

JL

:(X

-.X

? (J

n-

1

X(T

) =

J1 +

K1

(J I

X(T

) =

X+

K1ax

Pop

ulat

ion

Mom

ents

Q

uant

ity

J1 0

(J 0

J1 +

Kt(

J 0

TA

BL

E 2

Com

pari

son

of

esti

mat

ors

Reg

ress

ion

Gum

bel's

met

hod

Max

imum

lik

elih

ood

_L

:xk

K

(J f6

[Y -

ln G}

; e-(n

X/a

J6

)) J

X-K

-X

--&

L:

k2

{jK

X

L:xk

{j

x n

[-

IX e

-(n

Xja

J6

]

-X

-L

:e

{jK

)6

}; e

(n

Xja

J6

)

_ _

L:xk

_

(K1

-K

)&x

X +

(K1

-K

)- 2

X

+

A

? L:

k (JK

TA

BL

E

3

Bia

s in

est

imat

ors

Reg

ress

ion

Gum

bel'

s m

etho

d M

axim

um

like

liho

od

K

K

--

E(L

:xk)

--

(J ?

l:k2

{jK

E(L

:xk)

(J

[:K

-1]

? ~-(J

K

[E(L

:xk)

_ (

J J _ K

E(L

:xk)

K

1-K

(J-K

t(J

? 1

L:k2

L:

k2

{jK

~

m

..., s t) "' ~ '<

l ~ ~ ~ t)

0 c o;

t""

m ~ ~ a >

t"" 8 ~ o; § 0 z N

0\

-.I

Page 10: A COMPARISON OF METHODS OF FITTING THE · PDF file= coefficient of Ux in Gumbel's prediction equation. ... GUMBEL'S FITTING METHOD Gumbel's fitting method is similar to the regression

268 M. D. LOWERY AND J. E. NASH

mean of theN estimates, should be asymptotically normally distributed about Z with standard deviation fiz/..J Nand

..jN t = (Z- Z)-A-

<Jz (19)

should be distributed as students t with N- 2 degrees of freedom. Table 4 shows the t values obtained.

TABLE 4

Values oft for 500 samples of size 10

Population Moments Least

Gumbel Maximum

squares likelihood ----~--~~~-----

J.l 1.533 6.650 7.123 1.579 a - 1.815 12.459 17.761 -4.191

f.l + 5a -1.314 12.218 17.113 -3.288

The hypothesis of zero bias must be rejected for all fitting methods except moments. In the maximum likelihood method the bias may be in the estimate of <J only. In the plotting position methods the indications of bias are very strong and occur in both parameters. No doubt an adjustment of the plotting position could be made which would eliminate the bias, but as we shall see that these methods are also less efficient than fitting by moments the latter method is preferable on both grounds.

Table 5 shows a repetition of this test using the same data grouped in 100 samples of size 50.

TABLE 5

Values oft, for 100 samples of size 50

Population Moments Least

Gumbel Maximum

squares likelihood

J.l 1.542 3.228 3.272 1.645 a 0.798 6.393 8.010 0.854

f.l + 5a 1.009 6.312 7.789 1.080

The indications of bias are weaker reflecting the smaller bias in the larger samples.

Page 11: A COMPARISON OF METHODS OF FITTING THE · PDF file= coefficient of Ux in Gumbel's prediction equation. ... GUMBEL'S FITTING METHOD Gumbel's fitting method is similar to the regression

METHODS OF FITTING THE DOUBLE EXPONENTIAL DISTRIBUTION 269

Accuracy

From the practical hydrological point of view we may judge the result of a method of fitting by the expected value of the error in the estimate. The best method is clearly that for which the expected error from all sources is minimal. We shall use the term accuracy for the property of having a small expected error. Statisticians use the concept of efficiency for a similar pur­pose. Fitting methods are compared in efficiency according to the variance of the estimates obtained.

var X (T) = -1- I' [X (T)- EX (T)] 2

n- 1

where EX ( T) is the expected value or mean of all X ( T).

(20)

Efficiency, in this sense, is a measure of precision rather than of accuracy as it does not reflect any bias to which a fitting method may be subject but merely the scatter of estimates about their own mean. The mean square error (mse), defined by

1 ~ 2 mse =- I'[X(T)- X(T)] (21)

n

(where X(T) is the population value) includes the effects of both bias and efficiency and provides, at least in the present context, the best criterion by which to judge the relative merits of a number of fitting methods. The following numerical test was designed for this purpose.

Numerical test of mean square error of estimates

Each of the 500 samples of size 10 drawn from a double exponential population provided an estimate Z of each of the three population quantities Z 1 = Jl, Z 2 =a and Z 3 = J1 + 5a. The root mean square error for each set of 500 estimates was computed and appears in Table 6.

TABLE 6

Comparisons of the root mean square errors

Moments Regression Gumbel Maximum likelihood

12.27 13.30 13.36 12.62 11.47 15.95 17.49 10.82 64.54 88.86 96.48 62.41

Page 12: A COMPARISON OF METHODS OF FITTING THE · PDF file= coefficient of Ux in Gumbel's prediction equation. ... GUMBEL'S FITTING METHOD Gumbel's fitting method is similar to the regression

270 M.D. LOWERY AND J. E. NASH

The several entries in Table 6 are standard deviations and are subject to sampling variance. The significance of the difference between each pair of entries may be studied using the F (variance ratio) test. The values ofF for Z=J1+5u are given in Table 7.

Moments Regression Gumbel Max. Likelihood

TABLE 7

Values ofF for Z = 11 + 5a

Moments

1.896 2.235 1.069

Regression

1.179 2.027

Gumbel

2.390

Maximum likelihood

The value ofF at the 5% level is 1.15 for N = 500. Thus on the hypothesis that the four methods are equally efficient the difference between any pair of entries, excepting moments and maximum likelihood is significant. The larger rmse's of the plotting position methods cannot be attributed to chance alone but to a real difference in the accuracy of the methods.

It has thus been established that the simplest fitting method (viz., moments) is better both in bias and accuracy than the methods which depend on a priori plotting positions. It could be argued that the biased method could be corrected and perhaps then excell the method of moments in accuracy. The accuracy could, in the absence of bias, be expressed in terms of relative efficiency. For this reason the relative efficiency of the several methods was also tested using the same numerical data as before. Table 7 shows the observed standard deviation of the 500 estimates of the population quantities fl, u and f.l + 5u. Differences between corresponding entries in Tables 6 and 8 are due to bias.

TABLE 8

Comparison of relative efficiency in the absence of bias - s.d. of Z

z Moments Regression Gumbel Maximum likelihood

·------

J1 12.25 12.75 12.79 12.60 (}" 11.41 13.93 14.61 10.65

J1 + 5a 64.50 77.96 81.29 61.81

Page 13: A COMPARISON OF METHODS OF FITTING THE · PDF file= coefficient of Ux in Gumbel's prediction equation. ... GUMBEL'S FITTING METHOD Gumbel's fitting method is similar to the regression

METHODS OF FITTING THE DOUBLE EXPONENTIAL DISTRIBUTION 271

As with Table 6, an F-test can be applied to these entries and the values are given in Table 9 for Z = f.1 + 5CJ.

Moments Regression Gumbel Max. Likelihood

TABLE 9

Values ofF for Z = J1 + 5u

Moments

1.461 1.588 1.089

Regression

1.087 1.591

Gumbel

1.730

Maximum likelihood

Even apart from bias the methods of regression and Gumbel are less efficient than moments or maximum likelihood, the difference between the latter two being small. It would seem therefore, that the development of a method of correcting for bias in the method of regression, Gumbel's method, or maximum likelihood, would not be worth while as the method of moments is simplest, unbiased and subject to a sampling variance only very slightly greater than that of maximum likelihood and less than that of the others.

Formulae for variance of estimates

Kaczmarek 4) studied the variance of estimates of X(T) obtained by Gumbel's fitting method. He used the principle that if H is a function of sample moments such as

H = H (X, CJj:) (22)

it is asymptotically normally distributed with variance given by

var H = (~f!)2

var X+ (i5H)2

var CJ 2 + 2 i5H i5H cov (X: CJ 2)

i5X 6CJj: X i5X i5CJi ' X (23)

where the partial derivatives are evaluated at the expected values of X and (J;. Applying the same principle to Eq. (7) written as

X(T) =X+ K~fii

we may obtain the variance of X(T) for moments fitting. From Eq. (24)

i5X(T)=l· i5X(T) K i5X ' Mi 2CJ

(24)

(25)

Page 14: A COMPARISON OF METHODS OF FITTING THE · PDF file= coefficient of Ux in Gumbel's prediction equation. ... GUMBEL'S FITTING METHOD Gumbel's fitting method is similar to the regression

272 M. D. LOWERY AND J. E. NASH

The variance and covariance terms are

var X= 112 /n

var ai = (J14 - 3J1D!n + 211~/(n- 1)

cov (X, ai) = 11 3/n

(Fisher5>)

(Kendall and Stuart6>)

(26a)

(26b)

(26c)

where J1 2 , J1 3 and 114 are, respectively, the 2nd, 3rd and 4th central moments of the population.

The moment ratios J1 3/ 11~ 12 and J14 / Jl~ are the coefficients of skew and kurtosis which for the double exponential distribution are 1.14 and 5.40 respectively. Insertion of these values in Eq. (23) yields

var X T = u - + ~ K + K ~ + --~ ( ) 2 [1 1.14 2 (0.60 0.50 )] n n n n-1

(27a)

for moments .fitting. For reasonably large samples we may take n=n-1 and write

(J2

var X (T) = - [1 + 1.14K + 1.10K2]

n (approx.). (27b)

To assess the accuracy of an estimate X(T) obtained from a single sample we can only substitute &x for u in Eqs. (27a) or (27b) obtaining

var X (T) = ~~ [1 + 1.14K + 1.10K2].

n (27c)

Comparison of the prediction equations for the method of moments and Gumbel's fitting method, (Eqs. (7) and (13d)) shows that K in the former is replaced by A'=(K1 -K)j&K in the latter. A corresponding change in Eq. (27a) yields

var X (T) = :2

[ 1 + 1.14A' + (A') 2 ( 0.60 + 0.50 n ~

1) J

or in terms of

A= (K1 - K)fuK =A' J n n- 1

u2 [ J;=I ( n - 1)] var X(T) =-; 1 + 1.14A -n- + A2 0.50 + 0.60 -n-

for the variance of estimates obtained by Gumbel's fitting method.

(28a)

Page 15: A COMPARISON OF METHODS OF FITTING THE · PDF file= coefficient of Ux in Gumbel's prediction equation. ... GUMBEL'S FITTING METHOD Gumbel's fitting method is similar to the regression

METHODS OF FITTING THE DOUBLE EXPONENTIAL DISTRIBUTION 273

For large n this approaches 2

~ u 2 var X(T) =- [1 + 1.14A + l.lOA J (28b)

n

which is equivalent to Eq. (27b) with A replacing K. Kaczmarek derived Eq. (28b) but apparently made a numerical error in taking the coefficient of skew 113/ 11~ 12 = 1.298 instead of the square root of this quantity viz., 1.14. (O'Brien 13l). This error altered the coefficient of K Eq. (28b) to 1.298. Eq. (28b) with the coefficient of K as 1.298 was subsequently recommended by W.M.0. 7).

Nash and Amorocho B) independently of Kaczmarek derived an expression for var X ( T) for moments fitting. They used a Monte-Carlo process to evaluate the coefficient of K and obtained 1.18. Later 9), on becoming aware of Kaczmarek's work they accepted the superiority of the analytical develop­ment and altered the coefficient of K to 1.298 - Kaczmarek's analytical but incorrect value - thus compounding the confusion. The correct expressions for moments and Gumbel's fitting methods, respectively, are Eqs. (27a) and (28a) and the approximate equivalents Eqs. (27b) and (28b).

In an elaborate algebraic analysis Kimball 3) obtained an expression for the variance of estimates obtained through maximum likelihood.

A2

var X (T) = ux [0.978 + 0.948K + 0.608K2].

n (29)

This expression is in terms of ui a sample estimate. In this form the equation is more useful in estimating the susceptibility to error of a single estimate X ( T) but is less useful in studying relative efficiency.

Again following the lines of Kaczmarek's work it is possible to find an un-ana1ysed expression for the variance of estimates obtained by regression fitting. A more general formulation of Eq. (23) given by Kendall and Stuart 6)

can be applied to obtain the variance of X(T) for regression fitting.

~ _ (K - K)2 K - K _ var X(T) = var X+ --2- var Exk + --2- cov(X, Exk).

Ek Ek (30)

A complex expression for var (Exk) is given by Kendall and Stuart but no explicit expression for the covariance term is known to the authors and therefore Eq. (30) cannot be developed.

The usual formula for the variance of individual estimates appropriate to a simple linear regression of X or K where the errors are uniformly distri­buted along the regression line is given by Ezekiel and Fox10) as

(1 (K- K)

2) 2

var X ( T) = - + ( )2 S n E K-K

(31)

Page 16: A COMPARISON OF METHODS OF FITTING THE · PDF file= coefficient of Ux in Gumbel's prediction equation. ... GUMBEL'S FITTING METHOD Gumbel's fitting method is similar to the regression

274 M. D. LOWERY AND J. E. NASH

where S 2 is the residual variance of the X's. Eq. (31) expresses var X(T) as the sum of the variance of the mean X and (K- RV by the variance of the slope of the regression line. However, the assignment of K values to the sample X's in accordance with their observed rank does not comply with the normal requirements of least squares fitting where the X's and the K's are the values of simultaneously observed attributes of the individuals in the sample or the X's are values at preselected K values. In the present context, the mean of X is subject to the same variance as if a sample of the X's alone were drawn i.e. var X= (J

2 fn. Nash 11) attempted to allow for this by modify­ing Eq. (31) to

(J2 (K- K? 2 var X(T) =- + ( )2 S

n l:K-K (32)

This manifestly gives the correct variance of the mean of X but even if the initial variance is also used as the coefficient of (K-K) 2 /(l: (K-K) 2

) it is obvious from Eq. (30) that the formula is still deficient in a covariance term cov (X, r xk). It seems, therefore, that there is no expression available for the sampling variance of least squares estimates.

Comparison of the error formulae

Bearing in mind that A is greater thanK for all practical values of K and n, and comparing Eqs. (27a), (28a), and (29) is is clear that var X(T) increases from Eq. (29) to Eq. (28a) to Eq. (27a) confirming the progressively lower efficiency of the methods of maximum likelihood, moments and Gumbel's method.

Conclusions

Of the several fitting methods the simplest to apply is that of moments. This method is virtually unbiased and is more efficient and more accurate (in the mse sense) than either of the methods which depend on an a priori plotting position. In addition these methods are biased. The method of maximum likelihood is slightly more efficient than moments but it may be slightly biased. It is also extremely difficult to apply.

Eq. (27a) is an expression in population terms and Eq. (27c) an approxi­mate formula in sample terms for the variance of estimates, obtained by moments, of the magnitude corresponding to a given probability or return period, assuming that the sample is randomly drawn from a double expo­nential population.

Page 17: A COMPARISON OF METHODS OF FITTING THE · PDF file= coefficient of Ux in Gumbel's prediction equation. ... GUMBEL'S FITTING METHOD Gumbel's fitting method is similar to the regression

METHODS OF FITTING THE DOUBLE EXPONENTIAL DISTRIBUTION 275

Acknowledgement

This paper is published by permission of the Director, Institute of Hy­drology, Wallingford, Berks, England.

References

1) E. J. Gumbel, Statistics of extremes (Columbia Univ. Press, 1960) 2) V. T. Chow, Frequency analysis of hydrologic data (Univ. Illinois Eng. Exp. Sta.

Bull 414 (1953)) 3) B. F. Kimball, Sufficient Estimation Functions. P. 229 of Statistics of Extremes by

E. J. Gumbel (Columbia University Press 1960) 4) Z. Kaczmarek, Efficiency of the estimation of floods with a given return period, lASH

Toronto, vol. III (1957) 5) R. A. Fisher, Moments and product moments of sampling distributions. Proc. Lond.

Math. Soc. 30 199-238 6) Kendall and Stuart, The advanced theory of statistics. Vol. 1 (Griffin and Co., 1958) 7) W.M.O. Guide to Hydrometeorological Practices. W.M.O.- No. 168 TP82 (1965) 8) J. E. Nash and J. Amorocho, The accuracy of the prediction of floods of high return

period. Water Resources Research, 2 (1966) No. 2 9) J. E. Nash and J. Amorocho, Letter to Editor, Water Resources Research 3 (1967)

No.2 10) Ezekiel and Fox, Methods of correlation and regression analysis (John Wiley and

Sons, 1959) 11) J. E. Nash, River Engineering and Water Conservation Works. Chap. 6. Edited by

R. B. Thorn (Butterworths, London, 1966) 12) M. H. Quenouille, Fundamentals of Statistical Reasoning. No. 3 of Griffins Statistical

Monographs and Courses (1958) 13) D. O'Brien, Private communication (1968)