parameters of distribution location parameter scale parameter shape parameter

31
H Y D R O L O G Y P R O JE C T Technical Assistance Parameters of distribution Location Parameter Scale Parameter Shape Parameter

Upload: denise-hammel

Post on 01-Apr-2015

396 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

Parameters of distribution

Location Parameter

Scale Parameter

Shape Parameter

Page 2: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

x

f(x

)

pdf y y x0 mean median mode skew

1 0 0.5 0 1.13 1.00 0.78 1.75 2 0 0.5 2 3.13 3.00 2.78 1.75

1 2

Page 3: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

x

f(x

)

pdf y y x0 mean median mode skew

1 0.0 0.5 0 1.13 1.00 0.78 1.75 2 0.5 0.5 0 1.87 1.65 1.28 1.75 3 1.0 0.5 0 3.08 2.72 2.11 1.75

1

2

3

Page 4: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0

x

f(x

)

pdf y y x0 mean median mode skew

1 0 1.0 0 1.65 1.00 0.37 6.18 2 0 0.5 0 1.13 1.00 0.78 1.75 3 0 0.25 0 1.03 1.00 0.94 0.78

1

2

3

Page 5: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

Plotting position of xi means, the probability assigned to each data

point to be plotted on probability paper.

The plotting of ordered data on extreme probability paper is done

according to a general plotting position function:

P = (m-a) / (N+1-2a).

Constant 'a' is an input variable and is default set to 0.3.

Many different plotting functions are used, some of them can be

reproduced by changing the constant 'a'.Gringorton P = (m-0.44)/(N+0.12) a = 0.44Weibull P = m/(N+1) a = 0Chegadayev P = (m-0.3)/(N+0.4) a = 0.3Blom P = (m-0.375)/(N+0.25) a = 0.375

Plotting position

Page 6: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

Curve Fitting Methods

The method is based on the assumption that the observed data follow the theoretical distribution to be fitted and will exhibit a straight line on probability paper.

• Graphical Curve fitting Method

• Mathematical Curve fitting Method.-

Method of Moments-

Method of Least squares

Method of Maximum Likelihood

Page 7: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

Estimation of statistical parameters (2)• Estimation procedures differ• Comparison of quality by:

– mean square error or its root– error variance and standard error– bias– efficiency– consistency

• Mean square error in of :

mse E [( ) ] 2

mse E E E E [( [ ]) ] [( [ ] ) ] 2 2

E E[( [ ]) ] 2 2 E E b[( [ ] ) ] 2 2

Page 8: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

Estimation of statistical parameters (3)

• Consequently:– First part is the variance of = average of squared

differences about expected mean, it gives the random portion of the error

– Second part is square of bias, bias = systematic difference between expected and true mean, it gives the systematic portion of the error

• Root mean square error:

• Standard error

• Consistency:

mse b 2 2

rmse E b [( ) ] 2 2 2

E E[( [ ]) ]2

lim Pr ( | | )n ob for any 0

Mind effective number of dataMind effective number of data

Page 9: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

Graphical estimation

• Variable is function of reduced variate:– e.g. for Gumbel:

• Reduced variate function of non-exceedance prob.:

• Determine non-exceedance prob. from rank number of data in ordered set, e.g. for Gumbel:

• Unbiased plotting position depends on distribution

x x z 0

z F xX ln( ln( ( ))

Fi

Ni 0 44

012

.

.

Page 10: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

Graphical estimation (2)

• Procedure:– rank observations in ascending order

– compute non-exceedance frequency Fi

– transform Fi into reduced variate zi

– plot xi versus zi

– draw straight line through points by eye-fitting– estimate slope of line and intercept at z = 0 to find the

parameters

Page 11: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

Graphical estimation: example

• Annual maximum river flow at Chooz on Meuse

Year Qmax Rank xi Freq zi Year Qmax Rank xi Freq zi

1 2 3 4 5 6 1 2 3 4 5 61968 386 1 274 0.019 -1.383 1983 1199 16 685 0.517 0.4151969 910 2 295 0.052 -1.085 1984 675 17 690 0.550 0.5141970 550 3 386 0.085 -0.902 1985 760 18 735 0.583 0.6171971 274 4 406 0.118 -0.759 1986 735 19 760 0.616 0.7251972 468 5 406 0.151 -0.635 1987 780 20 780 0.649 0.8401973 406 6 423 0.185 -0.524 1988 660 21 785 0.683 0.9631974 615 7 468 0.218 -0.421 1989 690 22 795 0.716 1.0961975 295 8 491 0.251 -0.324 1990 1080 23 840 0.749 1.2411976 795 9 550 0.284 -0.230 1991 491 24 860 0.782 1.4041977 685 10 615 0.317 -0.138 1992 1135 25 910 0.815 1.5891978 680 11 635 0.351 -0.047 1993 1510 26 1080 0.849 1.8071979 785 12 642 0.384 0.043 1994 1527 27 1135 0.882 2.0731980 635 13 660 0.417 0.134 1995 406 28 1199 0.915 2.4211981 860 14 675 0.450 0.226 1996 642 29 1510 0.948 2.9341982 840 15 680 0.483 0.319 1997 423 30 1527 0.981 3.976

Page 12: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

0

200

400

600

800

1000

1200

1400

1600

-2 -1 0 1 2 3 4

Reduced variate z

x =

Qm

ax (

m3

/s)

x0 = 590

= 1200/4.85 = 247

4.85

1200

Graphical estimation

Page 13: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

Graphical estimation: example (2)• Gumbel parameters:

– graphical estimation: x0 = 590, = 247

– MLM-method: x0 = 591, = 238

• 100-year flood:– T = 100 FX(x) = 1-1/100 = 0.99

– z = -ln(-ln(0.99)) = 4.6

– graphical method: x = x0 + z = 590 + 247x4.6 = 1726 m3/s

– MLM method: x = x0 + z = 591 + 238x4.6 = 1686 m3/s

• Graphical method: pro’s and con’s– easily made

– visual inspection of series

– strong subjective element in method: not preferred for design; only useful for first rough estimate

– confidence limits will be lacking

Page 14: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

Plotting positions

• Plotting positions should be:– unbiased– minimum variance

• General: Fi b

N bi

2 1

Name of formula b distribution remarks

Hazen

Weibull

Blom

Chegodayev

Gringorten

NERC

Tukey

0.5

0

3/8

0.3

0.44

2/5

1/3

-

-

N, LN-2, LN-3, G-2 for large

various

EV-1, E-1, E-2, G-2

G-2, P-3

-

For i = N: T = 2N

biased

LP-3: for 1>0 b>3/8 and 1<0 b<3/8

Overall compromise

Compromise plotting position

Page 15: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

Censoring of data

• Right censoring: eliminating data from analysis at the high side of the data set

• Left censoring: eliminating data from analysis at the low side of the data set

• Relative frequencies of remaining data is left unchanged.

• Right censoring may be required because:– extremes in data set have higher T than follows from

series– extremes may not be very accurate

• Left censoring may be required because:– physics of lower part is not representative for higher

values

Page 16: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

0

200

400

600

800

1000

1200

1400

1600

1800

-2.0 -1.0 0.0 1.0 2.0 3.0 4.0 5.0

Reduced variate

Qu

anti

le (

m3/s

)

Right censoringRight censoring

Left censoringLeft censoring

Page 17: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

Quantile uncertainty and conf. limits (2)• Confidence limits become:

– CL diverge away from the mean– Number of data N also determine width of CL

• Uncertainty in non-exceedance probability for a fixed xp:

– standard error of reduced variate

• It follows with zp approx N(zp,zp):

x x z sN

z x x z sN

zp LCL p X p p UCL p X p, / , /

1 2

21 2

211

1

2

11

1

2

zx

pp X

X

zx

Xz

x

Xp

p

p

pestimated by ss

s hence:

p N p z

pz

p x

X

f zz

sz s

sp p

p

( ) exp( ) exp( )

1

2 2

1

2 2

2 2

Page 18: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

-20

0

20

40

60

80

100

120

140

160

180

200

220

-3 -2 -1 0 1 2 3

reduced variate Z = (X - 100)/25

Var

iate

X

line of best fit

LCL, n = 10

UCL, n = 10

LCL, n = 50

UCL, n = 50

LCL, n = 100

UCL, n = 100

Confidence limits for frequency distribution

Page 19: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

Example rainfall VagharoliAnnual rainfall Vagharoli

Period 1978 - 1997

Fitting the normal distribution function

Number of data = 20 Mean = 877.283 Standard deviation = 357.474 Skewness = -.088 Kurtosis = 2.617

Nr./year observation obs.freq. theor.freq.p theo.ret-per. st.dev.xp st.dev.p 10 232.000 .0309 .0355 1.04 129.6295 .0283 5 267.000 .0802 .0439 1.05 125.3182 .0325 9 505.000 .1296 .1488 1.17 99.2686 .0644 18 525.000 .1790 .1622 1.19 97.4253 .0669 15 606.000 .2284 .2240 1.29 90.7089 .0759 14 628.000 .2778 .2428 1.32 89.1161 .0780 7 649.580 .3272 .2621 1.36 87.6599 .0799 4 722.000 .3765 .3320 1.50 83.6122 .0849 11 849.400 .4259 .4689 1.88 80.0545 .0891 3 892.000 .4753 .5164 2.07 79.9673 .0892 16 924.000 .5247 .5520 2.23 80.2727 .0888 20 950.000 .5741 .5806 2.38 80.7532 .0883 19 1050.000 .6235 .6855 3.18 84.4622 .0839 6 1110.000 .6728 .7425 3.88 87.9885 .0795 12 1167.684 .7222 .7917 4.80 92.1776 .0740

Page 20: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

Example rainfall Vagharoli (2)Nr./year observation obs.freq. theor.freq.p theo.ret-per. st.dev.xp st.dev.p 10 232.000 .0309 .0355 1.04 129.6295 .0283 5 267.000 .0802 .0439 1.05 125.3182 .0325 9 505.000 .1296 .1488 1.17 99.2686 .0644 18 525.000 .1790 .1622 1.19 97.4253 .0669 15 606.000 .2284 .2240 1.29 90.7089 .0759 14 628.000 .2778 .2428 1.32 89.1161 .0780 7 649.580 .3272 .2621 1.36 87.6599 .0799 4 722.000 .3765 .3320 1.50 83.6122 .0849 11 849.400 .4259 .4689 1.88 80.0545 .0891 3 892.000 .4753 .5164 2.07 79.9673 .0892 16 924.000 .5247 .5520 2.23 80.2727 .0888 20 950.000 .5741 .5806 2.38 80.7532 .0883 19 1050.000 .6235 .6855 3.18 84.4622 .0839 6 1110.000 .6728 .7425 3.88 87.9885 .0795 12 1167.684 .7222 .7917 4.80 92.1776 .0740 8 1173.000 .7716 .7959 4.90 92.5994 .0734 13 1174.000 .8210 .7967 4.92 92.6794 .0733 2 1197.000 .8704 .8144 5.39 94.5736 .0708 1 1347.000 .9198 .9056 10.59 109.1187 .0513 17 1577.000 .9691 .9748 39.76 136.5096 .0224

Ranked observations

Fi=(i-3/8)/(N+1/4)

Normal distribution

FX(z) for

z=(x-877)/357

T=1/(1-FX(z))

Page 21: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

Example rainfall Vagharoli (3)

sz s

shence for the first row s

xp

p x

Xp

p

1

2 2

1

2 314

1805

2

129 6

357 50 0283

2 2

1 exp( ) :.

exp(( . ) .

..

s sN

zhence s xx X

pxp

11

2357

1

201

1805

2357 0 363 129 6

2 2

0 0355:

( . ). .

.

Nr./year observation obs.freq. theor.freq.p theo.ret-per. st.dev.xp st.dev.p 10 232.000 .0309 .0355 1.04 129.6295 .0283 5 267.000 .0802 .0439 1.05 125.3182 .0325 9 505.000 .1296 .1488 1.17 99.2686 .0644 18 525.000 .1790 .1622 1.19 97.4253 .0669 15 606.000 .2284 .2240 1.29 90.7089 .0759 14 628.000 .2778 .2428 1.32 89.1161 .0780 7 649.580 .3272 .2621 1.36 87.6599 .0799

Page 22: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

Example Vagharoli (4) Values for distinct return periods Return per. prob(xi<x) p value x st. dev. x confidence intervals lower upper 2 .50000 877.283 79.934 720.582 1033.985 5 .80000 1178.082 93.013 995.740 1360.424 10 .90000 1335.468 107.878 1123.984 1546.952 25 .96000 1503.247 127.221 1253.844 1752.650 50 .98000 1611.602 140.961 1335.263 1887.941 100 .99000 1709.048 153.900 1407.343 2010.753 250 .99600 1825.469 169.899 1492.399 2158.539 500 .99800 1906.275 181.273 1550.908 2261.643 1000 .99900 1982.065 192.101 1605.471 2358.660 1250 .99920 2005.533 195.482 1622.312 2388.754 2500 .99960 2075.895 205.685 1672.672 2479.118 5000 .99980 2142.841 215.477 1720.421 2565.260 10000 .99990 2206.758 224.893 1765.878 2647.638

T

FX(x) = 1 - 1/T

x m s z hence x x mmp X X p p : . . . .877 3 357 5 2 33 1709 0

s sN

zmmx X

p

p

11

2357 5

1

201

2 33

2153 9

2 2

.( . )

.

x x s x mm

x x s x mm

p LCL p x

p UCL p x

p

p

,

,

. . . .

. . . .

196 1709 196 1539 1407 3

196 1709 196 1539 2010 7

Page 23: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

Fit of normal distribution to annual rainfall of Vagharoli

Normal Distribution mx = 877.28 sx = 357.47 95% Confidence Interval

regression Line reduced variate observed frequencies low er confidence limit data upper confidence limit data

Frequency0.2 0.5 0.8 0.9 0.95 0.99 0.999 1

Return Period1.25 2 5 10 20 100 1,000 10,000

Ra

infa

ll (m

m)

2600

2400

2200

2000

1800

1600

1400

1200

1000

800

600

400

Page 24: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

Investigating homogeneity

• Prior to fitting, tests required on:– 1. stationarity (properties do not vary with time)– 2. homogeneity (all element are from the same population)– 3. randomness (all series elements are independent)– First two conditions transparent and obvious. Violating

last condition means that effective number of data reduces when data are correlated

– lack of randomness may have several causes; in case of a trend there will be serial correlation

• HYMOS includes numerous statistical test :– parametric (sample taken from appr. Normal distribution) – non-parametric or distribution free tests (no conditions on

distribution, which may negatively affect power of test

Page 25: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

Summary of tests

• On randomness:– median run test– turning point test– difference sign test

• On correlation:– Spearman rank correlation test– Spearman rank trend test– Arithmetic serial correlation coefficient– Linear trend test

• On homogeneity:– Wilcoxon-Mann-Whitney U-test– Student t-test– Wilcoxon W-test– Rescaled adjusted range test

Page 26: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

Chi-square goodness of fit test• Hypothesis

– F(x) is the distribution function of a population from which sample xi, i =1,…,N is taken

– Actual to theoretical number of occurrences within given classes is compared

• Procedure:– data set is divided in k class intervals containing at least

each 5 values– Class limits from all classes have equal probability

pj = 1/k = F(zj) - F(zj-1)e.g. for 5 classes this is p = 0.20, 0.40, 0.60, 0.80 and 1.00

– the interval j contains all xi with: UC(j-1)<xi UC(j)– the number of samples falling in class j = bj is computed– the number of values expected in class j = ej according to

the theoretical distribution is computed– the theoretical number of values in any class = N/k because

of the equal probability in each class

Page 27: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

F(zj)

F(zj-1)

zjzj-1

Class j

Upper class limit ofclass j is: Uc(j)= + .zj

z

pj

If Z=(X-)/

Chi-squared goodness of fit test

Page 28: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

Chi-square goodness of fit test (2)– Consider following test statistic:

– under H0 test statistic has 2 distr, with df = k-1-m – k= number classes, m = number of parameters– simplified test statistic:

– H0 not rejected at significance level if:

cj j

jj

k b e

e2

2

1

( )

cj

j

k

jj

kb N k

N k

k

Nb N2

2

1

2

1

( / )

/

c with k m21

2 1 , :

Page 29: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

Number of classes in Chi-squared goodness of fit test

N k N k N k

20-29

30-39

40-49

50-99

5

7

9

10

100-199

200-399

400-599

600-799

13

16

20

24

800-999

1000-1499

1500-1999

2000

27

30

35

39

Page 30: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

Example

– Annual rainfall Vagharoli (see parameter estimation)– test on applicability of normal distribution

– 4 class intervals were assumed (20 data)– upper class levels are at p=0.25, 0.50, 0.75 and 1.00– the reduced variates are at -0.674, 0.00, 0.674 and – hence with mean = 877, and stdv = 357 the class limits

become: 877 - 0.674x357 = 636 877 = 877 877 + 0.674x357 = 1118

Results of Chi-Square test variate = chi-square = 1.2000 prob. of exceedance of variate = .2733 number of classes = 4 number of observations = 20 degrees of freedom = 1

Page 31: Parameters of distribution  Location Parameter  Scale Parameter  Shape Parameter

HYDROLOGY PROJECTTechnical Assistance

Example continued (2)

Non-exc. probabilityof upper class limits

Reduced variate ofupper class limits

Class intervalsexpressed in mm

Number ofoccurrences bj

bj2

0.25

0.50

0.75

1.00

-0.67

0.00

0.67

0- 636

637-877

878-1118

1119-

6

3

5

6

36

9

25

36

sum 106

From the table it follows for the test statistic:

c x2 4

20106 20 1 2 .

At significance level = 5%, according to Chi-squared distribution for = 4-1-2 df the critical value is at 3.84, hence c

2 < critical value, so H0 is not rejected