extreme-value analysis of corrosion data

38
Motivation Objective of the Thesis Methods used Data declustering Examples of application Framework for modelling extremes of corrosion Conclusions and recommendations Extreme-Value Analysis of Corrosion Data Marcin Glegola July 16, 2007 Supervisors: Prof. Dr. Ir. Jan M. van Noortwijk MSc Ir. Sebastian Kuniewski Dr. Marco Giannitrapani Marcin Glegola Extreme-Value Analysis of Corrosion Data

Upload: others

Post on 01-May-2022

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Extreme-Value Analysis of Corrosion Data

Marcin Glegola

July 16, 2007

Supervisors:Prof. Dr. Ir. Jan M. van NoortwijkMSc Ir. Sebastian KuniewskiDr. Marco Giannitrapani

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 2: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Outline

Motivation

Objective of the Thesis

Methods used

Data declustering

Examples of application

Framework for modelling extremes of corrosion

Conclusions and recommendations

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 3: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Motivation

◮ in the oil industry, hundreds of kilometres ofpipes and other equipment can be affected by

corrosion

◮ it is extreme defect depth/wall loss thatinfluences the system reliability ⇒ extreme-value

methods are sensible for application

◮ only part of the system can be subjected to

inspection ⇒ results extrapolation is needed

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 4: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Motivation

◮ defects/wall losses caused by corrosion are likelyto be locally dependent ⇒ independence

assumption questionable

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 5: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Objective of the Thesis

◮ present statistical methods to model extremes ofcorrosion data, taking into account local defectdependence

◮ spatial extrapolation of the results

◮ examples of application

◮ framework/guideline for modellingextreme-values of corrosion

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 6: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Methods usedThe Generalised Extreme-Value (GEV) distribution (blockmaxima data)

G (z) =

exp

(

−h

1 + ξ“ z − µ

σ

”i

+

)

, ξ 6= 0

expn

− exph

−“ z − µ

σ

”io

, ξ = 0,

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 7: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Methods used

The Generalised-Pareto (GP) distribution (excess overthreshold data)

Yi = Xi − u, for X > u, i = 1, . . . , nu

0 2 4 6 8 10 12 14 16 18 200

0.5

1

1.5

2

2.5

u

3.5

4

x i

i

Threshold exceedances

H(y) =

1 −

»

1 +ξy

σ̄

−1/ξ

+

, ξ 6= 0

1 − exp“

−y

σ̄

, ξ = 0

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 8: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Methods used (extrapolation-GEV)◮ return-level method

G (zp) = 1 − p ⇔ Pr{M > zp} = p =1

N

◮ implied distribution of the maximum corresponding to the notinspected area

Pr{XN ≤ z} = GN(z) = G (z)N

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 9: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Methods used (extrapolation-GP)

based on Poisson frequency of threshold exceedances (Poisson-GPmodel)

◮ return-level method

H(yp) = 1 − p ⇔ Pr{Y > yp} = p =1

NE

NE = λGEV × (|S | − k × |A|) - expected number ofexceedances on the not inspected area

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 10: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Methods used (extrapolation-GP)

◮ implied distribution of the maximum corresponding to the notinspected area

Pr{XN ≤ x} = exp

{

−λGEV

(

1 + ξx − u

σ̄

)

−1/ξ

+

}N

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 11: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Methods used-summary

◮ two methods for statistical inference about extreme-valuesof corrosion

- the GEV distribution (block maxima)- the GP distribution (excess over threshold data)

◮ two methods for spatial results extrapolation- return-level- distribution of the maximum corresponding to the not

inspected area

◮ the GEV and GP distributions are closely related andtheoretically, should give the same results

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 12: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Modelling extremes of dependent data

◮ for the stationary data characterised by the limited extendof long-range dependence at extreme levels, theextreme-value methods can be still applied

◮ in corrosion application it is reasonable to assume that pitdepths are locally dependent ⇒ extreme-value methodsare applicable

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 13: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Modelling extremes of dependent data with the GEV

distribution

◮ assuming local data dependence, block maxima ofstationary data (for sufficiently large block sizes) can beconsidered as approximately independent

◮ the GEV distribution is used in its standard form

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 14: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Modelling extremes of dependent data with the GP

distribution

◮ neighbouring exceedances maybe dependent, therefore thechange of practise is needed

◮ one of the most widely adoptedmethod is data declustering -filtering out dependentobservations such that remainingexceedances can be consideredas approximately independent

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 15: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Modelling extremes of dependent data with the GP

distribution-approach

◮ define clusters of exceedances

◮ identify maximum excess withineach cluster

◮ assuming that cluster maximaare independent fit the GPdistribution

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 16: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Data declustering

Estimation of the number of data clusters

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 17: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Data declustering

Extremal index method:

◮ the extent of short-range dependence of extremeevents is captured by the parameter θ, calledextremal index

θ =1

limiting mean cluster size

◮ extremal index measures the degree of clustering ofthe process at extreme levels

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 18: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Data declustering

Extremal index method

θ =1

limiting mean cluster size⇒ Nc = θ × Ne

where Nc - number of clusters, Ne - number of exceedances abovethreshold uθ is estimated by the intervals estimator

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 19: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Data declustering

Finding number of clusters using clustering algorithm

◮ define a validity criteria for the number Nc of foundclusters

◮ run the clustering algorithm for a range of Nc

◮ as proper number of clusters choose the one for which thevalidity criteria are optimised

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 20: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Agglomerative hierarchical clustering algorithm

◮ starts with single points as clusters

◮ at each step the two closest clusters are merged

◮ stops when only one cluster remains

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 21: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Agglomerative hierarchical clustering algorithm

0 1 2 3 4 5 6 70

1

2

3

4

5

6

7

1

2

3

4

5

distance

0 1 2 3 4 5 6 70

1

2

3

4

5

6

7

1

2

3

4

5

distance

6

7

8

9

2 3 1 4 50

0.5

1

1.5

2

Objects

Dis

tanc

e be

twee

n ob

ject

s

Dendrogram

distance=heightof the link

link

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 22: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Data declustering

Validation criteria, that we used, aim at identifying clustersthat are compact and well isolated:

◮ silhouette plot - maximum value indicates optimum

◮ Davies-Bouldin index - minimum indicates optimum

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 23: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Data declustering-summary

◮ two methods to estimate the number of data clusters- the extremal index method- clustering algorithm + validation criteria (Davies-Bouldin

index, silhouette plot)

◮ prior to data declustering perform clustering tendency test

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 24: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Simulated corroded surface

◮ application of the gamma-process model◮ dependence in terms of the product moment correlation

ρ(Xk , Xl) = exp

−d

(

2∑

i=1

|disti |p

)q/p

Xk = (xk , yk ), Xl = (xl , yl ),dist1 = |xk − xl |, dist2 = |yk − yl |,d = 0.3, p = 2, q = 1

0 1 2 3 4 5 6 70

1

2

3

4

5

6

7

1

2

3

4

5

distance

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 25: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Simulated corroded surface

0 20 40 60 80 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

x

Cor

rela

tion

coef

ficie

nt

Matrix 1

Matrix 10Matrix 9Matrix 8Matrix 7

Matrix 5Matrix 4Matrix 3Matrix 2

Matrix 6

Matrix 20Matrix 19Matrix 18Matrix 17Matrix 16

Matrix 15Matrix 14Matrix 13Matrix 12Matrix 11

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 26: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Simulated corroded surface

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 27: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Simulated corroded surface - clustering algorithm

20 40 60 80 100 120 1400.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2Davies−Bouldin index vs number of clusters

Number of clusters

Dav

ies−

Bou

ldin

inde

x

20 40 60 80 100 120 1400.65

0.7

0.75

0.8

0.85

0.9

0.95

1 Silhouette plot

Number of clusters

Ave

rage

silh

ouet

te c

oeffi

cien

t

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 28: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Simulated corroded surface - results

θ̂ nc ncalg

0.544 107 121

Table: The estimate of extremal index and determined number of clusters

Number of clusters AD2up p − v . KS p − v .

107 0.376 0.204

121 0.549 0.493

Table: Goodness-of-fit test results for different number of clusters

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 29: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

ξ̂ ˆ̄σ AD2up p − v. KS p − v.

0.065 1.341 0.647 0.488

Table: GP fit to excess of dependent data

ξ̂ ˆ̄σ AD2up p − v. KS p − v.

-0.0137 1.6839 0.555 0.493

Table: GP fit to excess of declustered data

ξ̂ σ̂ µ̂ AD2up p − v. KS p − v.

-0.007 1.627 5.637 0.621 0.899

Table: GEV fit to block maxima data

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 30: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Real data example

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 31: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Real data example

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 32: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Real data example

20 40 60 80 100 120 140 1600.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8Davies−Bouldin index vs number of clusters

Number of clusters

Dav

ies−

Bou

ldin

inde

x

20 40 60 80 100 120 140 1600.7

0.75

0.8

0.85

0.9

0.95

1 Silhouette plot

Number of clusters

Ave

rage

silh

ouet

te c

oeffi

cien

t

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 33: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

ξ̂ σ̂ µ̂ AD2up p − v. KS p − v.

-0.082 0.182 0.757 0.713 0.986

Table: GEV fit to block maxima data

ξ̂ ˆ̄σ AD2up p − v. KS p − v.

-0.008 0.133 0.616 0.074

Table: GP fit to excess of dependent data

ξ̂ ˆ̄σ AD2up p − v. KS p − v.

-0.069 0.172 0.717 0.654

Table: GP fit to excess of declustered data

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 34: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

0.5 1 1.5 2 2.5 30

0.5

1

1.5

2

2.5

3

3.5

4

Extreme pit depth [mm]

Den

sity

Extrapolated probability density function

GP−basedGEV

0.5 1 1.5 2 2.5 30

0.5

1

1.5

2

2.5

3

3.5

4

Extreme pit depth [mm]

Den

sity

Extrapolated probability density function

GP−basedGEV

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 35: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Framework for modelling extremes of corrosion with the GEV distribution

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 36: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Framework for modelling extremes of corrosion with the GP distribution

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 37: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

Conclusions and recommendations

◮ the two applied distributions are closely related and lead tothe consistent inference about extreme-values of corrosion

◮ data declustering improves the results given by the GPdistribution

◮ the performance of other clustering algorithms could bechecked

◮ in order to take into account corrosion nonstationarity dueto space-varying environmental conditions,covariate-dependent extreme-value models with trendscould be considered

Marcin Glegola Extreme-Value Analysis of Corrosion Data

Page 38: Extreme-Value Analysis of Corrosion Data

MotivationObjective of the Thesis

Methods usedData declustering

Examples of applicationFramework for modelling extremes of corrosion

Conclusions and recommendations

THANK YOU FOR ATTENTION

Questions???

Marcin Glegola Extreme-Value Analysis of Corrosion Data