tail estimation for large insurance claims, an extreme ...345793/fulltext01.pdf · ically speaking...

58
Degree Project Mattias Nilsson 2010-06-14 Subject: Mathematics Level: Master Course code: 5MA11E Tail Estimation for Large Insurance Claims, an Extreme Value Approach.

Upload: others

Post on 27-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

Degree Project 

Mattias Nilsson 2010-06-14 Subject: Mathematics Level: Master Course code: 5MA11E

Tail Estimation for Large Insurance Claims, an Extreme Value Approach.

Page 2: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

 

Page 3: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

Mattias Nilsson

Tail Estimation for Large Insurance Claims, an Extreme Value Approach

Master thesis

Mathematics

2010

Page 4: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme
Page 5: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

Abstract

In this thesis are extreme value theory used to estimate the probability that large insuranceclaims are exceeding a certain threshold. The expected claim size, given that the claimhas exceeded a certain limit, are also estimated. Two different models are used for thispurpose. The first model is based on maximum domain of attraction conditions. A Paretodistribution is used in the other model. Different graphical tools are used to check thevalidity for both models. Länsförsäkring Kronoberg has provided us with insurance datato perform the study.

Conclusions, which have been drawn, are that both models seem to be valid and theresults from both models are essential equal.Key-words: Extremal theory, extreme value distribution, maximum domain of attracion,the Hill estimator, Pareto distribution, large insurance claims.

Sammanfattning

I detta arbete används extremvärdesteori för att uppskatta sannolikheten att stora för-säkringsskador överträffar en vis nivå. Även den förväntade storleken på skadan, givetatt skadan överstiger ett visst belopp, uppskattas. Två olika modeller används. Den förs-ta modellen bygger på antagandet att underliggande slumpvariabler tillhör maximat aven extremvärdesfördelning. I den andra modellen används en Pareto fördelning. Olikagrafiska verktyg används för att besluta om modellernas giltighet. För att kunna genom-föra studien har Länsförsäkring Kronoberg ställt upp med försäkringsdata.

Slutsatser som dras är att båda modellerna verkar vara giltiga och att resultaten ärlikvärdiga.Nyckelord: Extremvärdesteori, extremvärdesfördelning, maxima antagande, Hill,Paretofördelning, stora försäkringsskador.

Acknowledgments

I would like to thank my supervisor Astrid Hilbert at Linnaeus University for answeringmy questions concerning the thesis and for introducing me to appropriate literature. I alsowould like to thank Carl-Axel Rudhe at Länsförsäkring Kronoberg for introducing thetopic to me and for giving me the opportunity to work with real insurance data.

iii

Page 6: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme
Page 7: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

Contents

1 Introduction 11.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Purpose of this paper . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3 The insurance data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.4 Plan of this paper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2 Extremal events 2

3 Extreme value theory 3

4 Statistical methods 164.1 Autocorrelation and test for independence . . . . . . . . . . . . . . . . . 164.2 Maximum likelihood estimation . . . . . . . . . . . . . . . . . . . . . . 174.3 The Pareto distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 174.4 Quantile plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184.5 Mean excess function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184.6 Estimation of the shape parameter . . . . . . . . . . . . . . . . . . . . . 204.7 Tail and quantile estimation under MDA conditions . . . . . . . . . . . . 234.8 Expected claim size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5 Estimating the shape parameter from a Hill plot 265.1 Explanation of the method . . . . . . . . . . . . . . . . . . . . . . . . . 265.2 Testing the model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

6 Analysing the data 296.1 Maximum domain of attraction . . . . . . . . . . . . . . . . . . . . . . . 316.2 Pareto . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

7 Result 397.1 Maximum domain of attraction . . . . . . . . . . . . . . . . . . . . . . . 397.2 Pareto . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

8 Discussion 42

9 Conclusion 44

References 46

A Program code 47

v

Page 8: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme
Page 9: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

1 IntroductionIn this paper applied mathematics is used to solve a problem taken from the insuranceindustry. Different statistical methods will be used, but focus lies on extreme value theory.

1.1 Background

Länsförsäkringar is a nationwide insurance company in Sweden and they consist of 24independent companies. The group of länsförsäkringar has a reinsurance system wherethey can reinsure against big losses. In the event of a big loss the other 23 companieswill help to pay for the damage. One of the independent insurance companies is Läns-försäkring Kronoberg, which is the company this report focuses on. The company mustchoose a limit up to which amount damages they pay themselves and which damages thereinsurance company covers. This limit is called priority and has to be chosen each year.The company pays a reinsurance premium according to their choice of priority. A highpriority level is cheaper than a low. All claims below the priority are paid directly by theinsurance company. The reinsurance company pays for claims above the priority level,but only the amount above the given level. For example if the priority level is 4 millionSEK and the claim is 6 million SEK, then the insurance company will pay 4 million SEKand the reinsurance company will pay 2 million SEK.

1.2 Purpose of this paper

The purpose with this paper is to deliver material to Länsförsäkring Kronoberg so thatthey more easily can decide on an appropriate priority limit. The insurance companywants to know the probability that a claim occurs over a certain level. To estimate theseprobabilities the insurance company has provided us with insurance data. We will usetwo different models to estimate the exceedance probabilities. The first model is basedon fitting the data under maximum domain of attraction. In the other model the Paretodistribution is used to fit the data. The results from both models will be compared.

1.3 The insurance data

The received insurance data comes from Länsförsäkring Kronoberg and are based onfire and responsibility accidents in Kronoberg. The data consists of claims larger than 1million SEK between the years 2000 and 2009. The largest claim is on 87.7 million SEK,the second largest claim is on 36.4 million SEK and the smallest claim is on 1.002 millionSEK. It is worth mentioning that the largest claim is 2.4 times larger than the secondlargest claim, while the second largest claim is only 1.1 times larger than the third largestclaim. The claims are not spreed evenly. The sample consist of 219 observations.

1.4 Plan of this paper

This paper starts with an explanation of what an extremal event is and why extreme valuetheory is studied in this paper. The next section contains the theory; here are definitionsand theorems presented. In the next section are statistical methods presented. An algo-rithm that automatically chooses a value from the Hill plot is constructed and empiricallytested in the next section. The paper continues with an analysis of the insurance data.Further on are results presented and discussed. The thesis ends with a summary of con-clusions, open questions and suggestions of how to improve the paper.

1

Page 10: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

2 Extremal eventsThis section explains what an extremal event is and some examples are presented. It isalso explained why extreme value theory is studied in this paper.

An extremal event is a rare event which is difficult to predict. Extremal events usuallyhave a considerable financial impact on the insurance industry. One can usually divideextremal events into two sections: disasters made by nature and disasters made by hu-mans. Hurricanes, earthquakes, storms and tornados are examples of catastrophes madeby nature. While stock market crashes are made by men.

The technique to statistically analyze extremal events can be used in many fields. Agovernment might for example be interested in building a dyke to prevent a flood disaster.To know how to build this dyke they require to estimate how often a disaster caused bya particular flood hight can happen based on previous observations. Another exampleis that insurance companies are interested in determining next year’s premium based onrecent years events. If the premium is too low and extremal events occur then there is arisk that the company can go bankrupt. A financial example is that holders of portfoliossometimes need to set up a maximal limit of the potential loss. Another motivation tostudy extremal events in finance is that there is usually time to react to a series of smalllosses. But an extremal loss can immediately lead to bankruptcy. Embrechts et al (1997).

We will in this paper study the basic extreme value theory and apply it to the insurancebusiness. It is of course possible to apply the theory to other areas, see for exampleEmbrechts et al (1997) for inspiration. As already mentioned the received data are basedon fire and responsibility accidents. Fire accidents are good examples of extremal events.Accidents caused by fire do not occur very often. An insurance company usually has topay a considerable amount of money to the affected after a burnout.

Länsförsäkring Kronoberg receives a huge amount of claims every year. Some claimsare small and some claims are large. There is no upper limit in the claim sizes. Mathemat-ically speaking this means that the insurance data follows a heavy tailed distribution. Theextreme value distributions are known to be heavy tailed. The choice of the priority limitdepends only on the large claims. This means that the company is interested in the claimsoccurring in the tail of a given distribution. There is a possibility that large insuranceclaims follows the tail of an extreme value distribution. To see if this is the case we haveto study extreme value theory. It turns out that once the theory is known; there are statis-tical techniques to estimate the tail of an extreme value distribution. Such techniques andother statistical techniques will be discussed in Section 4. If we can estimate the behaviorof the tail, then can we calculate exceedens probabilities. This can help LänsförsäkringKronoberg to decide an appropriate priority limit. But first, some extreme value theoryhas to be presented.

2

Page 11: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

3 Extreme value theory

This section contains a summary of the theoretical background needed to study extremalevents. The theory is the same regardless if the topic is applied to insurance or finance.Definitions, theorems and propositions are presented. All theory is taken from publishedbooks or papers. Some theorems are given without a proof, this is because in some casesthe proofs are very extensive. The interested reader can find missing proofs through thereference list. In most cases the proofs are more detailed than in the original literature inorder to fit the level of a master thesis.

In the analysis of extremal events there are precisely three standard extreme value dis-tributions, namely Fréchtet, Weibull and Gumbel. Their distribution functions are givenbelow, and can be found in Embrechts et al (1997), page 121;

Fréchet: Φα(x) =

{0, x≤ 0exp{−x−α}, x > 0

α > 0.

Weibull: Ψα(x) =

{exp{−(−x)α}, x≤ 01, x > 0

α > 0. (3.1)

Gumbel: Λ(x) = exp{−e−x}, x ∈ R.

Instead of having three standard cases for the extreme value distributions, it could beuseful to have a one-parameter representation that covers all three cases. This can be doneby introducing a new parameter ξ such that for ξ = 0 the corresponding distribution isGumbel. If ξ = 1/α > 0 the corresponding distribution is Fréchet and if ξ = −1/α <0 the corresponding distribution is Weibull. With this in mind, we define the standardgeneralised extreme value distribution Hξ as

Hξ (x) =

{exp{−(1+ξ x)−1/ξ

}if ξ = 0,

exp{−exp{−x}} if ξ = 0,

where 1+ ξ x > 0. Extremal events are naturally related to the tail of a distribution andthey usually occur rarely.

Since the received insurance data contain claims exceeding 1 million SEK, techniquesto estimate the tail of the distribution have to be presented. If a random variable X be-longs to the maximum domain of attraction, it means that after rescaling the distributionfunction of X , the distribution function approximately coincides with one given in (3.1),in the tail region. The tail of the distribution function F is defined as F = 1−F . We canmake the following definition;

Definition 3.1. A random variable X or a distribution function F of X belongs to themaximum domain of attraction of the extreme value distribution H if there exist normingconstants cn > 0 and dn ∈ R such that

Mn−dn

cn

d→ H (3.2)

holds. Where Mn =max(X1, . . . ,Xn) and X1, . . . ,Xn are independent identically distributedrandom variables. If (3.2) holds we write X ∈MDA(H). See Embrechts et al (1997), page128.

3

Page 12: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

The Pareto and the Loggamma distributions are examples of distributions that are validin MDA(Φα). In the MDA(Ψα) are the Uniform distribution, defined on the interval(0,1), and the Beta distribution valid. Two examples of distributions that are valid inMDA(Λ) are the Gamma and the Lognormal distribution. These and more examples ofdistributions that are valid in MDA(H) can be found in Embrechts et al (1997), page153-157.

By using the assumption that the underlying random variables X1, . . . ,Xn are indepen-dent and by noting that the extreme value distribution functions are continuous on R,equation (3.2) can be reformulated as

limn→∞

P(Mn ≤ cnx+dn) = limn→∞

P(X1 ≤ cnx+dn, . . . ,Xn ≤ cnx+dn)

= limn→∞

Fn(cnx+dn)

= H(x), x ∈ R.

To understand what (3.2) means, we compare to the central limit theorem, where the sumis replaced by a maximum and instead of a normal distribution we have an extreme valuedistribution.

We now present some definitions, propositions and theorems. This is done since wewant to give a mathematical motivation to why the Pareto distribution might be valid inthe MDA(Φα).

Proposition 3.1. The distribution function F ∈MDA(H) if and only if

limn→∞

nF(cnx+dn) =− lnH(x), x ∈ R.

Embrechts et al (1997), page 128.

Proposition 3.1 is proven in de Haan (1970), page 62-63. The aim is to prove that

limn→∞

Fn(cnx+dn) = H(x) (3.3)

is equvalent tolimn→∞

nF(cnx+dn) =− lnH(x). (3.4)

Since F is a distribution function and H(x) is bounded by the asymptote f (x) = 1, bothequations imply that F(cnx+ dn) < 1. Hence is limn→∞ F(cnx+ dn) = 1. Take the log-arithm in (3.3) and multiply with minus one. Devide the result with the left hand sideof (3.4) to obtain

limn→∞

− ln(Fn(cnx+dn))

nF(cnx+dn)= lim

n→∞

−n ln(F(cnx+dn))

nF(cnx+dn)

= limn→∞

−n ln(1− F(cnx+dn))

nF(cnx+dn)= 1. (3.5)

By Taylor expanding the logarithmic function with respect to F can we se that the lastequality in (3.5) holds true since the left hand side can be written as

limn→∞

1+∞

∑k=2

(F(cnx+dn))k−1

k.

This proves that (3.3) is equivalent to (3.4).Since the given distribution function need not be strictly increasing it might fail to be

invertible. Therefore we introduce

4

Page 13: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

Definition 3.2. The quantile function F←(t) is definied as the generalized inverse of thedistribution function F . This is expressed by

F←(t) = inf{x ∈ R : F(x)≥ t}, 0 < t < 1.

The t-quantile of F is defined as xt = F←(t). For a reference, see Embrechts et al (1997),page 130.

Definition 3.3. The right endpoint xF of a distribution F is defined as

xF = sup{x ∈ R : F(x)< 1}.

Embrechts et al (1997), page 114-115.

Definition 3.4. A positive Lebesgue measurable function L, defined on the interval (0,∞),is said to be slowly varying at infinity if

limx→∞

L(tx)L(x)

= 1, t > 0.

Embrechts et al (1997), page 564.

Definition 3.5. A positive Lebesgue measurable function G, defined on the interval (0,∞),is said to be regularly varying at infinity with index α , (α ∈ R), if

limx→∞

G(tx)G(x)

= tα , t > 0. (3.6)

If (3.6) is satisfied, we write G ∈Rα . Embrechts et al (1997), page 564.

The function f (x) = loga x, a ∈ R, is an example of a slowly varying function. Anexample of a regularly varying function is f (x) = x.

Definition 3.6. If a sequence (an) of positive numbers satifies

limn→∞

a[tn]an

= tα , t > 0,

then is (an) said to be regularly varying of index α ∈ R. Embrechts et al (1997), page571.

Theorem 3.1. If h ∈Rα then

h(x) = c(x)exp{∫ x

z

δ (u)u

du}, (3.7)

for a positive constant z and where δ and c are measurable functions. Where δ (x)→ αand c(x)→ c0 ∈ (0,∞) as x→ ∞. The converse is also true.

Theorem 3.1 is known as the representation theorem for regularly varying functionsand can be found in Embrechts et al (1997), page 566.

If we know that h ∈Rα is valid, then is

h ∈Rα ⇔ limx→∞

h(tx)h(x)

= tα . (3.8)

5

Page 14: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

We now want to derive the unique choice of h(x) given by (3.7). Start with (3.8), letx >> 1 and choose an ε > 0 such that

(α− ε)h(x)≤ h(tx)≤ (α + ε)h(x)

holds. Study the derivity of the right hand side in (3.8),

ddt

(h(tx)h(x)

)= αtα−1⇔ x

h′(tx)h(x)

= αtα−1

⇔ th′(tx)h(x)

=αtα

x

⇔ddx (h(tx))

h(x)=

αtα

x

⇔ddx (h(tx))

h(tx)=

αtα

xh(x)h(tx)

⇔ ddx

(lnh(tx)) =αx. (3.9)

Choose δ measurable such that δ (x)→ α as x→∞. Then for each ε > 0 there exists a xεsuch that for all x > xε ,

α− ε ≤ δ (x)≤ α + ε

holds. Thenα− ε

x≤ δ (x)

x≤ α + ε

x,

or equivalentlyδ (x)− ε

x≤ α

xand

δ (x)+ εx

≥ αx. (3.10)

Inserting (3.10) into (3.9) reveals

δ (x)− εx

≤ ddx

(lnh(tx))≤ δ (x)+ εx

⇔∫ x

z

δ (u)− εu

du≤ lnh(tx)− lnh(z)≤∫ x

z

δ (u)+ εu

du⇔

exp{∫ x

z

δ (u)− εu

du}≤ h(tx)

h(z)≤ exp

{∫ x

z

δ (u)+ εu

du}⇔

h(z)exp{∫ x

z

δ (u)− εu

du}≤ h(tx)≤ h(z)exp

{∫ x

z

δ (u)+ εu

du}.

Now let ε → 0, which implies that x→ ∞. Finally divide with h(x) too see that the proofis finished.

Proposition 3.2. Choose a sequence of non-decreasing functions h,h1,h2, . . . ,hn suchthat limn→∞ hn(x) = h(x) for every continuity point of h. Then for every continuity pointy of h← the limn→∞ h←n (x) = h←(x) holds true.

Proposition 3.2 is stated and proven in Resnick (1987), page 5. Before proving Propo-sition 3.2, introduce the notation

C (J) = {x ∈ R : J is finite and continuous at x},

6

Page 15: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

where J is a function. This is done to make the proof easier to follow. Fix a t ∈ C (h←(x))and an ε > 0. The monotone function h has at most a countable number of discontinuespoints. Therefore there exists a x ∈ (h←(t)− ε,h←(t)) and x ∈ C (h). We know thatx < h←(t) and hence is h(x) < t, this follows from definition of h←. For large n arehn(x)< t since x ∈ C (h) implies hn(x)→ h(x) as n→ ∞. From the definition of h←, wecan go from hn(x)< t to x≤ h←n (t). Hence is

h←(t)− ε < x≤ h←n (t). (3.11)

Since ε was choosen arbitrary, equation (3.11) implies

limn→∞

inf h←n (t)≥ h←(t).

To prove the reverse inequality, choose a t ′ such that t ′ > t. For every t ′ there is az ∈ C (h) and

h←(t ′)< z < h←(t ′)+ ε . (3.12)

From (3.12) we get that h←(t ′) < z and hence is t < t ′ ≤ h(z). For large n are hn(z) ≥ tand hence are z≥ h←n (t), this follows from that z ∈ C (h) implies hn(z)→ h(z) as n→ ∞.For large n, can equation (3.12) be written as

h←n (t)≤ z < h←(t ′)+ ε. (3.13)

Since ε was choosen arbitrary, equation (3.13) implies

limn→∞

sup h←n (t)≤ h←(t ′).

Use the continuity of h← at the point t and let t ′→ t to get

limn→∞

sup h←n (t)≤ h←(t).

The proof is complete.

Proposition 3.3. Let U(x),V (x) and Fn,n ≥ 1 be distribution functions. Suppose thatU(x) and V (x) neither are concentrated at a point. Let an,αn ∈ R and bn,βn > 0 beconstants. Suppose

Fn(αnx+βn)→V (x) and Fn(anx+bn)→U(x) (3.14)

weakly. Thenβn−bn

an→ B ∈ R and

αn

an→ A > 0. (3.15)

A more extended version of Proposition 3.3 can be found in Resnick (1987), page 7-8,together with a proof. We give a short proof of Proposition 3.3. Applying Proposition 3.2to (3.14) gives that it is possible to invert (3.14) as follows

F←n (y)−βn

αn→V←(y) and

F←n (y)−bn

an→U←(y). (3.16)

We can find points y1 and y2 satisfying

y1,y2 ∈ C (U←)∩C (V←), y1 < y2,

U←(y1)<U←(y2) and V←(y1)<V←(y2),

7

Page 16: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

since U(x) nor V (x) concentrates at one point. From (3.16), we get

F←n (yi)−βn

αn→V←(yi) and

F←n (yi)−bn

an→U←(yi), (3.17)

for i = 1,2. Substraction gives

F←n (y2)−F←n (y1)

αn→V←(y2)−V←(y1)> 0, (3.18)

F←n (y2)−F←n (y1)

an→U←(y2)−U←(y1)> 0. (3.19)

Divide (3.19) with (3.18) to get

αn

an→ U←(y2)−U←(y1)

V←(y2)−V←(y1)=: A > 0. (3.20)

Equation (3.20) is indeed the right hand side in (3.15). From (3.17) and (3.20) we get

F←n (y1)−bn

an→U←(y1) and

F←n (y1)−βn

an→ V←(y1)

A.

Substraction givesβn−bn

an→U←(y1)−

V←(y1)

A=: B ∈ R. (3.21)

Equation (3.21) is the left hand side in (3.15). The proof is complete.

Proposition 3.4. Suppose that F is a distribution function and that F(x)< 1 for all x≥ 0.The function F is regularly varying if there exists two sequences, (xn) and (an), satisfying

limn→∞

xn = ∞ and limn→∞

an

an+1= 1

and such thatlimn→∞

anF(λxn) = g(λ ) = λ−α ∈ (0,∞)

exists for some continuous positive function g and all λ ∈ (0,∞).

Proposition 3.4 is stated in Embrechts et al (1997), page 568. A more extensive versionof the proposition can be found in De Haan (1970), page 8-12, where the proposition alsois proven. Assume that F is non-increasing. For all t > 0 define the integer n = n(t) by

n = min{m ∈ Z : xm+1 > t} .

Then is xn ≤ t < xn+1. Since F is non-increasing we have that

F(xn+1λ )≤ F(tλ )≤ F(xnλ )

and1

F(xnµ)≤ 1

F(tµ)≤ 1

F(xn+1µ)for all λ and µ ∈ (0,∞). Hence yields

F(xn+1λ )F(xnµ)

≤ F(tλ )F(tµ)

≤ F(xnλ )F(xn+1µ)

. (3.22)

8

Page 17: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

Suppose that limn→∞ anF(λxn) = g(λ ) = λ−α holds. Study the right hand side in (3.22).For simplicity fix µ . Recall that n depends on t and hence is

limt→∞

F(xn(t)λ )F(xn(t)+1µ)

= limt→∞

an(t)F(xn(t)λ ) = g(λ ) = λ−α .

Doing the same thing on the left hand side in (3.22) in combination with the squeezetheorem gives that

limt→∞

F(tλ )F(t)

= λ−α ,

for all positive λ . This proves that F is regularly varying, see Definition 3.5.For α > 0 we can immediately derive a natural choice for the parameter cn. Let dn = 0

and F ∈MDA(Φα) then we may choose cn such that

cn = F←(

1− 1n

)= inf

{x ∈ R : F(x)≥ 1− 1

n

}= inf

{x ∈ R : F(x)−1≥−1

n

}= inf

{x ∈ R :

1n≥ 1−F(x)

}(3.23)

= inf{

x ∈ R :1n≥ F(x)

}= inf

{x ∈ R :

1F(x)

≥ n}

=

(1F

)←(n).

Theorem 3.2. The distribution function F ∈ MDA(Φα), α > 0, if and only if F(x) =x−αL(x) for some slowly varying function L. If F ∈MDA(Φα) and dn = 0, then

Mn

cn

d→Φα ,

where cn can be chosen as in (3.23). Or in other words if

F ∈MDA(Φα)⇔ F ∈R−α ,

where R−α is the class of regularly varying functions with index α .

Theorem 3.2 can be found in Embrechts et al (1997), page 131, where it also is proven.Assume F ∈R−α for α > 0. Choose cn by (3.23) as

cn = F←(

1− 1n

)= inf

{x ∈ R :

1n≥ F(x)

},

then F(cn)∼ n−1 as n→ ∞. This gives that F(cn)→ 0 as cn→ ∞. If x > 0, then is

nF(cnx)∼ F(cnx)F(cn)

→ x−α , as n→ ∞.

9

Page 18: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

If x < 0, we can estimate Fn(cnx) ≤ Fn(0)→ 0 because regular variation requires thatF(0)< 1, see Definition 3.5. Apply Proposition 3.1 with dn = 0 to get

limn→∞

nF(cnx) = x−α ,

and hence F ∈MDA(Φα).To prove the other direction, choose appropriate dn ∈ R and cn > 0 such that

limn→∞

Fn(cnx+dn) = Φα(x) for all x > 0.

Hence is

limn→∞

Fn(c[ns]x+d[ns]) = Φ1/sα (x)

=(exp{−x−α})1/s

= exp{− 1

sxα

}= exp

{−(s1/αx)−α

}= Φα

(s1/αx

),

for x > 0 and s > 0. Apply Proposition 3.3 to get

c[ns]

cn→ s1/α and

d[ns]−dn

cn→ 0.

This means that (cn) is a regularly varying sequence, see Definition 3.6. Let dn = 0, thenis nF(cnx)→ x−α , by Proposition 3.1. Since cn → ∞ and n

n+1 → 1, when n→ ∞, theconditions of Proposition 3.4 is satisfied and hence is F ∈ R−α . There is much moreinvolved in the case when dn = 0, see de Haan (1970), page 69-72, for details.

All distributions that belongs to MDA(Φα) have an infinite right endpoint xF . While alldistributions that belongs to MDA(Ψα) have a finite endpoint. In the case of the MDA(Λ)is xF ≤∞, since MDA(Λ) covers a wide range of distributions. When modelling extremalevents in insurance it is prefered to have a heavy tailed distribution where xF = ∞.

The insurance data can’t belong to MDA(Ψα) since the claims are positive. The dis-tributions that belongs to MDA(Λ) is not heavy tailed enough to cover the insurance data.This means that the insurance data probably belongs to MDA(Φα).

We don’t go deeper into the case when F ∈MDA(Ψα) since the insurance data clearlydoesn’t belong to this case. But we can see that the MDA(Φα) and the MDA(Ψα) shouldbe closely related since

Ψα(−x−1) = Φα(x), x > 0.

For details about conclusions of this relation see Embrechts et al (1997), page 134-137.For the Gumbel distribution can we formulate the following theorem.

Theorem 3.3. The distribution function F ∈MDA(Λ) if and only if there exists a positivefunction a such that

limx↑xF

F(x+ ta(x))F(x)

= e−t

holds. A possible choice of a is

a(x) =∫ xF

x

F(t)F(x)

dt.

10

Page 19: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

For a proof see de Haan (1970), page 87, Theorem 2.5.1. De Haan proves the theoremwith the help of no less than seven other theorems. To do something similar in this reportwould be too extensive. I settle for a reference in particular since our data do not belongto the MDA(Λ).

Theorem 3.4 below is the basis in extreme value theory. The theorem lays the groundfor many statistical techniques. Before stating the theorem define U(t) = F←(1− t−1) fort > 0.

Theorem 3.4. The following statements are equivalent, as long as ξ ∈ R;

I. F ∈MDA(Hξ ).

II. There exists a measurable and positive function a(·) such that for 1+ξ x > 0 is

limu↑xF

F(u+ xa(u))F(u)

=

{(1+ξ x)−

1ξ if ξ = 0,

e−x if ξ = 0.(3.24)

III. For x,y > 0 and y = 1 is

lims→∞

U(sx)−U(s)U(sy)−U(s)

=

{xξ−1yξ−1

if ξ = 0,lnxlny if ξ = 0.

(3.25)

A sketch of a proof of Theorem 3.4 is given in Embrechts et al (1997), page 159. Forconvenience a more detailed proof will be presented bellow.

The first step to prove is that I⇔ II when ξ = 0. To do this note that F ∈MDA(Hξ )⇔F ∈MDA(H0)⇔ F ∈MDA(Λ), and hence by Theorem 3.3,

F ∈MDA(Λ)⇔ limu↑xF

F(u+ xa(u))F(x)

= e−x.

To prove that I⇔ II for ξ > 0, first prove I⇒ II then prove II⇒ I. To make computa-tions more easy note that Hξ (x) = Φα(1+ xα−1) for α = ξ−1, this is true since

Hξ (x) = exp(−(1+ξ x)−1/ξ

)= exp

(−ξ−1/ξ (ξ−1 + x

)−1/ξ)= exp

−(ξ−1 + x)−1/ξ

1ξ−1/ξ

= exp

(−(α + x)−α

α−α

)= exp

(−(

α + xα

)−α)

= exp(−(α−1(α + x)

)−α)= Φα

(α−1(α + x)

)= Φα

(1+

).

11

Page 20: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

Apply the result from Theorem 3.2 to see that F ∈ MDA(Hξ )⇔ F ∈ R−α . Hence, byTheorem 3.1,

F(x) = c(x)exp{−∫ x

z

1a(t)

dt}

, z < x < ∞,

where a(x)/x→ α−1 and c(x)→ c as x→ ∞. Therefore

limu→∞

F(u+ xa(u))F(u)

= limu→∞

c(u+ xa(u))c(u)

·exp{−∫ u+xa(u)

z1

a(t)dt}

exp{−∫ u

z1

a(t)dt}

= limu→∞

c(u+ xa(u))c(u)

· limu→∞

exp{−∫ u+xa(u)

u

1a(t)

dt}.

To make it easier to follow, study each limit separately. The first limit is

limu→∞

c(u+ xa(u))c(u)

=cc= 1.

To be able to calculate the second limit, note that a(x)/x→ α−1 is equivalent to a(x)→xα−1 when x→ ∞. Therefore set a(u) = uα−1 and obtain

limu→∞

exp{−∫ u+xa(u)

u

1a(t)

dt}= lim

u→∞exp

{−∫ u+xuα−1

u

1tα−1 dt

}= lim

u→∞exp{−α[ln(u(1+ xα−1))− ln(u)]

}= lim

u→∞exp{

ln(

uu(1+ xα−1)

)α}= exp

{ln(

1(1+ xα−1)α

)}=(

1+xα

)−α.

This gives that

limu→∞

F(u+ xa(u))F(u)

=(

1+xα

)−α,

which is (3.24).To get from II to I when ξ > 0, set

cn = (1/F)←(n) = F←(

1− 1n

)=U(n),

then1

F(cn)∼ n,

because of Theorem 3.2. Choose u = cn in (3.24). The right hand side of (3.24) becomes(1+

)−α= (1+ξ x)−

=− ln(

exp{−(1+ξ x)−

1ξ})

=− ln(Hξ (x)

).

12

Page 21: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

The left hand side of (3.24) can be written as(1+

)−α= lim

n→∞

F(cn + xa(cn))

F(cn)

= limn→∞

F(cn + xa(cn))1n

= limn→∞

nF(cn + xa(cn)).

Hencelimn→∞

nF(cn + xa(cn)) =− ln(Hξ (x)

). (3.26)

Now apply Proposition 3.1 to (3.26) to obtain that F ∈MDA(Hξ ).To prove that II⇔ III, when ξ = 0, assume that F is strictly increasing and continuous

on the interval (−∞,xF). Use the change of variables s = 1/F(u) in (3.24) and also usethat

U(s) = F←(

1− 1s

)=

(1F

)←(s) = u,

then

(1+ξ x)−1/ξ = limu→∞

F(u+ xa(u))F(u)

= lims→∞

sF(U(s)+ xa(U(s))).

Which is equivalent to

(1+ξ x)1/ξ = lims→∞

(sF(U(s)+ xa(U(s))))−1. (3.27)

For every positive s, the right hand side of (3.27) is decreasing and converges to the lefthand side, which is a contionuous function, as s→ ∞. Now apply Proposition 3.2, thatmeans that the quantile function converges pointwise to the inverse of (3.27). This can bewritten as

lims→∞

U(st)−U(s)a(U(s))

=tξ −1

ξ, (3.28)

since the inverse can be obtain by solving t = (1+ ξ x)1/ξ for x, which gives that x =(tξ −1)/ξ . To get from (3.28) to (3.25), choose x = t and t = y and take the quotient,

lims→∞

U(sx)−U(s)a(U(s))

/U(sy)−U(s)

a(U(s))= lim

s→∞

U(sx)−U(s)U(sy)−U(s)

,

which is the left hand side of (3.25). The right hand side is obtained by

lims→∞

U(sx)−U(s)a(U(s))

/U(sy)−U(s)

a(U(s))=

xξ −1ξ

/yξ −1

ξ=

xξ −1yξ −1

.

To prove II⇔ III for ξ = 0 use the same technique as above to obtain

lims→∞

U(st)−U(s)a(U(s))

= ln t. (3.29)

Then set t = x and t = y and take the quotient, similarly as above, to get from (3.29)to (3.25). The proof is finished.

13

Page 22: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

In the case when x > 0 the second condition in Theorem 3.4 can be reformulated as

limu↑xF

P[

X−ua(u)

> x∣∣∣∣X > u

]=

{(1+ξ x)−

1ξ if ξ = 0,

e−x if ξ = 0.(3.30)

Equation (3.30) means the conditional probability that a claim is exceeding xa(u) + ugiven that the claim has exceeded u, for some appropriate a(u). One way to derive (3.30)is to study the left hand side in (3.24) and by noticing that

{X > u+ xa(u)} ⊂ {X > u}. (3.31)

Since both x and a(u) are assumed to be positive the left hand side in (3.31) is clearlya subset of {X > u}. This means that the intersection between the two sets is {X >u+ xa(u)}. With this in mind, consider the left hand side in (3.24) to see that

F(u+ xa(u))F(u)

=P[X > u+ xa(u)]

P[X > u]

=P[{X > u+ xa(u)}∩{X > u}]

P[X > u]

= P[

X−ua(u)

> x∣∣∣∣X > u

].

In this report will we only study the case when ξ = 0 in (3.30) and how the right handside behaves at infinity. This is motivated by the nature of the insurance data. We alreadymentioned that the data probably belongs to the MDA(Φα), and hence is ξ positive.

In the examples on page 4 it was mentioned that the Pareto distribution belongs toMDA(Φα). In fact Taylor expansion applied to (3.30) reveals that the peaks over a thresh-old u for a distribution belonging to MDA(Φα) asymptotically follow a Pareto distribu-tion. We now perform the Taylor expansion.

By using the change of variables x = 1/y a Taylor expansion of (1+ξ x)−1/ξ arroundinfinity is equivalent to a Taylor epxansion of (1+ξ/y)−1/ξ arround zero. The later caseis easier to handle. For simplicity set α = ξ−1 and study the expansion for different valuesof α . Start with α = 1 to get (

1+1y

)−1

=∞

∑n=1

(−1)n+1yn.

For α = 2,3,4,5 is the expansion(1+

12y

)−2

=∞

∑n=2

(−1)n+2yn2n(n−1),(1+

13y

)−3

=∞

∑n=3

(−1)n+3yn3n(n−1)(n−2)(3−1)

,(1+

14y

)−4

=∞

∑n=4

(−1)n+4yn4n(n−1)(n−2)(n−3)(4−1)(4−2)

,(1+

15y

)−5

=∞

∑n=5

(−1)n+5yn5n(n−1)(n−2)(n−3)(n−4)(5−1)(5−2)(5−3)

.

14

Page 23: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

We can now make the expansion of(

1+ 1αy

)−αgenerally by noting that there are a

pattern in the above series. Hence is(1+

1αy

)−α=

∑n=α

(−1)n+αynαn(n−1)(n−2) · · ·(n−α +1)(α−1)!

=∞

∑n=α

(−1)n+αynαn(n−1)!(α−1)!(n−α)!

=(−1)α+αyααα(α−1)!

(α−1)!(α−α)!+

∑n=α+1

(−1)n+αynαn(n−1)!(α−1)!(n−α)!

= yααα +∞

∑m=1

(−1)mym+ααm+α(m+α−1)!(α−1)!m!

.

Use the change of variables y = x−1 to recive the Taylor expansion of(1+ x

α)−α around

infinity, i.e (1+

)−α=(α

x

)α+

∑m=1

(−1)mαm+α(m+α−1)!xm+α(α−1)!m!

. (3.32)

Equation (3.32) means that(1+ x

α)−α behaves approximately like a Pareto distribution

for large x. The Pareto distribution is defined in Section 4.3. Set b = a in (4.4) to seethat (3.32) behaves like a Pareto distribution. This is a motivation to model large claimsby a Pareto distribution. This will be studied in Section 6.2.

Combining Theorem 3.2 and Theorem 3.4 II for F ∈ MDA(Φα) reveals that above athreshold u, where u≈ xF , the following statements are equivalent:

I. F(x) = x−αL(x), where L is a slowly varying function.

II.

limu↑xF

F(u+ xa(u))F(u)

= (1+ξ x)−1ξ , ξ > 0.

For the lower order of the Taylor expansion of the right hand side of II around the point∞ we retrieve the tail distribution function of a Pareto distribution which is of the type I.

To summarise this section, there are some main conclusions that are worth mention-ing again. The first conclusion is that the underlying distribution of the insurance dataprobably belongs to the MDA(Φα). The second conclusion is that the MDA(Φα), overthe threshold u, behaves like a Pareto distribution. Another conclusion is that if F ∈MDA(Φα) then is the tail F(x) = L(x)x−α . The last conclusion is that the Pareto distribu-tion is one of the distributions that belongs to MDA(Φα). In Section 6 will we use someof the information from this section to analyse the given insurance data.

15

Page 24: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

4 Statistical methodsThis section contains the statistical methods used in this paper. Methods for testing in-dependence are discussed. Maximum likelihood estimation is explained. The Paretodistribution is introduced. Quantile and mean excess plots are defined. Two methods toestimate the shape parameter ξ are presented. Tail and quantile estimation are mentioned.

4.1 Autocorrelation and test for independence

Many statistical tools are assuming, or requiring, that the data are independent and iden-tically distributed (iid) from an underlying distribution. One way to test independence isto study the so called autocorrelation function for different time lags. Before explaininghow this works, we have to make the following definition.

Definition 4.1. If x1, . . . ,xn are observations from a sample, then the sample mean aredefinied as

x =1n

n

∑i=1

xi.

The autocorrelation functions of the sample for different time lags h are defined as

ρ(h) = ∑n−hi=1 (xi− x)(xi+h− x)

∑ni=1 x2

i. (4.1)

A reference for this subsection is Brockwell et al (2002), chapter 1, where also othertests for independence can be found. One way to graphically test if the data in the sampleare independent from each other is to plot the empirical autocorrelation function for dif-ferent time lags. If the data are independent, the autocorrelation function should be nearzero for all h. If one or several points differ significant from zero, this could be evidencefor that the data are dependent. This phenomenon is for example common in analysis ofstock prices. Today’s stock price is usually dependent on yesterday’s stock price.

To decide if a value from ρ(h) is near zero or significant not zero, plot two bounds±1.96/

√n and see if the values are within these bounds. If the autocorrelation func-

tion are within these bounds we can say that the data are iid. The reason to choose thebounds ±1.96/

√n are because for large iid sequences ρ(h) are approximately iid normal

distributed, with mean zero and variance 1/n. It is a well known fact that the 95% con-fidence quantiles for a normal distribution with mean zero and standard deviation σ aregiven by±1.96σ . It is of course allowed to choose other quantiles then the 95% quantiles.But it is quite standard in statistics to chose the value 95%.

In the case when the autocorrelation plot contains points that are outside the 95% con-fidence interval, we have to deside if the we can reject the assumption about independenceor not. The Ljung-Box test is a statistical test that can be used to answer this question.One advantage with the Ljung-Box test is that it tests independece for the whole sample,not only the points that are non zero. The test is as follows. Set up the two followinghypothesis

H0 : the data in the sample are iid,H1 : some data in the sample are not iid.

Reject H0 at level α if Q > χ21−α(d), where χ2

1−α(d) is the 1−α quantile of the chi-squared distribution with d degrees of freedom. The test statistic Q is defined as

Q = n(n+2)d

∑j=1

ρ2( j)n− j

. (4.2)

16

Page 25: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

A random variable is said to have a chi-squared distribution if it can be written as a sumof squared iid normal variables, with mean zero and variance equal to one, i.e. W ∼ χ2 if

W =n

∑i=1

X2i , Xi ∼ N(0,1).

The degree of freedom is just the length of the sample, minus one. For example if thesample size is 342, then the degree of freedom is 341. So if the data are iid, the Q shouldbehave like a chi-squared distribution.

4.2 Maximum likelihood estimation

When working with statistics, a common problem is to estimate one or several unknownparameters of some test statistics which is assumed to be adequate. When an estimator ofthe unknown parameters is found, there is usually a function, depending on the estimatedparameters, to apply the result to. In this paper the density function is given, but the den-sity functions parameters are unknown. A common and widely accepted way to estimatethe density functions parameters is to do maximum likelihood estimation.

A maximum likelihood estimation means that we want to maximize the likelihoodfunction

L(θ) =n

∏i=1

f (xi), (4.3)

for some θ where f (·) is a density function which is assumed to be known. If f is anexponential or rational function, the parameter θ that maximizes (4.3) can be estimatedeither directly or by maximizing the likelihood functions logarithm lnL(θ). The value ofθ that maximizes lnL(θ) also maximizes L(θ). The standard procedure is hence to dif-ferentiate lnL(θ) with respect to θ , set the derivative equal to zero and solve with respectto θ . If θ is a vector of several variables, calculate every partial derivative separately, setequal to zero and solve. Milton et al (2003), page 229-233.

4.3 The Pareto distribution

The Pareto distribution is a heavy-tailed and skewed distribution. Since Pareto is heavy-tailed it is usually used to model rare events. Embrechts et al (1997) claim, based ontheir own experiance, that data from insurance usually have a tail that follows a Paretodistribution. We somehow confirm this claim with the help of (3.32) and by the statisticalanalysis of our data.

The distribution function is defined as

F(x) =

{1−(b

x

)afor x≥ b,

0 for x < b.(4.4)

The density function is obtained by differating (4.4), and is hence

f (x) =

{aba

xa+1 for x≥ b,0 for x < b.

(4.5)

Equation (4.4) and (4.5) can be found in Jäckel (2002), page 16. To be able to apply thePareto distribution to real data, the parameters a and b have to be estimated. A natural

17

Page 26: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

estimation of b is to estimate it with the smallest value in the sample. The parameter acan be estimated from the likelihood function

L(a,b) =n

∏i=1

f (xi) =n

∏i=1

aba

xa+1i

= anbann

∏i=1

1xa+1

i.

Take the logarithm of L(a,b) to obtain

logL(a,b) = n loga+an logb− (a+1)n

∑i=1

logxi.

An estimation of a is hence given by differentiating logL(a,b) with respect to a, settingthis derivative equal to zero and solving with respect to a. Hence this is

∂ logL(a,b)∂a

=na+n logb−

n

∑i=1

logxi = 0.

The maximum likelihood estimation of a is therefore

a =n

∑ni=1 logxi− logb

.

4.4 Quantile plot

A quantile plot (QQ-plot) is a graphical tool which can help us to find the distributionunderlying the data. The idea with this method is to plot{(

xk,n,F←(

n− k+1n+1

)): k = 1, . . . ,n

}, (4.6)

where xk,n is the ordered sample. The data in the sample must be ordered to use thismethod. By plotting the graph (4.6), we can see whether the data come from F . If theQQ-plot is roughly linear it means that the data is generated by the referred distribution.In the case when the graph is non-linear, try with another distribution. The QQ-plot isgood to use since it does not only determine the distribution function, it also determinesoutliers, shape, location and scale of the data. If the QQ-plot for example curves up ordown it means that the data is heavy tailed. It is also possible to estimate graphically ifthe data is changed by a linear transformation. Embrechts et al (1997) page 290-294.

In this report the QQ-plot is used to see how good the assumptions about the distribu-tion is, i.e how good the estimation is. A less time effective way to apply this method isto plot the QQ-plot for all known distribution, and stop this procedure when a preferreddistribution is found.

4.5 Mean excess function

A useful graphical tool to determine if the data is heavy tailed or not is to plot the empiricalmean excess function against the ordered data. The empirical mean excess function isdefined as

en(u) =1

card∆n(u)∑

i∈∆n(u)(Xi−u), u≥ 0,

where u is a threshold value and ∆n(u) = {i : i = 1, . . . ,n,Xi > u}. The cardinal number of∆n(u) is the number of elements in ∆n(u). If the data don´t coincide with the mean excessfunction it means that the data is heavy tailed, see Embrechts et al (1997), page 294-303.

18

Page 27: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

The mean excess function is defined as

e(u) = E[X−u|X > u], 0≤ u≤ xF .

When working with insurance, e(u) can be interpreted as the expected claim size giventhat the claim is exceeding u, see the remarks related with (3.30). Equation (3.30) givesthat the mean excess function for insurance data follows a Pareto distribution. In thispaper the empirical mean excess function is used to chose an appropriate threshold suchthat the Pareto distribution is valid. The threshold u is estimated graphically. The valueof u should be choosen from a region where the graph {xk,n,en(xk,n)} is rougly linearfor xk,n > u. For references see Embrechts et al (1997), page 294-303. We will belowcalculate the mean excess function for the Pareto distribution. The result will give amotivation why the valid region is linear. But we will first show how to calculate themean excess function in the general case.

If F is a given distribution function then an explicit expression for e(u) can be foundby calculating

e(u) =∫ ∞

u F(x)dxF(u)

, (4.7)

see Embrechts et al (1997) page 161-162. We now want to derive (4.7). The definition ofe(u) gives that

e(u) = E[X−u|X > u] =∫ xF

u (x−u)dF(x)F(u)

.

Study the numerator. Straight forward calculations in combination with partial integrationgives that

limxF→∞

∫ xF

u(x−u)dF(x) = lim

xF→∞

∫ xF

u(x−u) f (x)dx

= limxF→∞

[∫ xF

ux f (x)dx−

∫ xF

uu f (x)dx

]= lim

xF→∞

[xFF(xF)−uF(u)−

∫ xF

uF(x)dx− (uF(xF)−uF(u))

]= lim

xF→∞

[F(xF)(xF −u)−

∫ xF

uF(x)dx

]= lim

xF→∞

[F(xF)

∫ xF

udx−

∫ xF

uF(x)dx

]=∫ ∞

uF(x)dx,

since limxF→∞ F(xF) = 1. Devide the result by F(u) to obtain (4.7).

In the case of a Pareto distribution having density aba

xa+1 , the mean excess function is

19

Page 28: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

found by first calculate the denominator

F(u) = P[X > u] =∫ ∞

udF

=∫ ∞

u

aba

xa+1 dx

= limR→∞

aba∫ R

u

1xa+1 dx

= limR→∞

aba

[x−(a+1−1)

−(a+1−1)

]R

u

= limR→∞

aba(

R−a

−a− u−a

−a

)=

abau−a

a

=ba

ua . (4.8)

Equation (4.8) is then used to calculate the numerator∫ ∞

uF(x)dx =

∫ ∞

u

ba

xa dx

= limR→∞

ba∫ R

ux−adx

= limR→∞

ba

[x−(a−1)

−(a−1)

]R

u

= limR→∞

ba

(R−(a−1)

−(a−1)− u−(a−1)

−(a−1)

)=

ba

(a−1)xa−1 .

The mean excess function for the Pareto disrtibution is hence equal to

e(u) = E[X−u|X > u]

=

∫ ∞u F(x)dx

F(u)

=ba

(a−1)ua−1 ·ua

ba

=u

a−1.

From the last equation we can see that the mean excess function for the Pareto distributionis a linear function. This means that if the empirical mean excess function is roughly linearfor xk,n > u then the Pareto distribution is valid for xk,n > u.

4.6 Estimation of the shape parameter

In this section two methods are introduced that can be used to estimate the shape parame-ter ξ when F ∈MDA(Hξ ). Focus lies on the Hill estimator, but an estimator by Dekkers,

20

Page 29: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

Einmahl and de Hann is also presented. Some of the statistical properties of the Hillestimator are presented.

Assume that the data comes from a distribution F ∈ MDA(Φα), α = ξ−1 > 0. Wethen know, from Theorem 3.2, that the tail behaves like L(x)x−α , where L(x) is a slowlyvarying function. One way of estimating the parameter α in the tail F(x) = L(x)x−α iswith help of the Hill estimator

1

ξ= αn(k) =

(1k

k

∑j=1

lnx j,n− lnxk,n

)−1

, (4.9)

where k is the number of exceedens over the value xk,n. The reason to estimate ξ with theHill estimator is because (4.9) will almost surely converge to a constant, see Theroem 4.1on page 22 in this paper.

To be able to use the Hill estimator the data in the sample must be sorted in suchway that the smallest value is first and the largest value comes last. This paper followsEmbrechts et al (1997), page 330-339. Note that some authors choose to sort the data withthe largest value first. One way to derive (4.9) is to do a maximum likelihood estimation,this is what we will do next.

Study the general caseF(x) =Cx−α , x≥ u > 0, (4.10)

where u is deterministic and known. Assume that (4.10) is fully specified, hence C =uα . To find the maximum likelihood estimation of α in (4.10) take the logarithm. Afterapplying the logarithm, differentiate the function with respect to x and set the result equalto zero. Finally solve the equation with respect to α to obtain the estimator

αn(u) =

(1n

n

∑j=1

lnx j,n

u

)−1

=

(1n

n

∑j=1

lnx j,n− lnu

)−1

, (4.11)

where the the threshold u is chosen such that lnun → 0, when n→ ∞. When estimating

under maximum domain of attraction, the tail above usually behaves approximately likea Pareto distribution. Let K be the number of events in the sample which are larger thanu. To Calculate the maximum likelihood estimation of α and C in (4.10), conditionallyon the event {K = k}, reduces to maximizing the joint density of (xk,n, . . . ,x1,n), i.e tomaximize

L(α,C) =n!

(n− k)!(F(xk))

n−kn

∏i=1

f (xi). (4.12)

The term n!(n−k)!(F(xk))

n−k in (4.12) has been introduced since the k largest values from

the sample (x1, . . . ,xn) can be rearranged in n!(n−k)! number of ways. Equation (4.12) can be

interpreted as the likelihood funtion for the k largest observations. Recall that F = 1−Fand hence

L(α,C) =n!

(n− k)!(1−Cx−α

k )n−kCkαkn

∏i=1

x−(α+1)i .

Use the same technique as above, take the logarithm and differentiate with respect to αand C, set equal to zero and obtain the Hill estimator

αn(k) =

(1k

k

∑j=1

lnx j,n− lnxk,n

)−1

,

Cn(k) =kn

xαn(k)k,n .

21

Page 30: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

Note that the Hill estimator has the same form as (4.11). Replace u with the randomthreshold xk,n and change the sum from all n to the k largest observations to see that theequations are equal. Another way to derive (4.9) is to study the mean excess function incombination with a regular variation approach. Embrechts et al (1997), page 330-336.

Some of the properties of the Hill estimator are stated in the following theorem andcan be found in Embrechts et al (1997) page 336-342.

Theorem 4.1. Assume that (Xn) is a process with marginal distribution F, satisfying

F(x) = P[X > x] = x−αL(x), x > 0,

where α > 0 and L is a slowly varying function. The Hill estimator αn(k) is given by (4.9).Then the Hill estimator has the following properties.

I. If (Xn) is an iid sequence and k/ ln lnn→ ∞ and k/n→ 0 for n→ ∞, then

αn(k)a.s.→ α.

II. If (Xn) is an iid sequence and if the following limit exists,

limx→∞

F(tx)/F(x)− t−α

a(x)= t−α tρ −1

ρ, t > 0, (4.13)

where a(x) is a measurable function. Then√

k (αn(k)−α)d→ N(0,α2).

The first property in Theorem 4.1 is proven in Deheuvels et al (1988). The proof isquite extensive, six facts and three lemmas are used. The interested reader is thereforerecommended to read the article by Deheuvels et al (1988) for the details. The proof ofthe second property is also quite complicated, see Hall (1982).

Unfortunately the condition given by equation (4.13) is difficult to verify in practice.But never the less, it is possible to create confidence intervals for the Hill estimator sinceit will converge to a normal distribution. An (100− p)% confidence interval for the Hillestimator is given by

αn(k)±λ p2

αn(k)√k

, (4.14)

where λ p2

is the (1− p)/2 standard normal quantile. One way to derive (4.14) is to studythe definition of a confidence interval

P[−λ p2≤ Z ≤ λ p

2] = p, (4.15)

see Milton et al(2003). Because of Theorem 4.1, set

Z =αn(k)−α

αn(k)√k

in (4.15). This implies

P

−λ p2≤ αn(k)−α

αn(k)√k

≤ λ p2

= p

P[−λ p

2

αn(k)√k≤ αn(k)−α ≤ λ p

2

αn(k)√k

]= p

P[

αn(k)−λ p2

αn(k)√k≤ α ≤ αn(k)+λ p

2

αn(k)√k

]= p.

22

Page 31: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

Hence is α bounded by (4.14). In the case of a 95% confidence interval, the searchedquantile is λ0.025 = 1.96.

Dekkers, Einmahl and de Haan sugest another estimator of ξ , namely

ξ = 1+H(1)n +

12

((H(1)

n )2

H(2)n

−1

)−1

,

where

H(1)n =

1k

k

∑j=1

(lnx j,n− lnxk+1,n)

and

H(2)n =

1k

k

∑j=1

(lnx j,n− lnxk+1,n)2.

One advantage of Dekkers estimation is that it covers all ξ ∈ R while the Hill estimatorworks best if ξ is positive, Embrechts et al (1997), page 339-340. Applying any of thesemethods to the insurance data leads to that ξ > 0. Because of this both methods can beused and they are both convenient.

There is a minor problem with these methods and that is how to choose k. We willpresent one way to solve this problem in Section 5. The idea is to plot k against ξ andchoose a k where the graph is roughly horizontal. See Figure 6.3 for an example of a Hillplot.

4.7 Tail and quantile estimation under MDA conditions

When F ∈MDA(Φα), α > 0, the Hill estimator can be used to estimate the tail and thetail’s quantiles. From Theorem 3.2 follows that when F ∈MDA(Φα) the tail is behavinglike Cx−α . Recall that an estimator of C and α was found when we derived the Hillestimator. Hence an estimator of F(x) is given by

ˆF(x) = Cn(k)x−1/ξ =kn

(x

xk,n

)−1/ξ, (4.16)

if x is chosen to be large enough. An estimator of the quantile xp can be derived fromequation (4.16). Replace F with 1− p and solve for x. The estimator of xp is therefore

xp =(n

k(1− p)

)−ξxk,n, (4.17)

where p ∈ (0,1). When applying (4.17) to real data, the parameter p is estimated fromequation (4.16), i.e p = ˆF(x). Embrechts et al (1997) page 334.

4.8 Expected claim size

If a claim is larger than a certain level, then can it be interesting to know how large theclaim is expected to be. This can be interpreted as the expected amount of money theinsurance company has to be prepared to pay if a big claim occurs. The expected claim,given that the claim is exceeding the quantile xp, is defined as E[X |X > xp]. In the generalcase we have to calculate

E[X |X > xp] =

∫ ∞xp

xdF∫ ∞xp

dF. (4.18)

23

Page 32: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

In the case of a Pareto distribution, the denominator in (4.18) was calculated in Sec-tion 4.5, and hence is ∫ ∞

xp

dF =ba

xap.

The numerator in (4.18) is calculated as follows

∫ ∞

xp

xdF =∫ ∞

xp

x f (x)dx

=∫ ∞

xp

xaba

xa+1 dx

= limR→∞

aba∫ R

xp

1xa dx

= limR→∞

aba

[x−(a−1)

−(a−1)

]R

xp

= limR→∞

aba

(R−(a−1)

−(a−1)−

x−(a−1)p

−(a−1)

)

=abax−(a−1)

p

a−1

=abax1−a

p

a−1.

Hence is

E[X |X > xp] =

∫ ∞xp

xdF∫ ∞xp

dF

=abax1−a

p

a−1·

xap

ba

=axp

a−1, (4.19)

for the Pareto distribution.Now turn to the case when F ∈MDA(Hξ ), ξ > 0. Then the expected claim, over the

level xp, is given by

E[X |X > xp] =−xp

ξ −1, ξ < 1. (4.20)

Equation (4.20) is calculated in the same way as for the Pareto distribution. We use theestimation of F(x), see equation (4.16), to get the denominator in (4.18). To calculate thenumerator we need to find the corresponding density function. Since ˆF(x) = 1− F(x) thedensity function is given by

f (x) = F ′(x) =k

nξ xk,n·(

xxk,n

)−1−1/ξ.

24

Page 33: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

The numerator is now given by

∫ ∞

xp

xdF =∫ ∞

xp

xk

nξ xk,n·(

xxk,n

)−1−1/ξdx

=−kxp

n(ξ −1)·(

xp

xk,n

)−1/ξ.

Equation (4.20) is obtained by the quotient

∫ ∞xp

xdF∫ ∞xp

dF=

−kxp

n(ξ−1)·(

xpxk,n

)−1/ξ

kn

(xp

xk,n

)−1/ξ

=−xp

ξ −1, ξ < 1.

If ξ > 1, it means that the expectation is infinite and (4.20) makes no sense. It will turnout that in our case is ξ < 1. If the shape parameter would be larger than one, then ananalogous examination could be done, but for the median instead.

25

Page 34: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

5 Estimating the shape parameter from a Hill plotThis section contains a method that automatically chooses a value from a Hill plot. Thesection starts with an explanation of the method, and then follows an empirical test of themethod. A summary ends the section.

If data comes from a distribution F ∈MDA(Φα), α > 0, then a Hill plot can be usedto estimate the parameter α . Recall that the data has to be sorted either in increasing ordecreasing order to be able to use the Hill plot.

A Hill plot based on the insurance data is given in Figure 6.3. Equation (4.9) is usedto construct the figure, details of how to construct such a plot can be found in Section 6.1.With increasing k the Hill estimator is stabilizing. The question is which k gives a rea-sonable estimation of ξ . By looking at Figure 6.3 one person might say that ξ = 0.65 is agood value. Another person might think that ξ = 0.7 is a good approximation. To avoidthe problem with the human factor, let a computer automatically decide a value of ξ . Thecomputer should choose a ξ from a region where the graph is stable.

5.1 Explanation of the method

Since the Hill estimator will almost surely converge to a constant, see Theorem 4.1, itcould be a good idea to analyse the graph from behind. First of all the Hill estimator hasto be calculated for all k, this is done by equation (4.9). The Hill estimator for every k issaved into a vector ξk, where k = 1, . . . ,n−1. As already argued the value of ξ should bechosen somewhere to the right in Figure 6.6, where the graph is stable. The problem is todecide where the graph is stable and where the graph is not stable.

Too find a stable part of the graph, create a ”window” that covers 20% of the Hill plot tothe right. Inside this window find the maximum and minimum values of ξk. The verticalrange of the window is now given by the maximum value minus the minimum value. Usethis information as a starting point. Then expand the window, in horizontal direction,so that it covers one more value of ξk. If the new window covers a new maximum andminimum, recalculate the vertical range. Repeat this procedure until the window hasincreased, in vertical range, with 1.5 times the original vertical range. Then chose the firstvalue of ξk that is inside this window. This iteration procedure creates an area where theHill plot is stable, and as soon as the Hill plot is outside this area the iteration stops. Thisphenomenon can be seen in Figure 6.4.

5.2 Testing the model

In this subsection an empirical test will be made to see if the model produces a realisticvalue of the shape parameter. The test will be based on artificial data. The idea with thetest is to create artificial data, from a known distribution with a known value of ξ . Oncethe artificial data is obtained it should be possible to apply our model to the data and getapproximately the same value of ξ as for the theoretical distribution. If this works, thenthere is indication that our model produces a realistic value of the shape parameter. Ifour method works for simulated data, we can then assume that the method works for theinsurance data as well.

Artificial data can be created by mapping a uniformly distributed random variable U ,from the interval (0,1), to the inverse of a specific distribution function. This works ifF←(x) exists and is known. See Jäckel (2002), page 99. This means that if U is uniformlydistributed, then is F←(U) following the distribution function F(x), since

P[F←(U)≤ x] = P[U ≤ F(x)] = F(x).

26

Page 35: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

Estimated value: 1.3909 0.7894 0.5061 0.20611.4210 0.7774 0.5083 0.19521.4178 0.7969 0.5025 0.19871.4166 0.7944 0.4934 0.20511.4584 0.8141 0.4872 0.19891.4250 0.8150 0.4848 0.20621.4531 0.8085 0.4878 0.19751.4474 0.8184 0.4998 0.19951.4197 0.8101 0.4976 0.20541.4139 0.8161 0.5086 0.1983

Mean value: 1.4264 0.8040 0.4976 0.2011Theoretical value: 1.4286 0.8000 0.5000 0.2000Absolute difference: 0.0022 0.0040 0.0024 0.0011

Table 5.1: The Hill estimator based on 5000 artificial claims from four different theoret-ical distributions. The mean value from ten iterations are presented, so is the theoreticalvalue of ξ and the difference between the estimated value and the theoretical value.

In our case we are interested in creating random outcomes from a Pareto distribution.Inverting equation (4.4) yields

F←(x) =b

(1− x)1/a.

Hence is the artificial claim

z =b

(1−U)1/a(5.1)

Pareto distributed. With help of (5.1) and by simulating U(0,1) random variables is itpossible to create Pareto distributed random variables. It is now possible to test if themethod works for artificial claims.

Before explaining the test, we want to give a short motivation to why it is valid tostudy random outcomes from a Pareto distribution. When we derived the Hill estimatorin Section 4.6, we assumed that the tail behaves like a Pareto distribution over a certainthreshold. Embrechts et al (1997) claim, on their own experience, that this is a verycommon situation when working with insurance data. We confirm this claim with equa-tion (3.32). By constructing a QQ-plot, see Figure 6.9, can we see that this assumptionis realistic in our case as well. Study equation (4.10) and the distribution function (4.4).Assuming that (4.10) is fully specified, i.e C = uα and comparing these two equationsgives that they are equivalent if b = u and a = α . The equality b = u is realistic anda = α = 1/ξ is valid over the threshold u. This way of reasoning strengthens that it isvalid to generate and study random outcomes from the Pareto distribution to perform atest.

The test is performed in the following way. Generate artificial claims from equa-tion (5.1) for different values of a. All tests contain 5000 artificial claims and are repeated10 times, every repetition contains a new set of random claims. If the model works thenthe mean value from different iterations should be close to the theoretical value of a. Testsare performed for the values a = 0.7, a = 1.25, a = 2 and a = 5, resulting in that the the-oretical value of ξ = a−1 is 1.4286, 0.8, 0.5 and 0.2 respective. The results from the testsare presented in Table 5.1.

27

Page 36: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

5.3 Summary

A model that automatically gives a realistic value of the shape parameter, assuming thatthe distribution comes from the MDA(Φα), α > 0, was constructed in Section 5.1. Whena distribution belongs to the MDA(Φα) the tail is behaving like L(x)x−α , this can be seenfrom Theorem 3.2. To test if the model works we used artifical claims from a Paretodistribution. The Pareto distribution was used since it is of the type L(x)x−α and thereforebelongs to the MDA(Φα).

The reason to create a model that automatically gives an estimation of ξ is to avoid theproblem with the human factor. The model is based on the Hill estimator, and the modelchooses a value from a region where the Hill plot is stable. This value is found by lettinga window analyzing the data from behind. The analysis ends when certain criterions aresatisfied. The algorithm can be summarizes as follows:

I. Calculate the Hill estimator for k = 1, . . . ,n−1. Save the result in a vector ξk.

II. Create a window that covers 20% of ξk. The window should cover values that arebased on 20% of the largest k values.

III. Find the maximum and minimum value inside this window.

IV. Obtain the vertical range of the window by subtract the minimum value from themaximum value. Remember this range.

V. Increase the horizontal range of the window, if necessary recalculate the verticalrange. Repeat this step until the vertical range has increased with 1.5 times itsoriginal size.

VI. An estimation of the shape parameter is now given by the first value that are insidethe window.

A test to verify that the algorithm really works were done in Section 5.2. The idea withthe performed test was to create artificial claims from a known distribution with a specificvalue of the shape parameter. Once the data were obtained our model were applied to thedata to see if our method gives approximately the same value of ξ as in the theoreticalcase. It turns out that the estimated value of ξ and the theoretical value of ξ coincide quitewell, see Table 5.1. Note that this is not a mathematical proof that the model works, but itcould be empirical evidence that the model gives a realistic value of the shape parameter.

28

Page 37: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

6 Analysing the dataThis section contains the analysis of the data. First we test if the data are independent andidentically distributed. Then we analyze under maximum domain of attraction conditions.The last part covers the Pareto distribution.

The received data where from the beginning divided into different sections. Everysection represented one year. The data were presented in chronological order with respectto when the claim occurred. But there was no information about specific dates. So it wasnot possible to distinguish the time between the first claim and the second claim and soon. Therefore I made the choice to put all data into one vector of information, in the sameorder as it originally was.

The first step in the analasys will be to test the data for independence. This can bedone by plotting the autocorrelation function for different time lags. Equation (4.1), fromSection 4.1, is plotted for all integer time lags between zero and 218. Confidence boundsare contstructed such that 95% of the data should be inside these bounds if the data areiid. Since the sample sice is 219 the bounds are given by ±1.96/

√219 ≈ ±0.132. See

Figure 6.1 for the result. Two of the points in Figure 6.1 are outside the bounds. Becauseof this, we cannot say if the data are iid or not. We can use the Ljung-Box to decide ifit is reasonable to reject the hypothesis about independence. Calculating (4.2) gives thatQ = 64.3. Many statistical textbooks have tables over the α-quantile for the chi-squareddistribution. Unfortunately only for distribution with degree of freedoms 100 or less. Tofind the value of χ1−α(218) for α = 0.95, plot its distribution function

F(x) = 1−Γ(218

2 , x2

)Γ(218

2

) , (6.1)

where Γ(·) is the gamma function, and find the value from the graph. Equation (6.1)can be found in Jäckel (2002), page 13. A graphical estimate of the quantile gives thatχ2

1−α(218) = 184.8, see Figure 6.2. The hypothesis about independence can’t be be re-jected, with 95% certainty, since Q < χ2

1−α(218).If it had turned out that the data were dependent we would have to search for methods

to take away the trend. Another option would have been to search for methods which arenot requiring the data to be iid. In this case there is no need for that.

Instead we continue to estimate the tail under the maximum domain of attraction con-dition and then under the Pareto assumption. But before doing so, it could be a goodidea to clarify the difference between these two approaches. When we assume that thedata follows a distribution that belongs to the MDA(Φα), we immediately assume thatthe tail behaves like L(x)x−α , where L(x) is a slowly varying function. This follows fromTheorem 3.2. The function L(x) can for example be a bounded, even function, ln(x) oranother function satisfying Definition 3.4. It should be clear that this approach coversa wide range of distribution functions. But in Section 6.2 is it assumed that the data isgenerated from a Pareto distribution. The distribution functions in these two approachesbehave asymptotically the same, but it is really two different approaches.

29

Page 38: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

0 50 100 150 200−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

Time lag

Sam

ple

Aut

ocor

rela

tion

Figure 6.1: The sample autocorrelation function for different time lags. The two bluelines indecates a 95% confidence intervall, two values are exceeding these lines.

0 50 100 150 200 250 300 3500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

x

F(x

)

Figure 6.2: The distribution function for a chi-squared distribution with 218 degrees offreedoms. The 0.05-quantile are marked with a red line in the figure, and are equal to184.8.

30

Page 39: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

6.1 Maximum domain of attraction

The analysis in this section starts with estimating the shape parameter ξ in MDA(Hξ ),with help of the techniques explained in Section 4.6 and 5. After that a confidence intervalfor ξ is set up. This section ends with calculations of exceedens probabilities, expectationand quantile estimation. A number of figures are presented to make it easier to understandthe working process.

When analyzing under maximum domain of attraction, of an extreme value distri-bution, the first step will be to estimate the shape parameter ξ . It is assumed that thedata come from a distribution that belongs to the MDA(Φα), see the short motivation onpage 10. But before estimating ξ , the data has to be sorted. The data are sorted in in-creasing order with the smallest value first and the largest value last. This is done becauseHill’s method, explained in Section 4.6, requires the data to be sorted. After sorting thedata, apply Hill’s method to get an estimate of ξ .

The parameter k in (4.9) is not known, therefore the calculations are done with respectto all possible k:s. This is programmed in a computer, with help of a nested loop. Theouter loop is iterating the k-values. In this case k is the integers between 1 and 218. Whilethe inner loop is calculating the sum in (4.9) for every k. Every unique k gives a uniquevalue of the Hill estimator. The result is saved in a vector. Every position in the vectorhas a unique value of the Hill estimator. The result is presented in Figure 6.3.

We apply the automatized method, explained in Section 5.1, to the graph in Figure 6.3to get an estimation of the shape parameter. See Figure 6.4 for the result. Hence the shapeparameter is estimated to be approximately equal to 0.6474. This automatically givesthat k = 79 and xk,n = 2060589. Hill’s method works best if ξ > 0, therefore it couldbe a good idea to try with another method to see if the result is approximately the same.Repeating the same procedure with Dekkers, Einmahl and de Haan´s method, explainedin Section 4.6, gives that ξ ≈ 0.6898, see Figure 6.5.

We know from Theorem 4.1 that the Hill estimator will converge to a constant. It isclear, by looking at Figure 6.3, that ξ is stabilizing. We can create a confidence interval forξ , since the estimator of the shape parameter is converging to a constant and by applyingTheorem 4.1. The confidence interval gives information about how good ξ is for differentvalues of k. A 95% confidence interval for ξ is created with help of (4.14), where λ p

2=

1.96. See Figure 6.6 for the result. It seems that the shape parameter lies between thevalues 0.6 and 0.8.

The next step is to estimate the probability that a claim is larger than a certain level,given that the claim has exceeded a given threshold u. This can be written as Fu(x) =P[X > x|X > u]. We estimate Fu(x) with help of (4.16). The probability that a claim islarger than x is given by

P[X > x] =P[X > x,X > u] ·P[X > u]

P[X > u]= P[X > x|X > u] ·P[X > u].

Recall that P[X > x] = F(x) was calculated in (4.16). Hence is

Fu(x) = P[X > x|X > u] =P[X > x]P[X > u]

=F(x)F(u)

=

kn

(x

xk,n

)−1/ξ

kn

(u

xk,n

)−1/ξ =(x

u

)−1/ξ.

An estimation of Fu(x), for u= 1064907, are presented both in Figure 6.7 and in Table 6.1.

31

Page 40: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

The choice of u = 1064907 comes from solving Fu(u) = 1 with respect to u, i.e

u =(n

k

)−ξ· xk,n =

(21979

)−0.6474

·2060589≈ 1064907.

It can bee seen from Figure 6.7 that the probability that a claim is going to be largerthan 2 million SEK is approximately 0.38, given that the claimed has exceeded 1064907SEK. Another way to get this result is to calculate

F1064907(2 ·106) = P[X > 2 ·106|X > 1064907]

=

(2 ·106

1064907

)−1/ξ

=

(2 ·106

1064907

)−1/0.6474

≈ 0.38.

Estimation of the p-quantile is done from (4.17), with the same parameters as above.The parameter p is estimated from (4.16). See Figure 6.8 for the result. The probabilitythat a claim is smaller than 1.7 million SEK is approximately 0.5. An estimation of the0.5-quantile can be calculated directly from (4.17) and hence is

x0.5 =(n

k(1−0.5)

)−ξxk,n =

(21979

(0.5))−0.6474

·2060589≈ 1.7 ·106.

A closed expression for the expected claim size was derived in Section 4.8. Applyingthe estimation of ξ to equation (4.20), gives that

E[X |X > u] =−u

ξ −1=

−u0.6474−1

=u

0.3526. (6.2)

Equation (6.2) calculated for different values of u are presented in Table 7.1, the fourthcolumn.

32

Page 41: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

0 50 100 150 200

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

1.05

1.1

k

xi

Figure 6.3: A Hill plot based on insurance data. The number of exceedens k is plotted onthe horizontal axis while the estimate of ξ is plotted on the vertical axis.

0 50 100 150 2000.5

0.6

0.7

0.8

0.9

1

1.1

1.2

k

xi

Figure 6.4: A Hill plot where the stable part of the graph is marked with two horizontalgreen lines. The stable part is found with help of the technique explained in Section 5.

33

Page 42: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

0 50 100 150 2000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

k

xi

Figure 6.5: A Dekkers, Einmahl and de Haan plot based on insurance data. The esti-mate of ξ is plotted on the vertical axis and the number of exceedens k is plotted on thehorizontal axis.

0 50 100 150 2000

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

k

xi

Figure 6.6: A Hill plot together with 95% confidence intervals. The Hill plot is blue andthe confidence intervals are red. The number of exceedens k is plotted on the horizontalaxis while the estimation of ξ is plotted on the vertical axis.

34

Page 43: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

1 2 3 4 5 6 7 8 9 10

x 106

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

x

1−F

u(x)

Figure 6.7: A tail estimation of the data, given that the claim is exceeding 1064907 SEK.The claim sizes are represented on the horizontal axis and the estimate of Fu(x) is givenon the vertical axis.

0 0.2 0.4 0.6 0.8 10

0.5

1

1.5

2

2.5x 10

7

p

x p

Figure 6.8: Estimate of the p-quantile. The probabilities are plotted on the horizontal axisand the p-quantile is plotted on the vertical axis.

35

Page 44: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

6.2 Pareto

In section 4.3, it was claimed that it is common that the tail follows a Pareto distribution.To be able to see if this is a good assumption we first need to estimate the parameters aand b in (4.4). A QQ-plot is then used to graphically test how good the data follows aPareto distribution. To graphically test in which region the Pareto distribution is valid amean excess plot is used. The last step is to calculate the probabilities and the quantilesfor the Pareto distribution.

The first step in this analysis will be to estimate the parameters of the Pareto distribu-tion, how to do this is explained in Section 4.3. The parameter a in the Pareto distributionis estimated with the maximum likelihood method and is estimated to be equal to 1.4606.The parameter b is estimated with the smallest value in the sample, i.e b = 1001984.

By constructing a QQ-plot it is possible to graphically estimate how good the assump-tion about Pareto is. The QQ-plot should be roughly linear if the data follow a Paretodistribution, see section 4.4. Recall that the data have to be sorted in increasing order. AQQ-plot is constructed by plotting (4.6), where

F←(

n− k+1n+1

)=

b(n−k+1n+1

)1/a, k = 1, . . . ,n.

See Figure 6.9 for the result. Only a few of the largest values differ from the expectedPareto value. It seams like a Pareto distribution is a good choice.

The assumption about the Pareto distribution seems to be valid. But is the assumptionvalid for the whole tail? To decide this, construct a mean excess plot. The empirical meanexcess plot should be roughly linear if the Pareto distribution is valid, see section 4.5.See Figure 6.10 for the empirical mean excess plot. The graph is roughly linear for all x,hence is Pareto probably valid for all x > 1001984.

To find a closed expression for Fu(x) = P[X > x|X > u], turn to equation (4.4). Recallthat F(x) = 1−F(x) and hence is

F(x) = 1−F(x) =

{(bx

)afor x≥ b,

0 for x < b.(6.3)

Use the same techinque as in the previous section to get that

Fu(x) = P[X > x|X > u] =P[X > x]P[X > u]

=

(bx

)a(bu

)a =(u

x

)a.

To find the p-quantiles for the Pareto distribution, replace F(x) and x with 1− p and xpin (6.3), respectively. Solve with respect to xp and hence obtain an estimation of thep-quantile,

xp =b

(1− p)1/a.

The P[X > x|X > u] for different values of x are presented in Table 6.1. The leftcolumn in Table 6.1 represents different claim sizes. The second and third columns giveinformation about Fu(x) for estimation under maximum domain of attraction and underPareto estimation. For example is the P[X > 8 ·106|X > 1001984] = 0.0481 for the Paretomodel. In section 7.2 a more extended table is presented.

For comparison with the previous section, the 0.5-quantile is given by

x0.5 =1001984

(0.5)1/1.4606 ≈ 1.6 ·106.

36

Page 45: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

The probability that a claim is larger than 2 million SEK, given that the claim is exceeding1001984, is given by

F1001984(2 ·106) = P[X > 2 ·106|X > 1001984] =(

10019842 ·106

)1.4606

≈ 0.36.

The expected claim size, given that the claim is exceeding a certain level, was discussedin Section 4.8. Inserting the estimation of a into (4.19) yields

E[X |X > u] =a ·u

a−1=

1.4606 ·u1.4606−1

=1.4606 ·u

0.4606. (6.4)

In Table 7.2 are (6.4) calculated for different values of u.

0 1 2 3 4 5 6 7 8 9

x 107

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5x 10

7

Observed value

Exp

ecte

d P

aret

o va

lue

Figure 6.9: A QQ-plot for a Pareto distribution with values a = 1.4606 and b = 1001984.Only three values differ from the expected Pareto value, which is represented by the blueline.

37

Page 46: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

1 2 3 4 5 6 7 8 9 10

x 106

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2x 10

7

x

e n(x)

Figure 6.10: An empirical mean excess plot for the observed values between 1001984and ten million SEK.

Claim size MDA Paretox Fu(x) = P[X > x|X > u] Fu(x) = P[X > x|X > u]

2 ·106 0.3777 0.36443 ·106 0.2019 0.20154 ·106 0.1295 0.13245 ·106 0.0917 0.09566 ·106 0.0692 0.07327 ·106 0.0546 0.05858 ·106 0.0444 0.04819 ·106 0.0370 0.040510 ·106 0.0314 0.0347

Table 6.1: Tail estimation under maximum domain of attraction (MDA) and under Paretoassumptions. The probability that a claim is larger than 5 million SEK is approximately9.17% and 9.56% with respectively model, given that the claimed has exceeded u.

38

Page 47: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

7 ResultThis section contains a summary of the results from Section 6. Focus lies on exceedenceprobabilities, quantiles and expectations. Tables are created to highlight the result. Thesection is divided into two parts. The first part covers the maximum domain of attractionmodel while the Pareto model is treated in the second part.

If a claim occurs and exceeds the threshold value u = 1064907, then Table 7.1 is valid.Table 7.1 covers the results for estimating under maximum domain of attraction condi-tions. The results related to the Pareto estimation are presented in Table 7.2. Table 7.2is valid for claims that are exceeding the threshold value u = 1001984. The quantile xpin the second column represents a bound. The bound xp is constructed in such way thatwith probability p the occurred claim are between u and xp. In the third column areP[X > xp|X > u] listed for different values of xp. The expected claim size, given that theclaim is exceeding xp, is represented in the fourth column.

Note that the values in the left column in Table 7.1 and 7.2 are not spread evenly.The expectation shows a linear behavior for p < 0.6. But for p > 0.8 the expectation isgrowing more rapidly. Because of this finer steps are presented for values of p between0.8 and 0.98.

7.1 Maximum domain of attraction

The results from Section 6.1 are summarized below and in Table 7.1.For example the quantile x0.4 = 1.4823 ·106, in Table 7.1, is representing a bound. The

probability that a claim is between u and x0.4 is 0.4. The probability that a claim is largerthan x0.4 is 0.6, given that the claim has exceeded the threshold u = 1064907. If the claimis exceeding x0.4 then the claim is expected to be 4.2039 ·106 SEK.

To manually calculate values that are not represented in Table 7.1, use

xp =

(21979

(1− p))−0.6474

·2060589,

to calculate the quantiles. The probability that a claim is larger than a certain level iscalculated from

Fu(xp) = P[X > xp|X > u] =(xp

u

)−1/0.6474,

where u = 1064907. The expected claim size, given that the claim is larger than xp, iscalculated from

E[X |X > xp] =xp

0.3526.

7.2 Pareto

The results for the Pareto model are presented in Table 7.2 and the main formulas aresummarized below.

The result for the Pareto distribution are presented in Table 7.2. Table 7.2 is composedin the same way as Table 7.1. A concrete example of how to use Table 7.2 is the following.If a claim occurs and is exceeding the threshold 1001984, we can then say that the claimwill be smaller or equal to x0.2 = 1.1674 · 106 with a probability of 0.2. If the claim isexceeding x0.2 then is it expected to be 3.7018 ·106.

To calculate a value that are not represented in Table 7.2, the following formula doesthe job for the second column:

xp =1001984

(1− p)1/1.4606 .

39

Page 48: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

The third column, which represents the probability that a given claim is larger than acertain level, is calculated with help of

Fu(xp) = P[X > xp|X > u] =(

uxp

)1.4606

,

where u = 1001984. The expectation, which is represented in the fourth column, is givenby

E[X |X > xp] =1.4606 · xp

0.4606.

Risk p-quantile Probability Expectationp xp P[X > xp|X > u] E[X |X > xp]

0.10 1.1401·106 0.90 3.2334·106

0.20 1.2304·106 0.80 3.4895·106

0.30 1.3415·106 0.70 3.8046·106

0.40 1.4823·106 0.60 4.2039·106

0.50 1.6680·106 0.50 4.7306·106

0.60 1.9272·106 0.40 5.4658·106

0.70 2.3218·106 0.30 6.5848·106

0.80 3.0187·106 0.20 8.5614·106

0.82 3.232·106 0.18 9.166·106

0.84 3.488·106 0.16 9.892·106

0.86 3.803·106 0.14 10.785·106

0.88 4.202·106 0.12 11.917·106

0.90 4.728·106 0.10 13.410·106

0.92 5.463·106 0.08 15.494·106

0.94 6.582·106 0.06 18.666·106

0.96 8.557·106 0.04 24.269·106

0.98 13.404·106 0.02 38.014·106

Table 7.1: A selection of quantiles, probabilities and expectations based on the maximumdomain of attraction estimation. How to interpret and calculate values that are not givenin the table are explained in Section 7.1.

40

Page 49: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

Risk p-quantile Probability Expectationp xp P[X > xp|X > u] E[X |X > xp]

0.10 1.0769·106 0.90 3.4150·106

0.20 1.1674·106 0.80 3.7018·106

0.30 1.2791·106 0.70 4.0562·106

0.40 1.4215·106 0.60 4.5077·106

0.50 1.6105·106 0.50 5.1070·106

0.60 1.8763·106 0.40 5.9500·106

0.70 2.2848·106 0.30 7.2453·106

0.80 3.0159·106 0.20 9.5635·106

0.82 3.241·106 0.18 10.279·106

0.84 3.514·106 0.16 11.142·106

0.86 3.850·106 0.14 12.209·106

0.88 4.279·106 0.12 13.568·106

0.90 4.847·106 0.10 15.372·106

0.92 5.648·106 0.08 17.909·106

0.94 6.877·106 0.06 21.808·106

0.96 9.077·106 0.04 28.785·106

0.98 14.590·106 0.02 46.267·106

Table 7.2: A selection of quantiles, probabilities and expectations based on the Paretodistribution. How to interpret the table are explained in Section 7.

41

Page 50: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

8 DiscussionIn this section the analysis and the result are discussed. Obtained difficulties are alsomentioned.

The first step in the analysis was to see if there were any trends in the data. Theautocorrelation function in combination with Ljung-Box test gives that the data are iid,with 95% certainty. It seems realistic that the data are independent since the data onlyconsists of claims that are exceeding 1 million SEK and that the claims are spread over along time interval. A claim that occurred in the year 2007 should probably be independentof a claim that occurred in 2002.

To choose the optimum value for k from the Hill plot in Figure 6.3 is not an easy task.Many different k:s give almost the same value for the Hill estimator. In Figure 6.6 it isclear that the Hill estimator is stabilizing for large k. Remember that k are supposed to bechosen from a region where the Hill plot is roughly horizontal. In Section 5 we presentedone way to automatically chose a value of ξ , or equivalent a value of k. A moving windowwas used to decide where the graph was stable. This method works on simulated data,therefore is it assumed that the method also works on the insurance data. There have beenstudies of how to choose the optimal k from a sample. One such study has been done byGomes et al (2001). The reason not to apply Gomes algorithm is that this algorithm isessentially designed for large samples.

Estimating the parameters in the Pareto model is straightforward. The difficult partis to decide if the assumption about the Pareto model is valid and in which region theassumption is valid. The quantile plot in Figure 6.9 shows that the Pareto distributionseems to fit the data quite well. If the data where following an exact Pareto distributionthen should all points in Figure 6.9 lie on the blue line. Only three points seem to differfrom the blue line. For empirical data this fit is quite good. As an extra validation of thePareto model, an empirical mean excess function has been constructed, see Figure 6.10.The Pareto model is valid in the region where the empirical mean excess plot is roughlylinear. One problem is of course the meaning of ”roughly linear”. I think that the pointsin Figure 6.10 have a linear behavior. This means that the Pareto model should be validas a model of the tail. Because of this it is natural to choose u as the smallest value inthe sample. The Pareto model is probably valid in a region bellow 1 million SEK sinceit is roughly linear for all values above 1 million. To find out if this is the case, moreinsurance data are needed. For example if a similar analysis as in Section 6.2, was doneon all claims exceeding 500 000 SEK, then it would probably be possible to find a lowervalue of u such that the Pareto model holds true. Finding the optimal u is perhaps ofinterest for the insurance company.

The two different models give two different values of u. In the Pareto case is u =1001984 and in the other model is u = 1064907. These thresholds are almost the same.The insurance data contains 14 claims that are smaller than the threshold 1064907. Soby choosing the higher of the both thresholds not much statistics is missing. I don’t thinkthat the choice of u is of great importance as long as u is between 1 and 1.1 million SEK.As already mentioned more insurance data are needed to be able to find the lowest u suchthat both models hold.

The exceedance probabilities for both models are essentially equal, see Table 7.1and 7.2. These probabilities seem to be reliable since two different models give almostthe same result. This means that both models seem to be valid for all claims over u. Butbe careful to use the model for very large claims. A warning is mentioned since therewere only few values exceeding the limit 30 million SEK. The uncertainty is increasedfor large values because there were so few large values in the sample.

42

Page 51: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

It is not possible to find an upper bound of the claim sizes. This is obvious since the sizeof the worst possible claim goes to infinity. Therefore is it natural to discuss the expectedclaim size instead. The insurance company could use the expectations in Table 7.1 and 7.2to decide an appropriate priority limit. Recall that different priority levels have differentpremiums. If two different priority levels have almost the same expectations, then therelation between these two levels and their premiums should be somehow proportioned.For example if the priority level 15 million has the expected claim size 17 million andthe priority level 20 million has the expected claim size 22 million. Then their premiumsshould be somehow equal since the cost for the reinsurer is almost the same. The lowerpriority level should of course be a little more expensive since there occur more smallclaims than big claims. The different priority levels and their premiums are unfortunatelynot known to me.

43

Page 52: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

9 ConclusionThis section contains a summary of drawn conclusions. Suggestions of how to improvethe paper and open questions are presented. The paper ends with some suggestions ofhow to continue the work.

The purpose of this paper was to deliver material consisting of exceedence probabilitiesto Länsförsäkring Kronoberg. Two different models to estimate these probabilities wereused, maximum domain of attraction and Pareto.

It seems like both the maximum domain of attraction and the Pareto model could beused to fit the insurance data. The maximum domain of attraction model seems to bevalid since the Hill plot is stabilizing, see Figure 6.6. The stability of the graph and theconstruction of the confidence intervals gives an indication of that this model seems tofit the data. The quantile plot and the mean excess plot, Figure 6.9 and 6.10, give anindication of that the Pareto model is valid.

When analyzing under maximum domain of attraction conditions we faced the problemto choose a good value of the shape parameter from the Hill plot. This problem was solvedby analyzing the Hill plot from the right hand side until the graph’s amplitude no longerwas stable. This technique works on artificial data, therefore is it assumed that is workson the insurance data as well. No mathematical proof has been presented for this method,only empirical evidence.

Calculating exceedence probabilities for both models separately gives essential equalresults, see Table 7.1 and 7.2. It should not play a big role which model the insurancecompany chooses to use. The company should use the model with the least statisticalerror, i.e the model with the thinnest confidence interval. Unfortunately I could not finda confidence interval for the Pareto estimation. The error in the Pareto model is left asan open question. But if a confidence interval can be found it could help the insurancecompany to decide between the both models.

Both models anticipate that huge claims can occur, with very little probability. But themodels are just mathematical models. They don’t take everything in consideration. Forexample there is always an extra risk factor that can’t be estimated. This extra risk can beseen as a gambling phenomenon. The financial cost after a fire depends on many differentthings. Is the fire restricted to one building or several buildings? Is the occurred fire inan ordinary house or in a factory? A fire in a power plant is probably more expensivethan a fire in a single-storey house. The circumstances around the accident can also playa certain role, is it a rainy, sunny or stormy day. The list about things that can have animpact on the claim size, but has not been considered in this paper, can be made verylong. The message is that there is always an external risk that is very difficult to estimateand take in consideration in a mathematical model.

However there are some things that could have been improved in this work. Takingthe inflation into consideration would probably give a more reliable result. If the datawhere generated from a perfect Pareto distribution then all the points in Figure 6.9 wouldlay on the blue line. Three of the largest claims differ from the line. Maybe there existsanother distribution that is fitting the data better than the Pareto distribution. The methodto choose a value from a Hill plot can also be improved. Instead of choosing the first valuethat is inside the window, take the mean of the maximum and minimum value. But thenwe face a new problem. It is very unlikely that an exact representation of the mean valuecan be found inside the window. This is because the Hill estimate is only calculated fordiscrete k:s. If there is no exact match of the mean value then there are no correspondingk. One suggestion of how to solve this is to choose the value that are closest to the meanvalue. But I don’t think that this would have a big influence on the result in this paper.

44

Page 53: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

There was not enough time to consider everything that could help the insurance com-pany to decide an appropriate priority level. This situation arises since this paper has tobe written in a very limited time. One suggestion of how to continue the work is to es-timate the time between specific events. Another suggestion is to study which event thatoccurs most frequently, claims based on fire or claims based on responsibility accidents.A third suggestion is to study the frequency that claims are exceeding the threshold value1 million SEK, to do this the full data sample is needed.

45

Page 54: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

References[1] Brockwell, Peter & Davis, Richard (2002), Introduction to Time Series and Forecast-

ing, Second edition, New York: Springer-Verlag.

[2] Deheuvels, Paul, Haeusler, Erich & Mason, David (1988), Almost sure convergenceof the Hill estimator, Math. Proc. Cambridge Philos. Soc. 104, 371-381.

[3] Embrechts, Paul, Klüppelberg, Claudia & Mikosch, Thomas (1997), Modelling Ex-tremal Events for Insurance and Finance, Berlin: Springer-Verlag.

[4] Gomes, Ivette & Oliveria, Orlando (2001), The Bootstrap Methodology in Statisticsof Extremes - Choice of the Optimal Sample Fraction, Extremes 4:4, 331-358.

[5] Haan, L. de (1970), On Regular Variation and Its Application to Weak Convergenceof Sample Extremes, Amsterdam: Mathematical Centre Tracts 32.

[6] Hall, Peter (1982), On some Simple Estimates of an Exponent of Regular Variation,Journal of the Royal Satistical Society, Series B 44, 37-42.

[7] Jäckel, Peter (2002), Monte Carlo Methods in Finance, Wiltshire: Wiley.

[8] Milton, J. Susan & Arnold, Jesse C. (2003), Introduction to Probability and Statistics:Principles and Applications for Engineering and the Computing Sciences, Fourthedition, New York: McGraw-Hill.

[9] Resnick, S.I. (1987), Extreme Values, Regular Variation, and Point Processes, NewYork: Springer-Verlag.

46

Page 55: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

A Program code

...sorted_data=sort(data);n=length(sorted_data);Hill_est=zeros(n−1,1);for k=1:n−1

sum=0;for i=1:k

sum= sum + log(X(n−i+1,1)) − log(X(n−k,1));endHill_est(k,1)=(1/k)∗sum;

endHill_est=1./Hill_est;...

Program code 1: The following program code shows how to calculate the Hill estimator,equation (4.9), in MatLab.

...data=Hill_est;n=length(data);initfraction=0.2;allowedincrease=1.5;initlength=ceil((n)∗initfraction);minimum=(data(n,1));maximum=(data(n,1));for i=n:−1:(n−initlength)

minimum=min(minimum,data(i,1));maximum=max(maximum,data(i,1));

endinitspan=maximum−minimum;currspan=initspan;index=n−initlength;while index >=1 && (currspan < allowedincrease ∗ initspan)

minimum=min(minimum,data(index,1));maximum=max(maximum,data(index,1));currspan=maximum−minimum;index=index−1;

endstart_value=index+1;estimated_value=Hill_est(start_value,1);...

Program code 2: The program code shows how to choose a value of the shape parameterfrom a Hill plot in MatLab.

47

Page 56: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

...sorted_data=sort(data);ME=zeros(n,1);card_delta=zeros(n,1);for i=1:n

card_delta(i,1)=length(find( sorted_data(:,1) > sorted_data(i,1)));if card_delta(i,1) == 0

ME(i,1)=0;else

tmp_sum=0;for j=i:n

tmp_sum=sorted_data(j,1)−sorted_data(i,1)+tmp_sum;endME(i,1)=tmp_sum/card_delta(i,1);

endendplot(sorted_data,ME,’.’),xlabel(’x’),ylabel(’e_n(x)’)...

Program code 3: This program code is written in MatLab and the code shows how toperform an empirical mean excess plot.

...n=length(data);autocorr=zeros(n−1,1);mean_data=mean(data);sum_denominator=0;for i=1:n

sum_denominator=sum_denominator+data(i,1)^2;endfor lag=1:(n−1)

sum_numerator=0;for i=1:(n−lag)

sum_numerator=sum_numerator+(data(i,1)−mean_data)∗(data(i+lag,1)−mean_data);

endautocorr(lag,1)=sum_numerator/sum_denominator;

endupper_bound=1.96/sqrt(n);lower_bound=−1.96/sqrt(n);stem(1:1:lag,autocorr,’or’,’MarkerFaceColor’,’r’,’MarkerSize’,4)hold onplot(1:0.5:lag,upper_bound,’blue’,1:0.5:lag,lower_bound,’blue’), xlabel(’Time lag’),

ylabel(’Sample Autocorrelation’)axis([0 lag −1 1])...

Program code 4: The following MatLab code shows how to calculate and plot theautocorrelation function in (4.1).

48

Page 57: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme
Page 58: Tail Estimation for Large Insurance Claims, an Extreme ...345793/FULLTEXT01.pdf · ically speaking this means that the insurance data follows a heavy tailed distribution. The extreme

SE-351 95 Växjö / SE-391 82 Kalmar Tel +46-772-28 80 00 [email protected] Lnu.se/dfm