toward a unified approach to fitting loss models

Toward a unified approach to fitting loss models

Jacques Rioux and Stuart Klugman, for presentation at the IAC, Feb. 9, 2004

Overview

What problem is being addressed? The general idea The specific ideas

Models to considerRecording the dataRepresenting the dataTesting a modelSelecting a model

The problem

Too many modelsTwo books – 26 distributions!Can mix or splice to get even more

Data can be confusingDeductibles, limits

Too many tests and plotsChi-square, K-S, A-D, p-p, q-q, D

The general idea

Limited number of distributions Standard way to present data Retain flexibility on testing and selection

Distributions

Should beFamiliarFewFlexible

A few familiar distributions

ExponentialOnly one parameter

GammaTwo parameters, a mode if

LognormalTwo parameters, a mode

ParetoTwo parameters, a heavy right tail

Flexible

Add by allowing mixtures That is,

where

and all Some restrictions:

Only the exponential can be used more than once.

Cannot use both the gamma and lognormal.

1 1( ) ( ) ( )k kf x a f x a f x

1 1ka a 0ja

Why mixtures?

Allows different shape at beginning and end (e.g. mode from lognormal, tail from Pareto).

By using several exponentials can have most any tail weight (see Keatinge).

Estimating parameters

Use only maximum likelihoodAsymptotically optimalCan be applied in all settings, regardless of

the nature of the dataLikelihood value can be used to compare

different models

Representing the data

Why do we care?Graphical tests require a graph of the

empirical density or distribution function.Hypothesis tests require the functions

themselves.

What is the issue?

None if,All observations are discrete or groupedNo truncation or censoring

But if so,For discrete data the Kaplan-Meier product-

limit estimator provides the empirical distribution function (and is the nonparametric mle as well).

Issue – grouped data

For grouped data, If completely grouped, the histogram

represents the pdf, the ogive the cdf. If some grouped, some not, or multiple

deductibles, limits, our suggestion is to replace the observations in the interval with that many equally spaced points.

Review

Given a data set, we have the following:A way to represent the data.A limited set of models to consider.Parameter estimates for each model.

The remaining tasks are:Decide which models are acceptable.Decide which model to use.

Example

The paper has two example, we will look only at the second one.

Data are individual payments, but the policies that produced them had different deductibles (100, 250, 500) and different maximum payments (1,000, 3,000, 5,000).

There are 100 observations.

Empirical cdfKaplan-Meier estimate

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 1000 2000 3000 4000 5000 6000

loss

F-e

mp

(x)

Distribution function plot

Plot the empirical and model cdfs together. Note, because in this example the smallest deductible is 100, the empirical cdf begins there.

To be comparable, the model cdf is calculated as

( ) ( )( )

1 ( )d

F x F dF x

F d

Example model

All plots and tests that follow are for a mixture of a lognormal and exponential distribution. The parameters are

1

lognormal: 7.109459, 0.254236

exponential: 1839.174

0.238301a

Distribution function plotDistribution function plot

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 1000 2000 3000 4000 5000 6000

loss

F(x

) F-emp

F-model

Confidence bands

It is possible to create 95% confidence bands. That is, we are 95% confident that the true distribution is completely within these bands.

Formulas adapted from Klein and Moeschberger with a modification for multiple truncation points (their formula allows only multiple censoring points).

CDF plot with bounds

CDF plot with 95% bounds

00.10.20.30.40.5

0.60.70.80.9

1

0 1000 2000 3000 4000 5000 6000

loss

F(x

)

F-emp

F-model

lower

upper

Other CDF pictures

Any function of the cdf, such as the limited expected value, could be plotted.

The only one shown here is the difference plot – magnify the previous plot by plotting the difference of the two distribution functions.

CDF difference plot

CDF difference plot

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

0 1000 2000 3000 4000 5000 6000

loss

Difference

lower

upper

Histogram plot

Plot a histogram of the data against the density function of the model.

For data that were not grouped, can use the empirical cdf to get cell probabilities.

Histogram plot

Histogram plot

0

0.0001

0.0002

0.0003

0.0004

0.0005

0.0006

0.0007

0 1000 2000 3000 4000 5000 6000

loss

hist

model

Hypothesis tests

Null-model fits Alternative-it doesn’t Three tests

Kolmogorov-SmirnovAnderson-DarlingChi-square

Kolmogorov-Smirnov

Test statistic is maximum difference between the empirical and model cdfs. Each difference is multiplied by a scaling factor related to the sample size at that point.

Critical values are way off when parameters estimated from data.

Anderson-Darling

Test statistic looks complex:

where e is empirical and m is model. The paper shows how to turn this into a

sum. More emphasis on fit in tails than for K-S

test.

22 [ ( ) ( )]

( )( )[1 ( )]

ue m

mdm m

F x F xA f x dx

F x F x

Chi-square test

You have seen this one before. It is the only one with an adjustment for

estimating parameters.

Results

K-S: 0.5829 A-D: 0.2570 Chi-square p-value of 0.5608 The model is clearly acceptable.

Simulation study needed to get p-values for these tests. Simulation indicates that the p-values are over 0.9.

Comparing models

Good picture Better test numbers Likelihood criterion such as Schwarz

Bayesian. The SBC is the loglikelihood minus (r/2)ln(n) where r is the number of parameters and n is the sample size.

Several models

Model Loglike A-D K-S Chi-sq SBC

Exp -628.23 1.2245 0.9739 0.1054 -630.53

Ln -626.26 0.6682 0.9375 0.2126 -630.87

Gam -627.35 0.8369 1.0355 0.2319 -631.96

L/E -623.77 0.2579 0.5829 0.5608 -632.98

G/E -623.64 0.2804 0.5773 0.5260 -632.85

L/E/E -623.39 0.1484 0.4494 0.3472 -637.21

G/E/E -623.26 0.1353 0.4652 0.3348 -637.08

Which is the winner?

Referee A – loglikelihood rules – pick gamma/exp/exp mixture This is a world of one big model and the best is the

best, simplicity is never an issue.

Referee B – SBC rules – pick exponential Parsimony is most important, pay a penalty for extra

parameters.

Me – lognormal/exp. Great pictures, better numbers than exponential, but simpler than three component mixture.

Can this be automated?

We are working on software Test version can be downloaded at

www.cbpa.drake.edu/mixfit. MLEs are good. Pictures and test

statistics are not quite right. May crash. Here is a quick demo.

toward a unified approach to fitting loss models

Documents

cdf plot

empirical cdf

discrete data

data set

present data

cdf difference plot

distribution function

issue grouped data