toward a unified approach to fitting loss models
DESCRIPTION
A systematic approach to fitting loss models.TRANSCRIPT
Toward a unified approach to fitting loss models
Jacques Rioux and Stuart Klugman, for presentation at the IAC, Feb. 9, 2004
Overview
What problem is being addressed? The general idea The specific ideas
Models to considerRecording the dataRepresenting the dataTesting a modelSelecting a model
The problem
Too many modelsTwo books – 26 distributions!Can mix or splice to get even more
Data can be confusingDeductibles, limits
Too many tests and plotsChi-square, K-S, A-D, p-p, q-q, D
The general idea
Limited number of distributions Standard way to present data Retain flexibility on testing and selection
Distributions
Should beFamiliarFewFlexible
A few familiar distributions
ExponentialOnly one parameter
GammaTwo parameters, a mode if
LognormalTwo parameters, a mode
ParetoTwo parameters, a heavy right tail
Flexible
Add by allowing mixtures That is,
where
and all Some restrictions:
Only the exponential can be used more than once.
Cannot use both the gamma and lognormal.
1 1( ) ( ) ( )k kf x a f x a f x
1 1ka a 0ja
Why mixtures?
Allows different shape at beginning and end (e.g. mode from lognormal, tail from Pareto).
By using several exponentials can have most any tail weight (see Keatinge).
Estimating parameters
Use only maximum likelihoodAsymptotically optimalCan be applied in all settings, regardless of
the nature of the dataLikelihood value can be used to compare
different models
Representing the data
Why do we care?Graphical tests require a graph of the
empirical density or distribution function.Hypothesis tests require the functions
themselves.
What is the issue?
None if,All observations are discrete or groupedNo truncation or censoring
But if so,For discrete data the Kaplan-Meier product-
limit estimator provides the empirical distribution function (and is the nonparametric mle as well).
Issue – grouped data
For grouped data, If completely grouped, the histogram
represents the pdf, the ogive the cdf. If some grouped, some not, or multiple
deductibles, limits, our suggestion is to replace the observations in the interval with that many equally spaced points.
Review
Given a data set, we have the following:A way to represent the data.A limited set of models to consider.Parameter estimates for each model.
The remaining tasks are:Decide which models are acceptable.Decide which model to use.
Example
The paper has two example, we will look only at the second one.
Data are individual payments, but the policies that produced them had different deductibles (100, 250, 500) and different maximum payments (1,000, 3,000, 5,000).
There are 100 observations.
Empirical cdfKaplan-Meier estimate
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1000 2000 3000 4000 5000 6000
loss
F-e
mp
(x)
Distribution function plot
Plot the empirical and model cdfs together. Note, because in this example the smallest deductible is 100, the empirical cdf begins there.
To be comparable, the model cdf is calculated as
( ) ( )( )
1 ( )d
F x F dF x
F d
Example model
All plots and tests that follow are for a mixture of a lognormal and exponential distribution. The parameters are
1
lognormal: 7.109459, 0.254236
exponential: 1839.174
0.238301a
Distribution function plotDistribution function plot
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1000 2000 3000 4000 5000 6000
loss
F(x
) F-emp
F-model
Confidence bands
It is possible to create 95% confidence bands. That is, we are 95% confident that the true distribution is completely within these bands.
Formulas adapted from Klein and Moeschberger with a modification for multiple truncation points (their formula allows only multiple censoring points).
CDF plot with bounds
CDF plot with 95% bounds
00.10.20.30.40.5
0.60.70.80.9
1
0 1000 2000 3000 4000 5000 6000
loss
F(x
)
F-emp
F-model
lower
upper
Other CDF pictures
Any function of the cdf, such as the limited expected value, could be plotted.
The only one shown here is the difference plot – magnify the previous plot by plotting the difference of the two distribution functions.
CDF difference plot
CDF difference plot
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0 1000 2000 3000 4000 5000 6000
loss
Difference
lower
upper
Histogram plot
Plot a histogram of the data against the density function of the model.
For data that were not grouped, can use the empirical cdf to get cell probabilities.
Histogram plot
Histogram plot
0
0.0001
0.0002
0.0003
0.0004
0.0005
0.0006
0.0007
0 1000 2000 3000 4000 5000 6000
loss
hist
model
Hypothesis tests
Null-model fits Alternative-it doesn’t Three tests
Kolmogorov-SmirnovAnderson-DarlingChi-square
Kolmogorov-Smirnov
Test statistic is maximum difference between the empirical and model cdfs. Each difference is multiplied by a scaling factor related to the sample size at that point.
Critical values are way off when parameters estimated from data.
Anderson-Darling
Test statistic looks complex:
where e is empirical and m is model. The paper shows how to turn this into a
sum. More emphasis on fit in tails than for K-S
test.
22 [ ( ) ( )]
( )( )[1 ( )]
ue m
mdm m
F x F xA f x dx
F x F x
Chi-square test
You have seen this one before. It is the only one with an adjustment for
estimating parameters.
Results
K-S: 0.5829 A-D: 0.2570 Chi-square p-value of 0.5608 The model is clearly acceptable.
Simulation study needed to get p-values for these tests. Simulation indicates that the p-values are over 0.9.
Comparing models
Good picture Better test numbers Likelihood criterion such as Schwarz
Bayesian. The SBC is the loglikelihood minus (r/2)ln(n) where r is the number of parameters and n is the sample size.
Several models
Model Loglike A-D K-S Chi-sq SBC
Exp -628.23 1.2245 0.9739 0.1054 -630.53
Ln -626.26 0.6682 0.9375 0.2126 -630.87
Gam -627.35 0.8369 1.0355 0.2319 -631.96
L/E -623.77 0.2579 0.5829 0.5608 -632.98
G/E -623.64 0.2804 0.5773 0.5260 -632.85
L/E/E -623.39 0.1484 0.4494 0.3472 -637.21
G/E/E -623.26 0.1353 0.4652 0.3348 -637.08
Which is the winner?
Referee A – loglikelihood rules – pick gamma/exp/exp mixture This is a world of one big model and the best is the
best, simplicity is never an issue.
Referee B – SBC rules – pick exponential Parsimony is most important, pay a penalty for extra
parameters.
Me – lognormal/exp. Great pictures, better numbers than exponential, but simpler than three component mixture.
Can this be automated?
We are working on software Test version can be downloaded at
www.cbpa.drake.edu/mixfit. MLEs are good. Pictures and test
statistics are not quite right. May crash. Here is a quick demo.