rpm 2015 - glm ii - bailey.ppt · 2015. 2. 16. · microsoft powerpoint - rpm 2015 - glm ii -...
TRANSCRIPT
-
© 2015 Towers Watson. All rights reserved.
GLM IIBasic Modeling Strategy
2015 CAS Ratemaking and Product Management Seminarby Paul BaileyMarch 10, 2015
-
1
Set project goals and
review background
Gather and prepare
data
Explore Data
Build Component Predictive
Models
Validate Component
Models
Combine Component
Models
Building predictive models is a multi-step process
Ernesto walked us through the first 3 components We will now go through an example of the remaining steps: Building component predictive models
— We will illustrate how to build a frequency model
Validating component models— We will illustrate how to validate your component model
We will also briefly discuss combining models and incorporating implementation constraints
— Goal should be to build best predictive models now and incorporate constraints later
Incorporate Constraints
© 2015 Towers Watson. All rights reserved.
-
2
Set project goals and
review background
Gather and prepare
data
Explore Data
Build Component Predictive
Models
Validate Component
Models
Combine Component
Models
Building component predictive models can be separated into two steps
Initial Modeling Selecting error structure and link function Build simple initial model Testing basic modeling assumptions and methodology
Iterative modeling Refining your initial models through a series of iterative steps
complicating the model, then simplifying the model, then repeating
Incorporate Constraints
© 2015 Towers Watson. All rights reserved.
-
3
Initial modeling
Initial modeling is done to test basic modeling methodology
Is my link function appropriate?
Is my error structure appropriate?
Is my overall modeling methodology appropriate (e.g. do I need to cap losses? Exclude expense only claims? Model by peril?)
© 2015 Towers Watson. All rights reserved.
-
4
Examples of error structures
Error functions reflect the variability of the underlying process and can be any distribution within the exponential family, for example:
Gamma consistent with severity modeling; may want to try Inverse Gaussian
Poisson consistent with frequency modeling
Tweedie consistent with pure premium modeling
Normal useful for a variety of applications
© 2015 Towers Watson. All rights reserved.
-
5
Generally accepted error structure and link functions
Observed Response Most Appropriate Link FunctionMost Appropriate Error Structure Variance Function
-- -- Normal µ0
Claim Frequency Log Poisson µ1
Claim Severity Log Gamma µ2
Claim Severity Log Inverse Gaussian µ3
Pure Premium Log Gamma or Tweedie µT
Retention Rate Logit Binomial µ(1-µ)
Conversion Rate Logit Binomial µ(1-µ)
Use generally accepted standards as starting point for link functions and error structures
© 2015 Towers Watson. All rights reserved.
-
6
Build an initial model
Reasonable starting points for model structure Prior model Stepwise regression General insurance knowledge CART (Classification and Regression Trees) or similar algorithms
© 2015 Towers Watson. All rights reserved.
-
7
Test model assumptions
Plot of all residuals tests selected error structure/link functionNormal Error Structure/Log Link (Studentized Standardized Deviance Residuals)
-8
-6
-4
-2
0
2
4
6
8
10
12
14
16
10.5 11.0 11.5 12.0 12.5 13.0 13.5 14.0 14.5 15.0 15.5 16.0 16.5Transformed Fitted Value
> 1
> 3
> 6
> 11
> 16
> 21
> 26
> 31
> 37
> 42
> 47
> 52
> 57
> 62
> 68
> 73
> 78
> 83
> 88
Asymmetrical appearance suggests power of variance function is too low
Elliptical pattern is ideal
Crunched Residuals (Group Size: 72)
-0.04
-0.03
-0.02
-0.01
0.00
0.01
0.02
0.03
0.04
0.05
0.06
0.00000 0.00005 0.00010 0.00015 0.00020 0.00025 0.00030 0.00035 0.00040 0.00045 0.00050 0.00055 0.00060 0.00065 0.00070
Fitted Value
Two concentrations suggests two perils: split or use joint modeling
Use crunched residuals for frequency
-3.5
-3.0
-2.5
-2.0
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
-1,000 0 1,000 2,000 3 ,000 4,000 5,000 6,000 7 ,000 8,000 9,000
F itte d Va lu e
Stud
entiz
ed S
tand
ardi
zed
Dev
ianc
e R
esid
uals
> .2
> .5
> .7
> 1 .4
> 2 .2
> 2 .9
> 3 .6
> 4 .3
> 5 .0
> 5 .7
> 6 .5
> 7 .2
> 7 .9
> 8 .6
> 9 .3
> 1 0 .1
> 1 0 .8
> 1 1 .5
> 1 2 .2
> 1 2 .9
> 1 3 .6
> 1 4 .4
> 1 5 .1
> 1 5 .8
> 1 6 .5
> 1 7 .2
> 1 7 .9
> 1 8 .7
> 1 9 .4
> 2 0 .1
> 2 0 .8
> 2 1 .5
> 2 2 .3
> 2 3 .0
> 2 3 .7
> 2 4 .4
> 2 5 .1
> 2 5 .8
> 2 6 .6
> 2 7 .3
© 2015 Towers Watson. All rights reserved.
-
8
Example: initial frequency model
Link function: Log
Error structure: Poisson
Initial variable selected based on industry knowledge: Gender
Driver age
Vehicle value
Area (territory)
Variable NOT in initial model: Vehicle body
Vehicle age
Gender Relativity
© 2015 Towers Watson. All rights reserved.
-
9
Example: initial frequency model
Link function: Log
Error structure: Poisson
Initial variable selected based on industry knowledge: Gender
Driver age
Vehicle value
Area (territory)
Variable NOT in initial model: Vehicle body
Vehicle age
Driver Age Relativity
© 2015 Towers Watson. All rights reserved.
-
10
Example: initial frequency model
Link function: Log
Error structure: Poisson
Initial variable selected based on industry knowledge: Gender
Driver age
Vehicle value
Area (territory)
Variable NOT in initial model: Vehicle body
Vehicle age
Vehicle Value Relativity
© 2015 Towers Watson. All rights reserved.
-
11
Example: initial frequency model
Link function: Log
Error structure: Poisson
Initial variable selected based on industry knowledge: Gender
Driver age
Vehicle value
Area (territory)
Variable NOT in initial model: Vehicle body
Vehicle age
Area Relativity
© 2015 Towers Watson. All rights reserved.
-
12
Example: initial frequency model - residuals
Frequency residuals are hard to interpret without ‘Crunching’
Two clusters: Data points with
claims Data points without
claims
© 2015 Towers Watson. All rights reserved.
-
13
Example: initial frequency model - residuals
Order observations from smallest to largest predicted value
Group residuals into 500 buckets
The graph plots the average residual in the bucket
Crunched residuals look good!
© 2015 Towers Watson. All rights reserved.
-
towerswatson.com 14
Set project goals and
review background
Gather and prepare
data
Explore Data
Build Component Predictive
Models
Validate Component
Models
Combine Component
Models
Building component predictive models can be separated into two steps
Initial Modeling Selecting error structure and link function Build simple initial model Testing basic modeling assumptions and methodology
Iterative modeling Refining your initial models through a series of iterative steps
complicating the model, then simplifying the model, then repeating
Incorporate Constraints
© 2015 Towers Watson. All rights reserved.
-
15
Iterative Modeling
Initial models are refined using an iterative modeling approach
Iterative modeling involves many decisions to complicate and simplify the models
Your modeling toolbox can help you make these decisions
We will discuss your tools shortly
Review Model
Complicate Include Interactions
Simplify Exclude Group Curves
© 2015 Towers Watson. All rights reserved.
-
16
Ideal Model Structure
To produce a sensible model that explains recent historical experience and is likely to be predictive of future experience
Underfit:Predictive
Poor explanatory power
Overfit:Poor predictive power
Explains history
Overall meanBest Models
One parameter per observation
Model Complexity (number of parameters)
© 2015 Towers Watson. All rights reserved.
-
17
Your modeling tool box
Model decisions include: Simplification: excluding variables, grouping levels, fitting curves Complication: including variables, adding interactions
Your modeling toolbox will help you make these decisions Your tools include:
— Judgment (e.g., do the trends make sense?)
— Balance tests (i.e. actual vs. expected test)
— Parameters/standard errors
— Consistency of patterns over time or random data sets
— Type III statistical tests (e.g., chi-square tests, F-tests)
© 2015 Towers Watson. All rights reserved.
-
18
Modeling toolbox: judgment
The modeler should also ask, ‘does this pattern make sense?’
Patterns may often be counterintuitive, but become reasonable after investigation
Uses: Inclusion/exclusion Grouping Fitting curves Assessing interactions
Modeled Frequency Relativity – Vehicle Value
© 2015 Towers Watson. All rights reserved.
-
19
Modeling toolbox: balance test
Balance test is essentially an actual vs. expected
Can identify variables that are not in the model where the model is not in ‘balance’ Indicates variable
may be explaining something not in the model
Uses: Inclusion
Actual vs. Expected Frequency - Vehicle Age
© 2015 Towers Watson. All rights reserved.
-
20
Modeling toolbox: parameters/standard errors
Parameters and standard errors provide confidence in the pattern exhibited by the data
Uses: Horizontal line
test for exclusion
Plateaus for grouping
A measure of credibility
Modeled Frequency Relativities With Standard Errors - Vehicle Body
© 2015 Towers Watson. All rights reserved.
-
21
Modeling toolbox: consistency of patterns
Checking for consistency of patterns over time or across random parts of a data set is a good practical test
Uses: Validating modeling
decisions— Including/excluding
factors
— Grouping levels
— Fitting curves
— Adding Interactions
Modeled Frequency Relativity – Age Category
© 2015 Towers Watson. All rights reserved.
-
22
Modeling toolbox: type III tests
Chi test and/or F-Test is a good statistical test to compare nested models Ho: Two models are essentially the same
H1: Two models are not the same
Principle of parsimony: If two models are the same, choose the simpler model
Uses: Inclusion/exclusion
Chi-Square Percentage Meaning Action*
30% Accept Ho Use Simpler Model
© 2015 Towers Watson. All rights reserved.
-
23
Example: frequency model iteration 1 – simplification
Modeling decision: Grouping Age Category and Area
Tools Used: judgment, parameter estimates/std deviations, type III test
Area RelativityAge Category Relativity
Chi Sq P Val= 97.4%
Chi Sq P Val= 99.9%
© 2015 Towers Watson. All rights reserved.
-
24
Example: frequency model iteration 1 – simplification
Modeling decision: fitting a curve to vehicle value
Tools used: judgment, type III test, consistency test
Vehicle Value Relativity – Curve FitVehicle Value Relativity – Initial Model
Chi Sq P Val= 100.0%
© 2015 Towers Watson. All rights reserved.
-
25
Example: frequency model iteration 2 – complication
Modeling decision: adding vehicle body type
Tools used: balance test, parameter estimates/std deviations, type III test
Balance Test:Actual vs. Expected Across Vehicle Body Type
Vehicle Body Type Not In ModelVehicle Body Type Relativities
Vehicle Body Type Included in Model
Chi Sq P Val= 1.3%
© 2015 Towers Watson. All rights reserved.
-
26
Example: iterative modeling continued….
Iteration 3 - simplification Group vehicle body type
Iteration 4 – complication Add vehicle age
Iteration 5 – simplification group vehicle age levels
© 2015 Towers Watson. All rights reserved.
-
27
Example: frequency model iteration 6 – complication
Action: adding age x gender interaction
Tools used: balance test, type III test, consistency test, judgment
Balance Test:Two Way Actual vs. Expected Across Age x Gender
Age x Gender Interaction NOT in modelVehicle Body Type Relativities
Vehicle Body Type Included in Model
Chi Sq P Val= 47.5%
MF
© 2015 Towers Watson. All rights reserved.
-
28
Set project goals and
review background
Gather and prepare
data
Explore Data
Build Component Predictive
Models
Validate Component
Models
Combine Component
Models
Predictive models must be validated to have confidence in the predictive power of the models
Model validation techniques include: Examining residuals Examining gains curves Examining hold out samples
— Changes in parameter estimates
— Actual vs. expected on hold out sample
Component models and combined risk premium model should be validated
Incorporate Constraints
© 2015 Towers Watson. All rights reserved.
-
29
Model validation: residual analysis
Recheck residuals to ensure appropriate shape
Studentized Standardized Deviance Residuals by Policyholder Age
-6
-4
-2
0
2
4
6
8
10
lt 17 17 18 19 20 21 22 23 24 25 26 27 28 29 30
31-32
33-35
36-40
41-45
46-50
51-55
56-60
61-65
66-70
71-75
75-80 81
+
Crunched residuals are symmetric For Severity - Does the Box-Whisker show symmetry across levels?
© 2015 Towers Watson. All rights reserved.
-
30
Model validation: residual analysis (cont’d)
Common issues with residual plotsNormal Error Structure/Log Link (Studentized Standardized Deviance Residuals)
-8
-6
-4
-2
0
2
4
6
8
10
12
14
16
10.5 11.0 11.5 12.0 12.5 13.0 13.5 14.0 14.5 15.0 15.5 16.0 16.5Transformed Fitted Value
> 1
> 3
> 6
> 11
> 16
> 21
> 26
> 31
> 37
> 42
> 47
> 52
> 57
> 62
> 68
> 73
> 78
> 83
> 88
Asymmetrical appearance suggests power of variance function is too low
Elliptical pattern is ideal
Crunched Residuals (Group Size: 72)
-0.04
-0.03
-0.02
-0.01
0.00
0.01
0.02
0.03
0.04
0.05
0.06
0.00000 0.00005 0.00010 0.00015 0.00020 0.00025 0.00030 0.00035 0.00040 0.00045 0.00050 0.00055 0.00060 0.00065 0.00070
Fitted Value
Two concentrations suggests two perils: split or use joint modeling
Use crunched residuals for frequency
-3.5
-3.0
-2.5
-2.0
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
-1,000 0 1,000 2,000 3 ,000 4,000 5,000 6,000 7 ,000 8,000 9,000
F itte d Va lu e
Stud
entiz
ed S
tand
ardi
zed
Dev
ianc
e R
esid
uals
> .2
> .5
> .7
> 1 .4
> 2 .2
> 2 .9
> 3 .6
> 4 .3
> 5 .0
> 5 .7
> 6 .5
> 7 .2
> 7 .9
> 8 .6
> 9 .3
> 1 0 .1
> 1 0 .8
> 1 1 .5
> 1 2 .2
> 1 2 .9
> 1 3 .6
> 1 4 .4
> 1 5 .1
> 1 5 .8
> 1 6 .5
> 1 7 .2
> 1 7 .9
> 1 8 .7
> 1 9 .4
> 2 0 .1
> 2 0 .8
> 2 1 .5
> 2 2 .3
> 2 3 .0
> 2 3 .7
> 2 4 .4
> 2 5 .1
> 2 5 .8
> 2 6 .6
> 2 7 .3
© 2015 Towers Watson. All rights reserved.
-
31
Model validation: gains curves
Gains curve are good for comparing predictiveness of models Order observations from
largest to smallest predicted value on X axis
Cumulative actual claim counts (or losses) on Y axis
As you move from left to right, the better model should accumulate actual losses faster
© 2015 Towers Watson. All rights reserved.
-
32
Partial Test/Training for Smaller Data SetsFull Test/Training for Large Data Sets
Model validation: hold out samples
Holdout samples are effective at validating models Determine estimates based on part of data set Uses estimates to predict other part of data set
Predictions should be close to actuals for heavily populated cells
DataData
Split Data
Train Data
Build Models
Test Data
ComparePredictions
to Actual
All Data
Build Models
Split Data
Train Data
RefitParameters
Test Data
ComparePredictions
to Actual
© 2015 Towers Watson. All rights reserved.
-
33
Model validation: lift charts on hold out data
Actual vs. expected on holdout data is an intuitive validation technique
Good for communicating model performance to non-technical audiences
Can also create actual vs. expected across predictor dimensions
© 2015 Towers Watson. All rights reserved.
-
34
Set project goals and
review background
Gather and prepare
data
Explore Data
Build Component Predictive
Models
Validate Component
Models
Combine Component
Models
Component frequency and severity models can be combined to create pure premium models
Component models can be constructed in many different ways The standard model:
Incorporate Constraints
COMPONENT MODELS
Frequency
Severity
Poisson/ Negative Binomial
COMBINE
Frequency Severityx
Gamma
© 2015 Towers Watson. All rights reserved.
-
35
Building a model on modeled pure premium
When using modeled pure premiums, select the gamma/log link (not the Tweedie)
Density: Severity
0
200
400
600
800
1,000
1,200
1,400
0 2,000 4,000 6,000 8,000 10,000 12,000 14,000
Range
Den
sity
Severity
Modeled pure premiums will not have a point mass at zero
Density: Pure Premium
0
200
400
600
800
1,000
1,200
1,400
1,600
1,800
2,000
2,200
2,400
0 2,000 4,000 6,000 8,000 10,000 12,000 14,000
Range
Den
sity
Pure Premium
Raw pure premiums are bimodal (i.e., have a point mass at zero) and require a distribution such as the Tweedie
© 2015 Towers Watson. All rights reserved.
-
36
Set project goals and
review background
Gather and prepare
data
Explore Data
Build Component Predictive
Models
Validate Component
Models
Combine Component
Models
Various constraints often need to be applied to the modeled pure premiums
Incorporate Constraints
Goal: Convert modeled pure premiums into indications after consideration of internal and external constraints
Not always possible or desirable to charge the fully indicated rates in the short run Marketing decisions Regulatory constraints Systems constraints
Need to adjust the indications for known constraints
© 2015 Towers Watson. All rights reserved.
-
37
Constraints to give desired subsidies
Offsetting one predictor changes parameters of other correlated predictors to make up for the restrictions The stronger the exposure correlation, the more that can be made up through
the other variable Consequently, the modeler should not refit models when a desired subsidy is
incorporated into the rating plan
Insurer-Desired Subsidy Regulatory Subsidy
Example Sr. mgmt wants subsidy to attract drivers 65+
Regulatory constraint requires subsidy of drivers
65+
Result of refitting with constraint
Correlated factors will adjust to partially make up for the difference. For example, territories with retirement communities
will increase.
Potential action Do not refit models with constraint
Consider implication of refitting and make a business
decision
© 2015 Towers Watson. All rights reserved.