rpm 2015 - glm ii - bailey.ppt · 2015. 2. 16. · microsoft powerpoint - rpm 2015 - glm ii -...

38
© 2015 Towers Watson. All rights reserved. GLM II Basic Modeling Strategy 2015 CAS Ratemaking and Product Management Seminar by Paul Bailey March 10, 2015

Upload: others

Post on 12-Feb-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

  • © 2015 Towers Watson. All rights reserved.

    GLM IIBasic Modeling Strategy

    2015 CAS Ratemaking and Product Management Seminarby Paul BaileyMarch 10, 2015

  • 1

    Set project goals and

    review background

    Gather and prepare

    data

    Explore Data

    Build Component Predictive

    Models

    Validate Component

    Models

    Combine Component

    Models

    Building predictive models is a multi-step process

    Ernesto walked us through the first 3 components We will now go through an example of the remaining steps: Building component predictive models

    — We will illustrate how to build a frequency model

    Validating component models— We will illustrate how to validate your component model

    We will also briefly discuss combining models and incorporating implementation constraints

    — Goal should be to build best predictive models now and incorporate constraints later

    Incorporate Constraints

    © 2015 Towers Watson. All rights reserved.

  • 2

    Set project goals and

    review background

    Gather and prepare

    data

    Explore Data

    Build Component Predictive

    Models

    Validate Component

    Models

    Combine Component

    Models

    Building component predictive models can be separated into two steps

    Initial Modeling Selecting error structure and link function Build simple initial model Testing basic modeling assumptions and methodology

    Iterative modeling Refining your initial models through a series of iterative steps

    complicating the model, then simplifying the model, then repeating

    Incorporate Constraints

    © 2015 Towers Watson. All rights reserved.

  • 3

    Initial modeling

    Initial modeling is done to test basic modeling methodology

    Is my link function appropriate?

    Is my error structure appropriate?

    Is my overall modeling methodology appropriate (e.g. do I need to cap losses? Exclude expense only claims? Model by peril?)

    © 2015 Towers Watson. All rights reserved.

  • 4

    Examples of error structures

    Error functions reflect the variability of the underlying process and can be any distribution within the exponential family, for example:

    Gamma consistent with severity modeling; may want to try Inverse Gaussian

    Poisson consistent with frequency modeling

    Tweedie consistent with pure premium modeling

    Normal useful for a variety of applications

    © 2015 Towers Watson. All rights reserved.

  • 5

    Generally accepted error structure and link functions

    Observed Response Most Appropriate Link FunctionMost Appropriate Error Structure Variance Function

    -- -- Normal µ0

    Claim Frequency Log Poisson µ1

    Claim Severity Log Gamma µ2

    Claim Severity Log Inverse Gaussian µ3

    Pure Premium Log Gamma or Tweedie µT

    Retention Rate Logit Binomial µ(1-µ)

    Conversion Rate Logit Binomial µ(1-µ)

    Use generally accepted standards as starting point for link functions and error structures

    © 2015 Towers Watson. All rights reserved.

  • 6

    Build an initial model

    Reasonable starting points for model structure Prior model Stepwise regression General insurance knowledge CART (Classification and Regression Trees) or similar algorithms

    © 2015 Towers Watson. All rights reserved.

  • 7

    Test model assumptions

    Plot of all residuals tests selected error structure/link functionNormal Error Structure/Log Link (Studentized Standardized Deviance Residuals)

    -8

    -6

    -4

    -2

    0

    2

    4

    6

    8

    10

    12

    14

    16

    10.5 11.0 11.5 12.0 12.5 13.0 13.5 14.0 14.5 15.0 15.5 16.0 16.5Transformed Fitted Value

    > 1

    > 3

    > 6

    > 11

    > 16

    > 21

    > 26

    > 31

    > 37

    > 42

    > 47

    > 52

    > 57

    > 62

    > 68

    > 73

    > 78

    > 83

    > 88

    Asymmetrical appearance suggests power of variance function is too low

    Elliptical pattern is ideal

    Crunched Residuals (Group Size: 72)

    -0.04

    -0.03

    -0.02

    -0.01

    0.00

    0.01

    0.02

    0.03

    0.04

    0.05

    0.06

    0.00000 0.00005 0.00010 0.00015 0.00020 0.00025 0.00030 0.00035 0.00040 0.00045 0.00050 0.00055 0.00060 0.00065 0.00070

    Fitted Value

    Two concentrations suggests two perils: split or use joint modeling

    Use crunched residuals for frequency

    -3.5

    -3.0

    -2.5

    -2.0

    -1.5

    -1.0

    -0.5

    0.0

    0.5

    1.0

    1.5

    2.0

    2.5

    3.0

    3.5

    -1,000 0 1,000 2,000 3 ,000 4,000 5,000 6,000 7 ,000 8,000 9,000

    F itte d Va lu e

    Stud

    entiz

    ed S

    tand

    ardi

    zed

    Dev

    ianc

    e R

    esid

    uals

    > .2

    > .5

    > .7

    > 1 .4

    > 2 .2

    > 2 .9

    > 3 .6

    > 4 .3

    > 5 .0

    > 5 .7

    > 6 .5

    > 7 .2

    > 7 .9

    > 8 .6

    > 9 .3

    > 1 0 .1

    > 1 0 .8

    > 1 1 .5

    > 1 2 .2

    > 1 2 .9

    > 1 3 .6

    > 1 4 .4

    > 1 5 .1

    > 1 5 .8

    > 1 6 .5

    > 1 7 .2

    > 1 7 .9

    > 1 8 .7

    > 1 9 .4

    > 2 0 .1

    > 2 0 .8

    > 2 1 .5

    > 2 2 .3

    > 2 3 .0

    > 2 3 .7

    > 2 4 .4

    > 2 5 .1

    > 2 5 .8

    > 2 6 .6

    > 2 7 .3

    © 2015 Towers Watson. All rights reserved.

  • 8

    Example: initial frequency model

    Link function: Log

    Error structure: Poisson

    Initial variable selected based on industry knowledge: Gender

    Driver age

    Vehicle value

    Area (territory)

    Variable NOT in initial model: Vehicle body

    Vehicle age

    Gender Relativity

    © 2015 Towers Watson. All rights reserved.

  • 9

    Example: initial frequency model

    Link function: Log

    Error structure: Poisson

    Initial variable selected based on industry knowledge: Gender

    Driver age

    Vehicle value

    Area (territory)

    Variable NOT in initial model: Vehicle body

    Vehicle age

    Driver Age Relativity

    © 2015 Towers Watson. All rights reserved.

  • 10

    Example: initial frequency model

    Link function: Log

    Error structure: Poisson

    Initial variable selected based on industry knowledge: Gender

    Driver age

    Vehicle value

    Area (territory)

    Variable NOT in initial model: Vehicle body

    Vehicle age

    Vehicle Value Relativity

    © 2015 Towers Watson. All rights reserved.

  • 11

    Example: initial frequency model

    Link function: Log

    Error structure: Poisson

    Initial variable selected based on industry knowledge: Gender

    Driver age

    Vehicle value

    Area (territory)

    Variable NOT in initial model: Vehicle body

    Vehicle age

    Area Relativity

    © 2015 Towers Watson. All rights reserved.

  • 12

    Example: initial frequency model - residuals

    Frequency residuals are hard to interpret without ‘Crunching’

    Two clusters: Data points with

    claims Data points without

    claims

    © 2015 Towers Watson. All rights reserved.

  • 13

    Example: initial frequency model - residuals

    Order observations from smallest to largest predicted value

    Group residuals into 500 buckets

    The graph plots the average residual in the bucket

    Crunched residuals look good!

    © 2015 Towers Watson. All rights reserved.

  • towerswatson.com 14

    Set project goals and

    review background

    Gather and prepare

    data

    Explore Data

    Build Component Predictive

    Models

    Validate Component

    Models

    Combine Component

    Models

    Building component predictive models can be separated into two steps

    Initial Modeling Selecting error structure and link function Build simple initial model Testing basic modeling assumptions and methodology

    Iterative modeling Refining your initial models through a series of iterative steps

    complicating the model, then simplifying the model, then repeating

    Incorporate Constraints

    © 2015 Towers Watson. All rights reserved.

  • 15

    Iterative Modeling

    Initial models are refined using an iterative modeling approach

    Iterative modeling involves many decisions to complicate and simplify the models

    Your modeling toolbox can help you make these decisions

    We will discuss your tools shortly

    Review Model

    Complicate Include Interactions

    Simplify Exclude Group Curves

    © 2015 Towers Watson. All rights reserved.

  • 16

    Ideal Model Structure

    To produce a sensible model that explains recent historical experience and is likely to be predictive of future experience

    Underfit:Predictive

    Poor explanatory power

    Overfit:Poor predictive power

    Explains history

    Overall meanBest Models

    One parameter per observation

    Model Complexity (number of parameters)

    © 2015 Towers Watson. All rights reserved.

  • 17

    Your modeling tool box

    Model decisions include: Simplification: excluding variables, grouping levels, fitting curves Complication: including variables, adding interactions

    Your modeling toolbox will help you make these decisions Your tools include:

    — Judgment (e.g., do the trends make sense?)

    — Balance tests (i.e. actual vs. expected test)

    — Parameters/standard errors

    — Consistency of patterns over time or random data sets

    — Type III statistical tests (e.g., chi-square tests, F-tests)

    © 2015 Towers Watson. All rights reserved.

  • 18

    Modeling toolbox: judgment

    The modeler should also ask, ‘does this pattern make sense?’

    Patterns may often be counterintuitive, but become reasonable after investigation

    Uses: Inclusion/exclusion Grouping Fitting curves Assessing interactions

    Modeled Frequency Relativity – Vehicle Value

    © 2015 Towers Watson. All rights reserved.

  • 19

    Modeling toolbox: balance test

    Balance test is essentially an actual vs. expected

    Can identify variables that are not in the model where the model is not in ‘balance’ Indicates variable

    may be explaining something not in the model

    Uses: Inclusion

    Actual vs. Expected Frequency - Vehicle Age

    © 2015 Towers Watson. All rights reserved.

  • 20

    Modeling toolbox: parameters/standard errors

    Parameters and standard errors provide confidence in the pattern exhibited by the data

    Uses: Horizontal line

    test for exclusion

    Plateaus for grouping

    A measure of credibility

    Modeled Frequency Relativities With Standard Errors - Vehicle Body

    © 2015 Towers Watson. All rights reserved.

  • 21

    Modeling toolbox: consistency of patterns

    Checking for consistency of patterns over time or across random parts of a data set is a good practical test

    Uses: Validating modeling

    decisions— Including/excluding

    factors

    — Grouping levels

    — Fitting curves

    — Adding Interactions

    Modeled Frequency Relativity – Age Category

    © 2015 Towers Watson. All rights reserved.

  • 22

    Modeling toolbox: type III tests

    Chi test and/or F-Test is a good statistical test to compare nested models Ho: Two models are essentially the same

    H1: Two models are not the same

    Principle of parsimony: If two models are the same, choose the simpler model

    Uses: Inclusion/exclusion

    Chi-Square Percentage Meaning Action*

    30% Accept Ho Use Simpler Model

    © 2015 Towers Watson. All rights reserved.

  • 23

    Example: frequency model iteration 1 – simplification

    Modeling decision: Grouping Age Category and Area

    Tools Used: judgment, parameter estimates/std deviations, type III test

    Area RelativityAge Category Relativity

    Chi Sq P Val= 97.4%

    Chi Sq P Val= 99.9%

    © 2015 Towers Watson. All rights reserved.

  • 24

    Example: frequency model iteration 1 – simplification

    Modeling decision: fitting a curve to vehicle value

    Tools used: judgment, type III test, consistency test

    Vehicle Value Relativity – Curve FitVehicle Value Relativity – Initial Model

    Chi Sq P Val= 100.0%

    © 2015 Towers Watson. All rights reserved.

  • 25

    Example: frequency model iteration 2 – complication

    Modeling decision: adding vehicle body type

    Tools used: balance test, parameter estimates/std deviations, type III test

    Balance Test:Actual vs. Expected Across Vehicle Body Type

    Vehicle Body Type Not In ModelVehicle Body Type Relativities

    Vehicle Body Type Included in Model

    Chi Sq P Val= 1.3%

    © 2015 Towers Watson. All rights reserved.

  • 26

    Example: iterative modeling continued….

    Iteration 3 - simplification Group vehicle body type

    Iteration 4 – complication Add vehicle age

    Iteration 5 – simplification group vehicle age levels

    © 2015 Towers Watson. All rights reserved.

  • 27

    Example: frequency model iteration 6 – complication

    Action: adding age x gender interaction

    Tools used: balance test, type III test, consistency test, judgment

    Balance Test:Two Way Actual vs. Expected Across Age x Gender

    Age x Gender Interaction NOT in modelVehicle Body Type Relativities

    Vehicle Body Type Included in Model

    Chi Sq P Val= 47.5%

    MF

    © 2015 Towers Watson. All rights reserved.

  • 28

    Set project goals and

    review background

    Gather and prepare

    data

    Explore Data

    Build Component Predictive

    Models

    Validate Component

    Models

    Combine Component

    Models

    Predictive models must be validated to have confidence in the predictive power of the models

    Model validation techniques include: Examining residuals Examining gains curves Examining hold out samples

    — Changes in parameter estimates

    — Actual vs. expected on hold out sample

    Component models and combined risk premium model should be validated

    Incorporate Constraints

    © 2015 Towers Watson. All rights reserved.

  • 29

    Model validation: residual analysis

    Recheck residuals to ensure appropriate shape

    Studentized Standardized Deviance Residuals by Policyholder Age

    -6

    -4

    -2

    0

    2

    4

    6

    8

    10

    lt 17 17 18 19 20 21 22 23 24 25 26 27 28 29 30

    31-32

    33-35

    36-40

    41-45

    46-50

    51-55

    56-60

    61-65

    66-70

    71-75

    75-80 81

    +

    Crunched residuals are symmetric For Severity - Does the Box-Whisker show symmetry across levels?

    © 2015 Towers Watson. All rights reserved.

  • 30

    Model validation: residual analysis (cont’d)

    Common issues with residual plotsNormal Error Structure/Log Link (Studentized Standardized Deviance Residuals)

    -8

    -6

    -4

    -2

    0

    2

    4

    6

    8

    10

    12

    14

    16

    10.5 11.0 11.5 12.0 12.5 13.0 13.5 14.0 14.5 15.0 15.5 16.0 16.5Transformed Fitted Value

    > 1

    > 3

    > 6

    > 11

    > 16

    > 21

    > 26

    > 31

    > 37

    > 42

    > 47

    > 52

    > 57

    > 62

    > 68

    > 73

    > 78

    > 83

    > 88

    Asymmetrical appearance suggests power of variance function is too low

    Elliptical pattern is ideal

    Crunched Residuals (Group Size: 72)

    -0.04

    -0.03

    -0.02

    -0.01

    0.00

    0.01

    0.02

    0.03

    0.04

    0.05

    0.06

    0.00000 0.00005 0.00010 0.00015 0.00020 0.00025 0.00030 0.00035 0.00040 0.00045 0.00050 0.00055 0.00060 0.00065 0.00070

    Fitted Value

    Two concentrations suggests two perils: split or use joint modeling

    Use crunched residuals for frequency

    -3.5

    -3.0

    -2.5

    -2.0

    -1.5

    -1.0

    -0.5

    0.0

    0.5

    1.0

    1.5

    2.0

    2.5

    3.0

    3.5

    -1,000 0 1,000 2,000 3 ,000 4,000 5,000 6,000 7 ,000 8,000 9,000

    F itte d Va lu e

    Stud

    entiz

    ed S

    tand

    ardi

    zed

    Dev

    ianc

    e R

    esid

    uals

    > .2

    > .5

    > .7

    > 1 .4

    > 2 .2

    > 2 .9

    > 3 .6

    > 4 .3

    > 5 .0

    > 5 .7

    > 6 .5

    > 7 .2

    > 7 .9

    > 8 .6

    > 9 .3

    > 1 0 .1

    > 1 0 .8

    > 1 1 .5

    > 1 2 .2

    > 1 2 .9

    > 1 3 .6

    > 1 4 .4

    > 1 5 .1

    > 1 5 .8

    > 1 6 .5

    > 1 7 .2

    > 1 7 .9

    > 1 8 .7

    > 1 9 .4

    > 2 0 .1

    > 2 0 .8

    > 2 1 .5

    > 2 2 .3

    > 2 3 .0

    > 2 3 .7

    > 2 4 .4

    > 2 5 .1

    > 2 5 .8

    > 2 6 .6

    > 2 7 .3

    © 2015 Towers Watson. All rights reserved.

  • 31

    Model validation: gains curves

    Gains curve are good for comparing predictiveness of models Order observations from

    largest to smallest predicted value on X axis

    Cumulative actual claim counts (or losses) on Y axis

    As you move from left to right, the better model should accumulate actual losses faster

    © 2015 Towers Watson. All rights reserved.

  • 32

    Partial Test/Training for Smaller Data SetsFull Test/Training for Large Data Sets

    Model validation: hold out samples

    Holdout samples are effective at validating models Determine estimates based on part of data set Uses estimates to predict other part of data set

    Predictions should be close to actuals for heavily populated cells

    DataData

    Split Data

    Train Data

    Build Models

    Test Data

    ComparePredictions

    to Actual

    All Data

    Build Models

    Split Data

    Train Data

    RefitParameters

    Test Data

    ComparePredictions

    to Actual

    © 2015 Towers Watson. All rights reserved.

  • 33

    Model validation: lift charts on hold out data

    Actual vs. expected on holdout data is an intuitive validation technique

    Good for communicating model performance to non-technical audiences

    Can also create actual vs. expected across predictor dimensions

    © 2015 Towers Watson. All rights reserved.

  • 34

    Set project goals and

    review background

    Gather and prepare

    data

    Explore Data

    Build Component Predictive

    Models

    Validate Component

    Models

    Combine Component

    Models

    Component frequency and severity models can be combined to create pure premium models

    Component models can be constructed in many different ways The standard model:

    Incorporate Constraints

    COMPONENT MODELS

    Frequency

    Severity

    Poisson/ Negative Binomial

    COMBINE

    Frequency Severityx

    Gamma

    © 2015 Towers Watson. All rights reserved.

  • 35

    Building a model on modeled pure premium

    When using modeled pure premiums, select the gamma/log link (not the Tweedie)

    Density: Severity

    0

    200

    400

    600

    800

    1,000

    1,200

    1,400

    0 2,000 4,000 6,000 8,000 10,000 12,000 14,000

    Range

    Den

    sity

    Severity

    Modeled pure premiums will not have a point mass at zero

    Density: Pure Premium

    0

    200

    400

    600

    800

    1,000

    1,200

    1,400

    1,600

    1,800

    2,000

    2,200

    2,400

    0 2,000 4,000 6,000 8,000 10,000 12,000 14,000

    Range

    Den

    sity

    Pure Premium

    Raw pure premiums are bimodal (i.e., have a point mass at zero) and require a distribution such as the Tweedie

    © 2015 Towers Watson. All rights reserved.

  • 36

    Set project goals and

    review background

    Gather and prepare

    data

    Explore Data

    Build Component Predictive

    Models

    Validate Component

    Models

    Combine Component

    Models

    Various constraints often need to be applied to the modeled pure premiums

    Incorporate Constraints

    Goal: Convert modeled pure premiums into indications after consideration of internal and external constraints

    Not always possible or desirable to charge the fully indicated rates in the short run Marketing decisions Regulatory constraints Systems constraints

    Need to adjust the indications for known constraints

    © 2015 Towers Watson. All rights reserved.

  • 37

    Constraints to give desired subsidies

    Offsetting one predictor changes parameters of other correlated predictors to make up for the restrictions The stronger the exposure correlation, the more that can be made up through

    the other variable Consequently, the modeler should not refit models when a desired subsidy is

    incorporated into the rating plan

    Insurer-Desired Subsidy Regulatory Subsidy

    Example Sr. mgmt wants subsidy to attract drivers 65+

    Regulatory constraint requires subsidy of drivers

    65+

    Result of refitting with constraint

    Correlated factors will adjust to partially make up for the difference. For example, territories with retirement communities

    will increase.

    Potential action Do not refit models with constraint

    Consider implication of refitting and make a business

    decision

    © 2015 Towers Watson. All rights reserved.