predictive analytics - amazon web services

29
Actuaries and Consultants 2019 Spring Workshops, Chicago Actuarial Association Predictive Analytics: A Practical Approach Andrea Huckaba Rome, FSA, CERA, MAAA

Upload: others

Post on 29-Dec-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Predictive Analytics - Amazon Web Services

Actuaries and Consultants

2019 Spring Workshops,Chicago Actuarial Association

Predictive Analytics: A Practical Approach

Andrea Huckaba Rome, FSA, CERA, MAAA

Page 2: Predictive Analytics - Amazon Web Services

Why Predictive Analytics Matters

Emerging area of interest

Automation will affect the work you do

Actuaries (and other experts) needed to fully benefit from Predictive Analytics

2

Page 3: Predictive Analytics - Amazon Web Services

Agenda

1. Definition of Predictive Analytics

2. Applicable Business Problems

3. Available Tools

4. Step by step example

3

Page 4: Predictive Analytics - Amazon Web Services

Definition

4

“Predictive analytics encompasses a variety of statistical techniques from data mining, predictive modeling, and machine learning, that analyze current and historical facts to make predictions about future or otherwise unknown events.”

How is this different from what we do today?• New techniques• Fewer Assumptions*• Less Time

Page 5: Predictive Analytics - Amazon Web Services

Definition

5

Examples:

Health Risk ScoringNew Techniques

Fewer AssumptionsLess Time

Auto Fraud DetectionNew TechniquesFewer AssumptionsLess Time

Page 6: Predictive Analytics - Amazon Web Services

Business Problems CriteriaWhen is PA appropriate for a business problem?

1. Clearly defined problem

2. Lots of useful data

3. Prediction will drive action

4. PA is the best option

6

Page 7: Predictive Analytics - Amazon Web Services

Business Problems Criteria 1. Clearly defined problem

2. Lots of useful data

3. Prediction will drive action

4. PA is the best option

7

Case 1: Some customers have complained about the system they use to contact the company, file claims, and receive information. The customer experience VP wants to create a customized set of protocols for each person that cater to their preferences. We have call center data, demographic data, customer e-mails, and social media data. Is this a viable PA problem?

Case 2: Your company’s head of sales wants to develop and sell a radically new product, and wants to know what type of people might be interested in it. You have current enrollment and claims data for your members, as well as some scattered industry reports on popular products. Is this a viable PA problem?

Page 8: Predictive Analytics - Amazon Web Services

Business Problems Criteria 1. Clearly defined problem

2. Lots of useful data

3. Prediction will drive action

4. PA is the best option

8

Case 3: Your state department of insurance wants a better way to identify potentially fraudulent claims for further investigation. They have comprehensive claims, enrollment and provider data from each carrier. They also have a database of prior cases that they have investigated for fraud, with the outcomes of each case. Is this a viable PA problem?

Page 9: Predictive Analytics - Amazon Web Services

Available PA ToolsSoftware

R, Python, SAS, Statistica, …

9

Types of Models

Categorical vs. Numerical

Exploratory vs. Result-Based

Transparent vs. Black Box

Fast vs. Slow (Execution Speed)

Fast vs. Slow (Analytical Time/Effort)

Stability

Some Model Types:• Linear Models• Generalized Linear Models• Decision Trees• Random Forests

• Gradient Boosting Machines• Support Vector Machines• Neural Networks• Genetic Algorithms

Page 10: Predictive Analytics - Amazon Web Services

Available PA ToolsGeneralized Linear Models

10

Match to common distributions with link functions, using most predictive data variables to shape the curve.

Like a linear regression model, maximized!!

Advantages:• Efficient• Interpretable• Smooth prediction surface

Disadvantages:• Likely to under-fit• Linear parameters

Variation: Stacking/Blending with other model types

Page 11: Predictive Analytics - Amazon Web Services

Available PA ToolsDecision Trees

11

Splits the feature space into exhaustive and mutually exclusive datasets that best predict the target variable.

Advantages:• Simple• Non-linear• Interpretable• Handles Missing Values in Data

Disadvantages:• Likely to overfit• Unstable• Prediction surface not smooth

Variation: Random Forest or Stacking/Blending with other model types

Page 12: Predictive Analytics - Amazon Web Services

Available PA ToolsSupport Vector Machines

12

Groups similar data points using decision boundaries. Future data points can be grouped quickly using these boundaries.

Advantages:• Good for online learning• Flexible with non-linear data

Disadvantages:• Likely to overfit• Do not provide probability

estimates, just classification.

Page 13: Predictive Analytics - Amazon Web Services

Example PA Problem: Health Plan LapseThe Business Problem

Self-insured group wants to know why members are lapsing

1. Clearly Define: Determine drivers of lapse for this healthcare MEWA, so we can predict potential future lapse

2. Lots of Useful Data: Claims, demographic data, membership information, and possible unstructured data from sales team

3. Prediction Drives Action: Characteristics of lapsers may help MEWA prioritize changes and/or use predictions to lessen chance of lapse

4. Most Appropriate Option: Combining multiples characteristics of lapsers and looking for patterns is easier with PA than with traditional analysis.

13

Page 14: Predictive Analytics - Amazon Web Services

Example PA Problem: Health Plan LapseData Gathering/Exploration

Scrubbing and Common sense checks

ASOP 23 Do data imperfections have a material impact on my results?

Do I understand the definitions of each data field?

Is the data reasonable to use for this analysis?

14

Page 15: Predictive Analytics - Amazon Web Services

Example PA Problem: Health Plan LapseData Gathering/Exploration

Scrubbing and Common sense checks

Scrubbing Reformat fields

Create new fields

Lower dimensionality on some fields

15

Page 16: Predictive Analytics - Amazon Web Services

Example PA Problem: Health Plan LapseData Gathering/Exploration

Scrubbing and Common sense checks

Checks Lapse by month

16

Page 17: Predictive Analytics - Amazon Web Services

Example PA Problem: Health Plan LapseData Gathering/Exploration

Scrubbing and Common sense checks

Checks Lapse by month

Lapse by age

17

Page 18: Predictive Analytics - Amazon Web Services

Example PA Problem: Health Plan LapseData Gathering/Exploration

Scrubbing and Common sense checks

Checks Lapse by month

Lapse by age

Other checks

18

Page 19: Predictive Analytics - Amazon Web Services

Example PA Problem: Health Plan LapseData Gathering/Exploration

Splitting the data

Testing / Training/ (Validation)

Avoid overfitting, accurate measure of predictive power

Random vs Stratified Sampling

19

Page 20: Predictive Analytics - Amazon Web Services

Example PA Problem: Health Plan LapseData Gathering/Exploration

Relationships between variables

Univariate, Multivariate plotting

Other measures of relationships Correlation

Principle Component Analysis

Clustering Analysis (KNN)

20

Page 21: Predictive Analytics - Amazon Web Services

Example PA Problem: Health Plan LapseFeature Selection

Not all of the raw data will be used in our analysis

Choose and/or modify the raw data

Two methods: Every feature then pare down

A few features then build up

21

Page 22: Predictive Analytics - Amazon Web Services

Example PA Problem: Health Plan LapseModel Selection

What is Needed: High transparency, Categorical

Decision Tree (to start)

22

Page 23: Predictive Analytics - Amazon Web Services

Example PA Problem: Health Plan LapseModel Selection- Decision Tree (to start)

23

Initial decision tree:• Unreadable• Too many factors = false precisionSolution= We need to prune the tree

Page 24: Predictive Analytics - Amazon Web Services

Example PA Problem: Health Plan LapseModel Training/Testing

24

Pruned decision tree:• Better• Makes sense• Some further pruning might be

needed (area-based predictions)• Might consider other features

Finally, use this model to determine the accuracy against our training and testing datasets.

Page 25: Predictive Analytics - Amazon Web Services

Example PA Problem: Health Plan LapseOther Follow-ups

25

Remove some features to see if others play a more important role:

• Remove renewal as a factor. Aging and discontinued plans were most prominent.

• Remove Aging as a factor. Tier and prior year rate increases were most prominent.

Page 26: Predictive Analytics - Amazon Web Services

Example PA Problem: Health Plan LapseInterpreting Results

How do we convey this to the stakeholders?

How can this model be implemented?

When should this model be updated?

26

AccuracyTraining Model Value77.2%

Testing Model Value77.8%

Page 27: Predictive Analytics - Amazon Web Services

Example PA Problem: Health Plan LapseInterpreting Results

27

Exhibit 8.4: Type of Lapse, by Year% of Total Lapsed members for the year

Type of Lapse 2016 2017 Description

Age-Based Lapse 19% 17% Medicare eligible or Dependents to 26

Mid-Year Spouse or Dependent Lapse 5% 3% Child covered by the other parent, spouse gains other

coverage, change in living situation, etc.

Renewal Spouse or Dependent Lapse 1% 3% Employee opts to change tier during renewal

Mid-Year Employee Lapse 43% 35% Employee lapse during any month but December. Late

renewal decisions, change in employement, etc.

Renewal Employee Lapse 12% 11% Employee lapse at the end of December, a decision to

not renew their plan

Mid-Year Group Lapse 15% 18% Group lapse during any month but December

Renewal Group Lapse 5% 13% Group lapse at the end of December, a decision to not

renew group coverage

Page 28: Predictive Analytics - Amazon Web Services

Example PA Problem: Health Plan LapseWhat else?

28

Given time, I would also do the following:• Incorporate more medical information, and additional demographics• Incorporate unstructured data from sales• Random Forest or Stacking Blending to increase credibility of the

model• Increase the visual appeal

Page 29: Predictive Analytics - Amazon Web Services

Final Notes for Actuaries

29

Actuarial students are learning this- use their expertise.

These are just tools. More sophisticated tools.

Remember the criteria for business problems.

Thank You!