stay or go? the science of departures - actuaries institute · stay or go? the science of...
TRANSCRIPT
Stay or Go? The science of departures
from superannuation funds
Nathan Bonarius & Richard Dunn
© Rice Warner Pty Ltd
This presentation has been prepared for the Actuaries Institute 2017 Actuaries Summit.
The Institute Council wishes it to be understood that opinions put forward herein are not necessarily those of the
Institute and the Council is not responsible for those opinions.
Think differently
• Statistical modelling – out with the old, in with the new
• Actuarial service in non-traditional and non-finance industries
Agenda
• The Data
• Data analysis
• Modelling and Results
• Discussion
Data – Super Insights 2016
5.6m Records | 30-June-2015 | Longitudinal 2013-2015
Age
1stQu. Median Mean 3rdQu.
26.12 34.83 37.52 47.6
Balance as at 30-06-2015
1stQu. Median Mean 3rdQu.
$1,112 $6,702 $30,170 $29,330
Gender
F M U
3,216,341 2,333,525 15,695
Some of the funds included
Data analysis – exit reasons (2015)
0.0%
5.0%
10.0%
15.0%
20.0%
25.0%
30.0%
35.0%
40.0%
45.0%
Rollover Unknown Retirement CeasedEmployment
Death Benefit CompassionateHardship
Other
Data analysis – exit reasons (2015)
0.0%
5.0%
10.0%
15.0%
20.0%
25.0%
30.0%
35.0%
40.0%
45.0%
Rollover Unknown Retirement CeasedEmployment
Death Benefit CompassionateHardship
Other
Data analysis – exit destinationsMembership Basis
Assets Basis
Data analysis – Demographics
• Age
• Gender
• Balance
• Heterogeneity
Age
Gender
Gender
Tenure
Data analysis – Account Indicators
• Balance and its change
• Contributions
• Insurance Cover
• Default Investments
Balance
Change in Balance
Contributions
Contributions
Insurance Cover and Investment Choice
Models considered
• Naïve Estimator
• GLM
• Support Vector Machines
• Random Partitions
• Random Forests
• Extreme Gradient Boosting
• Ensemble Models
– Positive predictive value
– Correctly predicted exits
Evaluation Metrics
• Log Loss
• Improvement on the Naïve Estimator
• Confusion Matrices
– Total accuracy
– Positive predictive value
– True positive rate
Fit - Log Loss
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5
Basic GLM RandomParition
XG Boost SVM RandomForest
Ensembling -PPV
Ensembling -Accuracy
Log
Loss
Fit – Improvement on a naïve estimator
-2%
0%
2%
4%
6%
8%
Basic GLM RandomPartition
XG Boost SVM RandomForest
Ensembling -PPV
Ensembling -Accuracy
Per
cen
tage
Imp
rove
men
t
Importance Plots – XG Boost model
Fit – Confusion Matrices
Basic GLMRandom Partition
XG Boost SVMRandom
ForestEnsembling
- PPVEnsembling - Accuracy
Total Accuracy 86.1% 86.6% 87.7% 89.5% 90.4% 92.4% 92.6% 86.6%
Positive
Predicted Value
7.5% 10.8% 16.1% 30.2% 7.6% 47.7% 52.0% 24.2%
True Positive Rate
7.5% 10.8% 15.1% 30.2% 2.5% 22.5% 20.8% 37.1%
A Final Model
Ensemble Model fit to optimise Positive Predictive Power:
• Markedly higher PPV more practical for business
• Immaterially less accurate than alternatives (0.001%)
Actual
Remain Exit
PredictedRemain 505240 32894
Exit 7997 8648
Confusion Matrix Diagnostics
Total Accuracy 92.6%
Positive Predictive Value 52.0%
True Positive Rate 20.8%
Future and Practical Value
Practical uses:
• Developing data-driven marketing strategies
• Understanding fund demographics
Extensions
• Considering other predictors – eg. call center contact.
• Estimating the where members go and their reasons.
DISCLAIMER
The information contained in this presentation is provided by Rice Warner Pty Limited, ABN 35 003 186 883, AFSL 239191
(Rice Warner). It is intended as general information and is not personal advice as it does not take into account the
particular circumstances of any reader. While due care and attention has been exercised in the preparation of this
presentation, no representation or warranty, either express or implied, is given as to the accuracy, completeness or
reliability of this information. The information presented is not intended to be a complete statement or summary of the
matters referred to in the presentation.