wwf andrew lockett tim dyke & scott logie iof analysts group segmentation v4 final
TRANSCRIPT
IoF Analysts Group:
Advanced Segmentation
Andrew Lockett, Tim Dyke & Scott Logie
13 May 2008
Summary
• What do we mean by „advanced segmentation‟
• An outline of some of the techniques
– Predictive Modelling
– Clustering
– Optimisation
– Combinations
– Event/Trigger detection
• Case Studies from WWF throughout
• Questions
Advanced Segmentation
• Segmentation – any way to split up our supporter
base
• Advanced – progressive
• Therefore Advanced Segmentation =
“progressive ways to split up our supporter base”
Advanced Segmentation
• Why be progressive?
• Only if it improves what is currently being done
• So either save money or make money
• Time plays a key element in any segmentation
A typical supporter value journey
Welcome - Driving initial behaviours
Conversion
Beginning decent
Continued decent
Inactivity Dormancy Closure
Peaking Max Value
Positive£ Value
Time0
Negative£ Value
Maximise the height and longevity of this period
Development Retention Win-backAcquisition Welcome
Predictive Modelling
Why use predictive response models?
• Should give better results …. but see later!
• Objective way of setting cut-offs
• Estimate impact of varying volume on income
• Makes selections easy!
What do you need to do it?
• Historic mailing and transaction data
• Analysis tools for regression – SPSS, SAS etc
• Good marketing database helps…..
Steps involved in building a model (1)
• Chose historic appeal to model
• Identify target variable:
– Usually responder vs non-responder
• Develop predictor characteristics (retrospective)
• Carry out Univariate / profiling analysis:
[WWF].[M-DBM001].[DONEVER_B] (Gini = 10.3331)
Values Base Count Target Count Base % Target % Index Significance Response Rate
a.Never 7,405 76 12.5% 1.4% 11 -27.68 1.0%
b.<£10 6,619 305 11.2% 5.5% 49 -14.03 4.4%
c.10-£50 22,677 1,439 38.3% 25.9% 68 -36.69 6.0%
d.50-£100 9,271 994 15.7% 17.9% 115 5.80 9.7%
e.100-£500 11,815 2,195 19.9% 39.6% 198 51.87 15.7%
f.500-£1000 1,179 422 2.0% 7.6% 382 13.27 26.4%
g.£1000+ 284 116 0.5% 2.1% 436 3.77 29.0%
Total 59,250 5,547
Univariate analysis – an example
Donations Ever is
good predictor of
future response
• This is an obvious example
– Equivalent to V in RFV matrix
• Can get more clever
– Categorical variables – such as sex, recruitment channel
– Cross characteristics to recognise interactions – eg. age
with marital status
Steps involved in building a model (2)
• Chose strong predictors
• Feed into regression tool to build model
Building the regression model
Chose Target
and select
predictors
Steps involved in building a model (2)
• Chose strong predictors
• Feed into regression tool to build model
• Validate and test model:
0%
5%
10%
15%
20%
25%
30%
35%
40%
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Semi-Decile (based on predicted score)
Actu
al
Resp
on
se R
ate
Actual Response Rate
Breakeven
Below semi-decile
12 unprofitable
Applying the response model
• Sounds simple…. but easy to get it wrong
• Score up “universe” and select those above
predetermined cut-off
• But what “universe” to use:
– Just those mailed previously – in which case volume
reduced
– Widen out to untested audience – in which case model
could be misleading
Applying the response model
Pre
dic
ted R
esponse R
ate
Good
Poor
Cut-off
Drop
Existing audience
Pull in
Untested audienceDanger that
model may
over-predict
Predictive Modelling summary
• Powerful tool in right situations … but
• Need to be careful to apply correctly… and
• Can be hard to explain to appeal managers
• So what are the alternatives?
Alternative ‘hybrid’ approach
• Simple RFV type rules to determine the „no brainer‟ segments…
• …Then multivariate analysis used to „cherry-pick‟ the best prospects from the non-responsive groups
• Good balance between simple (sensible) rules and more complex (and risky!) modelling
‘Hybrid’ approach in action
• Cash appeal, Jan-08 :
– RFV rule : Recent cash donors above £5
– Additional prospects : Modelling to predict
responsiveness of DD supporters
Cash prospect modelling – Overview
UNIVARIATE ANALYSIS
Gender
Region / GeoDems
Value of DD support
Tenure of DD support
Acquisition Media / Product
Other WWF support /
engagement
SIGNIFICANT MODEL VARIABLES
ConCensus Group (1)
DD Tenure (2 bands)
Acquisition Media (1) / Product (1)
DD Value (1 band)
Other support / engagement (5 types)
Responsiveness score
Cash prospect modelling – Implementation
• Predicted response „cut-off‟ set at 4%
– Ensured a positive return at the margin
• Prospect model indicative of responsiveness
– Targeted supporters 3x more responsive vs. base
– Modelling to be repeated and extended for future appeals
Cash prospect modelling – Interpretation
• Caution needed when applying historic findings
– „Predicted response rate‟ better interpreted as
„Responsiveness index‟
– Set „cut-off‟ and sample sizes to minimise risk
• Expectations should be set appropriately
– Modelling employed here to identify „best of the rest‟
Longer term practicalities
• Consider the flow in & out of segments over time
– Need to understand the segment hierarchy
• How best to implement split tests
– Distinguish between „one-off‟ and „cohort‟ tests
– Be aware of pitfalls when using random samples
Clustering
Clustering
• Theory – create groups of like minded supporters to
treat the same
• Can be done using stats techniques (cluster
analysis, factor analysis, CHAID) or rules based
• RFV is a form of clustering
Clustering Example from World Vision:
• Various segmentations built:
– Market Research survey output
– Full supporter base
– UK population (based on census and supporters)
• Hybrid created to join these together
• Used to drive warm campaigns and cold activity
• Including how to deal with TV and Radio responders
and siting posters
Optimisation
Optimisation
• Theory – maximise income across segments
• For example:
– £100k for the campaign
– Need to maximise return over 3 years
– Who are the next supporters to contact?
• Combination of predictive models and OR tools
Optimisation – WWF Experience
• Investigated two years ago
• Requires models for all key appeal types
• …..plus an optimisation tool to maximise return
given budget constraints and a set of supporter level
model scores
• Parked for now – although recognise important
Combinations
Combinations
• Theory – combine more than one technique to
better effect
• Most likely predictive modelling with clustering:
– Build clusters
– Run models across clusters
– Use output for selection and creative testing
Combinations Example - BHF• Models built for different types of comms (cash
appeal, upgrade, raffle)
• Segments of database created based on transactions
(i.e. one-off givers, gamers, engaged)
• Cross-tabs of segments and models created
• Selection criteria agreed
• Models and segments implemented
• Monthly monitoring, modification and measurement
Event/Triggers
Event/Triggers
• Theory – Use transactional data (on and off-line) to
detect triggers that identify changes in behaviour.
Use these triggers to drive contact
• Response rates are generally higher
• Requires daily (at least) data for detection
• Lots of test and learn
Event/Triggers Example - RBS• Daily flushing of data through detection
• Identifies all significant changes in activity that
day
• Create campaigns (e-mail, mail, phone) based on
trigger
• Very fast turnaround on campaigns
• Approx 25% increase on response rate to
campaigns
Summary
Summary
• In theory a lot that can be done in segmentation
• However – take one step at a time
• Easy to get carried away
• Sometimes the most simple solution or combination
of solutions is the best
• Occam‟s razor!
Questions?