analysing customer journeys to predict behaviour · driving a decision that can drive a positive...
TRANSCRIPT
Analysing Customer Journeys to Predict
Behaviour
Adrian Carr
A Customer Journey Example
January February March
Outbound
Branch
Inbound
Account log in
Competitor Browsing
Outcome
All companies try and predict outcomes – e.g. sales or churn etc, or even sub outcome events leading up to the
target outcome. These events can be across multiple channels, inbound and outbound, and they can also be trigger
events just captured from the data.
It’s complicated
Terminology
Path = Journey = SequenceAll of the above mean ‘what happened
between two points in time’
Event = Step = PointAll of the above are the ‘what’ in ‘what
happened between two points in time’
Even though this is just an example, it is already very complicated.
• Just one customer
• 17 events
• Which events are relevant?
• Is the order of events important?
• Did some people have the same sequence of events, but not the same outcome?
• What about demographics and account data – is that relevant?
Sankey Diagrams
Who is / was Sankey?
A. A Clever Person in SAS R&D?
B. A Russian Mathematician?
C. A Railway Engineer?
D. A Mild Mannered Janitor?
Sankey diagrams are lovely…..but
…but they aren’t easy to draw in .ppt, and they aren’t simple, and they
only ever cover a fraction of the universe so let’s start simply with an
arrow to represent time and events on this arrow form the customer
journey
This path contains multiple events, across multiple
channels (e.g. web, phone, social, etc)
Each of the circles represents an event, that could have come from something similar to the diagram below
Some of which are significant objectives or goals of an organisation,
which one would want to predict and dis/encourage (e.g. churn, or
product sale or conversion, or sub conversion)
Examples –
• Putting something in basket
• Downloading a white paper
• Completing purchase
• Posting an application form
• Calling a call center
• Responding to an offer
• Accepting an offer
• Visiting a store
And some of which are irrelevant when predicting the
objective
Dropping the irrelevant events makes a
problem simpler
Back to our (now relevant event containing) journey…. cutting the time
frame (or length of sequence) of analysis to a more manageable length
also makes life more manageable
The ‘word on the street’ / ‘grapevine’ is
that the length of a journey is best
measured in number of events, and is 3-
5 events long.
Focussing on the decision that an organisation can make
to influence the objective then becomes an easier task
These can be
considered as
‘intervention points’
These can be
considered to be batch
or real time too.
An example
Customer starts to
download a new film
that is 3Gb large
Customer has less than
2Gb remaining of their
inclusive download
allowance
Offer of a data snack of
2Gb
Customer responds to
offer.
…and this then easily extends to multiple intervention
points across multiple paths
And sometimes the goal is not achieved, but again, this
can form an input to the next decisioning path
…similarly, ‘sub conversions’ can be the objective of an
activity, or form the entry to the next path (though of
course the customer is just on one journey)
In summary
Our inputs are a distilled set of
paths that are relevant to
driving a decision that can
drive a positive outcome
our decisioning is now
referenced at the point of
potential intervention, i.e. the
different times where we can
take action, with our desire
being to influence towards a
positive outcome goal
And we are driving the goals
(or sub goals) that we want a
customer journey to lead to
And these customer paths sit as a foundation source of
insight into the SAS Customer Decision Hub….which can
be optimised
Digging a bit deeper……
So we said before ‘Dropping the irrelevant events makes a
problem simpler’, let’s dig deeper into that ‘irrelevant’
definition…
An example of relevant vs irrelevant
This is some data that records events happening
(e.g. bill shock, or dropped call), and a positive
outcomes (e.g. churn, or ‘called call centre’)
On first inspection, both events seem predictive –
there are ten events of each type occurring and
there are ten outcomes that are also
occurring…..but when you look deeper…..
Customer Event 1 Event 2
Positive
Outcome
1 1 1 1
2 1 0 1
3 1 0 1
4 0 0 0
5 0 1 0
6 0 1 0
7 1 0 1
8 1 1 1
9 0 0 0
10 1 0 1
11 1 0 0
12 0 1 0
13 1 0 1
14 0 1 0
15 0 1 0
16 0 0 0
17 0 1 1
18 1 1 1
19 1 1 1
20 0 0 0
Total 10 10 10
Question: - which are relevant?
A. Neither
B. Both
C. Event 1 Only
D. Event 2 Only
….looking deeper….
When Event 1 occurs (e.g. ‘Bill
Shock’, the positive outcome
occurs 90% of the time
But when Event 2 occurs (e.g. dropped call),
the positive outcome only occurs 50% of the
time…..the same frequency as when Event 2
doesn’t happen.
Customer Event 1 Event 2
Positive
Outcome
1 1 1 1
2 1 0 1
3 1 0 1
4 0 0 0
5 0 1 0
6 0 1 0
7 1 0 1
8 1 1 1
9 0 0 0
10 1 0 1
11 1 0 0
12 0 1 0
13 1 0 1
14 0 1 0
15 0 1 0
16 0 0 0
17 0 1 1
18 1 1 1
19 1 1 1
20 0 0 0
Total 10 10 10
Hit Rate
0 1
0 9 1 10%
1 1 9 90%
Total 10 10 50%
Positive Outcome
Even
t 1
Hit Rate
0 1
0 5 5 50%
1 5 5 50%
Total 10 10 50%
Positive Outcome
Even
t 2
We can ‘attribute’ the positive outcome occurring to the Event 1
occurring. We can also say that Event 2 is ‘irrelevant’, and therefore
we can ignore it from any path analysis.N.B. In reality, the combination may also need analysing, this is for example
purposes only
Many of you will be aware of the attribution techniques /
options that exist when considering digital spend…..
The successful
goals (e.g.
completed
purchase) are found
The lead up events are
known (e.g. customer
searched for ‘lovely
wine’ in Google)
One of the traditional
methods are used to
‘attribute’ the success to
the action
None of these are analytical – these are ‘rules based counting, whilst ignoring most
of the things that need to be counted’
It is a potentially simple extension to traditional modelling
methods
1 2 43
1 2 3
1 2 3 4
cust 1 2 3 4 Goal
A 1 1 1 1 1
B 1 1 1 0 1
C 1 1 1 1 0
The paths can easily be
represented as data, and
easily considered in a
predictive model
The goal is used as the
variable to be predicted,
and the events are the
predictive input variables.
This is then a potentially
smart way to identify if 4 is
truly predictive or not
Cust A
Cust B
Cust C
Caveat – traditional logistic regression usually only picks out 10-15 variables per goal,
so additional intelligence or other methods should be considered
Why is this different to normal analytics?
1 2 43
1 2 3
1 2 3 4
cust 1 2 3 4 Goal
A 1 1 1 1 1
B 1 1 1 0 1
C 1 1 1 1 0
The only thing we are
missing here compared to
path analytics is….
The order of events
Cust A
Cust B
Cust C
Why is this different to normal analytics?
1 2 43
1 2 3
1 2 3 4
cust 1 2 3 4 2,3 3,2 Goal
A 1 1 1 1 1 0 1
B 1 1 1 0 1 0 1
C 1 1 1 1 1 0 0
D 1 1 1 1 0 1 1
Cust A
Cust B
Cust C
1 3 2 4Cust D
Sequence style variables
can easily be created to be
represented in a normal
model.
One could argue that there is no
point in doing path analytics, unless
these ‘ordered combination
variables’ add more discriminative
power over and above existing data
More pictorially….
Only if you build two models – and
compare them, will you identify how
much the order of the events is
actually incrementally predictive.
Credit Card Sales Journeys…
1 43Cust A
2 43Cust B
On line browsing
for credit cardEmail Sent
Credit Card
Response Score
>200
Customer
Applies
Email Sent
Successful
Application
Customer
Applies
Successful
Application
‘New School’ / ‘Digital Marketing’
‘Old School’ Marketing
Q. Why not simply consider
model scores as events
within the path, i.e.
dummy events?
So now our decision hub is driven by both relevant paths
and relevant scores
2
2
Others may benefit
from the inclusion of
scores (perhaps even
make a trigger
campaign work better)
And other paths are
just what we used to
call campaigns, based
on a score based
selection criteria
Some Paths will
be purely event
driven
Questions?