phd seminar riezlern 2016
TRANSCRIPT
Leveraging Regularity in Predicting Customer Lifetime Value
Michael Platzer & Thomas Reutterer
Seminar Riezlern 2016
Warm Up
PAGE 2
Customer A
Customer B
1-Jan-16, 09:00 21-Jun-16, 10:28
1) Which customer would you prefer? The regular one, or the clumpy one?
2) Which type of customers are more prevalent? The regular ones, or the clumpy ones?
Two customers – Same Recency, Same Frequency
PAGE 3
1. Intro to BTYD models2. On the Subject of Regularity
3. Our Pareto/GGG model
4. Our (M)BG/CNBD-k model
5. Our BTYDplus R package
PAGE 4
dead?
non-contractual setting
Customer purchases, until she stops purchasing. However, dropout event is not observed.
Buy-Till-You-Die1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
alive!
Key Issues in the Management of Customer Relationships
PAGE 5
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
?
?
Given: Purchase history of customer cohort in non-contractual setting.
Example:
CD Sales
Broadening the context:• purchase ≈ transaction ≈ event …• customer relationship ≈ channel activity ≈
service activity …
Questions:How valuable is that cohort?
How many purchases to expect?
Who will still be active?
Who will be most active?
When will next purchase take place?
PAGE 6
BTYD “Gold Standard”Pareto/NBDSchmittlein, Morrison and Colombo, 1987
Assumptions1. Purchase process (while ‘alive’)
• Purchases follow Poisson process, i.e. exponentially-distributed inter-transaction times, itti,j ~ Exponential(λi)
• λi are Gamma (r, α) distributed across customers
Pareto
NBD(Ehrenberg 1959)
à parameter estimation of (r, α, s, β) via Maximum Likelihood à closed-form solutions for key expressions P(alive), # of future purchases à require only recency/frequency summary statistics (x, tx, T) per customer
2. Dropout (‘death’) process
• (Unobserved) customer’s lifetime is exponentially distributed, lifetime τi~ Exponential(μi)
• μi are Gamma (s, β) distributed across customers
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
3. λ and μ vary independently
PAGE 7
BTYD Models
• BG/NBD (Fader, Hardie, and Lee 2005)Discrete time defection process (after any transaction) instead of continuous
• MBG/NBD (Batislam et al. 2007), CBG/NBD (Hoppe and Wagner 2007)Customers can drop out at time zero (immediately after first purchase)
• PDO/NBD (Jerath et al. 2011)Defection opportunities tied to calendar time (indep. of transaction timing)
• GG/NBD (Bemmaor and Glady 2012)Flexible lifetime model, departing from exponential (Gamma-Gompertz)
• Pareto/NBD variant (Abe 2009)Hierarchical Bayes extension of Pareto/NBD (dependencies of λi and μi)
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
à All modify dropout process, but not purchase process
PAGE 8
1. Intro to BTYD models2. On the Subject of Regularity
3. Our Pareto/GGG model
4. Our (M)BG/CNBD-k model
5. Our BTYDplus R package
Regularity improves Predictability
PAGE 9
futurepast
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
PAGE 10
next event?
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
Regularity improves Predictability
PAGE 11
next event?
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
Regularity improves Predictability
PAGE 12
next event?
Well, so what?
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
Regularity improves Predictability
PAGE 13
still alive?
A
B
Buy-Till-You-Die Setting
Customer A and B exhibit same Recency and Frequency, yet we come to different assessments regarding P(alive).
Regularity improves Predictability
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
PAGE 14
• Erlang-k Herniter (1971)• Gamma Wheat & Morrison (1990)• CNBD Chatfield and Goodhardt (1973)
Schmittlein and Morrison (1983)
Morrison and Schmittlein (1988)
• CNBD Models Gupta (1991)
Wu and Chen (2000)
Schweidel and Fader (2009)
Regularity in Purchase Timings
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
PAGE 15
• RFMC Zhang, Bradlow and Small (2015)
Irregularity in Purchase Timings
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
PAGE 16
Empirical Findings1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
Data SetsGrocery kwheat = 2.5Donations kwheat = 2.2Health Supplements kwheat = 2.1 Office Supply kwheat = 1.8CD Sales kwheat = 1.0Fashion & Accessoires kwheat = 0.6
Grocery CategoriesCoffee pads kwheat = 3.1Detergents kwheat = 2.8Toilet Paper kwheat = 2.8Cat food kwheat = 2.8…Light bulbs kwheat = 1.9Cosmetics & perfumes kwheat = 1.6Sparkling Wine kwheat = 1.6
PAGE 17
1. Intro to BTYD models2. On the Subject of Regularity
3. Our Pareto/GGG model
4. Our (M)BG/CNBD-k model
5. Our BTYDplus R package
Pareto/GGG Platzer and Reutterer, forthcoming
PAGE 18
Customer Level
• Purchase Process: While alive, customer purchases with Gamma distributed waiting times; i.e. itti,j ~ Gamma(ki, ki λi)
• Dropout Process: Each customer remains alive for an exponentially distributed lifetime with death rate μi; i.e. lifetime τi ~ Exponential(μi)
Heterogeneity across Customers• λi ~ Gamma(r, α)
• μi ~ Gamma(s, β)• ki ~ Gamma(t, γ)
• λi, μi, ki vary independently
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
Pareto/GGG = Pareto/NBD + Varying Regularity
Gamma Distributed Interpurchase Times
PAGE 19
k=0.3 k=1Exponential
k=8Erlang-8
regularrandomclumpy
Coefficient of Variation = 1 / sqrt(k)
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
Pareto/GGG Estimation via MCMCComponent-wise Slice Sampling within Gibbs with Data Augmentation
SEITE 20
L Significantly Increased Computational Costs(2mins for drawing 1’000 customers)
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
Pareto/GGG Estimation via MCMCComponent-wise Slice Sampling within Gibbs with Data Augmentation
SEITE 21
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
L Significantly Increased Computational Costs(2mins for drawing 1’000 customers)
J but…• Posterior Distributions instead of Point Estimates
• Also for Individual Level Parameters
• Direct Simulation of Key Metrics of Managerial Interest
• And only one additional summary statistic required
Simulation StudyDesign
160 scenarios covering a wide range of parameter settings(similar to simulation design from BG/BB paper)
• N = {1000, 4000}• r = {0.25, 0.75}, α = {5, 15}• s = {0.25, 0.75}, β = {5, 15}• (t, γ) = {(1.6, 0.4), (5, 2.5), (6, 4), (8, 8), (17, 20)}=> Total of 400’000 simulated customers=> Total of 64 billion individual-level parameter draws (via slice sampling)
Compare individual-level forecast accuracy of Pareto/GGG vs. Pareto/NBD in terms of mean absolute error (MAE). Study relative improvement in terms in MAE.
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
Simulation StudyRegularity improves Predictability
Regularity improves Predictability
• bigger lift for bigger regularity• even for mildly regular patterns
we see lift• no lift for random and clumpy
customers
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
Simulation StudyLift in Predictive Accuracy by Segment
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
Simulation StudyInterplay of Recency, Frequency and Regularity
Assumptions: mean(itt) = 6 weeks, mean(lifetime) = 52 weeks
A
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
Simulation StudyInterplay of Recency, Frequency and Regularity
Same RF, but different P(alive) for different k! Particularly when customer is already “overdue”.
Regular customers are less likely and clumpy customers are more likely to be still alive, when compared to the randomly purchasing customer.
Assumptions: mean(itt) = 6 weeks, mean(lifetime) = 52 weeks
A
B
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
Simulation StudyInterplay of Recency, Frequency and Regularity
Same RF, but different P(alive) for different k! Particularly when customer is already “overdue”.
Regular customers are less likely and clumpy customers are more likely to be still alive, when compared to the randomly purchasing customer.
Assumptions: mean(itt) = 6 weeks, mean(lifetime) = 52 weeks
A
B
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
Empirical Findings
regularPoisson
clumpy
à regularity varies across but also within datasets
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
à improved predictive accuracy for datasets with regular patterns
median(k) rel. Lift in MAE
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
Empirical Findings
à estimates for next transaction timings differ, when regularity is taking into consideration
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
Empirical Findings
PAGE 31
1. Intro to BTYD models2. On the Subject of Regularity
3. Our Pareto/GGG model
4. Our (M)BG/CNBD-k model
5. Our BTYDplus R package
(M)BG/CNBD-kPlatzer and Reutterer, forthcoming
PAGE 32
Customer Level
• Purchase Process: While alive, customer purchases with Erlang-k distributed waiting times; i.e. itti,j ~ Erlang-k(λi)
• Dropout Process: A customer drops out at a (re-)purchase event with probability pi
Heterogeneity across Customers• λi ~ Gamma(r, α)
• pi ~ Beta(a, b)• λi, pi vary independently
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
BG/CNBD-k = BG/NBD + Fixed Regularity
MBG/CNBD-k = MBG/NBD + Fixed Regularity
(M)BG/CNBD-kPlatzer and Reutterer, forthcoming
PAGE 33
Closed-Form Expressions
• Likelihood à 100-1000x faster parameter estimation via MLE than MCMC
• P(X(t)=x | r, α, a, b) à approximate Unconditional Expectation
• P(alive | r, α, a, b, x, tx, T) à key component for Conditional Expectation
• Conditional Expected Transactions à “pretty good” approximation possible
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
Erlang-k = Poisson with every kth event counted
PAGE 34
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
Simulation StudyDesign
324 scenarios covering a wide range of parameter settings – 5 repeats each(similar to simulation design from BG/NBD paper)
• N = 4000, T.cal = 52, T.star = {4, 16, 52}• r = {0.25, 0.50, 0.75}, α = {5, 10, 15}• s = {0.50, 0.75, 1.00}, β = {2.5, 5, 10}• k = {1, 2, 3, 4}=> total of 1’300’000 simulated customers
Compare individual-level forecast accuracy of Pareto/GGG vs. Pareto/NBD in terms of mean absolute error (MAE). Study relative improvement in terms in MAE.
PAGE 35
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
Simulation StudyExample
PAGE 36
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
Simulation StudyResults
Regularity improves Predictability
• bigger lift for bigger regularity• even for mildly regular patterns we see lift
PAGE 37
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
Empirical FindingsResults
Findings
1. MBG/NBD either on par or better than BG/NBD
2. MBG/CNBD-k sees lift in forecast accuracy, if regularity present
3. MBG/CNBD-k comes close to P/GGG
PAGE 38
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
Empirical FindingsResults
Yet to come: Study Lift by Retail Category
PAGE 39
1. Intro to BTYD models2. On the Subject of Regularity
3. Our Pareto/GGG model
4. Our (M)BG/CNBD-k model
5. Our BTYDplus R package
PAGE 40
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
BTYDplus
• https://github.com/mplatzer/BTYDplus
• GPL-3 license• Implementations of
• MBG/NBD – Batislam et al. (2007)• GammaGompertz/NBD – Bemmaor & Glady (2012)• (M)BG/CNBD-k – Platzer and Reutterer (forthcoming)
• Pareto/NBD (MCMC) - Shao-Hui and Liu (2007)• Pareto/NBD variant (MCMC) – Abe (2009)
• Pareto/GGG (MCMC) – Platzer and Reutterer (forthcoming)• Fully tested and documented, incl. demos• Vignette will be coming
…
Users
PAGE 41
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
BTYDplusdemo
> elogcust date
1: 4 1997-01-182: 4 1997-08-023: 4 1997-12-124: 18 1997-01-045: 21 1997-01-01
---6914: 23556 1997-07-266915: 23556 1997-09-276916: 23556 1998-01-036917: 23556 1998-06-076918: 23569 1997-03-25
> (cbs <- elog2cbs(elog, per="week", T.cal=as.Date("1997-09-30"), T.tot=as.Date("1997-09-30")))
cust x t.x litt T.cal T.star x.star1: 4 1 28.000000 3.3322045 36.42857 39 12: 18 0 0.000000 0.0000000 38.42857 39 03: 21 1 1.714286 0.5389965 38.85714 39 04: 50 0 0.000000 0.0000000 38.85714 39 05: 60 0 0.000000 0.0000000 34.42857 39 0
---2353: 23537 0 0.000000 0.0000000 27.00000 39 22354: 23551 5 24.285714 5.5243721 27.00000 39 02355: 23554 0 0.000000 0.0000000 27.00000 39 12356: 23556 4 26.571429 6.3127713 27.00000 39 22357: 23569 0 0.000000 0.0000000 27.00000 39 0
calibration summary statsx = Frequencyt.x = Recencylitt = Sum Over Logarithmic Intertransaction Times
holdout summary stats
Transform event-log to summary stats(optionally one can split data into calibration and holdout)
customer ID
PAGE 42
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
BTYDplusdemo MBG/CNBD-k
> params <- mbgcnbd.EstimateParameters(cbs)> round(params, 2)
k r alpha a b 1.00 0.52 6.17 0.89 1.62
> cbs$xstar_est <- mbgnbd.ConditionalExpectedTransactions(params, cbs$T.star, cbs$x, cbs$t.x, cbs$T.cal)> cbs$palive_est <- mbgnbd.PAlive(params, cbs$x, cbs$t.x, cbs$T.cal)> cbs
cust x t.x litt T.cal T.star x.star palive_est xstar_est1: 4 1 28.000000 3.3322045 36.42857 39 1 0.6771113 0.78386362: 18 0 0.000000 0.0000000 38.42857 39 0 0.3919457 0.15581043: 21 1 1.714286 0.5389965 38.85714 39 0 0.1711458 0.18902914: 50 0 0.000000 0.0000000 38.85714 39 0 0.3907532 0.15403365: 60 0 0.000000 0.0000000 34.42857 39 0 0.4037292 0.1742668
---2353: 23537 0 0.000000 0.0000000 27.00000 39 2 0.4294331 0.22065542354: 23551 5 24.285714 5.5243721 27.00000 39 0 0.8222069 3.95010152355: 23554 0 0.000000 0.0000000 27.00000 39 1 0.4294331 0.22065542356: 23556 4 26.571429 6.3127713 27.00000 39 2 0.8557381 3.40193512357: 23569 0 0.000000 0.0000000 27.00000 39 0 0.4294331 0.2206554
E(X(T+T.star))P(alive)
PAGE 43
1. Intro to BTYD models2. On the Subject of Regularity3. Our Pareto/GGG model4. Our (M)BG/CNBD-k model5. Our BTYDplus R package
BTYDplusdemo Pareto/GGG
> params_draws <- pggg.mcmc.DrawParameters(cbs)> round(summary(params_draws$level_2)$quantiles[, "50%"], 2)
t gamma r alpha s beta 45.31 43.36 0.55 10.74 0.66 12.51 > est_draws <- mcmc.DrawFutureTransactions(cbs, params_draws, cbs$T.star)> cbs$palive_est <- sapply(params_draws$level_1, function(draws) mean(as.matrix(draws)[, 'z']))> cbs$xstar_est <- apply(est_draws, 2, mean)> cbs
cust x t.x litt T.cal T.star x.star palive_est xstar_est1: 4 1 28.000000 3.3322045 36.42857 39 1 0.92 0.772: 18 0 0.000000 0.0000000 38.42857 39 0 0.26 0.083: 21 1 1.714286 0.5389965 38.85714 39 0 0.17 0.114: 50 0 0.000000 0.0000000 38.85714 39 0 0.33 0.055: 60 0 0.000000 0.0000000 34.42857 39 0 0.34 0.27
---2353: 23537 0 0.000000 0.0000000 27.00000 39 2 0.38 0.152354: 23551 5 24.285714 5.5243721 27.00000 39 0 0.95 4.552355: 23554 0 0.000000 0.0000000 27.00000 39 1 0.36 0.172356: 23556 4 26.571429 6.3127713 27.00000 39 2 1.00 3.412357: 23569 0 0.000000 0.0000000 27.00000 39 0 0.51 0.31
E(X(T+T.star))P(alive)
Appendix
• C Measure by Zhang, Bradlow, Small 2015
• MCMC Sampling Scheme
ZBS: Clumpiness Measure Ca metric-based approach
Predicting Customer Value Using Clumpiness: From RFM to RFMCZhang, Bradlow, Small
• Introduce metric C which captures the “non-randomness” in timing patterns
• Straightforward calculation at individual-level;
• Useful for descriptive analysis and segmentation;
ZBS: Clumpiness Measure Ca metric-based approach
Main Empirical Findings• Capturing timing patterns adds
predictive power• When controlling for R and F, then
clumpy customers tend to be more active in the future
both findings are supported and can be explained by our model-based approach
ZBS: Clumpiness Measure Ca metric-based approach
Shortcomings• Requires many transactions at
individual-level• Metric C will be skewed when
dealing with different acquisition dates and churn settings
both are appropriately handled by our model-based approach
ZBS: Clumpiness Measure Ca metric-based approach
à sparse individual-level data mandates a model-based approach
Parameter Estimation via MCMCComponent-wise Slice Sampling within Gibbs with Data Augmentation
SEITE 50