three case studies deploying cluster analysis

31
SFbayACM.org Data Science Camp Saturday, October 25, 2014 Greg Makowski Twitter Tag #DMCAMP

Upload: greg-makowski

Post on 13-Jul-2015

406 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Three case studies deploying cluster analysis

SFbayACM.org Data Science CampSaturday, October 25, 2014

Greg Makowski

Twitter Tag #DMCAMP

Page 2: Three case studies deploying cluster analysis

Customer Description – CC Company –“Who” vs. “How” to Talk to Customers

Hotel Price Optimization – UsingClusters as Non-Linear Constraints

Retail Supply Chain – PlanningReplenishment for 52 Week DemandCurves

Page 3: Three case studies deploying cluster analysis

Context:◦ Major credit card company◦ South American Market◦ Repeat for Argentina, Brazil… and “dollar countries”

Objectives or Problem:◦ How to best manage the customer population◦ Develop a software system, to repeat over geography

and time◦ How to AUTOMATE understanding? How to automate naming the clusters?

Page 4: Three case studies deploying cluster analysis

Solution, 3 projects for each customer base◦ “WHO” to talk to… Customer Attrition Model – Neural Network (5 algs tested) Decrease in spending over time Basic vs. Supplemental Cards By 7 categories Challenge: Double digit inflation in some countries (90’s) Standardize by monthly spending Mining Factoid: Credit Card Digit 11 was predictive Billing cycles? Monthly salaries + high inflation

Customer Profitability – Net Present Value

◦ “HOW” to talk to them… Cluster Analysis

Page 5: Three case studies deploying cluster analysis

Consider Scalability◦ 100k – 500k customers◦ Some cluster methods are O(n) or O(n2)

◦ Use Kmeans to create 100 clusters O(n)◦ Then use O(n2) methods to reduce from 100 clusters

down to 8-12 clusters

Page 6: Three case studies deploying cluster analysis

14 7 10 13 16 19

RandomTree-Net

Cum

ulat

ive

Prof

it

5% CustomerGroups

Attrition Profita

bility

Total Profit / cell

83% of Attrition Profit was Lost in top 15%

Select the 5-15% customers“highest” in the spike

Page 7: Three case studies deploying cluster analysis

How to design the cluster analysis?◦ Select top fields from neural network Sensitivity Analysis on the NN % spending by category Restaurant, Retail, Grocery, Hotel, Air, Auto, … Trend over time (slope, expected future value) Decide to create 8 – 12 clusters or customer segments

to communicate to marketers

Page 8: Three case studies deploying cluster analysis

Consider Scalability◦ 100k – 500k customers◦ Some cluster methods are O(n) or O(n2)

◦ Use Kmeans to create 100 clusters O(n)◦ Then use O(n2) methods to reduce from 100 clusters

down to 8-12 clusters

Page 9: Three case studies deploying cluster analysis

Consider Scalability◦ 100k – 500k customers◦ Some cluster methods are O(n) or O(n2)

◦ Use Kmeans to create 100 clusters O(n)◦ Then use O(n2) methods to reduce from 100 clusters

down to 8-12 clusters

◦ This uses all the data scalebly, and moresophisticated hierarchical cluster search

Page 10: Three case studies deploying cluster analysis

ClustersMost customers Least

ALL 1 2 N100% 36% 22% 5%Fields

Orderedby

Importance

Page 11: Three case studies deploying cluster analysis

ClustersMost customers Least

ALL 1 2 N100% 36% 22% 5% min MAXFields

Orderedby

Importance

Page 12: Three case studies deploying cluster analysis

Most:Var X, Y, Z

LeastVar A, B, C

May have 12 clusters, 36 variablesThen each cluster may have 6 attributesto use in naming

min MAX

Page 13: Three case studies deploying cluster analysis

Select “WHO” with (Attrition)x(Profitability) Select “HOW” with Cluster Segments

◦ Given the variable selection, only a few clustersmatched most of the 15% subset of the customers tomanage

Marketers could understand well the differentaudiences and reasons for attrition – andcould better write copy for communication

About 50 Executives walked around with theone page cluster summary in their pocket,frequently used to plan customer strategies

Page 14: Three case studies deploying cluster analysis

AnalysisType

CRMBehaviorMediaMessage

$$$

Loyal

Loyalty

Prospect

Segment

Upgrade, Downgrade

Cross-Sell

BestCustomers

Reactivation

AttritionRetention

Fraud

Page 15: Three case studies deploying cluster analysis

Customer Description – CC Company –“Who” vs. “How” to Talk to Customers

Hotel Price Optimization – UsingClusters as Non-Linear Constraints

Retail Supply Chain – PlanningReplenishment for 52 Week DemandCurves

Page 16: Three case studies deploying cluster analysis

Objective:◦ Optimize pricing for hotel rooms◦ Take into account geography & use weekend, vacation, business, conference, … Seasons of the year as it relates to demand

The hotel owns many brands (chains) focusedon different audiences◦ Different price tiers, target audiences,…◦ Hotel, motel, extended stay, …◦ What “lessons learned” cross brands?

Page 17: Three case studies deploying cluster analysis

Revenue Management is a general process usedto◦ optimize profit◦ given the remaining (plane seat or hotel room)

inventory◦ the remaining time until the inventory is gone

Operations Research◦ Linear or Non-Linear Programming Lin or Non-Lin in either constraints or objective function

◦ Need an objective function to optimize Train predictive models to forecast price, given

conditions

Page 18: Three case studies deploying cluster analysis

Data Mining and Operations Research Design◦ When training predictive models, it helps to learn

behavior “in the same ball park” with the samemodel.

◦ If the underlying thought process is fairly different,subdivide the data into different subsets and traindifferent models. For example: Attrition: checking, credit card, line of credit, mortgage

In Mortgage Bond Pricing: monthly prepayment ofnone vs. 100’s vs. 1,000’s vs. a full refinance

Page 19: Three case studies deploying cluster analysis

How do we group or divide individual hotels,given all the attributes?◦ Brand, location, % utilization weekday or weekend,

Find bottom-up clusters, rather than top-down assertions on the data

For cluster variables – use best variables inpricing predictive models (sound familiar?)

Page 20: Three case studies deploying cluster analysis

Solution:◦ 1) Build an initial predictive model predicting

pricing. Find the most important variables.

◦ 2) Create 8-16 clusters, using those variables

◦ 3) Within each cluster A) Train a predictive model for use as the OR objective

function B) Run a LINEAR OR price optimization, on the data

subset

Page 21: Three case studies deploying cluster analysis

Customer Description – CC Company –“Who” vs. “How” to Talk to Customers

Hotel Price Optimization – UsingClusters as Non-Linear Constraints

Retail Supply Chain – PlanningReplenishment for 52 Week DemandCurves

Page 22: Three case studies deploying cluster analysis

The “Retail Supply Chain” is from◦ the manufacturer to◦ distribution center to◦ Warehouse to◦ Store to Consumer

Replenishment is to re-supply products on theshelves◦ Minimize overstock and understock◦ Heavy understock causes LOSS OF SALES◦ Heavy overstock causes 30% end of season liquidation

Page 23: Three case studies deploying cluster analysis

4,000 stores 100,000 products/SKU’s (stock keeping units)

◦ 400 million store-product combinations 52 weeks per year

◦ 20.8 billion store-product-week combinations

Not the smallest problem in the mid-90’s

Holidays shift in week number, from year toyear – need to adjust

Page 24: Three case studies deploying cluster analysis
Page 25: Three case studies deploying cluster analysis

End up creating 2,000+ “profiles” orcentroids

Assign new store-SKU’s to an existing profile

If it doesn’t match (within a radius)…◦ Re-run cluster analysis◦ Lock existing centroids◦ Create new centroids for data points outside◦ Add to the “profile library”

Page 26: Three case studies deploying cluster analysis

Bottom-up findings (after the fact)

◦ Buying hunting related items as the ducks migratenorth

Page 27: Three case studies deploying cluster analysis
Page 28: Three case studies deploying cluster analysis
Page 29: Three case studies deploying cluster analysis
Page 30: Three case studies deploying cluster analysis
Page 31: Three case studies deploying cluster analysis