data analytics for customer facing applications

132
2/25/2008 1 Data Analytics for Customer Facing Applications Jaideep Srivastava Computer Science & Engineering [email protected]

Upload: vucong

Post on 30-Dec-2016

221 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Data Analytics for Customer Facing Applications

2/25/2008 1

Data Analytics forCustomer Facing Applications

Jaideep SrivastavaComputer Science & Engineering

[email protected]

Page 2: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 2

Presentation OutlineTechnology trendsCustomer facing applicationsStatus of CRM effortsAnalytical CRM

Customer segmentationCustomer loyaltyCustomer retention

Analytical CRM architectureData warehouseDimensional data modelingOn-line analytical processing (OLAP)

Data miningAmazon.com: case study in building customer loyaltyAnalytics behind e-marketingYodlee.com: case study in web business intelligencePrivacy issuesConclusion

Page 3: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 3

Technology TrendsInternet growth

Faster than any other infrastructureData collection

Rapid drop in storage costsDramatic improvement in resolution and rate of data collection ‘probes’

Data analyticsIncreasing deployment of warehousesMajor leap forward in data mining technologies and tools

Becoming possible to really understand what yourcustomers want – even at the individual level!!

Page 4: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 4

Infrastructure Adoption in the US

Millionsof users

0

60

120

1922 1950 1980 1995 2000

Radio TV CableInternet

Page 5: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 5

Product Marketing – 75 years ago

• Production – a la Adam Smith• You can have any color as long as its black –Ford Motor Co.

Page 6: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 6

Product Marketing - today

5

Add the spice of flexibility, courtesy of robotics, computers …

Page 7: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 7

New approach to marketingTO: Finding products that are right for each customer

TURN the processthrough 90 degrees

FROM: Finding customers that are right for each product

Products: 1 2 3 4 5 ….. To achieve this we need to align around

•Organization and culture•Business processes and skill•Measurement and incentives•Information management•Technology

Page 8: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 8

“Mass Customization” – B. Joseph Pine

Mass productionCheap to produceEfficient to produceUniform features/quality‘one size fits all’ approachOptimize production cost

CustomizationExpensive to produceInefficient to produceCustomized features‘tailor made’ approachOptimize customer satisfaction

Mass customizationCheap & efficient to produceCustomized features‘tailor made’ approachOptimize production cost & customer satisfaction

Page 9: Data Analytics for Customer Facing Applications

2/25/2008 9

Customer Facing Applications

Page 10: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 10

Customer Facing Applications

Consumer marketingCampaign managementOpportunity managementWeb-based encyclopedia, configuratorMarket segmentationLead generation/enhancement/tracking

Page 11: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 11

Customer Facing Applications

Customer care & supportIncident assignment/escalation/tracking/reportingProblem management/resolutionOrder management/promise fulfillmentWarranty/contract management

Field service supportWork orders, dispatchingReal time information transfer to field personnel via mobile technologies

Page 12: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 12

Customer Facing ApplicationsCorporate sales

Contact management profiles and historyAccount management including activitiesOrder entryProposal generation

Sales managementPipeline analysis, e.g. forecastingSales cycle analysisTerritory alignmentRoll-up and drill-down reporting

Page 13: Data Analytics for Customer Facing Applications

2/25/2008 13

Status of Customer Relationship Management (CRM) Efforts

Page 14: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 14

Companies are spending mega-budgets on CRM

CRM = software + support servicesEuropean CRM expenditure = $1.2B + $3.0B = $4.2B*

UK marketingservice industrygrowing at 17.4%to $7.7B

CRM

Relationship marketingCustomer serviceValue added programsLoyalty programsCulture change

*Hewson Consulting October 2000

Page 15: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 15

But - satisfaction is declining

Page 16: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 16

And - more customers are complaining

Page 17: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 17

Increasing customer resistance

98% of customer solicitations are irrelevant82% of individuals would like to block all marketing access to their own dataCampaign hit rates and customer loyalty indicators are declining

Page 18: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 18

Consequently

The ‘best’ customers are being over communicated toToday’s less valuable customers are not being developed into tomorrow’s ‘best’ customersThe business potential of the customer base is not being maximized

Page 19: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 19

Solution: Analytical CRM

CRM = Customer Understanding + Relationship Management

Analytics helps in Customer Understanding

Analytics = OLAP, Statistical analysis, data mining, etc.

Page 20: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 20

Example Customer Facing Applications Helped by Analytical CRM

Customer segmentation

Customer loyalty building

Customer retention/recovery

Page 21: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 21

Customer segmentationPurpose of segmentation is to identify groups of customers with similar needs and behavior patterns, so that they be offered more tightly focused

ProductsServicesCommunications

Segments should beIdentifiableQuantifiableAddressableOf sufficient size to be worth addressing

Two approaches to segmentationcluster common characteristics, and then map out behavior patternsSeparate out behavior patterns, then identify segment characteristics

Page 22: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 22

Customer base segmentationPotential business

High

Care &Maintenance

RetainDevelop

Observe &Incentivize

Actual business

Low

HighLow

Targeted communication to each segment

Page 23: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 23

Segmentation by value

Page 24: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 24

Express profits as deciles, and ask questions

12001000800600400200

0-200-400-600-800

-1000-1200

Who are thesecustomers; what dothey look like?

Middle 60%, eitherside of break even.What can we do aboutthese?

Are these worth keeping?Can we service them with a lower cost channel?What can we do to make this segment profitable?

Should the focus be on retaining wallet share from segments 8 – 10?Or, on gaining from segments 1 – 4?

Profit

Deciles

Page 25: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 25

Customer loyalty: close relationships are more profitable

Page 26: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 26

Relationship intensity anddefection odds

Evidence suggests that customer ‘lock in’ occurs once 4 or more products are purchased

Odds of notdefecting

1.1%

10.2%

18.1%

98.3%

1 2 3 4Number of products purchased

Page 27: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 27

A difference of opinion …

70%

90%Company view Customer view

32%

2%

Customers are happywith our customerservice

We research customerservice needs andwants as part of ourcustomer serviceimprovement

Customer serviceneeds noimprovement

Customer servicetoday is betterthan ever

Page 28: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 28

… and action

Company view Customer view98%

43%

7%

We want to develop arelationship withour customers

We want to form anddevelop a relationshipwith our suppliers

The relationship now isstronger than 12 monthsago

Page 29: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 29

Increasing propensity to buy over a customer life cycle

Actions which buildrelationship warmth

•No-fault service•“Have a nice day”•Targeted sales

Customerrelationshipprofitability

Page 30: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 30

Loyalty is built through a virtuous circle of new customer experience

Virtuous circle of customer experience

SuperlativeCustomerserviceProvides legitimacy

to offer adviceProvides legitimacyto offer advice

Innovativenew products

Individualizedand helpfuldialog

Excites the customerand builds loyalty

Page 31: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 31

Lifetime Impact of Customer Loyalty

TIME

“Realized” customer value

Customer potential

“Maximized” customer value

VALUE

Page 32: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 32

Managing Credit-Card Retention in the Pacific Rim

•Behavioral Propensity Model based Campaigns generate New Customers •Selective score-based phone follow-up more than doubles response•“Event-driven”(Trans. Vol. & Value) Campaigns to stimulate initial usage of credit-card. •Propensity model + “Event-driven” Customer Retention program identifies likely non-renewers 3 months prior to renewal, and kicks in usage stimulation program •Different offers (“Frequent User Club” versus Premium) being tested

Impact: Over 100% improvement in both Acquisition and Retention. New market opened up.

Page 33: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 33

Using Negative Events to drive Positive Sales

Event = “ATM request for cash” is rejected due to lack of funds.

For credit-worthy customers, unsecured personal loan is offered by mail or phone the next day!

30% acceptance rate of product offered.

Impact: Significant cross-sales of additional product

Significant reduction in negative reactions

Page 34: Data Analytics for Customer Facing Applications

2/25/2008 34

Analytical CRM Architecture

Page 35: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 35

Analytical CRM Loop

Hypothesisgeneration

ResultsAnalysis

Action

Page 36: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 36

Traditional Growth of Functions in an Organization

Inbound Call Centre

BranchATM

Fax

Kiosk

Outbound Call Centre

WAPEmail

3rd Party Resellers

Data

Data

DataData

WEB

THE PRESENT MULTIPLE CHANNELS & DATA STORES / IMPERSONAL SERVICE

• IMPERSONAL

• LOW QUALITY

• UNINFORMED

• INCONSISTENT

Impact!Impact!

l In Confidence

Page 37: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 37

DATA

THE NEAR FUTURE MULTIPLE CHANNELS & DATA STORES / PERSONALISED SERVICE

Impact!Impact!

• PERSONALISED

• HIGH QUALITY

• INFORMED

• CONSISTENT

Vision for Customer Driven CRM

Page 38: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 38

Canonical Analytics ArchitectureCanonical Analytics Architecture

Monitor &

Integrator

DataWarehouse

ExtractTransformLoadRefresh

metadataOLAPServer

AnalysisQueryReportsData mining

Data SourcesTools

Serve

Data Marts

OperationalDBs

othersources

Page 39: Data Analytics for Customer Facing Applications

2/25/2008 39

Data Warehouse

Page 40: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 40

Data WarehouseA decision support database that is maintained separately from the organization’s operational databaseA data warehouse is a

subject-oriented,integrated,time-varying,non-volatile

collection of data that is used primarily in organizational decision making.

Page 41: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 41

Data Warehouse - Subject Orientedsubject oriented: oriented to the major subject areas of the corporation that have been defined in the data model.

E.g. for an insurance company: customer, product, transaction or activity, policy, claim, account, and etc.

operational DB and applications may be organized differently

E.g. based on type of insurance's: auto, life, medical, fire, ...

Page 42: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 42

Data Warehouse - Integrated

There is no consistency in encoding, naming conventions, … among different data sourcesWhen data is moved to the warehouse, it is converted.

Page 43: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 43

Data Warehouse - Non-Volatile

Operational data is regularly accessed and manipulated a record at a time and update is done to data in the operational environment. Warehouse Data is loaded and accessed. Update of data does not occur in the data warehouse environment.

Page 44: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 44

Data Warehouse - Time VarianceThe time horizon for the data warehouse is significantly longer than that of operational systems.Operational database contain current value data. Data warehouse data is nothing more than a sophisticated series of snapshots, taken as of some moment in time.The key structure of operational data may or may not contain some element if time. The key structure of the data warehouse always contains some element of time.

Page 45: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 45

Data SourcesData sources are often the operational systems, providing the lowest level of data.Data sources are designed for operational use, not for decision support, and the data reflect this fact.Multiple data sources are often from different systems run on a wide range of hardware and much of the software is built in-house or highly customized. Multiple data sources introduce a large number of issues -- semantic conflicts.

Page 46: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 46

Data CleaningImportant to warehouse clean data (operational data from multiple sources are often dirty).Three classes of tools

Data migration: allows simple data transformationData Scrubbing: uses domain-specific knowledge to scrub dataData auditing: discovers rules and relationships by scanning data (detect outliers).

Page 47: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 47

Load and RefreshLoading the warehouse includes some other processing tasks: checking integrity constraints, sorting, summarizing, build indxes, etc.Refreshing a warehouse means propagating updates on source data to the data stored in the warehouse

when to refreshdetermined by usage, types of data source, etc.

how to refreshdata shipping: using triggers to update snapshot log table and propagate the updated data to the warehousetransaction shipping: shipping the updates in the transaction log

Page 48: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 48

Monitordetect changes to an information source that are of interest to the warehouse

define triggers in a full-functionality DBMSexamine the updates in the log filewrite programs for legacy systems

propagate the change in a generic form to the integrator

Page 49: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 49

Integratorreceive changes from the monitors

make the data conform to the conceptual schema used by the warehouse

integrate the changes into the warehousemerge the data with existing data already presentresolve possible update anomalies

Page 50: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 50

Metadata RepositoryAdministrative metadata

source database and their contentsgateway descriptionswarehouse schema, view and derived data definitionsdimensions and hierarchiespre-defined queries and reportsdata mart locations and contentsdata partitionsdata extraction, cleansing, transformation rules, defaultsdata refresh and purge rulesuser profiles, user groupssecurity: user authorization, access control

Page 51: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 51

Metadata RepositoryBusiness data

business terms and definitionsownership of datacharging policies

Operational metadatadata lineage: history of migrated data and sequence of transformations appliedcurrency of data: active, archived, purgedMonitoring information: warehouse usage statistics, error reports, audit trails

Page 52: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 52

Data MartsA data mart (departmental data warehouse) is a specialized system that brings together the data needed for a department or related applications.Data marts can be implemented within the data warehouse by creating special, application-specific views.Data marts can also be implemented as materialized views departmental subsets that focus on selected subjects. Data marts may use different data representations and include their own OLAP engines

Page 53: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 53

Other ToolsUser interface that allows users to interact with the warehouse

query and reporting toolsanalysis toolsdata mining tools

Page 54: Data Analytics for Customer Facing Applications

2/25/2008 54

Dimensional Data Modeling

Page 55: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 55

Conceptual Modeling of Data Warehouses

Modeling data warehouses: dimensions & measurements

Star schema: A single object (fact table) in the middle

connected to a number of objects (dimension tables)

Snowflake schema: A refinement of star schema where

the dimensional hierarchy is represented explicitly by

normalizing the dimension tables.

Fact constellations: Multiple fact tables share dimension

tables.

Page 56: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 56

Example of Star Schema

DateMonthYear

Date

CustIdCustNameCustCityCustCountry

Cust

Sales Fact Table

Date

Product

Store

Customer

unit_sales

dollar_sales

Yen_salesMeasurements

ProductNoProdNameProdDescCategoryQOH

Product

StoreIDCityStateCountryRegion

Store

Page 57: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 57

Example of Snowflake Schema

DateMonth

Date

CustIdCustNameCustCityCustCountry

Cust

Sales Fact Table

Date

Product

Store

Customer

unit_sales

dollar_sales

Yen_sales

Measurements

ProductNoProdNameProdDescCategoryQOH

Product

MonthYear

MonthYearYear

CityState

City

CountryRegion

CountryStateCountry

State

StoreIDCity

Store

Page 58: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 58

A Query ModelShipping Method

AIR-EXPRESS

TRUCKORDER

Customer Orders

CONTRACTS

Customer

Product

PRODUCT GROUP

PRODUCT LINE

PRODUCT ITEM

SALES PERSON

DISTRICT

DIVISION

OrganizationPromotion

DISTRICT

REGION

COUNTRY

Geography

DAILYQTRLYANNUALYTime

Page 59: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 59

Summary TablesData warehouse may store some selected summary data, the pre-aggregated data.Summary data can store as separate fact tablessharing the same dimension tables with the base fact table.Summary data can be encoded in the original fact table and dimension tables.

DateID ProdID Sales0 1 10001 1 200001 2 400003 1 300000

id level date month year0 1 1 1 19981 2 NULL 1 19982 2 NULL 2 19983 3 NULL NULL 1998

Page 60: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 60

Multidimensional Data

Sales volume as a function of product, time, and geography

Prod

uct

Region

month

Dimensions: Product, Region, weekHierarchical summarization paths

Industry Country Year

Category Region Quarter

Product City Month Week

Office Day

Page 61: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 61

A Sample Data CubeTotal annual salesof TV in China.Date

Produ

ct

Cou

ntrysum

sumTV

VCRPC

1Qtr 2Qtr 3Qtr 4QtrChina

India

Japan

sum

Page 62: Data Analytics for Customer Facing Applications

2/25/2008 62

On-Line Analytical Processing (OLAP)

Page 63: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 63

Sample OperationsRoll up: summarize data

total sales volume last year by product category by region

Roll down, drill down, drill through: go from higher level summary to lower level summary or detailed data

For a particular product category, find the detailed sales data for each salesperson by date

Slice and dice: select and projectSales of beverages in the West over the last 6 months

Pivot: reorient cube

Page 64: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 64

Cube Operation

SELECT date, product, customer, SUM (amount)

FROM SALES

CUBE BY date, product, customer

Need compute the following Group-Bys

(date, product, customer),(date,product),(date, customer), (product, customer),(date), (product) (customer)

Page 65: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 65

Cuboid Lattice

(B)(A) (C) (D)

(B,C) (B,D) (C,D)(A,D)(A,C)

(A,B,D) (B,C,D)(A,C,D)

(A,B)

( all )

(A,B,C,D)

(A,B,C)

RData cube can be viewed as a lattice of cuboids

The bottom-most cuboid is the base cube.

The top most cuboidcontains only one cell.

Page 66: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 66

Cube Computation -- Array Based Algorithm

An MOLAP approach: the base cuboid is stored as a multidimensional arrayRead in a number of cells to compute partial cuboids

B {ABC}{AB}{AC}{BC}

{A}{B}{C}{ }

A

C

{}

Page 67: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 67

ROLAP versus MOLAPROLAP

exploits services of relational engine effectivelyprovides additional OLAP services

design tools for DSS schemaperformance analysis tool to pick aggregates to materialize

SQL comes in the way of sequential processing and columnar aggregationSome queries are hard to formulate and can often be time consuming to execute

Page 68: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 68

ROLAP versus MOLAPMOLAP

the storage model is an n-dimensional array

Front-end multidimensional queries map to server capabilities in a straightforward way

Direct addressing abilities

Handling sparse data in array representation is expensive

Poor storage utilization when the data is sparse

Page 69: Data Analytics for Customer Facing Applications

2/25/2008 69

Data Mining

Page 70: Data Analytics for Customer Facing Applications

© Jaideep Srivastava

What Is Data Mining?

Data mining (knowledge discovery in databases): Extraction of interesting ( non-trivial, implicit, previously

unknown and potentially useful) information from data in large databases

Alternative names and their “inside stories”: Data mining: a misnomer?Knowledge discovery in databases (KDD: SIGKDD), knowledge extraction, data archeology, data dredging, information harvesting, business intelligence, etc.

What is not data mining?(Deductive) query processing. Expert systems or small ML/statistical programs

Page 71: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 71

Examples of Interesting Knowledge

Association rules98% of people who purchase diapers also buy beer

ClassificationPeople with age less than 25 and salary > 40k drive sports cars

Similar time sequencesStocks of companies A and B perform similarly

Outlier DetectionResidential customers for telecom company with businesses at home

Page 72: Data Analytics for Customer Facing Applications

© Jaideep Srivastava

Motivation: “Necessity is the Mother of Invention”

Data explosion problem:Automated data collection tools and mature database technology lead to tremendous amounts of data stored in databases, data warehouses and other information repositories.

We are drowning in data, but starving for knowledge!

Data warehousing and data mining :On-line analytical processing

Extraction of interesting knowledge (rules, regularities, patterns, constraints) from data in large databases.

Page 73: Data Analytics for Customer Facing Applications

© Jaideep Srivastava

Data Mining and Business IntelligenceIncreasing potentialto supportbusiness decisions End User

BusinessAnalyst

DataAnalyst

DBA

MakingDecisions

Data PresentationVisualization Techniques

Data MiningInformation Discovery

Data Exploration

OLAP, MDA

Statistical Analysis, Querying and Reporting

Data Warehouses / Data Marts

Data SourcesPaper, Files, Information Providers, Database Systems, OLTP

Page 74: Data Analytics for Customer Facing Applications

© Jaideep Srivastava

Data Mining: Confluence of Multiple Disciplines

Database systems, data warehouse and OLAP

Statistics

Machine learning

Visualization

Information science

High performance computing

Other disciplines:

Neural networks, mathematical modeling, information retrieval, pattern recognition, etc.

Page 75: Data Analytics for Customer Facing Applications

2/25/2008 75

The Data Mining Process

Page 76: Data Analytics for Customer Facing Applications

© Jaideep Srivastava

Data Mining: A KDD Process

Data mining: the core of knowledge discovery process.

Data Cleaning

Data Integration

Data Warehouse

Task-relevant Data

Selection

Data Mining

Pattern Evaluation

Databases

Page 77: Data Analytics for Customer Facing Applications

© Jaideep Srivastava

Steps of a KDD ProcessLearning the application domain:

relevant prior knowledge and goals of applicationCreating a target data set: data selectionData cleaning and preprocessing: (may take 60% of effort!)Data reduction and projection:

Find useful features, dimensionality/variable reduction, invariant representation.

Choosing functions of data mining summarization, classification, regression, association, clustering.

Choosing the mining algorithm(s)Data mining: search for patterns of interestInterpretation: analysis of results.

visualization, transformation, removing redundant patterns, etc.Use of discovered knowledge.:

Page 78: Data Analytics for Customer Facing Applications

2/25/2008 78

Data Mining – Some Issues to Consider

Page 79: Data Analytics for Customer Facing Applications

© Jaideep Srivastava

Three Schemes in ClassificationKnowledge to be mined:

Summarization (characterization), comparison, association, classification, clustering, trend, deviation and pattern analysis, etc.Mining knowledge at different abstraction levels: primitive level, high level, multiple-level, etc.

Databases to be mined: Relational, transactional, object-oriented, object-relational, active, spatial, time-series, text, multi-media, heterogeneous, legacy, etc.

Techniques adopted:Database-oriented, data warehouse (OLAP), machine learning, statistics, visualization, neural network, etc.

Page 80: Data Analytics for Customer Facing Applications

© Jaideep Srivastava

Data Mining: Classification Schemes

General functionality:Descriptive data mining

Predictive data mining

Different views, different classifications:Kinds of knowledge to be discovered,

Kinds of databases to be mined, and

Kinds of techniques adopted.

Page 81: Data Analytics for Customer Facing Applications

© Jaideep Srivastava

Data Mining FunctionalityConcept description: Characterization and Comparison:

Generalize, summarize, and possibly contrast data characteristics, e.g., dry vs. wet regions.

Association:

From association, correlation, to causality.

finding rules like “inside(x, city) near(x, highway)”.

Classification and Prediction:

Classify data based on the values in a classifying attribute, e.g., classify countries based on climate, or classify cars based on gas mileage.

Predict some unknown or missing attribute values based on other information.

Page 82: Data Analytics for Customer Facing Applications

© Jaideep Srivastava

Data Mining Functionality (Cont.)

Clustering:

Group data to form new classes, e.g., cluster houses to find distribution patterns.

Time-series analysis:Trend and deviation analysis: Find and characterize evolution trend, sequential patterns, similar sequences, and deviation data, e.g., stock analysis.Similarity-based pattern-directed analysis: Find and characterize user-specified patterns in large databases.Cyclicity/periodicity analysis: Find segment-wise or total cycles or periodic behaviours in time-related data.

Other pattern-directed or statistical analysis:

Page 83: Data Analytics for Customer Facing Applications

© Jaideep Srivastava

Data Mining: On What Kind of Data?

Relational databases

Data warehouses

Transactional databases

Advanced DB systems and information repositoriesObject-oriented and object-relational databases

Spatial databases

Time-series data and temporal data

Text databases and multimedia databases

Heterogeneous and legacy databases

WWW

Page 84: Data Analytics for Customer Facing Applications

© Jaideep Srivastava

Are All the “Discovered” Patterns Interesting?

A data mining system/query may generate thousands of patterns, not all of them are interesting.

Suggested approach: Query-based, focused mining

Interestingness measures: A pattern is interesting if it iseasily understood by humansvalid on new or test data with some degree of certainty.potentially usefulnovel, or validates some hypothesis that a user seeks to confirm

Objective vs. subjective interestingness measures:Objective: based on statistics and structures of patterns, e.g., support, confidence, etc.Subjective: based on user’s beliefs in the data, e.g., unexpectedness, novelty, etc.

Page 85: Data Analytics for Customer Facing Applications

© Jaideep Srivastava

Can It Find All and Only Interesting Patterns?

Find all the interesting patterns: Completeness.Can a data mining system find all the interesting patterns?

Search for only interesting patterns: Optimization.Can a data mining system find only the interesting patterns?Approaches

First general all the patterns and then filter out the uninteresting ones.Generate only the interesting patterns --- mining query optimization

Page 86: Data Analytics for Customer Facing Applications

© Jaideep Srivastava

Requirements and Challenges in Data Mining

Mining methodology issuesMining different kinds of knowledge in databases.Interactive mining of knowledge at multiple levels of abstraction.Incorporation of background knowledgeData mining query languages and ad-hoc data mining.Expression and visualization of data mining results.Handling noise and incomplete dataPattern evaluation: the interestingness problem.

Performance issues:Efficiency and scalability of data mining algorithms.Parallel, distributed and incremental mining methods.

Page 87: Data Analytics for Customer Facing Applications

© Jaideep Srivastava

Requirements/Challenges in Data Mining (Cont.)

Issues relating to the variety of data types: Handling relational and complex types of dataMining information from heterogeneous databases and global information systems.

Issues related to applications and social impacts:Application of discovered knowledge.

Domain-specific data mining toolsIntelligent query answeringProcess control and decision making.

Integration of the discovered knowledge with existing knowledge: A knowledge fusion problem.Protection of data security and integrity.

Page 88: Data Analytics for Customer Facing Applications

2/25/2008 88

Amazon.com: Case study in building customer loyalty

Page 89: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 89

The continuing relationship …Amazon.com “Loyalty” model

Need CreationNeed Creation anticipate/stimulate

Information search Information search provide /assist

Evaluate alternatives Evaluate alternatives assist / negate

Purchase transaction Purchase transaction optimise /reward

Post purchase experiencePost purchase experience add value

Page 90: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 90

Need Creation(attract to website)

anticipate/stimulateNeed CreationNeed Creation

Page 91: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 91

Further Need Creation(upon reaching website)

Page 92: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 92

Information Searchprovide /assistInformation searchInformation search

Page 93: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 93

Evaluation of Alternatives

assist / negateEvaluate alternativesEvaluate alternatives

Page 94: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 94

Purchase Optimisation/Rewardoptimise /rewardPurchase transaction Purchase transaction

••11--click purchaseclick purchase••‘‘slippery check out counterslippery check out counter’’ vs. vs. ‘‘sticky aislessticky aisles’’

Page 95: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 95

Post-purchase experience

add valuePost purchase experiencePost purchase experience

Page 96: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 96

Account Management

Page 97: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 97

Why is loyalty importantAmazon’s ‘customer lifetime value’ model (for book buyers

Average $50 for first time purchaseAverage $40 per visit thereafterAverage of one visit per 2 monthsAssume customer will be active for 10 years – not validated yet ☺

‘4 buys and you are hooked’ empirical lawUse Alexa data to bring back ‘prodigal sons’(and daughters)

Page 98: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 98

Build more loyalty faster“Loyalty”LTV

Time

Page 99: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 99

The ‘Virtuous Cycle’

Purchase response

Customer knowledge

Buying decision/process

Page 100: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 100

Internet Marketing Insight – Jeff Bezos

Role ofAdvertisement – get customer to the storeCustomer experience – get customer to buy

Brick & mortar storesGetting customer to store is the hard partShopping cart abandonment is not common, since the overhead of going to another store is very high – especially in Minnesota winters!

Marketing expenses80% for advertisement; 20% for customer experience

The 80-20 rule is reversed for on-line stores– Jeff Bezos

Page 101: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 101

Remarks on Amazon.comA very innovative company – the poster child for e-commerceIs pushing the envelope in personalizationCustomers love itWill it make money – we’re all waiting to see

A company of the future, with a product ofthe past, in a market of the present

Page 102: Data Analytics for Customer Facing Applications

2/25/2008 102

The Analytics Behinde-Marketing

Page 103: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 103

Web Logs – Record of consumer behavior

looney.cs.umn.edu han - [09/Aug/1996:09:53:52 -0500] "GET mobasher/courses/cs5106/cs5106l1.html HTTP/1.0" 200 mega.cs.umn.edu njain - [09/Aug/1996:09:53:52 -0500] "GET / HTTP/1.0" 200 3291mega.cs.umn.edu njain - [09/Aug/1996:09:53:53 -0500] "GET /images/backgnds/paper.gif HTTP/1.0" 200 3014mega.cs.umn.edu njain - [09/Aug/1996:09:54:12 -0500] "GET /cgi-bin/Count.cgi?df=CS home.dat\&dd=C\&ft=1 HTTP mega.cs.umn.edu njain - [09/Aug/1996:09:54:18 -0500] "GET advisor HTTP/1.0" 302mega.cs.umn.edu njain - [09/Aug/1996:09:54:19 -0500] "GET advisor/ HTTP/1.0" 200 487looney.cs.umn.edu han - [09/Aug/1996:09:54:28 -0500] "GET mobasher/courses/cs5106/cs5106l2.html HTTP/1.0" 200

. . . . . . . . .

Access Log FormatIP address userid time method url protocol status size

mega.cs.umn.edu njain 09/Aug/1996:09:54:31 advisor/csci-faq.html

Other Server Logs: referrer logs, agent logsApplication server logs: business event logging

Page 104: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 104

Shopping Pipeline AnalysisOverall goal:•Maximize probability

of reaching final state•Maximize expected

sales from each visitEnterstore

Browsecatalog

Selectitems

Completepurchase

cross-sellpromotions

up-sellpromotions

‘sticky’states

‘slippery’state, i.e.1-click buy

• Shopping pipeline modeled as state transition diagram• Sensitivity analysis of state transition probabilities• Promotion opportunities identified• E-metrics and ROI used to measure effectiveness

Page 105: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 105

Original Amazon Model for Customer Segmentation

number of purchases in past quarter

dollarsspent inpastquarter

1 2 3 4 65

1500

1000

500

7

Light buyersMedium buyersHeavy buyersSuper heavy buyers

HM

Customer M - mediumCustomer H - heavy

Page 106: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 106

Data Driven Customer Segmentation Model

frequency

tenure

monetaryrecency

• modeled customers in a 4-dim space• used PCA to determine relative weights

of each dimension• Composite Score = w1*recency + w2*frequency +

w3*monetary + w4*tenure

Page 107: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 107

Customer Score InterpretationRecency Frequency Monetary Tenure Composite

Score

… … … … …

10 days 4 times $480 3 months 80%

… … … … …

30 days 2 times $900 10 months

72%

… … … … …

… … … … …

Cust M

Cust H

• Cust M => frequent visitor but low spender=> potential for acquiring higher wallet share=> focus on improving relationship

• Cust H => infrequent visitor but heavy spender=> focus on sustaining relationship

Page 108: Data Analytics for Customer Facing Applications

2/25/2008 108

Yodlee.com: Case study inweb business intelligence

Page 109: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 109

Current Situation: Consumer Confusion

“It takes me two hours to get to all my accounts”

“I can’t look at my assets across accounts”

“I can’t remember all my user IDs and passwords”

“I want the web to work for me, not the other way around”

“This is overwhelming……I need some help”

“Make it easier for me!”

Page 110: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 110

Solution –PersonalInformationAggregation

Page 111: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 111

Aggregation Service Model

Communication Site(content partner)

FinanceSite

TravelSite Capabilities

AggregationServiceProvider

AOL CitibankAOLfinance MyCiti

ContentAcquisition

Aggregation,Analysis,Personalization

Applications

ConnectedUser

Presentation &Interaction

MobileUser

Page 112: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 112

Business Intelligence Benefits to Corporation

‘Tip-of-the-iceberg’ analysis for a brokerage houseLifestyle preference analysis of banking customers for a survey‘True-wallet-share’ analysis for a credit card organizationDynamic targeting for banner advertisements, e-mail campaigns, etc.

Page 113: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 113

‘Tip-of-the-Iceberg’ Analysis for a Brokerage House

Asset BasedTiers

Number ofUsers

< $20K 7579

$20K - $100K 2539

$100K - $500K 1994

$500K - $1M 525

$1M - $5M 547

$5M - $25M 106

> $25M 9

• This brokeragehouse treatedcustomers withnet worth > $1Mas ‘high net worth’(HNW) customerswith specializedservices

• Almost none of thecustomers in thegreen region had> $1M with thisbrokerage

Page 114: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 114

Household Lifestyle Preference Analysis for a Survey

- 53% have at least one online banking account

- 51% have an online credit card account -- higher than

Yodlee users as a whole

- 31% also have an E*Trade account, and 11% also have a

Schwab account

- Have a preference for FirstUSAover Citibank, the opposite

preference for users as a whole

- The most popular credit card is American Express

Financial Preferences

25% make travel reservations online --fewer than users as a whole

- Expedia is more popular as an on-line travel site than Travelocity

- 49% have a frequent flier account --higher than users as a whole

-The favorite frequent flier programs are United, Delta, American, in that order

- Half as many of co-brand users shop on Ebay than users as a whole

Lifestyle Preferences

Page 115: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 115

‘True-Wallet-Share’ Analysis for a Credit Card Organization

Analysis of credit card balance habits of user base• There are1386 people, each of which carries a total balance between $1000 and $2000 on all credit cards that (s)he owns• 292 of these 1386 people own discover cards, and carry an average balance of $174.55• 540 of these 1386 people own AmEx cards, with an average balance of $988.97• 323 of these 1386 people carry one or more Visa, with an average Visa network balance of $1018.50

Range Total Users Discover American Express

Mastercard Visa Other Average Total< $100 462 4.13

(73)

-467.40 (152) 0 -29.76 (87) -60.29 (272) -190.74

$100 - $200 232 -12.61 (39) 120.17 (66) 0 89.95

(40)

167.10 (156) 149.44

$200 - $500 643 36.97 (107) 253.77 (207) 0 218.93 (135) 272.42 (421) 342.99$500 - $1000 968 75.57 (182) 571.09 (378) 0 597.83 (217) 623.36 (593) 893.47

$1000 -$2000

1386 174.55 (292) 988.97 (540) 837.25

(1)

1018.50 (323)

1078.01 (866)

1471.38

$2000 -$5000

2422 263.27 (432) 2156.30 (1099)

957.69

(1)

2087.75 (601)

2358.22 (1579)

3297.58

$5000 -$10000

1732 620.80 (354) 4091.64 (814)

3648.40

(3)

3976.93 (483)

4966.61 (1200)

7100.20

$10000+ 1696 1332.48 (452)

10111.75 (1010)

1921.16

(9)

8934.39 (642)

14649.52 (1341)

22329.56

Page 116: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 116

Business Implications ofTrue Wallet Share Analysis

A credit card offeror knows exactly how much money customers holding its cards spend (every month) on its card vs. that on the competition’s cardsOfferor can target users falling within various segments for specific customer acquisition, retention, etc. purposesDetailed profile and history information of these users can be used for precision targeting and customer messaging through various channels including ad serving, e-mail campaigns, promotions, etc.If transaction level detail information of these users is analyzed, it can be determined exactly which credit cards are being used by aggregation users as a whole for what kind of lifestyle activity, e.g. travel, entertainment, shopping, groceries, etc; this can help partner decide which market segments to focus on

Page 117: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 117

Business Implications (contd.)The analysis above, if carried out at an individual user level detail, can be used to target individual customers with specific promotions, etc.Transaction level detail can be classified into charges to specific organizations, department stores, airlines, etc. This will identify the top organizations that aggregation users spend money at, either on the partner’s card or on a competing network. This would be useful in determining which organizations to partner with for customer retention, and acquisition, respectivelyAll of these analyses if performed periodically, and tracked over time, can provide valuable insight into the evolving credit balance distribution and usage behavior at the user population or individual user level

Page 118: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 118

Targeted Ad Serving

Page 119: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 119

Targeted Ad Serving (contd.)

Page 120: Data Analytics for Customer Facing Applications

2/25/2008 120

Privacy Issues

Page 121: Data Analytics for Customer Facing Applications

2/25/2008 121

let’s begin with some real examples …

Page 122: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 122

Problem: Shopping for spouse’sanniversary – too much clutter

Page 123: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 123

Solution: Focused and relevantadvertisement

Page 124: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 124

Problem: Tired of mistreatment by financial institutions …

You have tons of money in your investment portfolioBut you are over-worked and slipped a couple of credit card payment deadlines – after all you are busy managing your investment portfolio ☺Credit card institution treats you like a deadbeat

Page 125: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 125

SolutionWhy not let the credit card institution know what your investment portfolio balance is? Impress them ☺Perhaps even authorize credit card company to transfer funds from your investment account to cover the payment? Or maybe not ☺

Page 126: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 126

So, what’s the catch…Shopping example

Allow the vendor to collect detailed information about you and build an accurate profileJunk mail is only a nuisance for the receiver, but an expense for the sender! – the sender wants to avoid it more than the receiver!!

Credit card exampleAllow the credit card company and investment company to share your information

Multiple online accounts exampleHand over your account names and passwords to aggregation serviceSounds scary – but over 1.5 million people have done this in about 18 months’ time!!

Page 127: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 127

let’s now talk about privacy …

Merriam Webster definitiona: the quality or state of being apart from company or observation b : freedom from unauthorized intrusion

Justice Oliver Wendell Holmes“the right to be left alone”

Operational definitionCollection and analysis of personal data beyond some limit

Page 128: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 128

Public Attitude Towards PrivacyA (self-professed) non scientific study carried out by a USA Today reporterAsked 10 people the following two questions

Are you concerned about privacy? 8 said YESIf I buy you a Big Mac, can I keep the wrapper (to get fingerprints)? 8 said YES

ACM E-Commerce 2001 paper [Spiekermann et al]Most people willing to answer fairly personal questions to anthropomorphic web-bot, even though not relevant to the task at handDifferent privacy policies had no impact on behaviorStudy carried out in Europe, where privacy consciousness is (presumably) higher

Page 129: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 129

Public Attitude (contd.)Amazon.com (and practically every commercial site) uses cookies to identify and track visitors

97.6% of Amazon.com customers accepted cookies

Airline frequent flier programs with cross promotions

We willingly agree to be trackedGet upset if the tracking fails!

Over 1.5 million people have trusted the aggregation service (called Yodlee) with the names and passwords of their financial accounts in less than 18 monthsAdoption rate has been over 3 times the most optimistic projections

Medical data is (perhaps) an exception to this

Page 130: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 130

What people really wantSome people will not share any kind of private data at any cost – the ‘paranoids’Some people will share any data for returns –the ‘Jerry Springerites’The vast majority in the middle wants

a reasonable level of comfort that private data about them will NOT be misusedTangible and compelling benefits in return for sharing their private data – Big Mac example, frequent flier programs

Page 131: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 131

Remarks on PrivacyIs it ‘much ado about nothing’?

If indeed data collection was outlawed, and thus personalization impossible, wouldn’t the public lose – faced with generic, undifferentiated products/services?Given the public’s attitude about privacy (as shown in their actions), are privacy advocates barking up the wrong tree?Is it just a matter of time or generational issue, e.g. adoption of credit cards

Where do we stand?Current position - loss of your privacy may be beneficial for youEmerging position (post September 11th ) - loss of your privacy will be beneficial for everyoneCritical emerging debate - is privacy a right or a privilege?

Page 132: Data Analytics for Customer Facing Applications

2/25/2008 © Jaideep Srivastava 132

Concluding RemarksInternet is a high bandwidth, low latency, negligible cost, interactive channel to the customerVery high adoption rates for this channelProcessing speeds and storage capacities continuing to increase while costs continue to fallData analytics technology has grown rapidlyCustomer facing applications are ready for a paradigm shiftInnovative companies have moved aheadPrivacy is an issue, but not much of a concern