data mining and knowledge discovery for strategic business optimization peter van der putten alp...

39
Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Upload: daniella-holt

Post on 12-Jan-2016

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Data Mining and Knowledge Discovery

for Strategic Business Optimization

Peter van der Putten

ALP Group, LIACS & KiQ Ltd

November 2004

Page 2: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Why is a business in business?

• Successful businesses create a lot of added value for their customers and capture it– Maximize long term profit

• Optimize: Maximize sales, minimize costs, minimize risk

Page 3: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Challenges

• Businesses are bigger• Fragmentation of products, customer interaction

channels, market segments• Fierce competition, chaotic economic climate and

dynamic customer behavior• Data glut & information overflow

• Solution: data mining & knowledge discovery for strategic business optimization

Page 4: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

All applications

Expert knowledge 29.8% accepted

12.7% infection

34.5% accepted

Prediction model plus rules

9.1% infection

Accepted Accepted volumevolume

Credit scoring case: minimizing loan risk while maximizing loan acceptionCredit scoring case: minimizing loan risk while maximizing loan acception

Page 5: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Marketing case: maximizing direct mail response while minimizing cost

Logistic-Regression

0.00

10.00

20.00

30.00

40.00

50.00

60.00

70.00

80.00

90.00

100.00

0 10 20 30 40 50 60 70 80 90 100

Cum

. po

sitiv

e

Cases (%)

A model was created that predicts the probability to respond to a mailing. By using the model to select

customers to mail we could reach 50% of the responders

by mailing only 20% of all customers

Page 6: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Siebel

OMEGA predicts a slight preference for general

insurance and offers a one-click cross-sell button.

Although the next customer might have preferences as well, the exit risk is overriding. Using a combination of

predictive models and business rules, OMEGA suggests to Siebel an immediate

attempt to retain the customer.

OMEGA offers Siebel the appropriate text for its script

engine.

Within general insurance, OMEGA predicts a

preference for car insurance and offers one-click access

to the appropriate script.

OMEGA again offers Siebel the appropriate text to

execute a retention script.

Page 7: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Overview

• Why Data Mining?• The Data Mining Process• Data Mining Tasks• Data Mining Techniques• Future Outlook• Data Mining Opportunities by Sector and

Function• Q&A

Page 8: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Some working definitions….

• ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are used interchangeably

• Data mining = – the discovery of interesting, meaningful and

actionable patterns hidden in large amounts of data • Multidisciplinary field originating from artificial

intelligence, pattern recognition, statistics, machine learning, econometrics, ….

Page 9: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Data mining is a process…

• Model Development– Objective– Data collection & preparation– Model construction– Model evaluation– Combining models with business knowledge into

decision logic• Model / decision logic deployment• Model / decision logic monitoring

Page 10: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Data mining tasks

• Undirected, explorative, descriptive, ‘unsupervised’ data mining– Matching & search– Profile & rule extraction– Clustering & segmentation

• Directed, predictive, ‘supervised’ data mining– Predictive modeling

Page 11: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Data mining task example: Clustering & segmentation

Page 12: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Data mining task example: Clustering & segmentation

Page 13: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Start Looking Glass

Page 14: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Tussenresultaat looking glass

Page 15: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Resultaat Looking Glass

Page 16: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Resultaat Looking Glass

Page 17: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Case A 7

Case B 4

10987654321

Worsebusiness

Score

Betterbusiness

Case A

Case B

Past experience

Data Behaviour

GoodBad

Bad

Good

Model

Data mining task example:predictive modeling

Page 18: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Data mining task example:predictive modeling

Income Age Children

60K 38 2

30K 23 1

30K 29 0

... ... ...

120K 55 2

Collected data

Page 19: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Data mining task example:predictive modeling

Income Age Children Status

60K 38 2 Good

30K 23 1 Good

30K 29 0 Bad

... ... ... ...

120K 55 2 Bad

Known customerbehaviour

Page 20: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

score = (0 x Income) + (-1 x Age) + (25 x Children)

Data mining task example:predictive modeling

Income Age Children Status Value Score

60K 38 2 Good 100 12

30K 23 1 Good 45 2

30K 29 0 Bad -80 -24

... ... ... ... ... ...

120K 55 2 Bad -40 -5

Page 21: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Data mining task example:predictive modeling

• Recruitment– Who will respond to a mailing campaign?– To who can we cross sell which products?– What will be the customer value one year from now?

• Retention– Who is going to cancel his/her mobile phone subscription. Should I

attempt to keep this customer?– Which customers have accounts that will go dormant?

• Risk– Should I sell a loan to this person?– How much money will someone claim on a policy?– Is this caller going to pay his bills?

Page 22: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Data mining techniques for predictive modeling

• Linear and logistic regression• Decision trees• Neural Networks• Genetic Algorithms• ….

Page 23: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

score

=

(0 x Income) + (-1 x Age) + (25 x Children)

Linear Regression Models

Page 24: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Regression in pattern space

age

inco

me

Only a single line available in pattern space to separate classes

Class ‘circle’

Class ‘square’

Page 25: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Decision Trees

20000 customersresponse 1%

Income >150000?

18800 customersPurchases >10?

1200 customersbalance>50000?

800 customersresponse 1,8% etc.400 customers

response 0,1%

no

noyes

yes

no

Page 26: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Decision Trees in Pattern Space

age

inco

me

Line pieces perpendicular to axes

Each line is a split in the tree, two answers to a question

Page 27: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Infotrees (Genetic Programming)• Nested regression formulas

– sum(average(region, spend), max(age, children))

sumsum

maxmax

childrenchildrenageage

averageaverage

regionregion spendspend

Page 28: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Infotrees in Pattern Space

age

inco

me

Infotrees can seperate any class in pattern space, even if the class boundary is non-linear

Can model complex customer behavior

Page 29: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Genetic Algorithms / Programming

• How to find the best Infotree? Genetic algorithms– Based on the idea of evolution– Start with (random) Infotrees– Build a new generation

• Fittest models can reproduce to create offspring, worst models die

• Small amount of mutation occurs to keep exploring– Repeat process

Page 30: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Notes about Infotree models: Cross-over

Notes about Infotree models: Cross-over

•New models can be created by cross-over:– part of one model is swapped with part of another– parts may chosen randomly or intelligently

convexconvex

concaveconcave

invertinvert

childrenchildren

ageage

salarysalary

s1s1

ameanamean

quadvquadv

regionregion

spendspend

ageage

convexconvex

concaveconcave

childrenchildren

ageage

ameanamean

regionregion

spendspend

new model

old model

old model

cross-over point

cross-over point

Page 31: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Notes about Infotree models:Mutation

Notes about Infotree models:Mutation

• New models can be created by mutation:– part of a model (a sub-tree, operator or predictor) is changed – part and type of change may chosen randomly or intelligently

convexconvex

concaveconcave

childrenchildren

ageage

s2s2

househouse

TV RegionTV Region

convexconvex

concaveconcave

childrenchildren

ageage

ameanamean

regionregion

spendspend

convexconvex

concaveconcave

childrenchildren

ageage

s2s2

regionregion

spendspend

convexconvex

concaveconcave

childrenchildren

ageage

ameanamean

regionregion

spendspend

convexconvex

concaveconcave

childrenchildren

ageage

s2s2

househouse

spendspend

convexconvex

concaveconcave

childrenchildren

ageage

ameanamean

regionregion

spendspend

Sub-tree

Operator

Predictor

Page 32: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Short Demo(if time allows…)

Model to predict caravan policy ownership

Combining this model with other models and business rules

Page 33: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Data Mining: the Future

• Business (marketing)– More fine-grained segmentation down to the cluster or

individual level– More personalised actions, inbound and outbound, in all

customer contact channels– Optimization of both value for the business and the

customer– Privacy

• Technical– From Data Mining to Decisioning, combining multiple

models with business rules– Monitoring business and model performance– Data Mining Process Automation

Page 34: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Let’s discuss:Data Mining Opportunities by Function

• Marketing, Sales, CRM• Product Development, R&D• Manufacturing, Production, Logistics• Customer service• Finance• Procurement• Human Resources• IT• ….

Page 35: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Let’s discuss:Data Mining Opportunities by Sector

• Retail• Telco• Pharma• Government• Automotive• Oil• Charity• Consumers / Citizens• ….

Page 36: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

The Paper: Requirements

• 2500 words + -10%, APA style references• No plagiarism / copying! Rephrase in your own words,

reference, cite & quote• Two parts of each 1250 words

– Your grasp of the research topic: what is data mining? Own interpretation, clear, put into context

– Memo to CEO/CIO of a specific company / industry: what are the benefits/changes/opportunities and next steps (best practice, proof of concept)? Impact, convincing, plan to action.

Page 37: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

The Paper: Suggestions

• Suggestions for ‘companies’– KPN Mobile, Marketing: how to reduce loss of customers to

competitors– Dutch Police, Strategic Innovation: opportunities for law

enforcement, privacy implications– Pfizer, Drug Discovery: using data mining to find new drugs– Google, Product Management / R&D: opportunities for new

data mining features to enhace customer experience

– Your Idea!

Page 38: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

The Paper: Resources

• Webpage for this talk: – http://www.liacs.nl/~putten/ictvision.html

• General Writing Resources: – http://www.liacs.nl/~putten/writingpapers.html

• Homepage: – www.liacs.nl/~putten , mail [email protected]

Page 39: Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

Dilbert’s Perspective on Data Mining