xlminer – a data mining toolkit quantlink solutions pvt. ltd

33
XLMiner a Data Mining Toolkit QuantLink Solutions Pvt. Ltd. www.quantlink.com www.xlminer.com

Upload: magdalen-ellis

Post on 20-Jan-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

XLMiner – a Data Mining Toolkit

QuantLink Solutions Pvt. Ltd.www.quantlink.com

www.xlminer.com

Page 2: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

XLMiner - the Data Mining Toolkit 2

XLMiner – a quick tourHere is a short demo of XLMiner.

Let us use a simple example: a bank sends mailers to its customers, offering a

special deal on Personal Loans. In its previous campaign, it got only about 9% positive response.

Objective: How to target customers for increased conversion rate.

In other words, the question to address is:

what profile indicates a high-potential customer?

Page 3: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

XLMiner - the Data Mining Toolkit 3

XLMiner – a quick tourPast campaign data will be used to

train the data mining model

This is called supervised learning in DataMining terms

Let’s see how to build a model and use it for improving the response rate.

Page 4: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

XLMiner - the Data Mining Toolkit 4

XLMiner Quick Tour

Data descriptionOur past campaign data has the following

customer attributes: Customer ID Customer’s Age Professional Experience Family Income Credit Card average annual spending Education Level #appliances owned Did this customer accept past campaign offer?

The last variable is the known outcome of the past campaign. Our Data Mining model will use this for Supervised Learning.

Page 5: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

XLMiner - the Data Mining Toolkit 5

XLMiner Quick Tour

A view of the dataThis is what the data looks like:

The variable labeled as “PersLoan?” is binary:

0 means the customer was not interested in the Personal Loan.

1 means the customer was interested.

Page 6: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

XLMiner - the Data Mining Toolkit 6

XLMiner Quick Tour

the Data Mining Process Partition the data into Training & Validation

Partitions

Fit the Model onTraining Partition only

Obtain results, seeif they look good

enough

Check if theyare good for

Validation data too!

Study the outputsfor validation data

Try out severalalternative models

Choose and deploythe best model

Page 7: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

XLMiner - the Data Mining Toolkit 7

XLMiner Quick Tour

Start the analysis

Let’s get going with XLMiner.

Notice that XLMiner is as easy to use as Excel!

All we need to do is use the friendly menus. We follow just three simple steps to fit a model and see the

outputs!

Page 8: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

XLMiner - the Data Mining Toolkit 8

XLMiner Quick Tour

Step 1: Partition the dataWe’ll create two

partitions by choosing the

records randomly.

The Training Partition will be

used for fitting the model.

The Validation partition will be

used for checking if the model gives a

good fit for another piece of known

data.

Page 9: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

XLMiner - the Data Mining Toolkit 9

XLMiner Quick Tour

Partitioned Data

XLMiner creates a Partition Sheet that

shows the data split into Two

partitions.

Easy Hyperlinks on the Navigator

facilitate viewing of either partition

Page 10: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

XLMiner - the Data Mining Toolkit 10

XLMiner Quick Tour

Step 2: Fitting the ModelThis is a

“Classification” Problem where we

want to predict customers as likely / not likely to take a

Personal Loan.

Let’s use one of the available techniques – Classification Tree.

Later we can use other Classification

techniques.

We select input (predictor) variables

here…

…and the outcome variable here

Page 11: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

XLMiner - the Data Mining Toolkit 11

XLMiner Quick Tour

Step 2: Fitting the ModelThe model fit guides us through easy

wizard-like steps.

In these steps we choose technique-specific parameters and the output

options.

In the end, we click Finish to produce the results.

Page 12: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

XLMiner - the Data Mining Toolkit 12

XLMiner Quick Tour

Step 3: Understanding the Outputs

The friendly Output Navigator lets us go over all the outputs.

The Summaries show us the classification error

percentages – i.e. how well the model is predicting

Other outputs (like the Tree here) will tell us the decision

rules that the model is suggesting.

Many other diagnostic outputs are available

depending on options we choose.

Page 13: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

XLMiner - the Data Mining Toolkit 13

XLMiner Quick Tour

Output 1: Validation SummaryFirst, we look at how well the model predicted

for the Validation data set

In the Training data where we already knew the outcome, 156 “will buy” were predicted correctly, and 38 wrongly.

1801 “Won’t buy” were predicted correctly and merely 5 wrongly.

Here are the corresponding error percentages.

The errors are not very small but could still indicate a

workable model.

Page 14: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

XLMiner - the Data Mining Toolkit 14

XLMiner Quick Tour

Output 2: the decision ruleHere is the Classification Tree that gives the easy-

to-understand and implement Decision Rules

Cut-off points for different variables decide

whether to go Left or Right

0: not likely to buy1: likely to buy

Page 15: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

XLMiner - the Data Mining Toolkit 15

XLMiner Quick Tour

the decision rule in table formThe same decision rule as shown visually, can be converted into the table below. This is useful for

implementing it in your information systems.

Page 16: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

XLMiner - the Data Mining Toolkit 16

XLMiner Quick Tour

Output 3: more detailsEach technique (Classification Tree in this case) has additional helpful outputs

The example here shows the “Prune Log” – how the percentage error reduced by “pruning” the tree

Page 17: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

XLMiner - the Data Mining Toolkit 17

XLMiner Quick Tour

Output 4: the Lift Chart“Lift” tells us how much better the model did compared to a

random targeting of customers. This is one of the most important outputs.

If customers were targeted randomly, we would expect this

outcome. For instance, 1000 mailers would probably yield

less than 100 customers.

With our Tree model, we get a much superior result. In less than 500 mailers sent to high probability customers, we

would get nearly 170 successes!

Page 18: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

XLMiner - the Data Mining Toolkit 18

XLMiner Quick Tour

Output 5: the Detailed reportThe Validation data is “scored” in detail as shown

below. Scoring means using the fitted model to classify each record of the data.

Predicted values can be seen

against the actuals here.

Probability of success is computed for each record. This is what helps XLMiner suggest selective records

(customers) to target.

Page 19: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

XLMiner - the Data Mining Toolkit 19

XLMiner Quick Tour

Try several techniques!

That was just one of the many techniques in XLMiner – Classification Tree.

A typical Data Mining exercise involves several alternative approaches on the same

data. This can be either with different techniques, or with different parameters, or

both.

Comparing multiple approaches lets us “assess” which model to finally choose for

implementation.

Page 20: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

XLMiner - the Data Mining Toolkit 20

XLMiner Quick Tour

Rich repertoire of techniques!XLMiner supports a comprehensive array of

supervised learning procedures:

Multiple Linear Regression Logistic Regression Classification & Regression

Trees Neural Networks k Nearest Neighbors Naïve Bayes Classifier Discriminant Analysis

Page 21: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

XLMiner - the Data Mining Toolkit 21

XLMiner Quick Tour

Rich repertoire of techniques!... and several other features in Unsupervised

Learning, Data Reduction and Exploration:

Principal Components Analysis k-means Clustering Hierarchical Clustering Self-organizing Maps (coming

soon) Affinity– Market Basket Analysis

Here are some sample outputs from these methods …

Page 22: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

XLMiner - the Data Mining Toolkit 22

XLMiner Quick Tour

sample output - DendrogramHierarchical Clustering produces a dendrogram – an excellent visual representation of Cluster formation.

Height of the bars is a measure of dissimilarity in the clusters that are

merging into one.

Smaller clusters “agglomerate” into bigger ones, with least possible loss

of cohesiveness at each stage.

Page 23: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

XLMiner - the Data Mining Toolkit 23

XLMiner Quick Tour

sample output – cluster predictionsCluster Analysis has many powerful uses like Market Segmentation.

We can view individual record’s predicted cluster membership.

Page 24: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

XLMiner - the Data Mining Toolkit 24

XLMiner Quick Tour

sample output – BoxPlotsXLMiner supports powerful visualization. The

example here shows BoxPlots of two variables.

Cluster 2 clearly shows higher Income & Credit Card spend

than Cluster 1. This is an excellent aid to characterizing

the clusters

Page 25: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

XLMiner - the Data Mining Toolkit 25

XLMiner Quick Tour

sample output – Scatter PlotsMatrix Scatterplots in XLMiner give a visual insight

into relationship among variables.

Page 26: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

XLMiner - the Data Mining Toolkit 26

XLMiner Quick Tour

sample output – Association Rules

For Market Basket Analysis XLMiner produces easy-to-read Association Rules

Rules are explained in simple English!

Each rule tells us which offerings will go well

together

Page 27: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

27XLMiner - the Data Mining Toolkit

XLMiner Quick Tour

… and that’s not all!

XLMiner has handy utilities for Data handling:

Missing data treatment

Transforming categorical data

Binning continuous data

Sampling from Databases

Scoring to Databases

Page 28: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

28XLMiner - the Data Mining Toolkit

XLMiner Quick Tour

XLMiner => Versatility!This was a quick demonstration of just a

few things XLMiner can do.

It can do lots more. It is comprehensive in coverage, like the best DM products

around.

Get your free download for evaluation at www.xlminer.ncom

Page 29: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

29XLMiner - the Data Mining Toolkit

XLMiner Quick Tour

XLMiner => Simplicity!Daryl Pregibon had said – Data Mining is

“Statistics at Scale and Speed”.

You’ll find that XLMiner is Statistics at Scale, Speed and Simplicity!

If you know to use Excel, you already know XLMiner. You can get started in

minutes.

Page 30: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

30XLMiner - the Data Mining Toolkit

XLMiner Quick Tour

XLMiner => Great Value!

Several comprehensive DM products are many times more

expensive.

For exploring how Data Mining will work for you, XLMiner provides a

great start!

Page 31: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

31XLMiner - the Data Mining Toolkit

XLMiner Quick Tour

What others say …

The American Statistician reviewed XLMiner along with other reputed products in the November

2003 issue

This is what it had to say:

“An easy to use… an excellent, inexpensive add-on that greatly expands the capabilities of Excel.”

“XLMiner’s documentation is remarkably good…”

Page 32: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

32XLMiner - the Data Mining Toolkit

XLMiner Quick Tour

More ResourcesFor your initiation into Data Mining:

Free evaluation download

Online Courses at www.statistics.com

Case Book in the making

Technical references on product website

Page 33: XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd

33XLMiner - the Data Mining Toolkit

XLMiner Quick Tour

Thank you for viewing this Demo!

www.xlminer.com

XLMiner - the Data Mining Toolsetfor the Managers of Tomorrow