xlminer – a data mining toolkit quantlink solutions pvt. ltd
TRANSCRIPT
XLMiner – a Data Mining Toolkit
QuantLink Solutions Pvt. Ltd.www.quantlink.com
www.xlminer.com
XLMiner - the Data Mining Toolkit 2
XLMiner – a quick tourHere is a short demo of XLMiner.
Let us use a simple example: a bank sends mailers to its customers, offering a
special deal on Personal Loans. In its previous campaign, it got only about 9% positive response.
Objective: How to target customers for increased conversion rate.
In other words, the question to address is:
what profile indicates a high-potential customer?
XLMiner - the Data Mining Toolkit 3
XLMiner – a quick tourPast campaign data will be used to
train the data mining model
This is called supervised learning in DataMining terms
Let’s see how to build a model and use it for improving the response rate.
XLMiner - the Data Mining Toolkit 4
XLMiner Quick Tour
Data descriptionOur past campaign data has the following
customer attributes: Customer ID Customer’s Age Professional Experience Family Income Credit Card average annual spending Education Level #appliances owned Did this customer accept past campaign offer?
The last variable is the known outcome of the past campaign. Our Data Mining model will use this for Supervised Learning.
XLMiner - the Data Mining Toolkit 5
XLMiner Quick Tour
A view of the dataThis is what the data looks like:
The variable labeled as “PersLoan?” is binary:
0 means the customer was not interested in the Personal Loan.
1 means the customer was interested.
XLMiner - the Data Mining Toolkit 6
XLMiner Quick Tour
the Data Mining Process Partition the data into Training & Validation
Partitions
Fit the Model onTraining Partition only
Obtain results, seeif they look good
enough
Check if theyare good for
Validation data too!
Study the outputsfor validation data
Try out severalalternative models
Choose and deploythe best model
XLMiner - the Data Mining Toolkit 7
XLMiner Quick Tour
Start the analysis
Let’s get going with XLMiner.
Notice that XLMiner is as easy to use as Excel!
All we need to do is use the friendly menus. We follow just three simple steps to fit a model and see the
outputs!
XLMiner - the Data Mining Toolkit 8
XLMiner Quick Tour
Step 1: Partition the dataWe’ll create two
partitions by choosing the
records randomly.
The Training Partition will be
used for fitting the model.
The Validation partition will be
used for checking if the model gives a
good fit for another piece of known
data.
XLMiner - the Data Mining Toolkit 9
XLMiner Quick Tour
Partitioned Data
XLMiner creates a Partition Sheet that
shows the data split into Two
partitions.
Easy Hyperlinks on the Navigator
facilitate viewing of either partition
XLMiner - the Data Mining Toolkit 10
XLMiner Quick Tour
Step 2: Fitting the ModelThis is a
“Classification” Problem where we
want to predict customers as likely / not likely to take a
Personal Loan.
Let’s use one of the available techniques – Classification Tree.
Later we can use other Classification
techniques.
We select input (predictor) variables
here…
…and the outcome variable here
XLMiner - the Data Mining Toolkit 11
XLMiner Quick Tour
Step 2: Fitting the ModelThe model fit guides us through easy
wizard-like steps.
In these steps we choose technique-specific parameters and the output
options.
In the end, we click Finish to produce the results.
XLMiner - the Data Mining Toolkit 12
XLMiner Quick Tour
Step 3: Understanding the Outputs
The friendly Output Navigator lets us go over all the outputs.
The Summaries show us the classification error
percentages – i.e. how well the model is predicting
Other outputs (like the Tree here) will tell us the decision
rules that the model is suggesting.
Many other diagnostic outputs are available
depending on options we choose.
XLMiner - the Data Mining Toolkit 13
XLMiner Quick Tour
Output 1: Validation SummaryFirst, we look at how well the model predicted
for the Validation data set
In the Training data where we already knew the outcome, 156 “will buy” were predicted correctly, and 38 wrongly.
1801 “Won’t buy” were predicted correctly and merely 5 wrongly.
Here are the corresponding error percentages.
The errors are not very small but could still indicate a
workable model.
XLMiner - the Data Mining Toolkit 14
XLMiner Quick Tour
Output 2: the decision ruleHere is the Classification Tree that gives the easy-
to-understand and implement Decision Rules
Cut-off points for different variables decide
whether to go Left or Right
0: not likely to buy1: likely to buy
XLMiner - the Data Mining Toolkit 15
XLMiner Quick Tour
the decision rule in table formThe same decision rule as shown visually, can be converted into the table below. This is useful for
implementing it in your information systems.
XLMiner - the Data Mining Toolkit 16
XLMiner Quick Tour
Output 3: more detailsEach technique (Classification Tree in this case) has additional helpful outputs
The example here shows the “Prune Log” – how the percentage error reduced by “pruning” the tree
XLMiner - the Data Mining Toolkit 17
XLMiner Quick Tour
Output 4: the Lift Chart“Lift” tells us how much better the model did compared to a
random targeting of customers. This is one of the most important outputs.
If customers were targeted randomly, we would expect this
outcome. For instance, 1000 mailers would probably yield
less than 100 customers.
With our Tree model, we get a much superior result. In less than 500 mailers sent to high probability customers, we
would get nearly 170 successes!
XLMiner - the Data Mining Toolkit 18
XLMiner Quick Tour
Output 5: the Detailed reportThe Validation data is “scored” in detail as shown
below. Scoring means using the fitted model to classify each record of the data.
Predicted values can be seen
against the actuals here.
Probability of success is computed for each record. This is what helps XLMiner suggest selective records
(customers) to target.
XLMiner - the Data Mining Toolkit 19
XLMiner Quick Tour
Try several techniques!
That was just one of the many techniques in XLMiner – Classification Tree.
A typical Data Mining exercise involves several alternative approaches on the same
data. This can be either with different techniques, or with different parameters, or
both.
Comparing multiple approaches lets us “assess” which model to finally choose for
implementation.
XLMiner - the Data Mining Toolkit 20
XLMiner Quick Tour
Rich repertoire of techniques!XLMiner supports a comprehensive array of
supervised learning procedures:
Multiple Linear Regression Logistic Regression Classification & Regression
Trees Neural Networks k Nearest Neighbors Naïve Bayes Classifier Discriminant Analysis
XLMiner - the Data Mining Toolkit 21
XLMiner Quick Tour
Rich repertoire of techniques!... and several other features in Unsupervised
Learning, Data Reduction and Exploration:
Principal Components Analysis k-means Clustering Hierarchical Clustering Self-organizing Maps (coming
soon) Affinity– Market Basket Analysis
Here are some sample outputs from these methods …
XLMiner - the Data Mining Toolkit 22
XLMiner Quick Tour
sample output - DendrogramHierarchical Clustering produces a dendrogram – an excellent visual representation of Cluster formation.
Height of the bars is a measure of dissimilarity in the clusters that are
merging into one.
Smaller clusters “agglomerate” into bigger ones, with least possible loss
of cohesiveness at each stage.
XLMiner - the Data Mining Toolkit 23
XLMiner Quick Tour
sample output – cluster predictionsCluster Analysis has many powerful uses like Market Segmentation.
We can view individual record’s predicted cluster membership.
XLMiner - the Data Mining Toolkit 24
XLMiner Quick Tour
sample output – BoxPlotsXLMiner supports powerful visualization. The
example here shows BoxPlots of two variables.
Cluster 2 clearly shows higher Income & Credit Card spend
than Cluster 1. This is an excellent aid to characterizing
the clusters
XLMiner - the Data Mining Toolkit 25
XLMiner Quick Tour
sample output – Scatter PlotsMatrix Scatterplots in XLMiner give a visual insight
into relationship among variables.
XLMiner - the Data Mining Toolkit 26
XLMiner Quick Tour
sample output – Association Rules
For Market Basket Analysis XLMiner produces easy-to-read Association Rules
Rules are explained in simple English!
Each rule tells us which offerings will go well
together
27XLMiner - the Data Mining Toolkit
XLMiner Quick Tour
… and that’s not all!
XLMiner has handy utilities for Data handling:
Missing data treatment
Transforming categorical data
Binning continuous data
Sampling from Databases
Scoring to Databases
28XLMiner - the Data Mining Toolkit
XLMiner Quick Tour
XLMiner => Versatility!This was a quick demonstration of just a
few things XLMiner can do.
It can do lots more. It is comprehensive in coverage, like the best DM products
around.
Get your free download for evaluation at www.xlminer.ncom
29XLMiner - the Data Mining Toolkit
XLMiner Quick Tour
XLMiner => Simplicity!Daryl Pregibon had said – Data Mining is
“Statistics at Scale and Speed”.
You’ll find that XLMiner is Statistics at Scale, Speed and Simplicity!
If you know to use Excel, you already know XLMiner. You can get started in
minutes.
30XLMiner - the Data Mining Toolkit
XLMiner Quick Tour
XLMiner => Great Value!
Several comprehensive DM products are many times more
expensive.
For exploring how Data Mining will work for you, XLMiner provides a
great start!
31XLMiner - the Data Mining Toolkit
XLMiner Quick Tour
What others say …
The American Statistician reviewed XLMiner along with other reputed products in the November
2003 issue
This is what it had to say:
“An easy to use… an excellent, inexpensive add-on that greatly expands the capabilities of Excel.”
“XLMiner’s documentation is remarkably good…”
32XLMiner - the Data Mining Toolkit
XLMiner Quick Tour
More ResourcesFor your initiation into Data Mining:
Free evaluation download
Online Courses at www.statistics.com
Case Book in the making
Technical references on product website
33XLMiner - the Data Mining Toolkit
XLMiner Quick Tour
Thank you for viewing this Demo!
www.xlminer.com
XLMiner - the Data Mining Toolsetfor the Managers of Tomorrow