finding hidden intelligence with predictive analysis of...
TRANSCRIPT
![Page 1: Finding Hidden Intelligence with Predictive Analysis of ...download.microsoft.com/documents/UK/Finland/post/... · 2/3/2009 · 23 Algorithms Algorithm Description Decision Trees](https://reader034.vdocuments.site/reader034/viewer/2022042117/5e958c0250c4e30f3a76f326/html5/thumbnails/1.jpg)
Finding Hidden Intelligence with Predictive Analysis of Data MiningRafal LukawieckiStrategic Consultant, Project Botticelli Ltd
![Page 2: Finding Hidden Intelligence with Predictive Analysis of ...download.microsoft.com/documents/UK/Finland/post/... · 2/3/2009 · 23 Algorithms Algorithm Description Decision Trees](https://reader034.vdocuments.site/reader034/viewer/2022042117/5e958c0250c4e30f3a76f326/html5/thumbnails/2.jpg)
2
Objectives
• Show use of Microsoft SQL Server 2008 Analysis Services Data Mining
• Tantalise you with the power of DM
This seminar is based on a number of sources including a few dozen of Microsoft-owned presentations, used with permission. Thank you to Marin Bezic, Kathy Sabourin, Aydin Gencler, Bryan Bredehoeft, and Chris Dial for all the support. Thank you to Maciej Pilecki for assistance with demos.
The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or RafalLukawiecki. The material presented is not certain and may vary based on several factors. Microsoft makes no warranties, express,implied or statutory, as to the information in this presentation.
Portions © 2009 Project Botticelli Ltd & entire material © 2009 Microsoft Corp. Some slides contain quotations from copyrightedmaterials by other authors, as individually attributed or as already covered by Microsoft Copyright ownerships. All rights reserved.Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S.and/or other countries. The information herein is for informational purposes only and represents the current view of Project BotticelliLtd as of the date of this presentation. Because Project Botticelli & Microsoft must respond to changing market conditions, it shouldnot be interpreted to be a commitment on the part of Microsoft, and Microsoft and Project Botticelli cannot guarantee the accuracy ofany information provided after the date of this presentation. Project Botticelli makes no warranties, express, implied or statutory, as tothe information in this presentation. E&OE.
![Page 3: Finding Hidden Intelligence with Predictive Analysis of ...download.microsoft.com/documents/UK/Finland/post/... · 2/3/2009 · 23 Algorithms Algorithm Description Decision Trees](https://reader034.vdocuments.site/reader034/viewer/2022042117/5e958c0250c4e30f3a76f326/html5/thumbnails/3.jpg)
3
Agenda
• Data Mining and Predictive Analytics
• Server and Process Considerations
• Scenarios & Demos
![Page 4: Finding Hidden Intelligence with Predictive Analysis of ...download.microsoft.com/documents/UK/Finland/post/... · 2/3/2009 · 23 Algorithms Algorithm Description Decision Trees](https://reader034.vdocuments.site/reader034/viewer/2022042117/5e958c0250c4e30f3a76f326/html5/thumbnails/4.jpg)
4
What does Data Mining Do?
Explores Your Data
Finds Patterns
Performs Predictions
![Page 5: Finding Hidden Intelligence with Predictive Analysis of ...download.microsoft.com/documents/UK/Finland/post/... · 2/3/2009 · 23 Algorithms Algorithm Description Decision Trees](https://reader034.vdocuments.site/reader034/viewer/2022042117/5e958c0250c4e30f3a76f326/html5/thumbnails/5.jpg)
5
Typical Uses
Data Mining
Seek Profitable Customers
Understand Customer
Needs
Anticipate Customer
Churn
Predict Sales &
Inventory
Build Effective
Marketing Campaigns
Detect and Prevent Fraud
Correct Data
During ETL
![Page 6: Finding Hidden Intelligence with Predictive Analysis of ...download.microsoft.com/documents/UK/Finland/post/... · 2/3/2009 · 23 Algorithms Algorithm Description Decision Trees](https://reader034.vdocuments.site/reader034/viewer/2022042117/5e958c0250c4e30f3a76f326/html5/thumbnails/6.jpg)
6
Analysis ServicesServer
Mining Model
Data Mining Algorithm DataSource
Server Mining Architecture
Excel/Visio/SSRS/Your App
OLE DB/ADOMD/XMLA
Deploy
BIDSExcelVisioSSMS
AppData
![Page 7: Finding Hidden Intelligence with Predictive Analysis of ...download.microsoft.com/documents/UK/Finland/post/... · 2/3/2009 · 23 Algorithms Algorithm Description Decision Trees](https://reader034.vdocuments.site/reader034/viewer/2022042117/5e958c0250c4e30f3a76f326/html5/thumbnails/7.jpg)
7
Mining Model Mining ModelMining Model
Mining Process
DM EngineDM Engine
Training data
Data to be
predictedMining Model
With
predictions
![Page 8: Finding Hidden Intelligence with Predictive Analysis of ...download.microsoft.com/documents/UK/Finland/post/... · 2/3/2009 · 23 Algorithms Algorithm Description Decision Trees](https://reader034.vdocuments.site/reader034/viewer/2022042117/5e958c0250c4e30f3a76f326/html5/thumbnails/8.jpg)
8
Concepts
• Case – set of Columns (attributes) you want to analyse
• Age, Gender, Annual Spending
• Column Usage
• Input: We analyse them
• Predict: Build a model for them
• Nested Case – case containing a table column
• Age, Gender, Annual Spending, Products, Purchases
• Case Key – unique ID of a case
• Data Mining Model – container of patterns discovered by a DM algorithm in your data
![Page 9: Finding Hidden Intelligence with Predictive Analysis of ...download.microsoft.com/documents/UK/Finland/post/... · 2/3/2009 · 23 Algorithms Algorithm Description Decision Trees](https://reader034.vdocuments.site/reader034/viewer/2022042117/5e958c0250c4e30f3a76f326/html5/thumbnails/9.jpg)
9
SCENARIO: CUSTOMER CLASSIFICATION & SEGMENTATION
Who are our customers? Are there any relationships between their demographics and their buying power?
![Page 10: Finding Hidden Intelligence with Predictive Analysis of ...download.microsoft.com/documents/UK/Finland/post/... · 2/3/2009 · 23 Algorithms Algorithm Description Decision Trees](https://reader034.vdocuments.site/reader034/viewer/2022042117/5e958c0250c4e30f3a76f326/html5/thumbnails/10.jpg)
10
Microsoft Decision Trees
• Use for:
• Classification: churn and risk analysis
• Regression: predict profit or income
• Association analysis based on multiple predictable variable
• Builds one tree for each predictable attribute
• Fast
![Page 11: Finding Hidden Intelligence with Predictive Analysis of ...download.microsoft.com/documents/UK/Finland/post/... · 2/3/2009 · 23 Algorithms Algorithm Description Decision Trees](https://reader034.vdocuments.site/reader034/viewer/2022042117/5e958c0250c4e30f3a76f326/html5/thumbnails/11.jpg)
11
Decision Trees for Classification of Customers’ Buying Potential
![Page 12: Finding Hidden Intelligence with Predictive Analysis of ...download.microsoft.com/documents/UK/Finland/post/... · 2/3/2009 · 23 Algorithms Algorithm Description Decision Trees](https://reader034.vdocuments.site/reader034/viewer/2022042117/5e958c0250c4e30f3a76f326/html5/thumbnails/12.jpg)
12
SCENARIO: PROFITABILITY AND RISK
Who are our most profitable customers? Can I predict profit of a future customer based on demographics? Are they creditworthy? How much should I charge them to give a good loan and protect against losses?
![Page 13: Finding Hidden Intelligence with Predictive Analysis of ...download.microsoft.com/documents/UK/Finland/post/... · 2/3/2009 · 23 Algorithms Algorithm Description Decision Trees](https://reader034.vdocuments.site/reader034/viewer/2022042117/5e958c0250c4e30f3a76f326/html5/thumbnails/13.jpg)
13
Profitability and Risk
• Finding what makes a customer profitable is also classification or regression
• Typically solved with:
• Decision Trees (Regression), Linear Regression,
• and Neural Networks or Logistic Regression
• Often used for prediction
• Important to predict probability of the predicted, or expected profit
• Risk scoring
• Logistic Regression and Neural Networks
![Page 14: Finding Hidden Intelligence with Predictive Analysis of ...download.microsoft.com/documents/UK/Finland/post/... · 2/3/2009 · 23 Algorithms Algorithm Description Decision Trees](https://reader034.vdocuments.site/reader034/viewer/2022042117/5e958c0250c4e30f3a76f326/html5/thumbnails/14.jpg)
14
Neural Network & Logistic Regression
• Applied to
• Classification
• Regression
• Great for finding complicated relationship among attributes
• Difficult to interpret results
• Gradient Descent method
• LR is NNet with no hidden layers
Age Education Sex Income
Input
Layer
Hidden
Layers
Output
LayerLoyalty
![Page 15: Finding Hidden Intelligence with Predictive Analysis of ...download.microsoft.com/documents/UK/Finland/post/... · 2/3/2009 · 23 Algorithms Algorithm Description Decision Trees](https://reader034.vdocuments.site/reader034/viewer/2022042117/5e958c0250c4e30f3a76f326/html5/thumbnails/15.jpg)
15
1. Neural Networks for Profitability Analysis2. Predicting Lending Risk with Neural Networks
![Page 16: Finding Hidden Intelligence with Predictive Analysis of ...download.microsoft.com/documents/UK/Finland/post/... · 2/3/2009 · 23 Algorithms Algorithm Description Decision Trees](https://reader034.vdocuments.site/reader034/viewer/2022042117/5e958c0250c4e30f3a76f326/html5/thumbnails/16.jpg)
16
SCENARIO: CUSTOMER NEEDS ANALYSIS
How do they behave? What are they likely to do once they bought that really expensive car? Should I intervene?
![Page 17: Finding Hidden Intelligence with Predictive Analysis of ...download.microsoft.com/documents/UK/Finland/post/... · 2/3/2009 · 23 Algorithms Algorithm Description Decision Trees](https://reader034.vdocuments.site/reader034/viewer/2022042117/5e958c0250c4e30f3a76f326/html5/thumbnails/17.jpg)
17
Sequence Clustering
• Analysis of:
• Customer behaviour
• Transaction patterns
• Click stream
• Customer segmentation
• Sequence prediction
• Mix of clustering and sequence technologies
• Groups individuals based on their profiles including sequence data
![Page 18: Finding Hidden Intelligence with Predictive Analysis of ...download.microsoft.com/documents/UK/Finland/post/... · 2/3/2009 · 23 Algorithms Algorithm Description Decision Trees](https://reader034.vdocuments.site/reader034/viewer/2022042117/5e958c0250c4e30f3a76f326/html5/thumbnails/18.jpg)
18
Analysis Customer Behaviour with Sequence Clustering
![Page 19: Finding Hidden Intelligence with Predictive Analysis of ...download.microsoft.com/documents/UK/Finland/post/... · 2/3/2009 · 23 Algorithms Algorithm Description Decision Trees](https://reader034.vdocuments.site/reader034/viewer/2022042117/5e958c0250c4e30f3a76f326/html5/thumbnails/19.jpg)
19
SCENARIO: FORECASTING
What are my sales going to be like in the next few months? Will I have credit problems? Will my server need an upgrade in the next 3 months?
![Page 20: Finding Hidden Intelligence with Predictive Analysis of ...download.microsoft.com/documents/UK/Finland/post/... · 2/3/2009 · 23 Algorithms Algorithm Description Decision Trees](https://reader034.vdocuments.site/reader034/viewer/2022042117/5e958c0250c4e30f3a76f326/html5/thumbnails/20.jpg)
20
Time Series
• Uses:
• Forecast sales
• Inventory prediction
• Web hits prediction
• Stock value estimation
• Regression trees with extras
![Page 21: Finding Hidden Intelligence with Predictive Analysis of ...download.microsoft.com/documents/UK/Finland/post/... · 2/3/2009 · 23 Algorithms Algorithm Description Decision Trees](https://reader034.vdocuments.site/reader034/viewer/2022042117/5e958c0250c4e30f3a76f326/html5/thumbnails/21.jpg)
21
Forecasting Using Time Series
![Page 22: Finding Hidden Intelligence with Predictive Analysis of ...download.microsoft.com/documents/UK/Finland/post/... · 2/3/2009 · 23 Algorithms Algorithm Description Decision Trees](https://reader034.vdocuments.site/reader034/viewer/2022042117/5e958c0250c4e30f3a76f326/html5/thumbnails/22.jpg)
22
TECHNIQUE SUMMARY
![Page 23: Finding Hidden Intelligence with Predictive Analysis of ...download.microsoft.com/documents/UK/Finland/post/... · 2/3/2009 · 23 Algorithms Algorithm Description Decision Trees](https://reader034.vdocuments.site/reader034/viewer/2022042117/5e958c0250c4e30f3a76f326/html5/thumbnails/23.jpg)
23
Algorithms
Algorithm Description
Decision Trees Finds the odds of an outcome based on values in a training set
Association Rules
Identifies relationships between cases
Clustering Classifies cases into distinctive groups based on any attribute sets
Naïve Bayes Clearly shows the differences in a particular variable for various data elements
Sequence Clustering
Groups or clusters data based on a sequence of previous events
Time Series Analyzes and forecasts time-based data combining the powerof ARTXP (developed by Microsoft Research) for short-term predictionswith ARIMA (in SQL 2008) for long-term accuracy.
Neural Nets Seeks to uncover non-intuitive relationships in data
Linear Regression
Determines the relationship between columns in order to predict an outcome
Logistic Regression
Determines the relationship between columns in order to evaluate the probability that a column will contain a specific state
![Page 24: Finding Hidden Intelligence with Predictive Analysis of ...download.microsoft.com/documents/UK/Finland/post/... · 2/3/2009 · 23 Algorithms Algorithm Description Decision Trees](https://reader034.vdocuments.site/reader034/viewer/2022042117/5e958c0250c4e30f3a76f326/html5/thumbnails/24.jpg)
24
Time Series
Sequence Clustering
Neural Nets
Naïve Bayes
Logistic Regression
Linear Regression
Decision Trees
Clustering
Association Rules
![Page 25: Finding Hidden Intelligence with Predictive Analysis of ...download.microsoft.com/documents/UK/Finland/post/... · 2/3/2009 · 23 Algorithms Algorithm Description Decision Trees](https://reader034.vdocuments.site/reader034/viewer/2022042117/5e958c0250c4e30f3a76f326/html5/thumbnails/25.jpg)
25
Summary
• Data Mining is a powerful, predictive technology
• Turns data into valuable, decision-making knowledge
• SQL Server 2008 Analysis Services support Predictive Analytics
• Mine your mountains of data for gems of intelligence today!
![Page 26: Finding Hidden Intelligence with Predictive Analysis of ...download.microsoft.com/documents/UK/Finland/post/... · 2/3/2009 · 23 Algorithms Algorithm Description Decision Trees](https://reader034.vdocuments.site/reader034/viewer/2022042117/5e958c0250c4e30f3a76f326/html5/thumbnails/26.jpg)
26
© 2009 Microsoft Corporation & Project Botticelli Ltd. All rights reserved.
The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The materialpresented is not certain and may vary based on several factors. Microsoft makes no warranties, express, implied or statutory, as to the information in thispresentation.
Portions © 2009 Project Botticelli Ltd & entire material © 2009 Microsoft Corp. Some slides contain quotations from copyrighted materials by other authors,as individually attributed or as already covered by Microsoft Copyright ownerships. All rights reserved. Microsoft, Windows, Windows Vista and otherproduct names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informationalpurposes only and represents the current view of Project Botticelli Ltd as of the date of this presentation. Because Project Botticelli & Microsoft mustrespond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft and Project Botticellicannot guarantee the accuracy of any information provided after the date of this presentation. Project Botticelli makes no warranties, express, implied orstatutory, as to the information in this presentation. E&OE.