dmdw lesson 08 - further data mining algorithms
TRANSCRIPT
![Page 1: DMDW Lesson 08 - Further Data Mining Algorithms](https://reader035.vdocuments.site/reader035/viewer/2022062418/555060c0b4c905ae3f8b53a3/html5/thumbnails/1.jpg)
STAATLICHANERKANNTEFACHHOCHSCHULE
STUDIERENUND DURCHSTARTEN.
Author I: Dip.-Inf. (FH) Johannes HoppeAuthor II: M.Sc. Johannes HofmeisterAuthor III: Prof. Dr. Dieter HomeisterDate: 13.05.2011
![Page 2: DMDW Lesson 08 - Further Data Mining Algorithms](https://reader035.vdocuments.site/reader035/viewer/2022062418/555060c0b4c905ae3f8b53a3/html5/thumbnails/2.jpg)
STAATLICHANERKANNTEFACHHOCHSCHULE
Further Data Mining Algorithms
Author I: Dip.-Inf. (FH) Johannes HoppeAuthor II: M.Sc. Johannes HofmeisterAuthor III: Prof. Dr. Dieter HomeisterDate: 13.05.2011
![Page 3: DMDW Lesson 08 - Further Data Mining Algorithms](https://reader035.vdocuments.site/reader035/viewer/2022062418/555060c0b4c905ae3f8b53a3/html5/thumbnails/3.jpg)
Data Mining Algorithms - Regression Analysis
01
3
![Page 4: DMDW Lesson 08 - Further Data Mining Algorithms](https://reader035.vdocuments.site/reader035/viewer/2022062418/555060c0b4c905ae3f8b53a3/html5/thumbnails/4.jpg)
DM Algorithms - Regression Analysis
Regression Analysis
› AKA. function approximation› Includes any techniques for modeling and analyzing
several variables› Models the relationship between one or more variables
you are trying to predict (dependent variables) and the predictive variables (independent variables)
4
![Page 5: DMDW Lesson 08 - Further Data Mining Algorithms](https://reader035.vdocuments.site/reader035/viewer/2022062418/555060c0b4c905ae3f8b53a3/html5/thumbnails/5.jpg)
DM Algorithms - Regression Analysis
SSAS build in
› MS Linear Regression Analysis› MS Logistic Regression Analysis› MS Time Series Algorithm
http://msdn.microsoft.com/en-us/library/ms170993(SQL.90).aspx
5
![Page 6: DMDW Lesson 08 - Further Data Mining Algorithms](https://reader035.vdocuments.site/reader035/viewer/2022062418/555060c0b4c905ae3f8b53a3/html5/thumbnails/6.jpg)
6
DM Algorithms - Regression / Linear Regression
Linear Regression
› Analyze two continuous columns › Relationship is an equation› Equation is a line (linear equation)
f(x) = m*x + b› Error == distance from the regression line
http://msdn.microsoft.com/en-us/library/ms174824(SQL.90).aspx
![Page 7: DMDW Lesson 08 - Further Data Mining Algorithms](https://reader035.vdocuments.site/reader035/viewer/2022062418/555060c0b4c905ae3f8b53a3/html5/thumbnails/7.jpg)
7
DM Algorithms - Regression / Linear Regression
0 100 200 300 400 500 6000
1000
2000
3000
4000
5000
6000
f(x) = 7.81381138497918 x + 866.585289444156R² = 0.701037764746929
Sales
SalesLinear (Sales)Linear (Sales)
Example
![Page 8: DMDW Lesson 08 - Further Data Mining Algorithms](https://reader035.vdocuments.site/reader035/viewer/2022062418/555060c0b4c905ae3f8b53a3/html5/thumbnails/8.jpg)
8
DM Algorithms - Regression / Linear Regression
Explanation
The Diagram shows a relationship between sales and advertising along with the regression equation. The goal is to be able to predict sales based on the amount spent on advertising. The graph shows a very linear relationship between sales and advertising. A key measure of the strength of the relationship is the R-square. The R-square measures the amount of the overall variation in the data that is explained by the model.This regression analysis results in an R-square of 70%. This implies that 70% of the variation in sales can be explained by the variation in advertising.[Source: Olivia Parr Rud et. al, Data Mining Cookbook]
![Page 9: DMDW Lesson 08 - Further Data Mining Algorithms](https://reader035.vdocuments.site/reader035/viewer/2022062418/555060c0b4c905ae3f8b53a3/html5/thumbnails/9.jpg)
9
DM Algorithms - Regression / Logistic Regression
Logistic regression › Dependent variables have values between 0 and 1› Functions which describes the probability of a given event › Instead of creating a straight line, logistic regression
analysis creates an "S" shaped curve that contains maximum and minimum constraints
› Wikipedia Algorithm != MSDN Algorithm
http://msdn.microsoft.com/en-us/library/ms174828(SQL.90).aspx
![Page 10: DMDW Lesson 08 - Further Data Mining Algorithms](https://reader035.vdocuments.site/reader035/viewer/2022062418/555060c0b4c905ae3f8b53a3/html5/thumbnails/10.jpg)
10
DM Algorithms - Regression / Logistic Regression
Logistic regression
![Page 11: DMDW Lesson 08 - Further Data Mining Algorithms](https://reader035.vdocuments.site/reader035/viewer/2022062418/555060c0b4c905ae3f8b53a3/html5/thumbnails/11.jpg)
DM Algorithms - Regression / Time-Series
MS Time-Series Algorithm
› Trend Analysis› Optimized for analyzing continuous values
› eg. product sales over time
› Train Predict› Cross-predictions possible! *
* cool!http://msdn.microsoft.com/en-us/library/ms174923(SQL.90).aspx
![Page 12: DMDW Lesson 08 - Further Data Mining Algorithms](https://reader035.vdocuments.site/reader035/viewer/2022062418/555060c0b4c905ae3f8b53a3/html5/thumbnails/12.jpg)
DM Algorithms - Regression / Time-Series
MS Time-Series Algorithm
![Page 13: DMDW Lesson 08 - Further Data Mining Algorithms](https://reader035.vdocuments.site/reader035/viewer/2022062418/555060c0b4c905ae3f8b53a3/html5/thumbnails/13.jpg)
13
DM Algorithms - Regression / Time-Series
› Combination of 2 algorithms, results are mixed› ARTxp
› Auto Regressive Tree Method› Developed by Microsoft Research› Based on Microsoft Decision-Tree› For Short term predictions
› ARIMA:› Auto Regressive Integrated Moving Average› Developed by Box and Jenkins› For long term predictions
http://msdn.microsoft.com/en-us/library/ms174828(SQL.90).aspxhttp://msdn.microsoft.com/en-us/library/bb677216.aspx
![Page 14: DMDW Lesson 08 - Further Data Mining Algorithms](https://reader035.vdocuments.site/reader035/viewer/2022062418/555060c0b4c905ae3f8b53a3/html5/thumbnails/14.jpg)
Data Mining Algorithms - Neural Networks
02
14
![Page 15: DMDW Lesson 08 - Further Data Mining Algorithms](https://reader035.vdocuments.site/reader035/viewer/2022062418/555060c0b4c905ae3f8b53a3/html5/thumbnails/15.jpg)
DM Algorithms - Neural Networks
15
![Page 16: DMDW Lesson 08 - Further Data Mining Algorithms](https://reader035.vdocuments.site/reader035/viewer/2022062418/555060c0b4c905ae3f8b53a3/html5/thumbnails/16.jpg)
DM Algorithms - Neural Networks
Neural Networks (NN or ANN)
› Better term: artificial neural networks (ANN),in opposite to biological NN
› Sometimes called neuronal networks
› By the way…http://code.google.com/p/clustered-neuronal-network/wiki/ProjektInfos
16
![Page 17: DMDW Lesson 08 - Further Data Mining Algorithms](https://reader035.vdocuments.site/reader035/viewer/2022062418/555060c0b4c905ae3f8b53a3/html5/thumbnails/17.jpg)
17
![Page 18: DMDW Lesson 08 - Further Data Mining Algorithms](https://reader035.vdocuments.site/reader035/viewer/2022062418/555060c0b4c905ae3f8b53a3/html5/thumbnails/18.jpg)
DM Algorithms - Neural Networks
Definition
› A neural network is a massively parallel distributed processor that has a natural propensity for storing experiential knowledge and making it available for use.
› It resembles the brain in two respects:› Knowledge is acquired by the network through a learning process. › Interneuron connection strengths known as synaptic weights are
used to store the knowledge.
[Source: Haykin, S. (1994), Neural Networks: A Comprehensive Foundation, NY: Macmillan. ]
18
![Page 19: DMDW Lesson 08 - Further Data Mining Algorithms](https://reader035.vdocuments.site/reader035/viewer/2022062418/555060c0b4c905ae3f8b53a3/html5/thumbnails/19.jpg)
DM Algorithms - Neural Networks
› Most NN are composed of several layers of neurons› The direction of most connections is from input to output › Often used: Back Propagation Networks› A single neuron has several inputs with individual weights
and one output › In the basic form, the output is activated if the sum of
inputs*weights exceeds a given threshold › Learning is done with a target value at an additional
training input plus a training mode signal.
19
![Page 20: DMDW Lesson 08 - Further Data Mining Algorithms](https://reader035.vdocuments.site/reader035/viewer/2022062418/555060c0b4c905ae3f8b53a3/html5/thumbnails/20.jpg)
THANK YOUFOR YOUR ATTENTION
20