intelligent data analysis (ida) by josipa kern, phd andrija stampar school of public health medical...

32
Intelligent Intelligent Data Analysis Data Analysis (IDA) (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

Upload: erika-long

Post on 25-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

Intelligent Data Intelligent Data AnalysisAnalysis

(IDA) (IDA)

by

Josipa Kern, PhD

Andrija Stampar School of Public Health

Medical School University of Zagreb

Zagreb, Croatia

Page 2: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

Interest and Excitement for Interest and Excitement for Intelligent Data AnalysisIntelligent Data Analysis

Decision making is asking for information and knowledge

Data processing can give them

Multidimensionality of problems is looking for methods for adequate and deep data processing and analysis

Page 3: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

Learning ObjectivesLearning Objectives

To understand the concept of the IDATo meet web-sites and literature on IDATo meet some tools for IDATo learn how to use IDA tools and to

validate the IDA results

Page 4: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

Performance ObjectivesPerformance Objectives

Recognize problems asking for IDAPreparing data and making analysisValidating and interpreting results of

IDA

Page 5: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

IDA is…IDA is…

… an interdisciplinary study concerned with the effective analysis of data;

… used for extracting useful information from large quantities of online data; extracting desirable knowledge or interesting patterns from existing databases;

Page 6: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

IDA or …IDA or …

Data miningKnowledge acquisition from dataGenetic algorithm-based rule discoveryKnowledge discoveryLearning classifier systemMachine learningetc.

Page 7: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

IDA gives knowledge …IDA gives knowledge …

Page 8: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

Knowledge is …Knowledge is …

the distillation of information that has been collected, classified, organized, integrated, abstracted and value-added;

at a level of abstraction higher than the data, and information on which it is based and can be used to deduce new information and new knowledge;

usually in the context of human expertise used in solving problems.

Page 9: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

Knowledge acquisition …Knowledge acquisition …

The process of eliciting, analyzing, transforming, classifying, organizing and integrating knowledge and representing that knowledge in a form that can be used in a computer system.

Page 10: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

Knowledge in a domain can Knowledge in a domain can be expressed as a number be expressed as a number

of rulesof rules

Page 11: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

Rule is …Rule is …

A formal way of specifying a recommendation, directive, or strategy, expressed as "IF premise THEN conclusion" or "IF condition THEN action".

Page 12: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

How to discover rules How to discover rules hidden in the data?hidden in the data?

Page 13: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

Some tools for IDA …Some tools for IDA …

See5 - program for analyzing data and generating classifiers in the form of decision trees and/or rule sets.

http://www.rulequest.com

Page 14: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

Some tools for IDA …Some tools for IDA …

Cubist - analyzes data and generates rule-based piecewise linear models – collections of rules, each with an associated linear expression for computing a target value..

http://www.rulequest.com

Page 15: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

Some tools for IDA …Some tools for IDA …

ILLM - the tool constructs classification models in the form of rules which represent knowledge about relations hidden in data.

http://dms.irb.hr

Page 16: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

Some tools for IDA …Some tools for IDA …

Magnum Opus - finds association rules providing competitive advantage by revealing underlying interactions between factors within the data. 

http://www.rulequest.com

Page 17: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

Evaluation of IDA resultsEvaluation of IDA results

Absolute & relative accuracySensitivity & specificityFalse positive & false negativeError rateReliability of rulesEtc.

Page 18: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

Example of IDAExample of IDA

Illustration of IDA by using See5

Page 19: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

See5…application…See5…application…

application.names - lists the classes to which cases may belong and the attributes used to describe each case.

Attributes are of two types: discrete attributes have a value drawn from a set of possibilities, and continuous attributes have numeric values.

Page 20: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

See5…application…See5…application…

application.data - provides information on the training cases from which See5 will extract patterns.

The entry for each case consists of one or more lines that give the values for all attributes.

Page 21: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

See5…application…See5…application…

application.test - provides information on the test cases (used for evaluation of results).

The entry for each case consists of one or more lines that give the values for all attributes.

Page 22: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

See5…application…See5…application…exampleexample……

Epidemiological study (1970-1990)Sample of examinees died from

cardiovascular diseases during the period

Question: Did they know they were ill?1 – they were healthy2 – they were ill (drug treatment, positive clinical

and laboratory findings)

Page 23: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

See5…application…See5…application…exampleexample……

application.names – example

Goal.gender:M,Factivity:1,2,3age: continuoussmoking: No,Yes…Goal:1,2…

Page 24: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

See5…application…See5…application…exampleexample……

application.data – example

M,1,59,Yes,0,0,0,0,119,73,103,86,247,87,15979,?,?,?,1,73,2.5

M,1,66,Yes,0,0,0,0,132,81,183,239,?,783,14403,27221,19153,23187,1,73,2.6

M,1,61,No,0,0,0,0,130,79,148,86,209,115,21719,12324,10593,11458,1,74,2.5

… …

Page 25: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

See5…application…See5…application…exampleexample……

Results – example 

Rule 1: (cover 26)

gender = M

SBP > 111

oil_fat > 2.9

-> class 1 [0.929]

Page 26: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

See5…application…See5…application…exampleexample……

Results – example 

Rule 4: (cover 14)

smoking = Yes

SBP > 131

glucose > 93

glucose <= 118

oil_fat <= 2.9

-> class 2 [0.938]

Page 27: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

See5…application…See5…application…exampleexample……

Results – example 

Rule 15: (cover 2)

SBP <= 111

oil_fat > 2.9

-> class 2 [0.750]

Page 28: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

See5…application…See5…application…exampleexample……

Results – example 

Evaluation on training data

(199 cases):

  (a) (b) <-classified as

---- ----

107 3 (a): class 1

17 72 (b): class 2

Page 29: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

See5…application…See5…application…exampleexample……

Results – example (training set) 

Sensitivity=0.97

Specificity=0.81

Page 30: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

See5…application…See5…application…exampleexample……

Results – example 

Evaluation on test data

(73 cases):

 

(a) (b) <-classified as

---- ----

43 1 (a): class 1

3 26 (b): class 2

Page 31: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

See5…application…See5…application…exampleexample……

Results – example (test set) 

 Sensitivity=0.98Specificity=0.90

Page 32: Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

All the suggested IDA tools are All the suggested IDA tools are available at mentioned URLs, at available at mentioned URLs, at

least as demo version least as demo version

Try your own IDA…

Thank you!