sewebar - a framework for creating and dissemination of analytical reports from data mining jan...

17
SEWEBAR - a Framework for Creating and Dissemination of Analytical Reports from Data Mining Jan Rauch, Milan Šimůnek University of Economics, Prague, Czech Republic

Upload: geoffrey-walton

Post on 23-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SEWEBAR - a Framework for Creating and Dissemination of Analytical Reports from Data Mining Jan Rauch, Milan Šimůnek University of Economics, Prague, Czech

SEWEBAR - a Framework for

Creating and Dissemination of

Analytical Reports from Data Mining

Jan Rauch, Milan Šimůnek

University of Economics, Prague, Czech Republic

Page 2: SEWEBAR - a Framework for Creating and Dissemination of Analytical Reports from Data Mining Jan Rauch, Milan Šimůnek University of Economics, Prague, Czech

SEWEBAR 2

SEWEBAR - a Framework for Creating and Dissemination of Analytical Reports from Data Mining

Starting points

Principles (as seen now)

Simple examples

First steps

Page 3: SEWEBAR - a Framework for Creating and Dissemination of Analytical Reports from Data Mining Jan Rauch, Milan Šimůnek University of Economics, Prague, Czech

SEWEBAR 3

SEWEBAR – Starting points (1) Several similar mining problems a la STULONG: ADAMEK, TINITUS

HEPATITIS, SOCIOLOGY, …:

Cca. 100 - 300 attributes

thousands of objects (usually patients)

domain expert (non informatics) available

some (this time relatively simple) background knowledge available

Reasonable result form is a well structured analytical report that must be

created

stored

retrieved

disseminated

used to answer more complex analytical questions

Page 4: SEWEBAR - a Framework for Creating and Dissemination of Analytical Reports from Data Mining Jan Rauch, Milan Šimůnek University of Economics, Prague, Czech

SEWEBAR 4

SEWEBAR – Starting points (2) Some results concerning partial related projects

Report assistant (it works)

AR2NL (successful experiment)

EverMiner (considerations)

SEWEBAR (considerations)

observational calculi

Grants: LISp, Czech Science Foundation (GAČR), Kontakt, CBI, ??

Students can contribute (4IZ460, 4IZ210, ? )

Dealing with knowledge and semantics „is in“ (see e.g. „10 Challenging problems

in Data Mining Research“ - http://www.cs.uvm.edu/~icdm/)

Page 5: SEWEBAR - a Framework for Creating and Dissemination of Analytical Reports from Data Mining Jan Rauch, Milan Šimůnek University of Economics, Prague, Czech

SEWEBAR 5

SEWEBAR – inspiration by Semantic Web (SEmantic WEB and Analytical Reports)

Page 6: SEWEBAR - a Framework for Creating and Dissemination of Analytical Reports from Data Mining Jan Rauch, Milan Šimůnek University of Economics, Prague, Czech

SEWEBAR 6

SEWEBAR – Principles (1)

There is a structured set of (types of) patterns of local analytical questions What strong relations (*, *, …) are valid in given data?

What strong known relations are not valid in given data?

What exceptions from … are valid in given data?

….

There are various items of background knowledge in easy understandable form Bier consumption BMI

Mother hypertension + Hypertension

, - , ….

Application of the pattern of analytical question to a given item of background

knowledge and to a given data matrix leads to a concrete analytical question.

Page 7: SEWEBAR - a Framework for Creating and Dissemination of Analytical Reports from Data Mining Jan Rauch, Milan Šimůnek University of Economics, Prague, Czech

SEWEBAR 7

SEWEBAR – Principles (2)

To each local analytical question there is type of local analytical report

answering the question

The concrete local analytical question can be answered by the GUHA

procedures implemented in the LISP-Miner system

The corresponding analytical report can be automatically created

There is a similar structured set of patterns of global analytical questions

(concerning several similar data matrices) that can be automatically

answered on the basis of the local analytical reports

Page 8: SEWEBAR - a Framework for Creating and Dissemination of Analytical Reports from Data Mining Jan Rauch, Milan Šimůnek University of Economics, Prague, Czech

SEWEBAR 8

SEWEBAR – Principles

From local analytical question to analytical report

Page 9: SEWEBAR - a Framework for Creating and Dissemination of Analytical Reports from Data Mining Jan Rauch, Milan Šimůnek University of Economics, Prague, Czech

SEWEBAR 9

SEWEBAR – simple examples

Pattern of analytical question – mutual influence of attributes

Pattern of analytical question – groups of attributes

Answering „analytical question – groups of attributes“ by 4ft-Miner

Analytical report

AQ - Mutual influence

AQ - Groups

Applying 4ft-Miner

Analytical report

Page 10: SEWEBAR - a Framework for Creating and Dissemination of Analytical Reports from Data Mining Jan Rauch, Milan Šimůnek University of Economics, Prague, Czech

SEWEBAR 10

SEWEBAR - a Framework for Creating and Dissemination of Analytical Reports from Data Mining

Starting points

Principles (as understood now)

Simple examples

First steps

Page 11: SEWEBAR - a Framework for Creating and Dissemination of Analytical Reports from Data Mining Jan Rauch, Milan Šimůnek University of Economics, Prague, Czech

SEWEBAR 11

SEWEBAR – Principles for first steps

To implement soon first version (simplified if necessary) of support for the whole

process dealing with local and global analytical reports. The whole process

covers:

Formulation of reasonable local analytical questions using background knowledge

Creation of analytical reports answering particular analytical questions

Formulating and answering reasonable global analytical questions

Use the first version to

Gradually improve and enhance particular parts

Develop corresponding theory using observational calculi

Page 12: SEWEBAR - a Framework for Creating and Dissemination of Analytical Reports from Data Mining Jan Rauch, Milan Šimůnek University of Economics, Prague, Czech

SEWEBAR 12

Control panel – tool for first steps

Page 13: SEWEBAR - a Framework for Creating and Dissemination of Analytical Reports from Data Mining Jan Rauch, Milan Šimůnek University of Economics, Prague, Czech

SEWEBAR 13

SEWEBAR – First steps (1)

Background knowledge and local analytical questions:

1. We start with ADAMEK and STULONG data sets

2. Background knowledge – we use current version of Knowledge Base

3. To define first version of the set of LAQ - Local Analytical Questions

4. To implement LAQPA - Local Analytical Question Patterns Administrator

5. To implement LAQA - Local Analytical Questions Administrator

KnowledgeBase

LAQ

Page 14: SEWEBAR - a Framework for Creating and Dissemination of Analytical Reports from Data Mining Jan Rauch, Milan Šimůnek University of Economics, Prague, Czech

SEWEBAR 14

SEWEBAR – First steps (3)

Local analytical reports:

6. Enhancement of 4ft-Miner by filtering out of uninteresting rules

7. EverMiner modules

8. To define skelets of analytical reports

9. Generator of analytical reports

Filtering

EM Modules

Page 15: SEWEBAR - a Framework for Creating and Dissemination of Analytical Reports from Data Mining Jan Rauch, Milan Šimůnek University of Economics, Prague, Czech

SEWEBAR 15

SEWEBAR – First steps (4)

Global analytical reports - implemented using ?Topic Maps Content

management system?

9. To define rules for indexing analytical reports by Topic Maps

10. To implement tool for automated indexing analytical reports for Topic Maps

11. To define first version of a set of global analytical questions

12. To implement tool for automated answering global analytical reports

13. ??IGA grant??

Page 16: SEWEBAR - a Framework for Creating and Dissemination of Analytical Reports from Data Mining Jan Rauch, Milan Šimůnek University of Economics, Prague, Czech

SEWEBAR 16

Thank you for your attention

Page 17: SEWEBAR - a Framework for Creating and Dissemination of Analytical Reports from Data Mining Jan Rauch, Milan Šimůnek University of Economics, Prague, Czech

SEWEBAR 17

Thank you for your attention