1 evaluation for web mining applications bettina berendt humboldt university berlin ernestina...

26
1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou Otto von Guericke University Magdeburg www.wiwi.hu-berlin.de/~berendt/Evaluation

Upload: maximillian-mayhall

Post on 02-Apr-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou

1

Evaluation for Web Mining Applications

Bettina BerendtHumboldt University Berlin

Ernestina MenasalvasUniversidad Politécnica de Madrid

Myra SpiliopoulouOtto von Guericke University Magdeburg

www.wiwi.hu-berlin.de/~berendt/Evaluation

Page 2: 1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou

2

Agenda

Mining for evaluation: perspectives and measures

A case study

Outlook: Evaluation of mining

Web mining as a project: towards a methodology

Evaluation and experimentation

Evaluation and Web mining

Web mining as a project: towards a methodology

Evaluation and experimentation

Evaluation and Web mining

Page 3: 1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou

3

Evaluation of Web mining applications, or: Web mining as a project

Is it worthwhil

e to do the

mining project?

Is the result valuable for

the application?

Are (all) the tasks performed well?

Are the data

appropriate for the

mining project?

Are the technique

s appropriate for the expected resutls?

Page 4: 1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou

4

Project definition

Refers to

Set of interdependent activities

Oriented to a specific goal

With a predetermined lenght

Set of tasks

Web Site goal: stakeholder

Cost and time estimation

Page 5: 1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou

5

Data Mining as a project

Define the goal:

Corresponds to the business understanding step of Crisp-DM

Business and data mining experts have to define the goal collaboratively

Each goal must be defined with a great degree of detail

Obtain the model

Apply data mining process model

Evaluate results and redirect

Evaluation in the extent definition: the act of ascertaining the value of an object according to specified criteria, operationalised in terms of measures.

Object= patterns or model

Measures and criteria has to do with goals

Deploy

With business goals directing each step, data mining produce results with a business impact

Check the business impact is due to the result of the project

Experiment design

Page 6: 1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou

6

Web Mining as a project:the 3 components of a system by Garnert Group

ERP/ERM

Order Manag.

Supply ChainMgmt.

Order Prom.

LegacySystems

SalesAutomation

ServiceAutomation

MarketingAutomation

FieldService

Mobile SalesVertical Apps.

Category Mgmt.

MarketingAutomation

Campaign Mgmt.

CustomerActivity

Customers Products

DataWarehouse

Voice(IVR, ACD)

Conferencing

WebConferencing

E-mail

ResponseManagement

FaxLetter

DirectInteraction

Operational CRM Analytical CRM

Collaborative CRM

Off

ice

Off

ice

Off

ice

Inte

ract

ion

Clo

sed-

Loop

Pro

cess

ing

(EA

I Too

lkits

, Em

bedd

ed/M

obile

Age

nts

Page 7: 1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou

7

Web Mining as a project:the 3 components translated

ERP/ERM

Order Manag.

Supply ChainMgmt.

Order Prom.

LegacySystems

SalesAutomation

ServiceAutomation

MarketingAutomation

FieldService

Mobile Sales

Data Mining.Data Mining

CustomerActivity

Customers Products

DataWarehouse

RecommenderPersonalization E-mail

ResponseManagement

OperationalAnalytical

Decisional System

Off

ice

Off

ice

Off

ice

Inte

ract

ion

Clo

sed-

Loop

Pro

cess

ing

(EA

I Too

lkits

, Em

bedd

ed/M

obile

Age

nts

Web SiteFront??

Web SiteBack??

Page 8: 1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou

8

The 3 component of a Web Site

Operational component: The end result of a

Software Development Process

Decisional component: Results of the analitycal component are integrated in the operational system:

Software development project

Analitical component: The end result of a

Data Mining process

Sw Development Methodologies

Sw Development Methodologies

Data Mining Methodologies ?¿??

Business Intelligent Project BI Methodologies

Page 9: 1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou

9

Methodology

Process Model

Lifecycle

+

Set of tasks to be perfomed:

Development tasks

Project Management tasks

Sequencing of task

Waterfall

Iterative

Phases of the project

Page 10: 1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou

10

BI-Methodologies

BI-Roadmap

CRM-Catalyst

Page 11: 1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou

11

CRM Catalyst mayor phases:

The five mayor phases are:

Discovery.

Establishing the business goals for CRM

Orientation.

Defining necessary system and organisational (specific technical solutions) changes to meet the goals. This leads to a definition of top-level system requirements.

Navigation.

The CRM system requirements are defined more precisely, the system is scoped, system and vendor assessment criteria are defined and a system is selected and contracted.

Implementation.

Planning and managing the CRM project. It is during this phase that the system is built and put into use.

Post implementation.

Monitoring performance and continuous improvement since CRM project never ends because CRM must constantly evolve to keep pace with the changing business and its environment.

Page 12: 1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou

12

Software Methodologies

Process Model

ISO 12207

Lifecycle

Iterative+ = RUP

Page 13: 1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou

13

Web Mining Methodology?

To Be Defined

Can be reused ?

The ones in CRISP-DM

Page 14: 1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou

14

Web mining methodology :Process Model: Crisp-DM

Is it worthwhil

e to do the

mining project?

Is the result valuable for

the application?

Are (all) the tasks performed well?

Are the data

appropriate for the mining

project?

Are the technique

s appropriate for the expected resutls?Has the goal be obtained as a

cause effect of the project development?

Page 15: 1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou

15

Web Mining Project goals

Top-level goal 1: The Web exists in order to be used

Goals of usage depend on stakeholder and viewpoint.

Is the site a good site? Is it successful?But: What does Success mean?

Starting point: Web life-cycle metrics, micro-conversion rates

Extension for application-oriented success measurement: Multi-Channel Metrics

Has the goal be obtained as a cause effect of the project development?

Join in this slides resutls with the web mining project or other factors

Page 16: 1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou

16

Agenda

Mining for evaluation: perspectives and measures

A case study

Outlook: Evaluation of mining

Web mining as a project: towards a methodology

Evaluation and experimentation

Evaluation and Web mining

Web mining as a project: towards a methodology

Evaluation and experimentation

Evaluation and Web mining

Page 17: 1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou

17

Experimentation in Web Mining Applications

Ernestina Menasalvas

Javier Segovia

Pilar HerreroUniversidad Politécnica de Madrid

Page 18: 1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou

18

Experimentation

Refers to

Matching with facts

Supositions, assumptions

speculation and beliefs

That abound in web mining solutions deployment

Users and stakeholder satisfied

Personalization helps the user to remain loyal

Recommendation increase selling

Evaluation: the act of ascertaining

the value and

the functioning

of an object according to specified criteria, operationalised by measures.

to assess concrete achievements

to give feedback towards improvement

Page 19: 1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou

19

Experimentationin web mining: Is the success due to the web mining resutls or to external

factors?

Is this a good Website?

Web Mining -> good website

NOT web Mining -> good website

Page 20: 1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou

20

•Humans can generate valid knowledge by means of trial and error

•Trial and error process is longer and chancy than the scientific method

•Experimental design is is used in other fields of science

Zelkowitz (98):

Controlled

Observational

Historical

What is Experimental Design?

5

Kitchemham (96):

Formal Experiments

Case Studies

Surveys

Experimental design to Web Mining empirical validation

Adatation of experimental design terminology to WM

(Juristo& Moreno 02)

Laboratory validation of theories

Validation at the level of real projects

Historical data validation

Empirical validation can be carried out:

Page 21: 1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou

21

Experimental Designwww.soacilaresearchmethods.net/Kb/desexper.html

Most rigorous of all research design

The strongest with respect to internal validity

Internal validity: Asses the proposition:

If X, then Y

And

If not X, Then not Y

If the program is given, then the outcome occurs

And

If the program is not given then the outcome does not occur

Isolate the program from all of the other potential causes of the outcome

Page 22: 1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou

22

Experimental Designwww.soacilaresearchmethods.net/Kb/desexper.html

Experimental design is intrusive

Difficult to carry out in mos real world contexts

TO some extent, you set up an artificial situation:

Asses the casual relationship with high internal validity.

Limitating the degree to which results can be generalized

Reduce external validity in order to achieve greater internal validity

Page 23: 1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou

23

Phases of experimental design process

1. Defining the objectives of the experiment

Mathematical techniques demand experiment to produce quantifiable hypothesis

Hypothesis expressed in terms of:

– a metric of the web mining results obtained using the web mining techniques

– or of the web mining process where the techiques have been applied

2. Designing the experiment:

Experimental unit

Parameters

Response variable

Factors, levels ans interaction

Replication: based on analogy ??

Design

3. Executing the experiment:

Measure response variables at the end of each experiment

4. Analyzing results: Experimental Analysis

Quantify the impact of each factor and each iteration between factors on the variation of the response variable: statistical significance

Page 24: 1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou

24

Experimental design classification

What we see can be divided into:

Signal Noise

Related to the variable of interest:

the construct to measurerandom factors in the situation

Signal enhancersNoise Reducers

Signal to noise metaphor: (www.socialresearchmethods.net/kb)

Factorial designs Blocking Designs

Page 25: 1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou

25

Experimental design techniques

Categorical

Factors

Quantitative

Experimental

response

Quantitative Factors

and

Response variable

1 Factor

(2 or n levels)

K Factors

(2 or n levels)

All other parameters fixed

Some parameters cannot be fixed

Regression

Models

One factor experiment

Blocking experiment

Some parameters are irrelevant

All factors are relevant

Blocking Factorial design

nk

experiments

Less than nk

experiments

Factorial

design

Fractorial

Factorial design

Page 26: 1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou

26

Questions thus far ?