an event-centric provenance model for digital libraries @ ircdl 2010

23
Introduction An Event-Centric Model Summary An Event-Centric Provenance Model for Digital Libraries C. Tang D. Castelli L. Candela P. Manghi P. Pagano C. Thanos Istituto di Scienza e Tecnologie dell’Informazione “A. Faedo” – CNR, Pisa - Italy [email protected] 6 th Italian Research Conference on Digital Libraries Padua, Italy, 28-29 January 2010 C. Tang et al. An Event-Centric Provenance Model

Upload: leonardo-candela

Post on 06-Jul-2015

226 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: An Event-Centric Provenance Model for Digital Libraries @ IRCDL 2010

IntroductionAn Event-Centric Model

Summary

An Event-Centric Provenance Model for DigitalLibraries

C. Tang D. Castelli L. Candela P. ManghiP. Pagano C. Thanos

Istituto di Scienza e Tecnologie dell’Informazione “A. Faedo” – CNR, Pisa - [email protected]

6th Italian Research Conference on Digital LibrariesPadua, Italy, 28-29 January 2010

C. Tang et al. An Event-Centric Provenance Model

Page 2: An Event-Centric Provenance Model for Digital Libraries @ IRCDL 2010

IntroductionAn Event-Centric Model

Summary

Outline

1 IntroductionMotivations

2 An Event-Centric ModelThe ConstituentsExploiting the Model

C. Tang et al. An Event-Centric Provenance Model

Page 3: An Event-Centric Provenance Model for Digital Libraries @ IRCDL 2010

IntroductionAn Event-Centric Model

SummaryMotivations

What is Provenance?

Some pseudo-definitions:

“a summary of the history and context of the data”

“the parts of the input that influenced (or that explain) apart of the output”

“the part of the input that shows where a part of the outputcame from”

“a causal graph that shows how a result was computed”

C. Tang et al. An Event-Centric Provenance Model

Page 4: An Event-Centric Provenance Model for Digital Libraries @ IRCDL 2010

IntroductionAn Event-Centric Model

SummaryMotivations

What is Provenance?

Provenance is thus information aboutsource, derivation, influences, history

. . . of an objectprogram result, database query

In e-Science (thus in DLs), it is essential forefficiency, reproducibility, accountability, explanation, datacleaning, certifying scientific value of data

C. Tang et al. An Event-Centric Provenance Model

Page 5: An Event-Centric Provenance Model for Digital Libraries @ IRCDL 2010

IntroductionAn Event-Centric Model

SummaryMotivations

What is the Problem?

Many models are being developed

Where-provenance, links output parts to equal input parts

Why-provenance, explains “why” some data appears in theresult

How-provenance, explains “how” a result was calculated

Workflow, describes result of a parallel/distributed program

. . . using different assumptions, e.g. system scope, program,granularityOur goal: develop a “non invasive” and “open” modelsupporting “provenance generation”

C. Tang et al. An Event-Centric Provenance Model

Page 6: An Event-Centric Provenance Model for Digital Libraries @ IRCDL 2010

IntroductionAn Event-Centric Model

Summary

The ConstituentsExploiting the Model

The Idea

Add a layer dedicated to capture provenance-oriented data

Reference Objects

Information Objects

Events

C. Tang et al. An Event-Centric Provenance Model

Page 7: An Event-Centric Provenance Model for Digital Libraries @ IRCDL 2010

IntroductionAn Event-Centric Model

Summary

The ConstituentsExploiting the Model

The Model

Event is a happening having an effect on a Reference Object<happenedTo> an Object

C. Tang et al. An Event-Centric Provenance Model

Page 8: An Event-Centric Provenance Model for Digital Libraries @ IRCDL 2010

IntroductionAn Event-Centric Model

Summary

The ConstituentsExploiting the Model

The Model

Each Event has a Type for filtering purposes

C. Tang et al. An Event-Centric Provenance Model

Page 9: An Event-Centric Provenance Model for Digital Libraries @ IRCDL 2010

IntroductionAn Event-Centric Model

Summary

The ConstituentsExploiting the Model

The Model

Description captures the “how” of the Event

C. Tang et al. An Event-Centric Provenance Model

Page 10: An Event-Centric Provenance Model for Digital Libraries @ IRCDL 2010

IntroductionAn Event-Centric Model

Summary

The ConstituentsExploiting the Model

The Model

Place captures the “where” of the Event

C. Tang et al. An Event-Centric Provenance Model

Page 11: An Event-Centric Provenance Model for Digital Libraries @ IRCDL 2010

IntroductionAn Event-Centric Model

Summary

The ConstituentsExploiting the Model

The Model

Time captures the “when” of the Event

C. Tang et al. An Event-Centric Provenance Model

Page 12: An Event-Centric Provenance Model for Digital Libraries @ IRCDL 2010

IntroductionAn Event-Centric Model

Summary

The ConstituentsExploiting the Model

The Model

The Agent controls the Event

C. Tang et al. An Event-Centric Provenance Model

Page 13: An Event-Centric Provenance Model for Digital Libraries @ IRCDL 2010

IntroductionAn Event-Centric Model

Summary

The ConstituentsExploiting the Model

The Model

Rationale captures the “why” of the Event

C. Tang et al. An Event-Centric Provenance Model

Page 14: An Event-Centric Provenance Model for Digital Libraries @ IRCDL 2010

IntroductionAn Event-Centric Model

Summary

The ConstituentsExploiting the Model

The Model

The Parameter is any additional information

C. Tang et al. An Event-Centric Provenance Model

Page 15: An Event-Centric Provenance Model for Digital Libraries @ IRCDL 2010

IntroductionAn Event-Centric Model

Summary

The ConstituentsExploiting the Model

The Model

Don’t reinvent the wheel!!!

C. Tang et al. An Event-Centric Provenance Model

Page 16: An Event-Centric Provenance Model for Digital Libraries @ IRCDL 2010

IntroductionAn Event-Centric Model

Summary

The ConstituentsExploiting the Model

Computing the provenance

!"#$% &'(#)%

*#+#,#$)#-&'(#)%

./00#$#123

./45$06%&'(#)% ./4*#+#,#$)#

7$+3,8/%73$-3'(#)%4-9/:#,

,#+#,#$)#-3'(#)%4-9/:#,

#"#$%4-9/:#,

1

23

4

5

C. Tang et al. An Event-Centric Provenance Model

Page 17: An Event-Centric Provenance Model for Digital Libraries @ IRCDL 2010

IntroductionAn Event-Centric Model

Summary

The ConstituentsExploiting the Model

The granularity issue

High flexibility by relying on the Information Object relationships

Reference Objects

Information Objects

Events

part-of

C. Tang et al. An Event-Centric Provenance Model

Page 18: An Event-Centric Provenance Model for Digital Libraries @ IRCDL 2010

IntroductionAn Event-Centric Model

Summary

The ConstituentsExploiting the Model

The AquaMaps scenario

AquaMaps is one of the VRE supported by the D4Sciencee-Infrastructure

Aggregate data on species from multiple and evolving datasources (e.g. OBIS, GBIF)Curate aggregated dataGenerate species distribution and biodiversity predictionmaps

C. Tang et al. An Event-Centric Provenance Model

Page 19: An Event-Centric Provenance Model for Digital Libraries @ IRCDL 2010

IntroductionAn Event-Centric Model

Summary

The ConstituentsExploiting the Model

Example 1

Find the events occurred to the Salmon object

C. Tang et al. An Event-Centric Provenance Model

Page 20: An Event-Centric Provenance Model for Digital Libraries @ IRCDL 2010

IntroductionAn Event-Centric Model

Summary

The ConstituentsExploiting the Model

Example 2

Find the contributors to the Salmon object

C. Tang et al. An Event-Centric Provenance Model

Page 21: An Event-Centric Provenance Model for Digital Libraries @ IRCDL 2010

IntroductionAn Event-Centric Model

Summary

The ConstituentsExploiting the Model

Example 3

How to explain the existence of the Salmon object

C. Tang et al. An Event-Centric Provenance Model

Page 22: An Event-Centric Provenance Model for Digital Libraries @ IRCDL 2010

IntroductionAn Event-Centric Model

Summary

Summary

Provenance is an essential feature in Digital Libraries andeScience scenarios

Many provenance models are being developed usingdifferent assumptions

A DL oriented provenance model that is event-based,“open” and “non invasive”Future steps

validation and consolidation of the model in the context ofnew DLs application scenariosimplementation of a infrastructural service realising themodel in the D4Science infrastructure

C. Tang et al. An Event-Centric Provenance Model

Page 23: An Event-Centric Provenance Model for Digital Libraries @ IRCDL 2010

IntroductionAn Event-Centric Model

Summary

Summary

Provenance is an essential feature in Digital Libraries andeScience scenariosMany provenance models are being developed usingdifferent assumptionsA DL oriented provenance model that is event-based,“open” and “non invasive”Future steps

validation and consolidation of the model in the context ofnew DLs application scenariosimplementation of a infrastructural service realising themodel in the D4Science infrastructure

http://www.d4science.euhttp://www.dlorg.eu

C. Tang et al. An Event-Centric Provenance Model