tuw-ase-summer 2014: evaluating and utilizing data concerns for daas

46
Evaluating and Utilizing Data Concerns for DaaS Hong-Linh Truong Distributed Systems Group, Vienna University of Technology [email protected] http://dsg.tuwien.ac.at/staff/truong 1 ASE Summer 2014 Advanced Services Engineering, Summer 2014, Lecture 5

Upload: hong-linh-truong

Post on 26-Jan-2015

106 views

Category:

Education


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

Evaluating and Utilizing Data Concerns for

DaaS

Hong-Linh Truong

Distributed Systems Group,

Vienna University of Technology

[email protected]://dsg.tuwien.ac.at/staff/truong

1ASE Summer 2014

Advanced Services Engineering,

Summer 2014, Lecture 5

Advanced Services Engineering,

Summer 2014, Lecture 5

Page 2: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

Outline

Data concern-aware DaaS service engineering

Data concern evaluation

Data concern publishing

A Proof-of-concept: QoD Framework

Issues in utilizing data concerns

ASE Summer 2014 2

Page 3: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

........

Recall -- DaaS Concerns

ASE Summer 2014 3

datadata DaaSDaaS.... data assetsdata assets

Data

concerns

Quality of

dataOwnership

PriceLicense ....

APIs, Querying, Data Management, etc.

DaaS concerns include QoS, quality of data (QoD),

service licensing, data licensing, data governance, etc.

DaaS concerns include QoS, quality of data (QoD),

service licensing, data licensing, data governance, etc.

Page 4: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

4

Recall -- DaaS design &

implementation

Data

items

Data

items

Data

items

Data resourceData resource

Data

assets

Data resourceData resource Data resourceData resource

Data resourceData resourceData resourceData resource

Consumer

Consumer

DaaS

ASE Summer 2014

Page 5: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

HOW TO EVALUATE DATA

CONCENRS FOR DATA

ASSETS IN DAAS?

ASE Summer 2014 5

Page 6: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

Patterns for „turning data to DaaS“

ASE Summer 2014 6

Storage/Database

-as-a-Service

Storage/Database

-as-a-Servicedatadata DaaSDaaS

Storage/Databa

se/Middleware

Storage/Databa

se/Middleware

datadata

ThingsDaaSDaaS

Storage/Database/

Middleware

Storage/Database/

Middleware

datadata

PeopleDaaSDaaS

DaaSDaaSdatadata Build Data

Service

APIs

Deploy

Data

Service

Page 7: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

Data-related activities

ASE Summer 2014 7

Wrapping

data

Publishing DaaS

interface

Typical activities for data wrapping and publishing

Typical activities for data updating & retrieval

Updating

data

Selecting

datadatadata

Provisioning

data

Page 8: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

Wrapping data

(Relational) database

(Storage of ) Files

Streams of events (including attached

information)

Service interfaces are different

Update mechanisms are different

ASE Summer 2014 8

Page 9: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

Typical data concern evaluation

ASE Summer 2014 9

Evaluating data

concerns

Evaluating data

concerns

Describing data

concerns

Describing data

concerns

Data Concerns

Evaluation ToolsData Concerns

Representation Models

Populating data

concerns

Populating data

concerns

Publishing services

What do we need in order to perform these activities?

Page 10: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

10

Data concern-aware DaaS

engineering process Typical activities

for data wrapping

and publishing

Typical activities

for data updating &

retrieval

ASE Summer 2014

Hong Linh Truong, Schahram Dustdar: On Evaluating and Publishing

Data Concerns for Data as a Service. APSCC 2010: 363-370

Hong Linh Truong, Schahram Dustdar: On Evaluating and Publishing

Data Concerns for Data as a Service. APSCC 2010: 363-370

Page 11: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

DaaS service operationDaaS service operation

Wrapping, selecting, and updating

data in DaaS (1)

11ASE Summer 2014

Processing

parameter

Processing

parameter

Mapping parameters to

data queries parameter

Query content of

data resources

Mapping and

returning results

Mapping and

returning results

Mapping parameters to

metadata queries

Mapping parameters to

metadata queries

Querying metadata of

data resources

Querying metadata of

data resources

Data

Consumer

Data

Consumer

different strategies for structured data and unstructured data

Page 12: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

Wrapping, selecting, and updating

data in DaaS (2)

Different techniques exist for wrapping,

selecting, updating and retrieving data

How generic data concern evaluation and

publishing techniques can be integrated with

these techniques?

12ASE Summer 2014

Page 13: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

WHICH TYPES OF DATA ARE NEEDED FOR

EVALUATING DATA CONCERNS?

WHAT IS THE IMPACT OF DATA

PROVISIONING MODELS (OFFLINE

VERSUS NEAR-REALTIME) ON CONCERN

EVALUATION/PUBLISHING?

Discussion

ASE Summer 2014 13

Page 14: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

Evaluating data concerns – the

three important points

14

• At which level the evaluation is performed?

evaluation scope

• When the evaluation is done?

evaluation modes

• How the evaluation tool is invoked?

integration model

ASE Summer 2014

Hong Linh Truong, Schahram Dustdar: On Evaluating and Publishing Data Concerns for Data as a Service. APSCC

2010: 363-370

Hong Linh Truong, Schahram Dustdar: On Evaluating and Publishing Data Concerns for Data as a Service. APSCC

2010: 363-370

Page 15: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

Evaluating data concerns –

evaluation scopes

Three scopes

data resource

DaaS operations

DaaS as a whole

15

Why multiple evaluation scopes make sense?

enable fine-grained evaluationenable fine-grained evaluation

ASE Summer 2014

Page 16: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

Evaluating data concerns –

evaluation modes

Off-line

before the access to data

On-the-fly

when the data is requested

16

Why multiple evaluation modes make sense?

suitable for different types of datasuitable for different types of data

ASE Summer 2014

Page 17: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

Evaluating data concerns –

integration modes

Push and pull data concerns

Pass-by-value versus pass-by-reference to data

concerns evaluation tools

17

Why multiple integration modes make sense?

suitable for different tool integration strategiessuitable for different tool integration strategies

ASE Summer 2014

Page 18: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

Evaluating data concerns – some

patterns (1)

18

Pull, pass-by-referencesPull, pass-by-references

ASE Summer 2014

Page 19: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

Evaluating data concerns – some

patterns (2)

19

Pull, pass-by-valuesPull, pass-by-values

ASE Summer 2014

Page 20: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

Evaluating data concerns – some

patterns (3)

20

Push, pass-by-values (1)Push, pass-by-values (1)

ASE Summer 2014

Page 21: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

Evaluating data concerns – some

patterns (4)

21

Push, pass-by-values (2)Push, pass-by-values (2)

ASE Summer 2014

Page 22: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

Evaluation Tool – Internal Software

components

Self-developed or third-party software

components for evaluation tool

Advantages

Tightly couple integration performance, security,

data compliance

Customization

Disadvantages

Usually cannot be integrated with other features

(e.g., data enrichment)

Costly (e.g., what if we do not need them)

ASE Summer 2014 22

Page 23: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

Evaluation tool – using cloud

services

Evaluation features are provided by cloud

services

Several implementations Informatica Cloud Data Quality Web Services, StrikeIron,

Advantages Pay-per-use, combined features

Disadvantages Features are limited (with certain types of data)

Performance issues with large-scale data

Data compliance and security assurance

ASE Summer 2014 23

Page 24: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

Evaluation Tool -- using human

computation capabilities Professionals and Crowds can act as data

concerns evaluators For complex quality assessment that cannot be done by

software

Issues Subjective evaluation

Performance

Limited type of data (e.g., images, documents, etc.)

ASE Summer 2014 24

Michael Reiter, Uwe Breitenbücher, Schahram Dustdar, Dimka Karastoyanova, Frank Leymann, Hong Linh Truong: A Novel

Framework for Monitoring and Analyzing Quality of Data in Simulation Workflows. eScience 2011: 105-112

Maribel Acosta, Amrapali Zaveri, Elena Simperl, Dimitris Kontokostas, Sören Auer, Jens Lehmann: Crowdsourcing Linked

Data Quality Assessment. International Semantic Web Conference (2) 2013: 260-276

Óscar Figuerola Salas, Velibor Adzic, Akash Shah, and Hari Kalva. 2013. Assessing internet video quality using

crowdsourcing. In Proceedings of the 2nd ACM international workshop on Crowdsourcing for multimedia (CrowdMM '13).

ACM, New York, NY, USA, 23-28. DOI=10.1145/2506364.2506366 http://doi.acm.org/10.1145/2506364.2506366

Michael Reiter, Uwe Breitenbücher, Schahram Dustdar, Dimka Karastoyanova, Frank Leymann, Hong Linh Truong: A Novel

Framework for Monitoring and Analyzing Quality of Data in Simulation Workflows. eScience 2011: 105-112

Maribel Acosta, Amrapali Zaveri, Elena Simperl, Dimitris Kontokostas, Sören Auer, Jens Lehmann: Crowdsourcing Linked

Data Quality Assessment. International Semantic Web Conference (2) 2013: 260-276

Óscar Figuerola Salas, Velibor Adzic, Akash Shah, and Hari Kalva. 2013. Assessing internet video quality using

crowdsourcing. In Proceedings of the 2nd ACM international workshop on Crowdsourcing for multimedia (CrowdMM '13).

ACM, New York, NY, USA, 23-28. DOI=10.1145/2506364.2506366 http://doi.acm.org/10.1145/2506364.2506366

Page 25: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

BASED ON WHICH CRITERIA, AN EVALUATION

SCOPE, EVALUATION MODE OR INTEGRATION

MODE IS SELECTED?

Discussion time

ASE Summer 2014 25

WHICH ARE OTHER COMPONENTS INTERACTING

WITH EVALUATION TOOLS?

WHY DO WE NOT REALLY DISCUSS THE

IMPLEMENTATION OF EVALUATION TOOLS?

Page 26: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

Publishing data concern

information (1)

Off-line publishing of data concerns

suitable for static data concerns

the publishing of data concerns of a data

resource is separated from the service

operation which provides the access to the

data resource

ASE Summer 2014 26

Page 27: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

Publishing data concern

information (2)

On-the-fly publishing of data concerns

associating concerns with retrieved data

resources

the resulting data resources (e.g., via queries)

are annotated with data concerns evaluated

by data concerns evaluation tools.

suitable for providing dynamic data concerns

ASE Summer 2014 27

Page 28: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

28

Publishing data concern

information (3)

On-the-fly publishing of data concerns through

queries

the use of different service operation

parameters to query data concerns of data

resources

suitable for validating data concerns before

accessing data resources

ASE Summer 2014

Page 29: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

WHAT ARE THE RELATIONSHIPS BETWEEN

CONCERN EVALUATION AND PUBLISHING

WHEN DATA IS DYNAMICALLY UPDATED?

Discussion time

ASE Summer 2014 29

Page 30: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

How do we utilize the data concern-

aware service engineering process?

Using this model we can determine and publish

several concerns

Our “a proof-of-concept”

A framework for evaluating and publishing QoD of

DaaS

A proof-of-concept implementation of data concern-

aware service engineering process

Another example: model and publish privacy

concerns for DaaS [ECOWS 2010]

ASE Summer 2014 30

Michael Mrissa, Salah-Eddine Tbahriti, Hong-Linh Truong, "Privacy model and annotation for DaaS", The 8th European

Conference on Web Services (ECOWS 2010), (c)IEEE Computer Society, 1-3 December, 2010, Ayia Napa, Cyprus

Michael Mrissa, Salah-Eddine Tbahriti, Hong-Linh Truong, "Privacy model and annotation for DaaS", The 8th European

Conference on Web Services (ECOWS 2010), (c)IEEE Computer Society, 1-3 December, 2010, Ayia Napa, Cyprus

Page 31: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

31

QoD framework (1)

Pull QoD Evaluation Models for DaaS

Pass-by-references and pass-by-value

References of data resources: URI

Values: any object

Third-party data evaluation tools

ASE Summer 2014

Page 32: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

32

QoD framework (2)

ASE Summer 2014

http://www.infosys.tuwien.ac.at/prototype/SOD1/dataconcerns/http://www.infosys.tuwien.ac.at/prototype/SOD1/dataconcerns/

Page 33: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

33

QoD framework: publishing

concerns (1)

Off-line data concern

publishing

a common data concern

publication specification

a tool for providing data concerns

according to the specification

supported by external service

information systems

ASE Summer 2014

Page 34: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

QoD framework: publishing

concerns (2)

On-the-fly querying data concerns associated with data

resources

Using REST parameter convention

Based on metric names in the data concern

specification

ASE Summer 2014 34

Hong Linh Truong, Schahram Dustdar, Andrea Maurino, Marco Comerio: Context, Quality and Relevance:

Dependencies and Impacts on RESTful Web Services Design. ICWE Workshops 2010: 347-359

Hong Linh Truong, Schahram Dustdar, Andrea Maurino, Marco Comerio: Context, Quality and Relevance:

Dependencies and Impacts on RESTful Web Services Design. ICWE Workshops 2010: 347-359

Page 35: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

QoD framework: publishing

concerns (3)

Specifying requests by using utilizing query parameters

the form of metricName=value

35

Obtaining contex and quality by using context and quality

parameters without specifying value conditions

GET/resource?crq.accuracy="0.5"&crq.location=’’Europe”GET/resource?crq.accuracy="0.5"&crq.location=’’Europe”

curl http://localhost:8080/UNDataService/data/query/Population annual growth rate

(percent)?crq.qod

{”crq.qod” : {

”crq.dataelementcompleteness ”: 0.8654708520179372,

”crq.datasetcompleteness”: 0.7356502242152466,

...

}}

curl http://localhost:8080/UNDataService/data/query/Population annual growth rate

(percent)?crq.qod

{”crq.qod” : {

”crq.dataelementcompleteness ”: 0.8654708520179372,

”crq.datasetcompleteness”: 0.7356502242152466,

...

}}

ASE Summer 2014

Page 36: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

36

QoD framework: QoD monitoring

and composition

QoD concerns monitoring and composition are

useful for the evaluation of aggregated data

resources

Our approach

Utilizing monitoring rules

QoD metrics of data resources are passed to an rule

engine

Rules are user-defined for monitoring and composing

QoD metrics

ASE Summer 2014

Page 37: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

QoD framework experiments

Implementation

Java, JAX-RS/Jersey, Drools

Utilizing UNDataAPI - www.undata-api.org

XML data sets without QoD

Illustrating examples: check data from 1990-

2009

datasetcompleteness: the completeness of the list of

countries

dataelementcompleteness: the completeness of data

elements in the list metrics

RESTful services wrapping to UNDataAPI

ASE Summer 2014 37

Page 38: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

38

QoD framework experiment:

evaluating and annotating QoD

metrics

ASE Summer 2014

Page 39: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

39

QoD framework experiments:

publishing QoD with data

resources

ASE Summer 2014

Page 40: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

40

QoD framework experiments:

simple rules for monitoring and

composing QoD

ASE Summer 2014

Page 41: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

HOW TO SCALE THE

EVALUATION?

Discussion time

ASE Summer 2014 41

Page 42: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

ISSUES IN UTILIZING DATA

CONCERNS

ASE Summer 2014 42

Page 43: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

Elasticity

If data does not fit for a purpose, because data

concerns do not meet the requirement from the

consumer

DaaS may enrich the data,

The consumer may switch to another DaaS

The consumer may combine data from different

DaaSs

The consumer may combine data from a DaaS with

its own data

Elasticity of data and data concerns

ASE Summer 2014 43

Page 44: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

Data fits to your purpose

Data concern measurement

They are determined from the data

Whether they fit to your application is dependent on

application contexts

Data concern interpretation

Context-specific interpretation

The same type of data with the same set of concern

measurements but might not fit for the same application at

different times/contexts

Application-specific treatment!

Strongly related to data elasticity

ASE Summer 2014 44

Page 45: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

Exercises

Read mentioned papers

Identify and analyze the relationships between

data concerns evaluation tools and types of data

Analyze trade-offs between on-line and off-line

evaluation and when we can combine them

Analyze how to utilize evaluated data concerns

for optimizing data compositions

Analyze situations when software cannot be

used to evaluate data concerns

ASE Summer 2014 45

Page 46: TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS

46

Thanks for your attention

Hong-Linh Truong

Distributed Systems Group

Vienna University of Technology

[email protected]

http://dsg.tuwien.ac.at/staff/truong

ASE Summer 2014