ibm - active data: data life cycle management across ......avalon june 3rd, 2014 22/30 active data...

16
Active Data: Data Life Cycle Management Across Heterogeneous Systems and Infrastructures Anthony Simonet, Gilles Fedak (INRIA) Matei Ripeanu, Samer Al-Kiswany (UCB) Kyle Chard, Ian Foster (ANL/UC) Hot Topics in High-Performance Distributed Computing Workshop IBM Almaden Research Center San Jose, California March 12, 2015 1/13 G. Fedak() Active Data March 12, 2015

Upload: others

Post on 18-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IBM - Active Data: Data Life Cycle Management Across ......Avalon June 3rd, 2014 22/30 Active Data Client Failure and Recovery Handler remove metadata from the catalog Filters data

Active Data: Data Life Cycle ManagementAcross Heterogeneous Systems and

Infrastructures

Anthony Simonet, Gilles Fedak (INRIA)Matei Ripeanu, Samer Al-Kiswany (UCB)

Kyle Chard, Ian Foster (ANL/UC)

Hot Topics in High-Performance Distributed Computing WorkshopIBM Almaden Research Center

San Jose, CaliforniaMarch 12, 2015

1/13

G. Fedak() Active Data March 12, 2015

Page 2: IBM - Active Data: Data Life Cycle Management Across ......Avalon June 3rd, 2014 22/30 Active Data Client Failure and Recovery Handler remove metadata from the catalog Filters data

Big Data ...

I Huge and growing volume of information originating from multiplesources.

!"#$%&'("$#) *+'&,-./"#)0+1)*2+("2() 34(")5-$-)!"$(%"($)

I . . . or Big Bottlenecks ?I how to scale the infrastructure ?

I end-to-end performance improvement, inter-system optimization.I how to improve productivity of data-intensive scientist ?

I data-oriented programming language, data quality, improveautomation and errors recovery

2/13

G. Fedak() Active Data March 12, 2015

Page 3: IBM - Active Data: Data Life Cycle Management Across ......Avalon June 3rd, 2014 22/30 Active Data Client Failure and Recovery Handler remove metadata from the catalog Filters data

Data Life Cycle

Definition

Data Life Cycle (DLC) is the course of operational stages through whichdata pass from the time when they enter a set of systems to the timewhen they leave it.

!"#$%&%'()* +,-.,("-&&%)/* 01(,2/-*

!)234&%&*

!)234&%&*

Challenges :

I Expose high level view DLC across distributed systems andinfrastructures

I Expose interactions between the infrastructure and the DLC (e.gfailures)

3/13

G. Fedak() Active Data March 12, 2015

Page 4: IBM - Active Data: Data Life Cycle Management Across ......Avalon June 3rd, 2014 22/30 Active Data Client Failure and Recovery Handler remove metadata from the catalog Filters data

Active Data

Active Data:

I Allow to reason about data sets handled by heterogeneous softwareand infrastructures.

I A formal model that captures the essential life cycle stages andproperties: creation, deletion, faults, replication, error checking . . .

I programming model to develop easily data life cycle managementapplications.

I Allows legacy systems to expose their intrinsic data life cycle.

4/13

G. Fedak() Active Data March 12, 2015

Page 5: IBM - Active Data: Data Life Cycle Management Across ......Avalon June 3rd, 2014 22/30 Active Data Client Failure and Recovery Handler remove metadata from the catalog Filters data

Active Data: Principles & Features

System programmers expose their system’s internal data life cycle with amodel based on Petri Nets.A Life Cycle Model is made of

I Places: data states

I Transitions : data operations

•Created

t1

Written

t2

Read

t3

t4

Terminated

public void handler () {

computeMD5 ();

}

Each token has a unique identifier, corresponding to the actual dataitem’s.

5/13

G. Fedak() Active Data March 12, 2015

Page 6: IBM - Active Data: Data Life Cycle Management Across ......Avalon June 3rd, 2014 22/30 Active Data Client Failure and Recovery Handler remove metadata from the catalog Filters data

Active Data: Principles & Features

System programmers expose their system’s internal data life cycle with amodel based on Petri Nets.A Life Cycle Model is made of

I Places: data states

I Transitions : data operations

Created

t1

•Written

t2

Read

t3

t4

Terminated

public void handler () {

computeMD5 ();

}

A transition is fired whenever a data state changes.

5/13

G. Fedak() Active Data March 12, 2015

Page 7: IBM - Active Data: Data Life Cycle Management Across ......Avalon June 3rd, 2014 22/30 Active Data Client Failure and Recovery Handler remove metadata from the catalog Filters data

Active Data: Principles & Features

System programmers expose their system’s internal data life cycle with amodel based on Petri Nets.A Life Cycle Model is made of

I Places: data states

I Transitions : data operations

Created

t1

•Written

t2

Read

t3

t4

Terminated

public void handler () {

computeMD5 ();

}

Code may be plugged by clients to transitions.It is executed whenever the transition is fired.

5/13

G. Fedak() Active Data March 12, 2015

Page 8: IBM - Active Data: Data Life Cycle Management Across ......Avalon June 3rd, 2014 22/30 Active Data Client Failure and Recovery Handler remove metadata from the catalog Filters data

Active Data Framework

6/13

G. Fedak() Active Data March 12, 2015

Life Cycle View

File transferFile Dataset Metadata

Guard

Code Execution

} Tagged Tokens

Notification

Framework features:

I Captures data events in legacy systems

I High-level life cycle-centered view of dataI Single namespace for all the files,

datasets and metadata

I Powerful filters based on Data TagsI Install Taggers on TransitionsI Guarded Transitions : only executes on

token which have specific tags.

I Publish/subscribe transitions

I Custom user reaction to data progressI Custom code executionI Custom notifications (twitter, email,

gdoc, ifttt . . . )

Page 9: IBM - Active Data: Data Life Cycle Management Across ......Avalon June 3rd, 2014 22/30 Active Data Client Failure and Recovery Handler remove metadata from the catalog Filters data

Use Case: Advanced Photon Source

Globus Catalog Globus

Detector Local Storage Compute Cluster

1. LocalTransfer

2. ExtractMetadata

3. GlobusTransfer

4. Swift Parallel Analysis

I 3 to 5 TB of data per week on this detector

I Raw data are pre-processed and registered in the Globus Catalog :

I Data are curated by several applications

I Data are shared amongst scientific user

7/13

G. Fedak() Active Data March 12, 2015

Page 10: IBM - Active Data: Data Life Cycle Management Across ......Avalon June 3rd, 2014 22/30 Active Data Client Failure and Recovery Handler remove metadata from the catalog Filters data

Data Surveillance Framework

4 goals (that would otherwise require a lot of scripting and hacking):

I Monitoring Data Set Progress

I Better Automation

I Sharing & Notification

I Error Discovery & Recovery

8/13

G. Fedak() Active Data March 12, 2015

Page 11: IBM - Active Data: Data Life Cycle Management Across ......Avalon June 3rd, 2014 22/30 Active Data Client Failure and Recovery Handler remove metadata from the catalog Filters data

APS Data Life Cycle Model

Created Start transfer

Terminated

End

Detector

Created

SuccessFailure

SucceededFailed

EndEnd

Terminated

End transfer

Globus transfer

Created

End

Terminated

Start transfer

Shared storage

Created

SuccessFailure

SucceededFailed

EndEnd

Terminated

Start Swift

Globus transfer

CreatedExtract

Update

TerminatedRemove

Globus Catalog

Created

Initialize

Set

End

Failure

Terminated

Derive

Swift

Data life cycle model composed of 6 systems.

9/13

G. Fedak() Active Data March 12, 2015

Page 12: IBM - Active Data: Data Life Cycle Management Across ......Avalon June 3rd, 2014 22/30 Active Data Client Failure and Recovery Handler remove metadata from the catalog Filters data

Error Detection & Recovery

10/13

G. Fedak() Active Data March 12, 2015

Example scenario

Recover from system-wide errors: faulty acquired files are detected onlyafter Swift fails to process them.

In this situation, the user manually:

I Drops the whole dataset

I Removes any associated file and metadata

I Re-acquire the dataset using the same parameters

Page 13: IBM - Active Data: Data Life Cycle Management Across ......Avalon June 3rd, 2014 22/30 Active Data Client Failure and Recovery Handler remove metadata from the catalog Filters data

E.D.&R. implementationAvalon Daniel Arnaud Anthony Vincent

Use-case: APS data life cycle model

Created Start transfer

Terminated

End

Detector

Created

SuccessFailure

SucceededFailed

EndEnd

Terminated

End transfer

Globus transfer

Created

End

Terminated

Start transfer

Shared storage

Created

SuccessFailure

SucceededFailed

EndEnd

Terminated

Start Swift

Globus transfer

CreatedExtract

Update

TerminatedRemove

Globus Catalog

Created

Initialize

Set

End

Failure

Terminated

Derive

Swift

Data life cycle model composed of 6 systems.

Avalon June 3rd, 2014 22/30

Active Data ClientFailure and Recovery Handler

remove metadata from the catalog

Filters data likely to fail

Tagger

Active Data Client

Guard

Handler

run the Globus catalog UI scripts

11/13

G. Fedak() Active Data March 12, 2015

Page 14: IBM - Active Data: Data Life Cycle Management Across ......Avalon June 3rd, 2014 22/30 Active Data Client Failure and Recovery Handler remove metadata from the catalog Filters data

Handler Code

TransitionHandler handler = new TransitionHandler () {

public void handler(Transition t, boolean isLocal , Token[] inTokens , Token[] outTokens) {

// Get the dataset identifier

LifeCycle lc = ad.getLifeCycle(inTokens [0]);

datasetId = lc.getTokens("Shared storage.Created")[0]. getUid ();

// Remove the dataset annotations from the catalog

String url = "https :// catalog.globus.org/dataset/" + datasetId;

Runtime r = Runtime.getRuntime ();

Process p = r.exec("catalog_client.py remove " + url);

p.waitFor ();

// Locally , remove the datasets

String path = "~/aps/" + datasetId;

FileUtils.deleteDirectory(new File(path));

// Publish the "Detector.End"

Token root = lc.getTokens("Detector.Created")[0];

ad.publishTransition("Detector.End", lc);

// Notify the user

sendEmail("[email protected]", "APS - Corrupted dataset " + datasetId);

}

};

HandlerGuard guard = new HandlerGuard () {

public boolean accept ( Transition t , Token [] inTokens , Token [] outTokens ) {

return input [0]. hasTag(" f a i l u r e c o r r u p t e d ");

}}

ad.subscribeTo("Swift.Failure", handler , guard);

12/13

G. Fedak() Active Data March 12, 2015

Page 15: IBM - Active Data: Data Life Cycle Management Across ......Avalon June 3rd, 2014 22/30 Active Data Client Failure and Recovery Handler remove metadata from the catalog Filters data

Conclusion

Active Data

I allows to expose Data Life Cycle across heterogeneous systems andinfrastructures

I transition-based programming model for DLC managementapplication

I Monitoring, automation, error detection & recoveryI X-systems optimizations: incremental computing, data staging,

caching, throttling etc. . .

Perspectives :

I Use AD to deploy data management software stack on IaaS (AsmaBen Cheick, Heithem Abbes, Univ. Tunis)

I Big Data Apache stack X-optimization (H. He, CAS, Beijing)

I Volunteer & crowd computing (M. Moca, BBU, Romania)

13/13

G. Fedak() Active Data March 12, 2015

Page 16: IBM - Active Data: Data Life Cycle Management Across ......Avalon June 3rd, 2014 22/30 Active Data Client Failure and Recovery Handler remove metadata from the catalog Filters data

Thank you!

Questions?