data replication · data replication motivation: ensure bitsream preservation enable data curation...

17
Data Replication Morris Riedel, Jedrzej Rybicki {m.riedel, j.rybicki}@fz-juelich.de Juelich Supercomputing Center (GER) Date: 8.03.2012 EUDAT Task Force

Upload: others

Post on 27-Jun-2020

19 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Replication · Data Replication Motivation: Ensure bitsream preservation Enable data curation functionality Improve data accessibility Common functionality: Create M replicas

Data Replication

Morris Riedel, Jedrzej Rybicki{m.riedel, j.rybicki}@fz-juelich.de

Juelich Supercomputing Center (GER)

Date: 8.03.2012

EUDAT Task Force

Page 2: Data Replication · Data Replication Motivation: Ensure bitsream preservation Enable data curation functionality Improve data accessibility Common functionality: Create M replicas

2

Content1.The idea of Task Forces2.Goal: What is replication?3.EUDAT Architecture4.Technical details of replication5.Time line6.Summary

2

Page 3: Data Replication · Data Replication Motivation: Ensure bitsream preservation Enable data curation functionality Improve data accessibility Common functionality: Create M replicas

3

Requests for the presentation

3

Temporary Slide• Where we are and how we intend to implement

the services● But don't be too technical!

• Safe Replication:● Timing● Functionality● Main components

Page 4: Data Replication · Data Replication Motivation: Ensure bitsream preservation Enable data curation functionality Improve data accessibility Common functionality: Create M replicas

4

EUDAT Working Principle

4

Page 5: Data Replication · Data Replication Motivation: Ensure bitsream preservation Enable data curation functionality Improve data accessibility Common functionality: Create M replicas

5

Task Force Islands

5

EPOS

Page 6: Data Replication · Data Replication Motivation: Ensure bitsream preservation Enable data curation functionality Improve data accessibility Common functionality: Create M replicas

6

Task Force Islands

6

EPOS

Page 7: Data Replication · Data Replication Motivation: Ensure bitsream preservation Enable data curation functionality Improve data accessibility Common functionality: Create M replicas

7

Task Force Islands

7

EPOS

Page 8: Data Replication · Data Replication Motivation: Ensure bitsream preservation Enable data curation functionality Improve data accessibility Common functionality: Create M replicas

8

Data Replication

8

Motivation:➔ Ensure bitsream preservation➔ Enable data curation functionality➔ Improve data accessibility

Common functionality:Create M replicas (identified by a PID record) at different data centers for

N years, exclude centers X, maintaining the given access permissions.

Page 9: Data Replication · Data Replication Motivation: Ensure bitsream preservation Enable data curation functionality Improve data accessibility Common functionality: Create M replicas

9

High-level Idea

9

Page 10: Data Replication · Data Replication Motivation: Ensure bitsream preservation Enable data curation functionality Improve data accessibility Common functionality: Create M replicas

10

Components

10

Technologies:● Long Term Archives → Comunity specific ● Policy-based Replication → iRODS ● Persistent Identifiers → EPIC/Handle

Orthogonal aspects:● AAI ● Monitoring● Center Registry● Metadata

Page 11: Data Replication · Data Replication Motivation: Ensure bitsream preservation Enable data curation functionality Improve data accessibility Common functionality: Create M replicas

11

Data centerStorage & Compute

ResourcesComputeStorage

Community centerStorage & Compute

ResourcesComputeData

CommunityVirtual

ResearchEnvironmentsScientific Application-specific VRE Other VREs

Models

Web 2.0 YouTube-like FeaturesWeb 2.0 YouTube-like Features Workbench Features: iRods ClientWorkbench Features: iRods Client

Common Services

VirtualWorkspaces

Core Functions Service AdaptersList of Profiles

authorization

iRods

replication

iRods

EUDATTF AAI

Outcomes..EUDAT TF AAI

access and control

RulesRules

RulesRules iCATACLsiCATACLs

iCATACLsiCATACLs

Workbench Features: MonitoringWorkbench Features: Monitoring

Monitoring

Monitoring

Firewalls

PID

Page 12: Data Replication · Data Replication Motivation: Ensure bitsream preservation Enable data curation functionality Improve data accessibility Common functionality: Create M replicas

12

iRODS

12

Integrated Rule Oriented Data Systems

Data grid software system developed by the Data Intensive Cyber Environments (DICE) research group

Deployments: NASA, CC-IN2P3, EU SHAMAN, Australian ARCS, UK e-Science, King’s College,...

Adaptive Middleware with Rule Oriented Programming (ROP)● One-size does not fit all● Community specific operations can be realized by defining rules● Rule: workflows composed of micro services ● Execution: triggered by middelware or user

Page 13: Data Replication · Data Replication Motivation: Ensure bitsream preservation Enable data curation functionality Improve data accessibility Common functionality: Create M replicas

13

IRODS Components

13

User Interface(Access and Manage

Data & Metadata)

IRODS Server(Data on Disk)

IRODS MetadataCatalog Database

(State of data)

IRODS RuleEngine

(ImplementPolicies)

Web GUI, iRODS GUI, Command-lineDspce, Fedora, Kepler, WebDAV, FUSE

Page 14: Data Replication · Data Replication Motivation: Ensure bitsream preservation Enable data curation functionality Improve data accessibility Common functionality: Create M replicas

14

iRODS Rules Example

14

● Format:

#action|conditions separated by &&|call_functions separated by ##|rollback_functions separated by ##

● Replication upon ingest:

acPostProcForPut||msiSysReplDataObj(seq,all)| nop

● Example with a workflow of microservices:

acCreateUser||msiCreateUser##acCreateDefaultCollections##msiCommit|msiRollback##msiRollback##nop

Page 15: Data Replication · Data Replication Motivation: Ensure bitsream preservation Enable data curation functionality Improve data accessibility Common functionality: Create M replicas

From Community Solution to EUDAT Architecture

Architecture WorkInput Related Work

Standards

Specifications

Profiles

Protocols

considersaccounts for

Input

Requirements

Motivation

Goals

guided by

accounts for constrained by

use

Concrete

Abstract

ConcreteArchitectures

derived

ENES/DKRZData ArchitectureImplementation

ENES/DKRZData ArchitectureImplementation

Service Oriented Architecture Implementations

Reference Architectures

EUDAT

Reference ModelEUDAT

Patterns &

Common Services

EPOS/CINECAData ArchitectureImplementation

EPOS/CINECAData ArchitectureImplementation

CLARIN/RZGSARAData ArchitectureImplementation

CLARIN/RZGSARAData ArchitectureImplementation

Your Community

Your Community

Page 16: Data Replication · Data Replication Motivation: Ensure bitsream preservation Enable data curation functionality Improve data accessibility Common functionality: Create M replicas

16

Implementation Plan

16

Segment 1: Service Building (February 2012: done!)● Install software (iRODS,...)● Test replication (Test data, metrics)● Discuss replication policies

Segment 2: Test and Evaluation (May 2012)● Test PID registration● Evaluate performance and accessibility

Segment 3: Production (July 2012)● Integrate in monitoring● Produce documentation● Pass over to production WP5/WP6

Page 17: Data Replication · Data Replication Motivation: Ensure bitsream preservation Enable data curation functionality Improve data accessibility Common functionality: Create M replicas

17

Summary

17

Data Replication Task Force

● Implements data replication service● No one-size-fits-all-solution is sought after...

– 80% of the island solution will be common – 3*20% of the community specific functionalities

● Easy integration of the new communities:

– Enjoy basic services– Use existing microservices and rules to tailor a solution that suits you

JOIN US!