naregi wp4 (data grid environment) hideo matsuda osaka university

11
NAREGI WP4 (Data Grid Environment) Hideo Matsuda Osaka University

Upload: lester-hawkins

Post on 28-Dec-2015

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: NAREGI WP4 (Data Grid Environment) Hideo Matsuda Osaka University

NAREGI WP4(Data Grid Environment)

Hideo Matsuda

Osaka University

Page 2: NAREGI WP4 (Data Grid Environment) Hideo Matsuda Osaka University

2NAREGI Software Stack (Beta Ver. 2006)

Computing Resources and Virtual Organizations

NII IMS Research Organizations

Major University Computing Centers

((WSRF (GT4+Fujitsu WP1) + GT4 and other services)WSRF (GT4+Fujitsu WP1) + GT4 and other services)

SuperSINET

Grid-Enabled Nano-Applications (WP6)

Grid PSEGrid  Programming (WP2)

-Grid RPC -Grid MPI

Grid Visualization

Grid VM (WP1)

Packag

ing

DistributedInformation Service

(CIM)

Grid Workflow (WFML (Unicore+ WF))

Super Scheduler

Grid Security and High-Performance Grid Networking (WP5)

Data (W

P4)WP1WP1

WP3

WP4

Page 3: NAREGI WP4 (Data Grid Environment) Hideo Matsuda Osaka University

3

NAREGI Data Grid Overview

Composed of three parts• Data Access Management System• Metadata Management System• Data Resource Management System

Functionalities• Grid File System using AIST Gfarm

– Logical to physical fIle name mapping, File replication, etc • File Import to WFT (Workflow Tool)• File Staging Support to SS (Super Scheduler)• File Retrieval by Metadata

– For storing file attributes (owner, created date & condition (what application creates the file, etc.). SQL access to Metadata DB (OGSA-DA)

Page 4: NAREGI WP4 (Data Grid Environment) Hideo Matsuda Osaka University

4

Grid File System

(AIST Gfarm)

MetadataManagement

WFTUser

Fileserver#1 Fileserver#2 Fileserver#3

File Mapping

SS

MetadataDB

Data AccessManagement

Data Resource

Management

File1 File2 File3

Computation node

Job File

Computation node

Job File

File Staging or File Access

NAREGI Data Grid Architecture(beta 1)

File Import

Page 5: NAREGI WP4 (Data Grid Environment) Hideo Matsuda Osaka University

5

Data 1 Data 2 Data n

Grid-wide File System

MetadataManagement

Data Access Management

Data ResourceManagement

Job 1

Meta-data

Meta-data

Data 1

Grid Workflow

Data 2 Data n

NAREGI Data Grid Functionalities

Job 2 Job n

Meta-data

Job 1

Grid-wide Data Sharing Service

Job 2

Job n

Data Grid Components

Import data into workflow

Place & register data on the Grid

Assign metadata to data

Store data into distributed file nodes

Page 6: NAREGI WP4 (Data Grid Environment) Hideo Matsuda Osaka University

6

Data 1 Data 2 Data n

Gfarm 1.2 PL4(Grid FS)

Data Access Management

NAREGI WP4: Standards Employed in the Architecture

Job 1

Data SpecificMetadata DB

Data 1 Data n

Job 1 Job nImport data into workflow

Job 2

Computational Nodes

FilesystemNodes

Job n

Data ResourceInformation DB

OGSA-DAI WSRF2.0

OGSA-DAI WSRF2.0

Globus Toolkit 4.0.1

Tomcat5.0.28

PostgreSQL 8.0

PostgreSQL 8.0

Workflow (NAREGI WFML =>BPEL+JSDL)

Super Scheduler (SS) (OGSA-RSS)

Data Staging

Place data on the Grid

Data Resource Management

Metadata Construction

OGSA-RSSFTS SC

GridFTP

Page 7: NAREGI WP4 (Data Grid Environment) Hideo Matsuda Osaka University

7

Grid File System• Adoption of AIST Gfarm 1.2.9

http://datafarm.apgrid.org/index.en.html• UNIX file I/O API (open, close, read, write, etc.), and file/directory operatio

ns (mkdir, rmdir, ls, etc.).

• Single Virtual Filesystem (Logical to Physical File Name Mapping)Client library, File servers, Metadata server (mapping & space balancing)Mapping “/gfarm/file1” to “HOST:/GFARM_DIR/file1” Extensible file space by adding many file servers

• File replication to user-specified hosts.• File Metadata (management with PostgreSQL)

File location (host, pathname), Username, Access/ Modify/ Creation dates, Size

• Access Security with GT4 GSI APICertificate-based file access, Access using gridmapfile

• WSRF interface to File Metadata with OGSA-DAI• Access Control not yet implemented (planned in Gfarm 2.0)

Page 8: NAREGI WP4 (Data Grid Environment) Hideo Matsuda Osaka University

8

File Import & Staging• Import files in GFS (Grid File System) to

user workflows.

• Imported files can be used from jobs in their local disks (file staging).

• SS transfers GFS

files to Job nodes.

Page 9: NAREGI WP4 (Data Grid Environment) Hideo Matsuda Osaka University

9

File Metadata Management

• Many applications (or many instances of the same application with different. parameters) can be coupled in a workflow.

• Need many input and output files of the applications.

• This system provides a metadata DB storing application and parameter information as file attributes.

• User can retrieve files on GFS by their metadata.

Page 10: NAREGI WP4 (Data Grid Environment) Hideo Matsuda Osaka University

10

Computation Node

Local Disk

GridFTP Server

GridVM

Job

Gfarm Client

Super Scheduler (SS) Node

SS File Staging Tool

Portal Server Node

Gfarm Staging Node

Workflow Tool (WFT) Node

WFT

Metadata Node

PostgreSQLMetadata DB

UserClient

OGSA-DAI Client

Metadata Management Frontend

Data Access Management Frontend File-Import

Tool

Data Resource Management Frontend

OGSA-DAI Client

Workflow Tool Client

Data Access Management Client

OGSA-DAI

Gfarm Node

Local Disk

Gfarm Server

Gfarm Node

Local Disk

Gfarm Server

Gfarm Client

GRAM4 FIle Staging / import Interface

Local DiskGridFTP Server

OGSA-DAI

PostgreSQLGfarm DB

Page 11: NAREGI WP4 (Data Grid Environment) Hideo Matsuda Osaka University

11EGEE-NAREGI Interoperation Issues in Data Management

• Data Transfer– both groups use GridFTP ?

• Storage Element– NAREGI does not have this. Adoption of SRMv2.1 ? o

r SRB?

• Replication Catalog– NAREGI only provides simple file replication (flat tabl

e). Adoption of LFC? or RLS?

• Metadata Catalog– NAREGI Metadata Schema & Structure are not fixed

yet. Again it is just a flat table. Adoption ?