naregi wp4 (data grid environment) hideo matsuda osaka university
TRANSCRIPT
NAREGI WP4(Data Grid Environment)
Hideo Matsuda
Osaka University
2NAREGI Software Stack (Beta Ver. 2006)
Computing Resources and Virtual Organizations
NII IMS Research Organizations
Major University Computing Centers
((WSRF (GT4+Fujitsu WP1) + GT4 and other services)WSRF (GT4+Fujitsu WP1) + GT4 and other services)
SuperSINET
Grid-Enabled Nano-Applications (WP6)
Grid PSEGrid Programming (WP2)
-Grid RPC -Grid MPI
Grid Visualization
Grid VM (WP1)
Packag
ing
DistributedInformation Service
(CIM)
Grid Workflow (WFML (Unicore+ WF))
Super Scheduler
Grid Security and High-Performance Grid Networking (WP5)
Data (W
P4)WP1WP1
WP3
WP4
3
NAREGI Data Grid Overview
Composed of three parts• Data Access Management System• Metadata Management System• Data Resource Management System
Functionalities• Grid File System using AIST Gfarm
– Logical to physical fIle name mapping, File replication, etc • File Import to WFT (Workflow Tool)• File Staging Support to SS (Super Scheduler)• File Retrieval by Metadata
– For storing file attributes (owner, created date & condition (what application creates the file, etc.). SQL access to Metadata DB (OGSA-DA)
4
Grid File System
(AIST Gfarm)
MetadataManagement
WFTUser
Fileserver#1 Fileserver#2 Fileserver#3
File Mapping
SS
MetadataDB
Data AccessManagement
Data Resource
Management
File1 File2 File3
Computation node
Job File
Computation node
Job File
File Staging or File Access
NAREGI Data Grid Architecture(beta 1)
File Import
5
Data 1 Data 2 Data n
Grid-wide File System
MetadataManagement
Data Access Management
Data ResourceManagement
Job 1
Meta-data
Meta-data
Data 1
Grid Workflow
Data 2 Data n
NAREGI Data Grid Functionalities
Job 2 Job n
Meta-data
Job 1
Grid-wide Data Sharing Service
Job 2
Job n
Data Grid Components
Import data into workflow
Place & register data on the Grid
Assign metadata to data
Store data into distributed file nodes
6
Data 1 Data 2 Data n
Gfarm 1.2 PL4(Grid FS)
Data Access Management
NAREGI WP4: Standards Employed in the Architecture
Job 1
Data SpecificMetadata DB
Data 1 Data n
Job 1 Job nImport data into workflow
Job 2
Computational Nodes
FilesystemNodes
Job n
Data ResourceInformation DB
OGSA-DAI WSRF2.0
OGSA-DAI WSRF2.0
Globus Toolkit 4.0.1
Tomcat5.0.28
PostgreSQL 8.0
PostgreSQL 8.0
Workflow (NAREGI WFML =>BPEL+JSDL)
Super Scheduler (SS) (OGSA-RSS)
Data Staging
Place data on the Grid
Data Resource Management
Metadata Construction
OGSA-RSSFTS SC
GridFTP
7
Grid File System• Adoption of AIST Gfarm 1.2.9
http://datafarm.apgrid.org/index.en.html• UNIX file I/O API (open, close, read, write, etc.), and file/directory operatio
ns (mkdir, rmdir, ls, etc.).
• Single Virtual Filesystem (Logical to Physical File Name Mapping)Client library, File servers, Metadata server (mapping & space balancing)Mapping “/gfarm/file1” to “HOST:/GFARM_DIR/file1” Extensible file space by adding many file servers
• File replication to user-specified hosts.• File Metadata (management with PostgreSQL)
File location (host, pathname), Username, Access/ Modify/ Creation dates, Size
• Access Security with GT4 GSI APICertificate-based file access, Access using gridmapfile
• WSRF interface to File Metadata with OGSA-DAI• Access Control not yet implemented (planned in Gfarm 2.0)
8
File Import & Staging• Import files in GFS (Grid File System) to
user workflows.
• Imported files can be used from jobs in their local disks (file staging).
• SS transfers GFS
files to Job nodes.
9
File Metadata Management
• Many applications (or many instances of the same application with different. parameters) can be coupled in a workflow.
• Need many input and output files of the applications.
• This system provides a metadata DB storing application and parameter information as file attributes.
• User can retrieve files on GFS by their metadata.
10
Computation Node
Local Disk
GridFTP Server
GridVM
Job
Gfarm Client
Super Scheduler (SS) Node
SS File Staging Tool
Portal Server Node
Gfarm Staging Node
Workflow Tool (WFT) Node
WFT
Metadata Node
PostgreSQLMetadata DB
UserClient
OGSA-DAI Client
Metadata Management Frontend
Data Access Management Frontend File-Import
Tool
Data Resource Management Frontend
OGSA-DAI Client
Workflow Tool Client
Data Access Management Client
OGSA-DAI
Gfarm Node
Local Disk
Gfarm Server
Gfarm Node
Local Disk
Gfarm Server
Gfarm Client
GRAM4 FIle Staging / import Interface
Local DiskGridFTP Server
OGSA-DAI
PostgreSQLGfarm DB
11EGEE-NAREGI Interoperation Issues in Data Management
• Data Transfer– both groups use GridFTP ?
• Storage Element– NAREGI does not have this. Adoption of SRMv2.1 ? o
r SRB?
• Replication Catalog– NAREGI only provides simple file replication (flat tabl
e). Adoption of LFC? or RLS?
• Metadata Catalog– NAREGI Metadata Schema & Structure are not fixed
yet. Again it is just a flat table. Adoption ?