core srb technology for 2005 ncoic workshop by michael wan and wayne schroeder sdsc sdsc/ucsd/npaci
TRANSCRIPT
Core SRB Technologyfor 2005 NCOIC Workshop
By Michael WanAnd
Wayne SchroederSDSC
SDSC/UCSD/NPACI
Outline
• Basic Concepts behind SRB• SRB architecture• SRB features• SRB Usage Model
• Wayne:– SRB productization - Installation, Administration, etc– Security and Authentication– Examples and demo
Initial Design of SRB
• Transparency and Uniformity– Data are increasingly distributed– Design Goal –
• use a single interface and authorization mechanism to access data across:
– Multiple hosts
– Multiple OS platforms
– Multiple resource type (UNIX FS, HPSS, UniTree, DBMS ..)
Initial Design of SRB
Global view– Global Logical Name space –
• Data organization• UNIX like directories (collections) and files (data)• Mapping of logical name to physical attributes - host
address, physical path.• UNIX like API and utilities
– Single Global User Name Space• Single sign-on• No need for UNIX account on every systems• Robust access control
SRB Architecture• Federated middleware system• Client/server model –
– Federation of resource servers with uniform interfaces• client-server• server-server - Each request handler has 2 versions
– Local – Remote – pass off to server that can handle the request
– All Servers use same software• Simplicity – easy to implement, easy to debug
– Robust access control • user level, grant access to multiple users• group level• tickets
• MCAT – – Metadata catalog
Federation of Servers
MCAT
Server1 Server2
Mcat Server
SRB as a Data Grid
SRB
MCAT
DB
SRB
SRB
SRB
SRB SRB
•Data Grid has arbitrary number of servers•Complexity is hidden from users
SRB server design
• Three layers design
– Top layer • Interacts with clients and other servers through tcp/ip sockets
• User authentication
• Handle function requests – parses requests and invokes handlers in middle and bottom layers.
SRB server design (cont2)
• Middle layer (logical layer)– Most requests pass through here– Input parameters are in their logical representations
(logical path name , logical resource name)– Generally, two types of requests –
• Data access – – Queries MCAT, translates from logical to physical representations
– Calls functions in the bottom (physical) layer to access data
• Metadata access – – Interacts with MCAT
SRB server design (con2)
– Bottom layer (physical layer)• Where all data I/O to/from resources are done• Handles three types of resources• File system
– Drivers to interface with different FS– FS supported : UNIX, HPSS, ADS, UniTree, gridFTP (to be
released)
• DB large objects
• DB tables– Access DB tables (query, insert, …)
SRB Features -Authentication
• Support 2 authentication schemes
– Encrypt1 (SDSC) – No plain text password over the net
– GSI (Globus)
– Wayne will give details
Performance Enhancement
• Parallel I/O– For transferring large files
– Uses multi-threads for data transfer and disk I/O
– Interface with HPSS’s mover protocol for parallel I/O
– Parallel third party transfer for copy and replicate
– One hop data transfer between client and data resource
• Bulk Operation– Uploading and downloading large number of small files– Multi-threads– Bulk registration – 500 files in one call– 3-10 times speedup
SRBserver1
SRB agent
SRBserver2
Sput – serial mode
MCAT
Sput
SRB agent
1
2
3
4
5
6
srbObjCreatesrbObjWrite
1.Logical-to-Physical mapping2. Identification of Replicas3.Access & Audit Control
Peer-to-peer
Request
Server(s) SpawningData
Transfer R
SRBserver1
SRB agent
SRBserver2
Parallel mode Data Transfer – Client Initiated
MCAT
Sput -M
SRB agent
1
2
3
4
7
8srbObjPut
1.Logical-to-Physical mapping2. Identification of Replicas3.Access & Audit Control
Return socket addr.,
port and cookie
Connect to server
Data transfer
R
5
6
Performance Enhancement (cont1)
• Container – – physical grouping of small files – for tape I/O or archival resources– Easy to use, transparent to users
Data Replication• A SRB file can have multiple replica• Replica can be stored in different resources• Sls –l mfile
– fedsrbbrick8 0 demoResc 3029449 2005-07-29-15.37 % mfile– fedsrbbrick8 1 demoResc1 3029449 2005-07-29-21.28 % mfile
• Commands that uses replica– Sreplicate – replicate a file to the specified resource– Sbackupsrb – backup a file to the specified resource– SsyncD – Synchronize the replica of a file
PhyMove –move SRB files to another resource
• Move files to another resource without making another replica
• Normally used by admin to move files around • Bulk phyMove – large number of small files• Parallel I/O – large files• Container – move files into container• Heavily used by the BBSRC project for distributed
archive.– Files uploaded to local server
– Files eventually moved to a central archival resource by admin
Performance Enhancement (cont2)
• Use of checksum
– a MCAT metadata associated with a file– Checksum routines is part of server and client
codes– For verification and synchronization of data– Built into most data handling utilities
• Sput, Sget, Srsync, Schksum
Metadata in SRB• SRB System Metadata• Free-form Metadata (User-defined)
– Attribute-Value-Unit Triplets…• Extensible Schema Metadata
– User Defined – Tables integrated into MCAT Core Schema
• External Database• Metadata operations
– Metadata Insertion through User Interfaces– Bulk Metadata Insertion– Template based Metadata Extraction– Query Metadata through well defined Interfaces
SRB Proxy operation• Perform operations on server on behalf of user
– Operation where data is located– File format conversion, md5 checksum, subsetting
and filtering, etc
• Two types of proxy operations– Proxy commands
• Server fork and exec executable/script on server• Pipe output back to client
– Proxy functions• Functions built into server• Well defined framework for writing proxy functions
HDF5-SRB Model Data flow
Client APIsrbObjRequest(void *obj, int objID)
Server APIsrbObjProcess(void *obj, int objID)
1. packMsg() 2. unpackMsg()
3. H5Obj::op()
4. Access file
5. packMsg()6. unpackMsg()
SRB Server
HDF5 Library
HDF5 file
Zone Federation
• Federation of multiple MCATs – MCAT ZONE
• defines a federation of SRB resources controlled by a single MCAT
• Each Zone has full control of its own administrative domain
• Each Zone can operate entirely independently from other zone.
• Data and Resource sharing across ZONES– Use storage resources in foreign zones– Share data across zones– Copy data across zones
Peer to peer Federated MCAT Zone
MCAT1
MCAT2
MCAT3Server1.1
Server1.2
Server2.1Server2.2
Server3.1
SRB Client Implementations
• A set of Basic APIs– Over 160 APIs– Used by all clients to make request to servers
• Scommands– Unix like command line utilities for UNIX and
Window platforms– Over 60 - Sls, Scp, Sput, Sget …
SRB Client Implementations (cont)
• inQ – Window GUI browser
• Jargon – Java SRB client classes– Pure Java implementation
• mySRB – Web based GUI– run using web browser
• Java Admin Tool– GUI for User and Resource management
• Matrix – Web service for SRB work flow
inQ Windows GUI
MySRB – Web Based SRB Interface
• SRB Browser
• Advanced Metadata manipulation
SRB Usage Model
• Various Usage models
• Specific Usages– SLAC’s Babar experiment– UK eScience BBSRC– BIRN
SRB Configuration – Peer-to-peer Data Grid
Resourceserver
Resourceserver
Resourceserver
Resourceserver
Data sharing, no central resourcetProjects – NARA, BIRN
SRB Configuration - Exploding Star
Source Server
Satelliteserver
Satelliteserver
Satelliteserver
Satelliteserver
Satelliteserver
Data source – physics experimentProjects – Babar, kek
SRB Configuration - Imploding Star
CentralCache Server
Satellitesourceserver
Satellitesourceserver
Satellitesourceserver
Satellitesourceserver
Satellitesourceserver
Archival Storage Model Projects – UK eScience –
BBSRC
Central Archival
server
Peer to peer Federation of MCAT Zone
MCAT1
MCAT2
MCAT3Server1.1
Server1.2
Server2.1Server2.2
Server3.1
Summary of the Babar Project
• Preproduction evaluation – 2003– Highlight of Wilco Kroeger’s (SLAC) talk at IEEE 2003
– Title - “Distributing Babar Data using SRB”
• BaBar Computing resources are geographically distributed: 5 Tier-A center GridKA (D), IN2P3 (F), INFN-Padova (I), RAL (UK), SLAC (USA)
• Data have to be replicated to the Tier-A sites.• Number of files is 1M. Size 100’s TB
Babar Preproduction – SRB Usage
• Allows transparent access to files.– Don’t need to know host or storage medium
(disk,tape).
• Accessing files/collections by attributes.– Find files that were produced at a certain time or
site.– Find collections from a particular run period.
• Preproduction test – 2 weeks of MCAT and file transfer tests
Babar Production Update
• Transferred ~70 Tb and 140K files
• Peak rate ~2 Tb/day. Average rate – 1 Tb/day
• Downtime encountered – hardware problem – DB updates
• Plan to federate SLAC and In2p3 Zones – – In2p3 picks up some of the load
• Thanks to Wilko Kroeger (SLAC) and Jean-Yves Nief (In2p3) for the info
UK eScience BBSRC
• Archival of Biological Data from 16 sites to a central resource
• Data ingested into local resources
• Admin uses bulk Sphymove to move data from local resources to a central cache
• Moves data into containers
• Replicates containers to cache resource at RAL
• Replicates containers to ADS archival at RAL
• Removes cache copies
UK eScience BBSRC
• Develop some software on their own– User interface using Jargon
• GUI
• Users not exposed to all SRB functionalities
– Request tracker – track data movement after ingestion
• Status – Project started at beginning of this year
– Just done with pilot program using SRB3.2
– Upgrading to 3.3 for production
Biomedical Informatics Research Network (BIRN)
• Major collaboration with SDSC, several of the projects’ Co-Investigators and Co-PIs are at SDSC..
• SRB provides the ability to transparently share data across remote sites.
The BIRN SRB Data Grid
The BIRN Data Grid
SRB in BIRN
BIRN Toolkit
Mediator
Viewing/Visualization Queries/ResultsApplications Data Management
File System
MCAT
HPSS
Data M
od
elD
ata Access
Data G
ridC
ompu
tatio
nal G
rid
Collaboration
NM
IG
rid
Man
agem
ent
Globus
GridPort
Scheduler
Distributed Resources
Database
SRB
Database