srb replicated data management for cooperative computing
DESCRIPTION
SRB Replicated Data Management for Cooperative Computing. Arcot (Raja) Rajasekar San Diego Supercomputer Center [email protected]. What is SRB?. SRB is an Intelligent Data Access System SRB provides federated access to datasets - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: SRB Replicated Data Management for Cooperative Computing](https://reader035.vdocuments.site/reader035/viewer/2022070405/56813e2a550346895da80b3b/html5/thumbnails/1.jpg)
SAN DIEGO SUPERCOMPUTER CENTER
A National Laboratory for Computational Science & Engineering
SRB Replicated Data Management
for Cooperative Computing
Arcot (Raja) RajasekarSan Diego Supercomputer Center
![Page 2: SRB Replicated Data Management for Cooperative Computing](https://reader035.vdocuments.site/reader035/viewer/2022070405/56813e2a550346895da80b3b/html5/thumbnails/2.jpg)
SAN DIEGO SUPERCOMPUTER CENTER
A National Laboratory for Computational Science & Engineering
What is SRB?
· SRB is an Intelligent Data Access System· SRB provides federated access to datasets· SRB provides protocol transparency to diverse
and distributed storage systems· SRB provides location transparency to
distributed datasets· SRB provides access transparency to remote
user
![Page 3: SRB Replicated Data Management for Cooperative Computing](https://reader035.vdocuments.site/reader035/viewer/2022070405/56813e2a550346895da80b3b/html5/thumbnails/3.jpg)
SAN DIEGO SUPERCOMPUTER CENTER
A National Laboratory for Computational Science & Engineering
The Storage Resource Broker is Middleware
Application(SRB client)
SRB Server
Distributed Storage Resources(database systems, archival storage systems, file systems, ftp, http)
MCAT
DB2, Oracle, Illustra, ObjectStore HPSS, ADSM, UniTree UNIX, NTFS, HTTP, FTP
![Page 4: SRB Replicated Data Management for Cooperative Computing](https://reader035.vdocuments.site/reader035/viewer/2022070405/56813e2a550346895da80b3b/html5/thumbnails/4.jpg)
SAN DIEGO SUPERCOMPUTER CENTER
A National Laboratory for Computational Science & Engineering
SRBMaster
SRB agents
Application
MCAT
(Host, port)
(port)
The SRB Process Model
![Page 5: SRB Replicated Data Management for Cooperative Computing](https://reader035.vdocuments.site/reader035/viewer/2022070405/56813e2a550346895da80b3b/html5/thumbnails/5.jpg)
SAN DIEGO SUPERCOMPUTER CENTER
A National Laboratory for Computational Science & Engineering
SRBserver
SRB agent
SRBserver
Federated SRB Operation
MCAT
Application
SRB agent
1
2
34
6
5
![Page 6: SRB Replicated Data Management for Cooperative Computing](https://reader035.vdocuments.site/reader035/viewer/2022070405/56813e2a550346895da80b3b/html5/thumbnails/6.jpg)
SAN DIEGO SUPERCOMPUTER CENTER
A National Laboratory for Computational Science & Engineering
SRB Concepts (1)
• Abstraction of User Space no domain dependence no user accounts needed on remote servers
• Abstraction of Resources Logical Resource Definitions - bundling Resource type and Access protocol transperancy
• Abstraction of Data and Collections Persistent Identifier and Global Name Space
• Uniform Access Methods
![Page 7: SRB Replicated Data Management for Cooperative Computing](https://reader035.vdocuments.site/reader035/viewer/2022070405/56813e2a550346895da80b3b/html5/thumbnails/7.jpg)
SAN DIEGO SUPERCOMPUTER CENTER
A National Laboratory for Computational Science & Engineering
SRB Concepts (2)
• Provide Scalability Hosts Resource Types Resources Collections Data Objects - size and number Users & Groups Methods MetaData
![Page 8: SRB Replicated Data Management for Cooperative Computing](https://reader035.vdocuments.site/reader035/viewer/2022070405/56813e2a550346895da80b3b/html5/thumbnails/8.jpg)
SAN DIEGO SUPERCOMPUTER CENTER
A National Laboratory for Computational Science & Engineering
SRB Concepts(3)
• Provide Logical Abstractions srbSpace - an abstract storage space Resource Types - resource defined by properties Resources - resource identified by name and type
multiple resources tied together as a single resource Collections - abstraction over directory structure
distributed & curated Datasets - identified by properties Users - authenticated across hosts/networks Domain - abstraction over physical domains Metadata Schema/Attributes
![Page 9: SRB Replicated Data Management for Cooperative Computing](https://reader035.vdocuments.site/reader035/viewer/2022070405/56813e2a550346895da80b3b/html5/thumbnails/9.jpg)
SAN DIEGO SUPERCOMPUTER CENTER
A National Laboratory for Computational Science & Engineering
SRB Concepts(4)• Replication of Datasets• Collections for logical co-location• Containers for physical co-location• Access Control Lists for Authorization• Ticket-based Access• Auditing• Authentication and Encryption (SEA)• Server-side proxy Operations• Metadata-based Discovery• Rich Interface - programmatic & interactive
![Page 10: SRB Replicated Data Management for Cooperative Computing](https://reader035.vdocuments.site/reader035/viewer/2022070405/56813e2a550346895da80b3b/html5/thumbnails/10.jpg)
SAN DIEGO SUPERCOMPUTER CENTER
A National Laboratory for Computational Science & Engineering
SRB Space
DR
DR
DR
DR
DR
DR
DL
DL
DL
DL
MC
DR - Data RepositoryDL - Dig LibraryMC - Meta CatalogCP - Comp Process/ SRB Client
MC
CP
CP
CP
CP
CP
CP
CP
CP
CP
SRB
SRB
SRB
SRB
SRB
SRB
SRB
SRBSRB
SRB
![Page 11: SRB Replicated Data Management for Cooperative Computing](https://reader035.vdocuments.site/reader035/viewer/2022070405/56813e2a550346895da80b3b/html5/thumbnails/11.jpg)
SAN DIEGO SUPERCOMPUTER CENTER
A National Laboratory for Computational Science & Engineering
MCAT: Metadata Catalog• Stores metadata about
Data sets, Users, Resources, Proxy Methods, • Maintains replica information for data & containers• Provides “Collection” abstraction for data• Provides “Global User” name space & authentication• Provides Authorization through ACL & tickets
• data, collection, resources and methods• Maintains audit trail on data & collections• Maintains metadata for methods and resources• Provides Resource Transparency - logical resources• Implemented as a relational database - Oracle or DB2
![Page 12: SRB Replicated Data Management for Cooperative Computing](https://reader035.vdocuments.site/reader035/viewer/2022070405/56813e2a550346895da80b3b/html5/thumbnails/12.jpg)
SAN DIEGO SUPERCOMPUTER CENTER
A National Laboratory for Computational Science & Engineering
SRB Replication Concepts• Replication is a core functionality in SRB• Global Name Space (hierarchical)
local name independence replica can reside in any type of resource
• Persistent Id: data movement independence• Access Control at Replica Level• Resource Access Control• Replicas created using SRB or from outside• Semantic Replicas & Syntactic Replicas• Typing of Replicas: Archive, Cache, Temporary
![Page 13: SRB Replicated Data Management for Cooperative Computing](https://reader035.vdocuments.site/reader035/viewer/2022070405/56813e2a550346895da80b3b/html5/thumbnails/13.jpg)
SAN DIEGO SUPERCOMPUTER CENTER
A National Laboratory for Computational Science & Engineering
SRB Data Replication Support• Synchronous Replication
Replication via Logical Resource Resource definition integrated into open/create & write function Can choose: k out of n Associate replication with containers/collections Consistency
• Asynchronous Replication - Offline srbObjReplicate API , Sreplicate command, GUI
• Out of Band Replication - outside SRB Registering of Replicas using srbRegisterReplica API
![Page 14: SRB Replicated Data Management for Cooperative Computing](https://reader035.vdocuments.site/reader035/viewer/2022070405/56813e2a550346895da80b3b/html5/thumbnails/14.jpg)
SAN DIEGO SUPERCOMPUTER CENTER
A National Laboratory for Computational Science & Engineering
SRB Data Replication Support
• Choice at Read any replica specific replica (by copy number) round-robin “nearest” by resource characteristics by timestamp or other characteristics data itself may be identified by meta charcteristics
user defined metadata & annotations data type, owner, comments, ...
![Page 15: SRB Replicated Data Management for Cooperative Computing](https://reader035.vdocuments.site/reader035/viewer/2022070405/56813e2a550346895da80b3b/html5/thumbnails/15.jpg)
SAN DIEGO SUPERCOMPUTER CENTER
A National Laboratory for Computational Science & Engineering
Data Replication
SRB
MCAT
NCSA
Oracle
SRB
HPSSHPSS DB2 Unix
SRB
SDSCCaltech
LogRsrc1 LogRsrc2
Application
SAIC
![Page 16: SRB Replicated Data Management for Cooperative Computing](https://reader035.vdocuments.site/reader035/viewer/2022070405/56813e2a550346895da80b3b/html5/thumbnails/16.jpg)
SAN DIEGO SUPERCOMPUTER CENTER
A National Laboratory for Computational Science & Engineering
SRB API • Programmatic API
High-level API Low-level API SRB Manager API
• Command Level Interface - Scommands
• Graphical User Interface - Java Browser, NT Browser
• Web Utilities
• Transparent Access
![Page 17: SRB Replicated Data Management for Cooperative Computing](https://reader035.vdocuments.site/reader035/viewer/2022070405/56813e2a550346895da80b3b/html5/thumbnails/17.jpg)
SAN DIEGO SUPERCOMPUTER CENTER
A National Laboratory for Computational Science & Engineering
High & Low-level API
• Low-level API talks to resource drivers no registration of data sets in MCAT no authentication through MCAT User provides all information
• High-level API Uses low-level API to access resources Registers data management information in MCAT Uses MCAT for authentication and meta information Uses MCAT for resource and data discovery Access/store data in remote SRB
![Page 18: SRB Replicated Data Management for Cooperative Computing](https://reader035.vdocuments.site/reader035/viewer/2022070405/56813e2a550346895da80b3b/html5/thumbnails/18.jpg)
SAN DIEGO SUPERCOMPUTER CENTER
A National Laboratory for Computational Science & Engineering
Low-level API
• srbFileOpen(conn, storType, host, fileName, mode)• srbFileCreate(conn, storType, host, fileName, mode)• srbFileClose(conn, fd)• srbFileUnlink(conn, storType, host, fileName)• srbFileRead(conn, fd, buffer, length)• srbFileWrite(conn, fd, buffer, length)• srbFileSeek(conn, fd, offset, whence)• srbFileSync(conn, fd)• srbFileStat(conn, storType, host, fileName, statBuf)• srbFileMkdir(conn, storType, host, dirName, mode)• srbFileRmdir(conn, storType, host, dirName, mode)• srbFileChmod(conn, storType, host, fileName, mode)
![Page 19: SRB Replicated Data Management for Cooperative Computing](https://reader035.vdocuments.site/reader035/viewer/2022070405/56813e2a550346895da80b3b/html5/thumbnails/19.jpg)
SAN DIEGO SUPERCOMPUTER CENTER
A National Laboratory for Computational Science & Engineering
High-level API• srbObjOpen(conn, objChar, mode, collectionName)• srbObjCreate(conn, objName, objType, resourceName, collectionName,
pathName, size)• srbObjClose(conn, od)• srbObjUnlink(conn, objChar, collectionName)• srbObjRead(conn, od, buffer, length)• srbObjWrite(conn, od, buffer, length)• srbObjSeek(conn, od, offset, whence)• srbObjMove(conn, objChar, collectionName, newResourceName,
newPathName)• srbObjReplicate(conn, objChar, collectionName, newResourceName,
newPathName)• srbObjProxyOpr(conn, Operation, sourceDesc, targetDesc)• srbRegisterReplica(conn, objChar, collectionName, newResourceName,
newPathName)
![Page 20: SRB Replicated Data Management for Cooperative Computing](https://reader035.vdocuments.site/reader035/viewer/2022070405/56813e2a550346895da80b3b/html5/thumbnails/20.jpg)
SAN DIEGO SUPERCOMPUTER CENTER
A National Laboratory for Computational Science & Engineering
High-Level API (contd …)
• srbGetDatasetInfo(conn, objChar, collectionName, resultStruct, requiredNumber)
• srbGetMoreInfo(resDesc, resultStruct, requiredNumber) • srbGetDataDirInfo(conn, conditionList, selectList, resultStruct)• srbModifyDataset(conn, objId, collectionName, newValue1, newValue2,
modifyType, resourceName, pathName)• srbCreateCollect(conn, parentCollectionName, childCollectionName)• srbListCollect(conn, CollectionName, flag, resultStruct)• srbModifyCollect(conn, CollectionName, newValue1, newValue2,
newValue2, modifyType)• srbModifyUser(conn, newValue1, newValue2, modifyType)• srbSetAuditTrail(conn, setValue)
![Page 21: SRB Replicated Data Management for Cooperative Computing](https://reader035.vdocuments.site/reader035/viewer/2022070405/56813e2a550346895da80b3b/html5/thumbnails/21.jpg)
SAN DIEGO SUPERCOMPUTER CENTER
A National Laboratory for Computational Science & Engineering
Scommands• Sinit - initialize S-environment• Sexit - clean up • Sman - get manpage for Scommand• Scat - display srbObject on screen• Sput - copy local file into srbSpace• Sget - copy srbObject to local space• Sappend - append to srbObject• Srename - change srbObject name• Srm - remove srbObject• Schmod - change/grant access to
srbObject
• Scd - change collection• Spwd - display current collection• Sls - list collection• Smkdir - make new collection• Srmdir - remove old collection
• SgetD - get srbObject information• SgetR - get resource information• SgetU - get user information• SmodD - modify srbObject info• SmodU - modify user info• Stoken - get native type information
• Scopy - copy srbObject in another collection and under another name
• Sreplicate - clone object in new resource - same internal id
• Smove - move srbObject to new collection or resource
![Page 22: SRB Replicated Data Management for Cooperative Computing](https://reader035.vdocuments.site/reader035/viewer/2022070405/56813e2a550346895da80b3b/html5/thumbnails/22.jpg)
SAN DIEGO SUPERCOMPUTER CENTER
A National Laboratory for Computational Science & Engineering
Scommands (contd …)
• ingestUser - adding a new user or group• ingestResource - adding a new resource• ingestLogicalResource - making a new resource grouping• addLogicalResource - adding to a resource grouping• ingetLocation - adding new location information• ingestToken - adding new native types
(eg. resourceType, objectType, userType, domainName, ActionType, . . .)
![Page 23: SRB Replicated Data Management for Cooperative Computing](https://reader035.vdocuments.site/reader035/viewer/2022070405/56813e2a550346895da80b3b/html5/thumbnails/23.jpg)
SAN DIEGO SUPERCOMPUTER CENTER
A National Laboratory for Computational Science & Engineering
Web Utilities
• Sgetw - copies a SRBobject into server site
• Sputw - copies local file in SRBspace
• Scatw - displays SRBobject on browser (handles types)
• Slsw - displays information of SRBobjects