srm-lite: overcoming the firewall barrier for large scale file replication

16
1 SRM-Lite: overcoming the firewall barrier for large scale file replication Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory April, 2007

Upload: gin

Post on 11-Jan-2016

21 views

Category:

Documents


0 download

DESCRIPTION

SRM-Lite: overcoming the firewall barrier for large scale file replication. Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory. April, 2007. First, some background: What are SRMs. SRM (DPM). SRM/ dCache. SRM/ CASTOR. SRM (StoRM). dCache. CASTOR. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: SRM-Lite: overcoming the firewall barrier for large scale file replication

1

SRM-Lite:overcoming the firewall barrierfor large scale file replication

Arie Shoshani

Alex Sim

Lawrence Berkeley National Laboratory

April, 2007

Page 2: SRM-Lite: overcoming the firewall barrier for large scale file replication

2

First, some background:First, some background:What are SRMsWhat are SRMs

• Storage Resource Managers (SRMs) are middleware components Storage Resource Managers (SRMs) are middleware components whose function is to provide:whose function is to provide:• dynamic space allocation AND file management in spaces

• for storage components on the local or wide-area network• Based on a common standard

SRM(BeStMan)

client/user applications

Unix-basedDiskPools

Examples of some storage systems currently supported by SRMs

dCache CASTOR

CCLRC RAL

GPFS

SRM(DPM)

SRM/dCache

SRM/CASTOR

SRM(StoRM)

Unix-basedDiskPools

Page 3: SRM-Lite: overcoming the firewall barrier for large scale file replication

3

SRM Functional Concepts

• Manage Spaces dynamically• Reservation, lifetime• Negotiation

• Manage files in spaces• Request to put files in spaces• Request to get files from spaces• Lifetime, pining of files, release of files• No logical name space management (done by replica location services)

• Access remote sites for files• Bring files from other sites and SRMs as requested• Use existing transport services (GridFTP, https, …)• Transfer protocol negotiation

• Manage multi-file requests• Manage request queues• Manage caches• Manage garbage collection

• Directory Management• Uxix semantics: srmLs, srmMkdir, srmMv, srmRm, srmRmdir

Page 4: SRM-Lite: overcoming the firewall barrier for large scale file replication

4

Tomcat servlet engine

Tomcat servlet engine

MCSMetadata Cataloguing Services

MCSMetadata Cataloguing Services

RLSReplica Location Services

RLSReplica Location Services

SOAP

RMI

MyProxyserver

MyProxyserver

MCS client

RLS client

MyProxy client

GRAMgatekeeper

GRAMgatekeeper

CASCommunity Authorization Services

CASCommunity Authorization Services

CAS client

disk MSSMass Storage System

HPSSHigh PerformanceStorage System

disk

HPSSHigh PerformanceStorage System

disk

disk

DRMStorage Resource

Management

DRMStorage Resource

Management

HRMStorage Resource

Management

HRMStorage Resource

Management

HRMStorage Resource

Management

HRMStorage Resource

Management

HRMStorage Resource

Management

HRMStorage Resource

Management

gridFTP

gridFTP

gridFTPserver

gridFTPserver

gridFTPserver

gridFTPserver

gridFTPserver

gridFTPserver

gridFTPserver

gridFTPserver

openDAPgserver

openDAPgserver

gridFTPStripedserver

gridFTPStripedserver

LBNL

LLNL

ISI

NCAR

ORNL

ANL

DRMStorage Resource

Management

DRMStorage Resource

Management

Example Use of SRMs in Earth Science GridExample Use of SRMs in Earth Science Grid(in production for 3 years)(in production for 3 years)

3100 users 120 TBs managed

LBNL’s SRMs inter-communicate between several sites and main portal site at NCAR to provide storage management and multi-file movement

Page 5: SRM-Lite: overcoming the firewall barrier for large scale file replication

5

DataMover: SRMs used to provide large scale DataMover: SRMs used to provide large scale robust robust data streamingdata streaming between sites between sites

• ProblemProblem: move thousands of : move thousands of files robustlyfiles robustly

• Takes many hours• Need error recovery

• Mass storage systems failures

• Network failures

• SolutionSolution: Use Storage Resource : Use Storage Resource Managers (SRMs)Managers (SRMs)

• File streaming paradigm• By reserving and releasing

storage space automatically

• ProblemProblem: too slow: too slow

• Solution: Solution: • in GridFTP

• Use parallel streams• Use large FTP windows

• Pre-stage files from MSS• Use concurrent transfers

NERSC

Anywhere

BNL

DiskCache

DiskCache

SRM-COPY(thousands of files)

SRM-GET (one file at a time)

DataMover

SRM(performs writes)

SRM(performs reads)GridFTP GET (pull mode)

stage filesarchive files

Network transfer

Get listof files

Example setup for STAR high-energy-physics experiment

Page 6: SRM-Lite: overcoming the firewall barrier for large scale file replication

6

SRM-Lite

• Goal 1: automate file movement behind a firewall• a client program • to automate movement of multiple files • to/from client’s directory to a remote site• given a OTP firewall at one site• Support entire directory transfers• Recover from mid-transfer interruption and machine failure

• Goal 2: pull files into user’s workstation• Use SRM-Lite by users to download files into their workstations• Using various transfer protocols (GridFTP, bbcp, https, …)• Have a GUI that shows transfer progress• Or have a command line• Support entire directory transfers• Support suspend/resume operations (e.g. on laptops)

Page 7: SRM-Lite: overcoming the firewall barrier for large scale file replication

7

SRM-LiteSRM-Lite: a : a clientclient program to automate movement of multiple program to automate movement of multiple files files to/fromto/from client’s directory to a client’s directory to a remote SSH serverremote SSH server

given a given a OTP firewallOTP firewall at one site at one site

DiskCache

HPSS

SSH Server

NERSC

SSHRequest

GridFTP/FTP/SCP

transfers

• Process StepsProcess Steps• Login to ORNL using OTP• At ORNL invoke SRM-Lite• User composes XML input

file, srmlite.xml for selectedfiles/directories to copy from/to another site

• Or, user gives command lineoption for a selected file/directory

• SRM-Lite uses srmlite.xml orcommand line inputto automatically

• Push/Pull files to/from NERSC• Use multiple threads for

concurrent transfers

DiskCache

ORNL

SRM-Lite

OTPLogin

Use Case A: OTP firewall at local site (ORNL),SSH server at remote site (NERSC)

srmlite.xml

Local Commands

Page 8: SRM-Lite: overcoming the firewall barrier for large scale file replication

8

Scenario: one end has Scenario: one end has SRMSRM,,The other end has a firewall, use SRMThe other end has a firewall, use SRM

DiskCache

HPSS

SRM

NERSC

GridFTP/FTP/SCP

transfers

• Process StepsProcess Steps• Login to ORNL using OTP• At ORNL invoke SRM-Lite• User composes XML input

file, srmlite.txt for selectedfiles/directories to copy over to/from another SRM controlled storage system

• Or, user gives command lineoption for a selected file/directory

• SRM-Lite uses srmlite.xml orcommand line inputto automatically

• Push/Pull files to/from SRM at NERSC

• Use multiple threads for concurrent transfers

DiskCache

ORNL

SRM-Lite

OTPLogin

srmlite.xml

SRM

Request

Use Case B: OTP firewall at local site (ORNL),

SRM server at remote site (NERSC)

Page 9: SRM-Lite: overcoming the firewall barrier for large scale file replication

9

Scenario:Scenario: one end has SRM, one end has SRM,The other end has a firewall, use either The other end has a firewall, use either SSH/SRM,SSH/SRM,

DiskCache

HPSS

SRM

SSH Server

NERSC

SSHRequest

GridFTP/FTP/SCP

transfers

• Process StepsProcess Steps• Login to ORNL using OTP• At ORNL invoke SRM-Lite• User composes XML input

file, srmlite.xml for selectedfiles/directories to copy over to another site

• Or, user gives command lineoption for a selected file/directory

• SRM-Lite uses srmlite.xml orcommand line inputto automatically

• Push/Pull files to/from NERSC using either SSH or SRM

• Use multiple threads for concurrent transfers

DiskCache

ORNL

SRM-Lite

OTPLogin

srmlite.txt

SRM

Request

Use Case C: OTP firewall at local site (ORNL),

SRM/SSH server at remote site (NERSC)

Page 10: SRM-Lite: overcoming the firewall barrier for large scale file replication

10

Scenario: both ends have SRMs,Scenario: both ends have SRMs,both ends have a firewall, use both ends have a firewall, use SRM-liteSRM-lite on both ends, on both ends,

Use SSH to invoke SRM-Lite at other endUse SSH to invoke SRM-Lite at other end

• Process StepsProcess Steps• Login to ORNL using OTP• Create a OTP SSH tunnel to • NERSC• User composes XML input

file, srmlite.txt for selectedfiles/directories to copy over to another site

• Or, user gives command lineoption for a selected file/directory

• SRM-Lite uses srmlite.txt orcommand line input, throughSSH tunnel to automatically

• Communicate with SRM-lite at other end

• Push/Pull files to/from NERSC using SRM

• Use multiple threads for concurrent transfers

• Can use SCP onlyCan use SCP only

NERSC

SCP

transfers

ORNL

OTPLogin

SSH-TunnelingOTP

SSH Tunnel

DiskCache

SRM-Lite

srmlite.txt

HPSS

SRM

DiskCache

HPSS

SRM

SRM-Lite

SRMRequest

Use Case D: OTP firewall at both local site (ORNL),

and remote site (NERSC),Use SRM-lite at both ends

Page 11: SRM-Lite: overcoming the firewall barrier for large scale file replication

11

SRM-Lite: Status

• SRM-Lite is developed

• Available from: http://datagrid.lbl.gov/srmlite/

• Tested with GridFTP, SCP, HTTPS, HTTP

• Tested with large number of files

• Tested behind a firewall

• Access from local SRMs that acess HPSS – not tested yet

• Access between two filewalled system – not developed yet

Page 12: SRM-Lite: overcoming the firewall barrier for large scale file replication

12

SRM-Lite: GUI

• GUI was developed when used at user’s site(Linux, PC, MAC), called DataMover-lite (DML)

• Available from: http://datagrid.lbl.gov/dml/

• Example GUIscreen

• Shows info on:completed, active,and pendingtransfers

• Also, file sizes,transfer times,transfer speed

Page 13: SRM-Lite: overcoming the firewall barrier for large scale file replication

13

Extra Slides

Page 14: SRM-Lite: overcoming the firewall barrier for large scale file replication

14

Storage Resource ManagersStorage Resource Managers

• SRMs are middleware components whose function SRMs are middleware components whose function is to provide:is to provide:• dynamic space allocation AND file management in spaces• for storage components on the local or wide-area network• Based on a common standard

SRM(BeStMan)

JASMine

client/user applications

Unix-baseddisks

Examples of storage systems currently supported by SRMs

dCache CASTOR

CCLRC RAL

GPFS

SRM(DPM)

SRM(Jlab-SRM)

Unix-baseddisks

CASTOR

SRM(StoRM)

SRM/dCache

SRM/CASTOR

SRM/CASTOR

SRM(StoRM)

MSS

SRM/L-Store

SRM(BeStMan)

SRM(BeStMan)

Page 15: SRM-Lite: overcoming the firewall barrier for large scale file replication

15

DataMover-Lite use DataMover-Lite use in ESGin ESG: a : a clientclient program used program usedto automate movement of multiple files to automate movement of multiple files toto client’s directory client’s directory

DiskCache

MSS

SRM

ESG Portal

DiskCache

User’s browser

DataPortalNCAR User’s machine

DataMoverLite

release

request

GridFTP/FTP/HTTP/HTTPS

transfers

• Process StepsProcess Steps• User downloads DataMoverLite• User goes to portal, select files• Portal gets ALL files into

SRM disk• Portal generates XML input file,

datamover.txt, for userselected files

• DML uses datamover.txt to automatically

• get files, and• release files after move

completes successfully

datamover.txt

Page 16: SRM-Lite: overcoming the firewall barrier for large scale file replication

16

Another example of DML GUI