optimisation of grid enabled storage at small sites jamie k. ferguson university of glasgow email...

25
Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – [email protected] Jamie K. Ferguson – University of Glasgow Graeme A. Stewart - University of Glasgow Greig A. Cowan - University of Edinburgh

Upload: noah-johnson

Post on 29-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University

Optimisation of Grid Enabled Storage at Small Sites

Jamie K. FergusonUniversity of Glasgow

email – [email protected]

Jamie K. Ferguson – University of Glasgow Graeme A. Stewart - University of Glasgow

Greig A. Cowan - University of Edinburgh

Page 2: Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University

● Typical Tier 2 & Purpose of the Inbound Transfer Tests

● Details of the hardware/software configuration for the File Transfers

● Analysis of Results

Introduction

Page 3: Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University

LHC and the LCG ● LHC – most powerful instrument ever built in the field of physics

● Generate huge amounts of data every second it is running

● Retention of 10PB annually to be processed at sites

● Use case is typically files of size ~GB, many of which are cascaded down to be stored at T2s until analysis jobs process them

Page 4: Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University

Typical Tier2 - Definition

● Limited Hardware Resources– (In GridPP) Using dCache or dpm as SRM– Few (one or two) Disk Servers– Few Terabytes of RAIDed Disk

● Limited Manpower– Not enough time to Configure and/or Administer a

Sophisticated Storage System– Ideally want something just to work “out of the box”

Page 5: Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University

Importance of Good Write (and Read) Rates● Experiments Desire Good in/out Rates

– Write more stressful than read, hence our focus– Expected data transfer rates (T1==>T2) will be directly

proportional to the storage at a T2 site– Few 100Mbps for small(ish) sites up to several Gbps

for large CMS sites● Limiting Factor could be one of many things

– I know this from recently coordinating 24hour tests between all 19 of the GridPP T2 member institutes

● We also yielded file transfer failure rates

Page 6: Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University

Glite File Transfer Service

● Used FTS to manage transfers– Easy to use file transfer management software– Uses SURLs for source and destination– Experiments shall also use this software– Able to set channel parameters N

f and N

s– Able to monitor each job, and each transfer within each job.

● Pending, Active, Done, Failed, etc.

Page 7: Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University

What Variables Were Investigated?

● Destination srm– dCache (v1.6.6-5)– dpm (v1.4.5)

● The underlying File system on the destination– ext2, ext3, jfs, xfs

● Two Transfer-Channel Parameters– No. of Parallel Files– No. of GridFTP

Streams

● Example => Nf=5,

Ns=3

Page 8: Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University

Software Components

● Dcap and rfio are the transportation layers for dCache and dpm respectively

● Under this software stack is the filesystem itself. e.g. ext2

● Above this stack was the filetransfer.py script– See http://www.physics.gla.ac.uk/~graeme/scripts/|

filetransfer#filetransfer

Page 9: Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University

Software Components - dpm

● All the daemons of the destination dpm were running on the same machine

● dCache had a similar setup in terms everything housed in a single node

Page 10: Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University

Hardware Components

● Source was a dpm.– High performance machine

● Destination was single node dual core Xeon CPU● Machines were on same network.

– Connected via 1GB link which had negligible other traffic.

– No firewall between source and destination● No iptables loaded

● Destination had three 1.7TB partitions– Raid 5– 64K stripe

Page 11: Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University

Kernels and Filesystems

● A CERN contributed rebuild of the standard SL kernel was used to investigate xfs.– This differs from the first Kernel only in the addition

of xfs support– Instructions on how to install kernel at

http://www.gridpp.ac.uk/wiki/XFS_Kernel_Howto– Necessary RPMs available from

ftp://ftp.scientificlinux.org/linux/scientific/305/i386/contrib/RPMS/xfs/

Page 12: Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University

Method

● 30 source files, each of size 1GB were used– This size is typical of the sizes of LCG files that shall

be used by LHC experiments● Both dCache and dpm were used during testing● Each kernel/Filesystem was tested - 4 such pairs● Values of 1,3,5,10 were used for No. Files and No.

Streams - giving a matrix of 16 test results● Each test was repeated 4 times to attain a mean.

– Outlying results (~ < 50% of other results) were retested

● This prevented failures in higher level components e.g. FTS adversely affecting results

Page 13: Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University

Results – Average Rates

● All results are in Mbps

Page 14: Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University

Average dCache rate vs. Nf

Page 15: Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University

Average dCache rate vs. Ns

Page 16: Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University

Average dpm rate vs. Nf

Page 17: Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University

Average dpm rate vs. Ns

Page 18: Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University

Results – Average Rates

● In our tests dpm outperformed dCache for every average Nf, Ns

Page 19: Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University

Results – Average Rates

● Transfer rates are greater when using jfs and xfs rather than ext2 or ext3

● Rates for ext2 are better than ext3 due to the fact that ext2 does not suffer from journalling overheads

Page 20: Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University

Results - Average Rates● Having more than one N

f on the channel

substantially improves the transfer rate for both SRMs and for all filesystems. And for both SRMs, the average rate is similar for N

f =3,5,10

● dCache– N

s = 1 is the optimal value for all filesystems

● dpm– N

s = 1 is the optimal value for ext2 and ext3

– For jfs and xfs rate seems independent of Ns

● For both SRMs, the average rate is similar for Ns

=3,5,10

Page 21: Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University

Results – Error (Failure) Rates

● Failures, in both cases tended to be caused by a failure to correctly call srmSetDone() in FTS resulting from a high machine load

● Recommended to separate the SRM daemons and disk servers, especially at larger sites

Page 22: Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University

Results – Error (Failure) Rates

● dCache– small number of errors for the ext2 and ext3

filesystems● caused by high machine load

– No errors for the jfs and xfs filesystems● dpm

– all filesystems had errors● As in dCache case, caused by high machine load

– Error rate for jfs was particularly high, but this was down to many errors in one single transfer

Page 23: Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University

Results – FTS Parameters● N

f– Initial tests indicate that N

f set at a high value (15) causes

a large load on machine when first batch of files completes. Subsequent batches time-out.

– Caused by post-transfer SRM protocol negotiations occurring simultaneously

● Ns– > 1 caused slower rates for ¾ of the SRM/filesystem combinations

– Multiple streams causes a file to be split up and sent down different TCP channels

– This results in “random writes” to the disk. – Single streams cause the data packets to arrive

sequentially and can be written sequentially also

Page 24: Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University

Future Work

● Use SL4 as OS– allows testing of 2.6 kernel

● Different stripe size for RAID configuration● TCP read and write buffer sizes

– Linux kernel-networking tuning parameters● Additional hardware, e.g. More disk servers● More realistic simulation

– Simultaneous reading/writing– Local file access

● Other filesystems?– e.g. reiser, but this filesystem is more applicable to

holding small files, not the sizes that shall exist on the LCG

Page 25: Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University

Conclusions

● Choice of SRM application should be made at site level based on resources available

● Using newer high performance filesystem jfs or xfs increases inbound rate– Howto move to xfs filesystem without loosing data

http://www.gridpp.ac.uk/wiki/DPM_Filesystem_XFS_Formatting_Howto

● High value for Nf– Although too high will cause other problems

● Low value for Ns– I recommended N

s=1 and N

f=8 for GridPP inter-T2 tests

that I'm currently conducting