a comparison of distributed data storage middleware for hpc, grid and cloud mikhail goldshtein 1,...

17
A comparison of distributed data storage middleware for HPC, GRID and Cloud Mikhail Goldshtein 1 , Andrey Sozykin 1 , Grigory Masich 2 and Valeria Gribova 3 1 Institute of Mathematics and Mechanics UrB RAS, Russia, Yekaterinburg 2 Institute of Continuous Media Mechanics UrB RAS, Russia, Perm 3 Institute of Automation and Control Processes FEB RAS, Russia, Vladivostok

Upload: ronald-palmer

Post on 20-Jan-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A comparison of distributed data storage middleware for HPC, GRID and Cloud Mikhail Goldshtein 1, Andrey Sozykin 1, Grigory Masich 2 and Valeria Gribova

A comparison of distributed data storage middleware for HPC, GRID and Cloud

Mikhail Goldshtein1, Andrey Sozykin1, Grigory Masich2 and

Valeria Gribova3

1Institute of Mathematics and Mechanics UrB RAS, Russia, Yekaterinburg

2Institute of Continuous Media Mechanics UrB RAS, Russia, Perm

3Institute of Automation and Control Processes FEB RAS, Russia, Vladivostok

Page 2: A comparison of distributed data storage middleware for HPC, GRID and Cloud Mikhail Goldshtein 1, Andrey Sozykin 1, Grigory Masich 2 and Valeria Gribova

European Middleware Initiative

EMI - Software platform for high performance

distributed computing, http://www.eu-emi.eu

Joint effort of the major European distributed computing

middleware providers (ARC, dCache, gLite, UNICORE)

Widely used in Europe, including Worldwide LHC

Computing Grid (WLCG)

Higgs boson:

•Alberto Di Meglio: Without the EMI middleware, such an important result could not have been achieved in such a short time

2

Page 3: A comparison of distributed data storage middleware for HPC, GRID and Cloud Mikhail Goldshtein 1, Andrey Sozykin 1, Grigory Masich 2 and Valeria Gribova

Storage solutions in EMI

3

dCache - http://www.dcache.org/

Disk Pool Manager (DPM) -

https://svnweb.cern.ch/trac/lcgdm/wiki/Dpm

StoRM (STOrage Resource Manager) - http://storm.forge.cnaf.infn.it/

Page 4: A comparison of distributed data storage middleware for HPC, GRID and Cloud Mikhail Goldshtein 1, Andrey Sozykin 1, Grigory Masich 2 and Valeria Gribova

dCache

4

Page 5: A comparison of distributed data storage middleware for HPC, GRID and Cloud Mikhail Goldshtein 1, Andrey Sozykin 1, Grigory Masich 2 and Valeria Gribova

Disk Pool Manager

5

Page 6: A comparison of distributed data storage middleware for HPC, GRID and Cloud Mikhail Goldshtein 1, Andrey Sozykin 1, Grigory Masich 2 and Valeria Gribova

StoRM

6

Page 7: A comparison of distributed data storage middleware for HPC, GRID and Cloud Mikhail Goldshtein 1, Andrey Sozykin 1, Grigory Masich 2 and Valeria Gribova

Usage statistics in WLCG

7

Page 8: A comparison of distributed data storage middleware for HPC, GRID and Cloud Mikhail Goldshtein 1, Andrey Sozykin 1, Grigory Masich 2 and Valeria Gribova

Distributed storage systems

Traditional approach:

• Grid

• Distributed file systems (IBM GPFS, Lustre File System, etc.)

Modern technologies:

• Standard Internet Protocols (Parallel NFS, WebDAV, etc.)

• Cloud storage (Amazone S3, HDFS, etc.)

8

Page 9: A comparison of distributed data storage middleware for HPC, GRID and Cloud Mikhail Goldshtein 1, Andrey Sozykin 1, Grigory Masich 2 and Valeria Gribova

Classic NFS

9

Page 10: A comparison of distributed data storage middleware for HPC, GRID and Cloud Mikhail Goldshtein 1, Andrey Sozykin 1, Grigory Masich 2 and Valeria Gribova

Parallel NFS

10

Page 11: A comparison of distributed data storage middleware for HPC, GRID and Cloud Mikhail Goldshtein 1, Andrey Sozykin 1, Grigory Masich 2 and Valeria Gribova

Comparison results

11

Feature dCache DPM StoRMGrid protocols SRM, xroot, dcap

GridFTPSRM, RFIO, xroot, GridFTP

SRM,RFIO, xroot, GridFTP, file

Standardprotocols

NFS 4.1, WebDAV

NFS 4.1, WebDAV

-

Cloud backend HDFS (in development)

HDFS, Amazon S3

-

Quality of documentation

High Medium High

Ease of administration

Easy Medium Easy

Page 12: A comparison of distributed data storage middleware for HPC, GRID and Cloud Mikhail Goldshtein 1, Andrey Sozykin 1, Grigory Masich 2 and Valeria Gribova

Distributed dCache based Tire 1 WLCG storage

12

Page 13: A comparison of distributed data storage middleware for HPC, GRID and Cloud Mikhail Goldshtein 1, Andrey Sozykin 1, Grigory Masich 2 and Valeria Gribova

Implementation

13

Page 14: A comparison of distributed data storage middleware for HPC, GRID and Cloud Mikhail Goldshtein 1, Andrey Sozykin 1, Grigory Masich 2 and Valeria Gribova

Implementation details

Hardware: 4 x Supermicro servers (3 in

Yekaterinburg, 1 in Perm), 210 TB useful

capacity (252 full capacity, RAID5 +

Hotspare are used)

ОС Scientific Linux 6.3

dCache 2.6 from EMI repository

Protocol: NFS v4.1 (Parallel NFS)

RHEL has a parallel NFS client, no need to

install additional software to clusters

14

Page 15: A comparison of distributed data storage middleware for HPC, GRID and Cloud Mikhail Goldshtein 1, Andrey Sozykin 1, Grigory Masich 2 and Valeria Gribova

Performance testing

15

IOR test (http://www.nersc.gov/systems/trinity-nersc-8-rfp/nersc-8-trinity-benchmarks/ior/)

Page 16: A comparison of distributed data storage middleware for HPC, GRID and Cloud Mikhail Goldshtein 1, Andrey Sozykin 1, Grigory Masich 2 and Valeria Gribova

Future works

Evaluation of NFS performance over 10GE and WAN

Evaluation of dCache in the experiments (Particle Image

Velocimetry and so on)

Participation in GRID projects:

• Grid of Russian National Nanotechnology Network

• WLCG (through Joint Institute for Nuclear Research, Dubna, Russia)

Connection to Hadoop Cluster (when dCache will support

HDFS)

16

Page 17: A comparison of distributed data storage middleware for HPC, GRID and Cloud Mikhail Goldshtein 1, Andrey Sozykin 1, Grigory Masich 2 and Valeria Gribova

Thank you!

Andrey Sozykin

Institute of Mathematics and Mechanics

UrB RAS, Russia, Yekaterinburg

[email protected]

17