atlas hk tier-2 site setup & storage research in cuhk by roger wong & runhui li

33
ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Upload: adele-curtis

Post on 17-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

ATLAS HK Tier-2 Site Setup&

Storage Research in CUHK

By Roger Wong& Runhui Li

Page 2: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Roadmap

• ATLAS HK Tier-2 Site Setup– Presented by Roger Wong

([email protected]) Research Computing Team Information Technology Services Centre The Chinese University of Hong Kong

• Storage Research in CUHK

Page 3: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Major Tasks

• HTCondor• ARC CE + EGIIS• DPM• Frontier Squid%• Client

Install software components

Page 4: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Install Software Components

HTCondor

• Completed

ARC CE + EGIIS

• Basic configuration completedDP

M

•Basic installation completed (all-in-one node)•It works for protocols such as RFIO, XROOT

Squid

•Completed

Client

•Could access ARC CE, DPM and Frontier Squid

Page 5: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Install Testing Cluster• 10 VMs

– One EGIIS (for testing the registration process to CERN grid)– One ARC CE node with HTCondor manager & submit roles

• Two HTCondor worker nodes

– One ARC CE node with HTCondor manager, submit and execute roles– One DPM head node

• Two DPM disk nodes

– One Squid server– One client

• All 10 servers with production host certificates applied from AP Grid PMA

• Would like to try to connect to CERN grid now (yet to be discussed with counterpart in Lyon)– Need to tune configuration parameters– Need to sort out all outstanding issues

Page 6: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Conduct tender of production cluster

• Preliminary specification– 1,000+ cores– 1 PB storage

• Target to finalize the cluster specification after the testing cluster is connected to CERN grid in “test” mode

Page 7: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Upgrade testing cluster into production cluster

• Replacing ARC CE, HTCondor worker nodes and Squid

Replacing VMs in testing cluster with PMs

Add more HTCondor worker nodes

Reinstall DPM with PMs and storage devices

Page 8: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Tentative Timeline

Connect testing cluster to CERN grid in “test” mode (by 2015)

Conduct tender for production cluster(Jan 2016)

Put cluster into production (H2 2016)

Page 9: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Roadmap

• ATLAS HK Tier-2 Site Setup• Storage Research in CUHK

– Lead by Professor Patrick P. C. Lee ([email protected])

– Presented by Runhui Li ([email protected]) Advanced Networking and System Research Lab Department of Computer Science and Engineering

Page 10: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Storage Research in CUHK Build dependable storage systems with fault tolerance, recovery,

security, performance in mind

Techniques:• Erasure coding: Provide fault tolerance via “controlled” redundancy

(e.g., RAID)• Deduplication: Remove content-level “uncontrolled” redundancy• Security: Ensure data confidentiality and integrity against attacks

Targeted architectures:• Clouds, data centers, disk arrays, SSDs

Approach:• Build prototypes, backed by experiments and theoretical analysis• Open-source software • http://www.cse.cuhk.edu.hk/~pclee

Page 11: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Storage Research in CUHK

Erasure coding Deduplication Security

Cloud Data center Disk array SSD

Backup MapReduce StreamingPrimary I/O

Our focus

Big data

File and storage systems

Page 12: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Motivation

Distributed storage systems are widely deployed to provide scalable storage by striping data across multiple nodes

Failures are common

12

LAN

Page 13: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Replication vs. Erasure Coding

Solution: Add redundancy:• Replication• Erasure coding

Enterprises (e.g., Google, Azure, Facebook) move to erasure coding to save footprints due to explosive data growth• e.g., 3-way replication has 200% overhead; erasure

coding can reduce overhead to 33% over 50% of operational cost saving [Huang, ATC’12]

13

Page 14: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Background: Erasure Coding

Divide file to data chunks (each with multiple blocks) Encode data chunks to additional parity chunks Distribute data/parity chunks to nodes Fault-tolerance: any out of nodes can recover file data

14

File encode divide

Nodes

(n, k) = (4, 2)

ABCD

A+CB+D

A+DB+C+D

AB

CD

A+CB+D

A+DB+C+D

AB

CD

Page 15: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Erasure Coding

Key advantage:• Reduce storage space with high fault tolerance

Challenges:• Data chunk updates need parity chunk updates expensive updates

• k chunks needed to recover a lost chunk expensive recovery

Our work: Mitigating performance overhead of erasure coding, while preserving storage efficiency

Page 16: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

CodFS

Object-based distributed file system• Splits a large file into smaller segments that are striped across

different storage nodes

Erasure coding• Each segment is independently encoded with erasure coding for

fault tolerance

Decoupling metadata and data management• Metadata updates off the critical path

Lightweight recovery• Monitor health of storage nodes and trigger recovery if needed

16"Parity Logging with Reserved Space: Towards Efficient Updates and Recovery in Erasure-coded Clustered Storage“, USENIX FAST 2014

Page 17: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

CodFS Solve update problem

Novelty: use parity logging with reserved space• Puts deltas in a reserved space next to parity chunks to

eliminate disk seeks in parity updates• Predicts and reclaims reserved space in workload-aware

manner• Mitigates both network and disk I/Os in updates and recovery

17

Data nodes Parity nodes

∆A

∆P = f(∆A) ∆Q = g(∆A)

Page 18: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

CodFS: I/O Workflow

18

Client MDS

OSD OSD...

OSD OSD... OSD OSD

primary

secondary

segment

chunk

Encode

1

23

4

MDS: metadata serverOSD: object storage device

Page 19: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

CodFS Implementation

CodFS Architecture• Exploits parallelization

across nodes and within each node

• Provides a file system interface based on FUSE

OSD: Modular Design19

Page 20: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Results

Aggregate read/write throughput• Achieve several hundreds of megabytes per second• Network bound 20

Page 21: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Projects on Erasure Coding Mixed failures

• STAIR codes: a general, space-efficient erasure code for tolerating both device failures and latent sector errors [FAST’14, TOS’14]

• I/O-efficient integrity checking against silent data corruptions [MSST’14]

Efficient updates• CodFS: enhanced parity logging to reduce network and disk I/Os [FAST’14]

Efficient recovery• NCCloud: reduce bandwidth for archival storage [FAST’12, INFOCOM’13, TC’14]

• I/O-efficient recovery schemes for erasure codes [MSST’12, DSN’12, TC’14, TPDS’14]

Integration of erasure coding and Hadoop• CORE: Regenerating code deployment in HDFS [MSST’13,TC’15]

• Degraded-First Scheduling: MapReduce on erasure-coded storage [DSN’14]

• Encoding-Aware Replication: efficient transition from replication to erasure coding on HDFS [DSN’15]

Modeling of SSD RAID• Stochastic model to capture reliability changes as SSDs age [SRDS’13, TC]

Page 22: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Projects on Deduplication

LiveDFS: Linux kernel-space deduplication file system [Middleware’11]

• Extends Linux file system with deduplication• Follows Linux file system layout• Deployed as a kernel driver module

CloudVS: Tunable version control for virtual machine images on Openstack [NOMS’12,TSC’15]

• Extends Eucalyptus with deduplication• Tunable performance between storage efficiency and performance

RevDedup: Reverse deduplication with high read/write throughput on GB/s scale [APSys’13, TOS’15]

• Efficient hybrid inline and out-of-line deduplication

Page 23: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Projects on Security

FADE: secure access control and assured deletion for cloud storage [SecureComm’10, ICPP Workshop 11, TDSC’12]

FMSR-DIP: remote data checking for regenerating codes [SRDS’12, TPDS’14]

Cryptographic deduplication cloud storage [TPDS’14, TPDS’15]

CDStore: unifying erasure coding, deduplication, and security via convergent dispersal [HotStorage’14, USENIX ATC’15]

Page 24: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

DISCUSSION

Page 25: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Connecting to CERN Grid

By Roger Wong

Page 26: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Connecting to CERN Grid (1)• Registered in GOCDB or OIM• Step 0: Required Services

– SE• CUHK: Setup SRMv2.2 and configure necessary space tokens• Lyon: Configure FTS channels• Could we transfer in data from more than one Tier-1 sites?

– CE and WNs• Not that many in the testing cluster when first connecting to Tier-1 site• Will add much more WNs within 6 months

– CVMFS• What does CUHK need to do? Just ensure our client has CVMFS installed?

– Squid• Install default and fail-over Squid servers• Manual fail-over?

• Question– Could CUHK transfer in data from more than one Tier-1 sites?

• Outstanding items– Separate DPM head node and disk nodes– SRMv2.2 configuration

Page 27: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Connecting to CERN Grid (2)• Step 1: Register the site to AGIS

– Register an “Atlas Site” with the site name in the GOCDB / OIM– Is site name just CUHK?

• Step 2: Register the storage to DDM– Register “DDM EndPoints” corresponding to the space tokens in AGIS

• CUHK– SE name– Space token availability– Email address of responsible person– seinfo– FTS channel information

• Lyon– Open a DDM Ops Savannah ticket

» Include DDM endpoint in SiteServices and DeletionServices» Validate the transfer and deletion steps with one dataset» DDM endpoints will appear in DaTRI after 24 hours

– Fill in all the information

Page 28: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Connecting to CERN Grid (3)• Step 3: Set up a Squid

– Register the Squid in AGIS as well as the Frontier services that it should look up

• Step 4: Panda Queues– CUHK

• CE name and queue name• vmem size per job slot• Available disk size (workdir) per job slot• Wall-time limit if any

– Lyon• Register “Panda Site”, “Panda Resources” and associated “Panda Queues” in AGIS

• Question– Will the ARC CE queue become Panda queue automatically if CUHK registers as

a Panda site? No extra set up in CUHK is necessary?

Page 29: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Connecting to CERN Grid (4)

• Step 4: Panda Queues– Panda site

• Usually “Atlas Site” == GOCDB/OIM site name

– Panda resource• Associated to the “Panda site”• Production jobs: usually “Panda site” == “Atlas site” ==

GOCDB/OIM

– Panda queue• Associated to the “Panda resource”• Usually the same name as “Panda resource”• Associate the CE and the queue• Set queue status to test

Page 30: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Connecting to CERN Grid (5)

• Step 5: ATLAS SW installation/validation system– After “Panda queues” are configured, contact

[email protected] to start automatic software installation/validation

Page 31: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Connecting to CERN Grid (6)

• ATLAS Functional Tests– DDM FT: Test storage and connectivity stability– SAM test: Test CE and storage stability

• Step 6: Perform data transfer functional test– Lyon: include the site in DDM FT (T1->site and

Sonar)

Page 32: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Connecting to CERN Grid (7)

• Step 7: Perform production functional test• Step 8: Perform analysis functional test

– Contact atlas-adc-hammercloud-support and inform the “Panda resource” name

– The site should set automatically into the HC DB within hours

– Test jobs should appear with 24 hours

Page 33: ATLAS HK Tier-2 Site Setup & Storage Research in CUHK By Roger Wong & Runhui Li

Connecting to CERN Grid (8)

• Step 9: Analysis activity– Add the site to PanDA database (for pathena/prun

analysis jobs)– Site appear in PanDA Cloud Monitor– Run GangaRobot jobs with a success rate > 95%

for 10 days• Step 10: Production activity

– The site would be online after a few jobs could be successfully run