federated dafs: scalable cluster-based direct access file servers

34
Federated DAFS: Scalable Cluster-based Direct Access File Servers Murali Rangarajan, Suresh Gopalakrishnan Ashok Arumugam, Rabita Sarker Rutgers University Liviu Iftode University of Maryland

Upload: shina

Post on 23-Jan-2016

31 views

Category:

Documents


0 download

DESCRIPTION

Federated DAFS: Scalable Cluster-based Direct Access File Servers. Murali Rangarajan , Suresh Gopalakrishnan Ashok Arumugam, Rabita Sarker Rutgers University. Liviu Iftode University of Maryland. Network File Servers. TCP/IP. NFS. FILE SERVER. CLIENTS. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Federated DAFS: Scalable Cluster-based Direct Access File Servers

Federated DAFS: Scalable Cluster-based Direct Access File Servers

Murali Rangarajan, Suresh GopalakrishnanAshok Arumugam, Rabita Sarker

Rutgers University

Liviu Iftode

University of Maryland

Page 2: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

2

Network File Servers

OS involvement increases latency & overhead

TCP/UDP protocol processing

Memory-to-memory copying

NFS

CLIENTS

FILESERVER

TCP/IP

Page 3: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

3

Application Application

NIC

OS

Send Receive

NIC

Application has direct access to the network interface

OS involved only in connection setup to ensure protection

Performance benefits: zero-copy, low-overhead

User-level Memory Mapped Communication

OS OS

Page 4: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

4

SENDQUEUE

KernelAgent

VI Provider Library

VI NIC

Data transfer from user-space

Setup & Memory registration through kernel

Communication models Send/Receive: a pair of

descriptor queues Remote DMA: receive

operation not required

Set

up

&M

emor

y re

gist

rati

on

Application

RECVQUEUE

COMPQUEUE

Virtual Interface Architecture

Page 5: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

5

ApplicationBuffers

DAFS Client

VIPLUser

Kernel

NIC

VI NICDriver

DAFS Server

NIC

File access API

DAFS File ServerBuffers Driver

VI NICDriver

KVIPL

Direct Access File System Model

DAFS File ServerBuffers

VI NICDriver

VIPL

Page 6: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

6

Goal: High-performance DAFS Server

Cluster-based DAFS Server Direct access to network-attached storage

distributed across server cluster Clusters of commodity computers - Good

performance at low cost User-level communication for server

clustering Low-overhead mechanism Lightweight protocol for file access across

cluster

Page 7: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

7

Outline Portable DAFS client and server

implementation Clustering DAFS servers – Federated DAFS Performance Evaluation

Page 8: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

8

User-space DAFS Implementation

DAFS client and server in user-space

DAFS API primitives translate to RPCs on server

Staged Event Driven Architecture

Portable across Linux, FreeBSD and Solaris

DAFS Server

VI Network

Application

VI Local FSVI

DAFS ClientDAFS Server

Application

VI VI

DAFS Client

VI Network Local FS

DAFS API Request DAFS Server

Application

VI VI

DAFS Client

VI Network Local FS

DAFS API Response

Page 9: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

9

DAFS Server

CLIENT

ConnectionManager

Protocol Threads

SERVER

Connection RequestConnection RequestConnection RequestConnection Request

DAFS API RequestDAFS API RequestDAFS API Request

Response

Page 10: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

10

Client-Server Communication

VI channel established at client initialization VIA Send/Receive used except for dafs_read Zero-copy data transfers

Emulation of RDMA Read used for dafs_read Scatter/gather I/O used in dafs_write

DAFS Server

VI Network

Application

VI VIDAFS Client

Local FS

dafs_read(file, buf)buf

DAFS Server

req

VI NetworkVI VI

DAFS Client

Request

Local FS

dafs_read(file, buf)buf

DAFS Server

VI Network VIVI

DAFS Client

Response

Local FS

dafs_write(file, buf)

DAFS Server

bufreq

VI Network

DAFS Client

VIVI Local FS

Page 11: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

11

Asynchronous I/O Implementation

Applications use I/O descriptors to submit asynchronous read/write requests

Read/write call returns immediately to application

Result stored in I/O descriptor on completion

Applications need to use I/O descriptors to wait/poll for completion

Page 12: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

12

Benefits of Clustering

Local FSVI

DAFS Server

Application

VIDAFS Client

• • •

Application

VIDAFS Client

Application

VIDAFS Client

Single DAFS Server

Local FSVI

DAFS Server

Application

VI

DAFS Client

Local FSVI

DAFS Server

Local FSVI

DAFS Server

Application

VI

DAFS Client

Application

VI

DAFS Client

Standalone DAFS Servers on a Cluster

Local FSVI

DAFS Server

Application

VI

DAFS Client

Local FSVI

DAFS Server

Local FSVI

DAFS Server

Application

VI

DAFS Client

Application

VI

DAFS Client

Standalone DAFS Servers on a Cluster

Local FSVI

DAFS ServerApplication

VIDAFS Client

Application

VIDAFS Client

Application

VIDAFS Client

Clustered DAFS Servers

Clustering Layer

Local FSVI

DAFS Server

Clustering Layer

Local FSVI

DAFS Server

Clustering Layer

Page 13: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

13

Clustering DAFS Servers Using FedFS

Federated File System (FedFS) Federation of local file systems on cluster

nodes Extend the benefits of DAFS to cluster-based

servers Low overhead protocol over SAN

FedFS over SAN

File I/O

DAFSServer

DAFSServer

DAFSServer

DAFSServer

Page 14: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

14

FedFS Goals

Global name space across the cluster Created dynamically for each distributed

application Load balancing Dynamic Reconfiguration

Page 15: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

15

Each VD is mapped to a manager node Determined using hash function on pathname

Manager constructs and maintains the VD

Virtual Directory (VD) Union of all local directories with same

pathname

/

usr

file1

/

usr

file2

file1 file2

/

usr

Virtual Directory (/usr)

/

usr

file1

/

usr

file2

Page 16: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

16

Constructing a VD Constructed on first access to directory Manager performs dirmerge to merge

real directory info on cluster nodes into a VD Summary of real directory info is generated

and exchanged at initialization Cached in memory and updated on directory

modifying operations

Page 17: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

17

File Access in FedFS

Each file mapped to a manager Determined using hash on pathname Maintains information about the file

Request manager for location (home) of file Access file from home

Local FSVIFedFS

VI Network

DAFS Server

Local FSVIFedFS

DAFS Server

Local FSVIFedFS

DAFS Server

f1

manager(f1)

home(f1)

Page 18: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

18

Optimizing File Access Directory Table (DT) to cache file

information File information cached after first lookup Cache of name space distributed across

cluster Block level in-memory data cache

Data blocks cached on first access LRU Replacement

Page 19: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

19

Communication in FedFS

Two VI channels between any pair of server nodes Send/Receive for request/response RDMA exclusively for data transfer

Descriptors and buffers registered at initialization

Local FS

DAFS Server

FedFS

VI

VI Network

Local FS

DAFS Server

FedFS

VI

Send/Receive forRequest/Response

Local FS

DAFS Server

FedFS

VI

VI Network

Local FS

DAFS Server

FedFS

VI

BufferRDMA for

Response with data

Page 20: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

20

Performance Evaluation

DAFS Server

VI Network

Application

VI Local FSVI

DAFS Client FedFS

Application

VIDAFS Client

Application

VIDAFS Client

DAFS Server

Local FSVIFedFS

• • • •

• •

Page 21: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

21

Experimental Platform Eight node server cluster

800 MHz PIII, 512 MB SDRAM, 9 GB 10K RPM SCSI

Clients Dual processor (300 MHz PII), 512 MB SDRAM

Linux-2.4 Servers and Clients equipped with

Emulex cLAN adapter 32 port Emulex switch in full-bandwith

configuration

Page 22: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

22

SAN Performance Characteristics VIA Latency and Bandwidth

poll/wait for latency/bandwidth measurement respectively

Packet Size (Bytes)

Roundtrip Latency (s)

Bandwidth (MB/s)

256 23.3 56

512 27.3 85

1024 36.9 108

2048 56.0 109

4096 91.2 110

Page 23: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

23

Workloads Postmark – Synthetic benchmark

Short-lived small files Mix of metadata-intensive operations

Benchmark outline Create a pool of files Perform transactions – READ/WRITE paired

with CREATE/DELETE Delete created files

Page 24: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

24

Workload Details

Each client performs 30,000 transactions Each transaction – READ paired with

CREATE/DELETE READ = open, read, close CREATE = open, write, close DELETE = unlink

Multiple clients used for maximum throughput

Clients distribute requests to servers using a hash function on pathnames

Page 25: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

25

Base Case (Single Server) Maximum throughput

5075 transactions/second Average time per transaction

For client ~ 200 s On server ~ 100 s

Page 26: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

26

Postmark Throughput

0

5000

10000

15000

20000

25000

30000

0 1 2 3 4 5 6 7 8 9

Number of Servers

Po

stm

ark

Th

rou

gh

pu

t (t

xns/

sec)

File size: 2 K

File size: 4 K

File size: 8 K

File size: 16 K

# Servers

2 4 8

Speedup 1.75 3 5

Page 27: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

27

FedFS Overheads

Files are physically placed on the node which receives client requests

Only metadata operations may involve communication first open(file) delete(file)

Observed communication overhead Average of one roundtrip message among

servers per transaction

Page 28: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

28

Other Workloads No client request sent to file’s correct

location All files created outside Federated DAFS Only READ operations (open, read, close) Potential increase in communication overhead

Optimized coherence protocol minimizes communication Avoid communication at open and close in

the common case Data Caching helps reduce the frequency

of communication for remote data access

Page 29: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

29

Postmark Read Throughput

Each transaction = READ

0

10000

20000

30000

40000

50000

60000

2 4

Number of Servers

Po

stm

ark

Re

ad

Th

rou

gh

pu

t (t

xn

s/s

ec

)

Federated DAFS

Federated DAFS - No Cache

Page 30: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

30

Communication Overhead Without Caching

Without caching, each read results in remote fetch Each remote fetch costs ~65s request message (< 256 B) + response

message (4096 B)

# Servers

# Clients for Max. Throughput

# Transactions

# Remote Reads on each

server

2 10 300,000 150,000

4 20 600,000 150,000

Page 31: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

31

Work in Progress Study other application workloads Optimized coherence protocols to

minimize communication in Federated DAFS

File migration Alleviate performance degradation from

communication overheads Balance load

Dynamic reconfiguration of cluster Study DAFS over a Wide Area Network

Page 32: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

32

Conclusions Efficient user-level DAFS implementation Low overhead user-level communication

used to provide lightweight clustering protocol (FedFS)

Federated DAFS minimizes overheads by reducing communication among server nodes in the cluster

Speedups of 3 on 4-node and 5 on 8-node clusters demonstrated using Federated DAFS

Page 33: Federated DAFS: Scalable Cluster-based Direct Access File Servers

Thanks

Distributed Computing Laboratoryhttp://discolab.rutgers.edu

Page 34: Federated DAFS: Scalable Cluster-based Direct Access File Servers

SAN-2 Disco Lab

34

DAFS Performance

0

5000

10000

15000

20000

25000

30000

35000

40000

0 2 4 6 8 10

Number of Servers

Po

stm

ark

Th

rou

gh

pu

t (t

xns/

sec)

File size: 4 K