fefs: scalable cluster file system · fefs† is scalable parallel file system based on lustre....

13
FEFS: Scalable Cluster File System

Upload: others

Post on 01-Aug-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: FEFS: Scalable Cluster File System · FEFS† is scalable parallel file system based on Lustre. High Performance & High Scalability Scalable I/O performance (~1TB/s) & capacity (~8EB)

FEFS: Scalable Cluster File System

Page 2: FEFS: Scalable Cluster File System · FEFS† is scalable parallel file system based on Lustre. High Performance & High Scalability Scalable I/O performance (~1TB/s) & capacity (~8EB)

Copyright 2011 FUJITSU LIMITED

Features of FEFS

FEFS† is scalable parallel file system based on Lustre.

High Performance & High Scalability

Scalable I/O performance (~1TB/s) & capacity (~8EB).

I/O Bandwidth Guarantee

Fair share and Best effort QoS.

High Reliability & High Availability

Failover with redundant hardware

and continuing file system service.

† FEFS: Fujitsu Exabyte File System

Meta Data Server

(MDS)

Client Node

Meta Data

Object Storage

Server

(OSS)

Object Storage

Target

(OST)

File Data

1

Page 3: FEFS: Scalable Cluster File System · FEFS† is scalable parallel file system based on Lustre. High Performance & High Scalability Scalable I/O performance (~1TB/s) & capacity (~8EB)

System Configuration

ETERNUS (MDT)

FEFS Clients

InfiniBand Network

PRIMERGY RHEL5.6, RHEL6.1

Metadata Server Data Server

PRIMERGY RHEL5.6 (MDS)

ETERNUS(OST)

PRIMERGY RHEL5.6 (OSS)

Copyright 2011 FUJITSU LIMITED 2

PRIMEHPC FX10

2

Page 4: FEFS: Scalable Cluster File System · FEFS† is scalable parallel file system based on Lustre. High Performance & High Scalability Scalable I/O performance (~1TB/s) & capacity (~8EB)

Specification of FEFS

Fujitsu expand system limits and add new functions to Lustre. Item FEFS Lustre

Max. file system size 8 EB 64 PB

Max. file size 8 EB 320 TB

Max. number of files 8x1018 files 4x109 files

Max. number of OST 20x103 8,150

Max. OST size 1 PB 16 TB

Max. number of clients 1x106 clients 128x103 clients

Max. block size 512 KB 4 KB

Max. number of stripes 20x103 stripes 160 stripes

QoS (Fair share/Best effort) Yes No

Directory Quota Yes No

InfiniBand Multi-rail Yes No Copyright 2011 FUJITSU LIMITED 3 3

Page 5: FEFS: Scalable Cluster File System · FEFS† is scalable parallel file system based on Lustre. High Performance & High Scalability Scalable I/O performance (~1TB/s) & capacity (~8EB)

Copyright 2011 FUJITSU LIMITED

High Performance & High Scalability

Achieved high-scalable I/O performance with multiple server.

Scale out throughput & capacity by adding server & storage.

OSS

Add Server&Storage

OSS

Number of servers

Th

rough

put/

Capacity

4

Page 6: FEFS: Scalable Cluster File System · FEFS† is scalable parallel file system based on Lustre. High Performance & High Scalability Scalable I/O performance (~1TB/s) & capacity (~8EB)

Copyright 2011 FUJITSU LIMITED

I/O Bandwidth Guarantee: Fair Share QoS

Sharing IO bandwidth with all users.

Prevent slowdown from huge I/O from a user.

Prevent variability in job execution time.

File Servers Login Node

User A

User B

Without Fair Share QoS

Not Fair

With Fair Share QoS

User A

User B

Fair

Fujitsu extended function

5

Page 7: FEFS: Scalable Cluster File System · FEFS† is scalable parallel file system based on Lustre. High Performance & High Scalability Scalable I/O performance (~1TB/s) & capacity (~8EB)

Copyright 2011 FUJITSU LIMITED

I/O Bandwidth Guarantee: Best Effort QoS

Utilize all I/O bandwidth exhaustively.

Shared by all clients Occupied by one client

Clients

File Servers

Clients A

Clients B

Fujitsu extended function

6

Page 8: FEFS: Scalable Cluster File System · FEFS† is scalable parallel file system based on Lustre. High Performance & High Scalability Scalable I/O performance (~1TB/s) & capacity (~8EB)

Effectiveness of Fair Share QoS

Sample Case: User A 19 node job

User B 1 node job ⇒ Creation and removal time of 10,000 files.

User B

10,000 files

Without fair share

Single user

Without fair share

Multiple users

With fair share

Multiple users

Create files 4.1 sec 10.1 sec 3.9 sec

Remove files 4.2 sec 14.0 sec 5.5 sec

19 node job

1 node job

19 node job

1 node job

1 node job

FE

FS

Serv

er

FE

FS

Serv

er

FE

FS

Serv

er

Copyright 2011 FUJITSU LIMITED

User A

User B

User A

User B

User B

User B’s processing time

7

Page 9: FEFS: Scalable Cluster File System · FEFS† is scalable parallel file system based on Lustre. High Performance & High Scalability Scalable I/O performance (~1TB/s) & capacity (~8EB)

Copyright 2011 FUJITSU LIMITED

High Reliability and High Availability

Avoid out of service time caused by a single point of failure

with redundant hardware and failover mechanism.

OSS (Active)

OSS (Active)

RAID RAID

MDS OSS

RAID

IB SW IB SW Network path

Disk path

Dual Server

RAID

Failover

MDS (Active)

MDS (Standby)

Failover

Compute Node (Clients)

Redundant hardware

8

Page 10: FEFS: Scalable Cluster File System · FEFS† is scalable parallel file system based on Lustre. High Performance & High Scalability Scalable I/O performance (~1TB/s) & capacity (~8EB)

Continue communication when single point of IB failure occurs.

All IB connections are used by round-robin order by each requests.

Copyright 2011 FUJITSU LIMITED

High Availability: InfiniBand Multi-rail

Clients

MDS/OSS

o2ib0

o2ib0

o2ib0

o2ib0

InfiniBand SW

Multi-rail

by o2iblnd

Clients

MDS/OSS

o2ib0

o2ib0

o2ib0

o2ib0

Failure

Degeneracy

& continue I/O

Fujitsu extended function

9

Page 11: FEFS: Scalable Cluster File System · FEFS† is scalable parallel file system based on Lustre. High Performance & High Scalability Scalable I/O performance (~1TB/s) & capacity (~8EB)

Third-party Clients Connectivity

FEFS can be mounted on the third-party IA clients

RAID10 RAID6 RAID6

Requirements for the third-party IA clients which mount FEFS server.

InfiniBand Card

Mellanox InfiniBand (QDR) HCA

オペレーティング システム

OS

Red Hat Enterprise Linux 5.6 (Fujitsu support Kernel version)

For more details, please contact us.

PRIMERGY Third-party IA Client

FEFS Copyright 2011 FUJITSU LIMITED 10

MDS OSS

10

Page 12: FEFS: Scalable Cluster File System · FEFS† is scalable parallel file system based on Lustre. High Performance & High Scalability Scalable I/O performance (~1TB/s) & capacity (~8EB)

Copyright 2011 FUJITSU LIMITED

Contribution to the Lustre Community

Fujitsu will

work with Lustre community,

and merge our Lustre enhancements into the

future version of Lustre 2.x community release.

11

Page 13: FEFS: Scalable Cluster File System · FEFS† is scalable parallel file system based on Lustre. High Performance & High Scalability Scalable I/O performance (~1TB/s) & capacity (~8EB)

Copyright 2011 FUJITSU LIMITED