christian mohrbacher [email protected] · schmallenberg dortmund potsdam berlin...
TRANSCRIPT
1
Introduction
Overview on FhGFS
Benchmarks
2
The Fraunhofer Gesellschaft (FhG)
Fraunhofer is based in Germany
Largest organization for applied research in Europe
Annual research volume of 1.6 billion euros
17,000 employees
~ 60 Fraunhofer institutes with different
business fields
München
Holzkirchen
Freiburg
Efringen-
Kirchen
Freising Stuttgart
Pfinztal Karlsruhe Saarbrücken
St. Ingbert Kaiserslautern
Darmstadt Würzburg
Erlangen
Nürnberg
Ilmenau
Schkopau
Teltow
Oberhausen
Duisburg
Euskirchen Aachen St. Augustin Schmallenberg
Dortmund
Potsdam Berlin
Rostock Lübeck
Itzehoe
Braunschweig
Hannover
Bremen
Bremerhaven
Jena
Leipzig
Chemnitz
Dresden
Cottbus Magdeburg
Hall
e
Fürth
Wachtberg
Ettlingen
Kandern
Oldenburg
Freiberg
Paderborn
Kassel
Gießen Erfurt
Augsburg
Oberpfaffenhofen
Garching
Straubing
Bayreuth
Bronnbach
Prien
3
The Fraunhofer ITWM
Institute for Industrial Mathematics
Located in Kaiserslautern, Germany
Staff: ~ 150 employees + ~ 70 PhD students
4
ITWM’s Competence Center HPC
FhGFS Photorealistic RT
rendering Interactive
seismic imaging
Green IT Smart Grids
Programming models / tools
Research
5
Introduction
Overview on FhGFS
Benchmarks
6
FhGFS - Overview
Maximum Scalability
Flexibility Easy to use
Free to use Support by
Fraunhofer
http://www.fhgfs.com
7
FhGFS – Key concepts (1)
Maximum Scalability
Distributed file contents & metadata
Initially optimized especially for HPC
Native Infiniband / RDMA
8
FhGFS - Key concepts (2)
Flexibility
Add clients and servers without downtime
Multiple servers on the same machine
Client and servers can run on the same machine
Servers run on top of local FS
On-the-fly storage init => suitable for temporay “per-job” PFS
Flexible striping (per-file/per-directory)
Multiple networks with dynamic failover
9
FhGFS - Key concepts (3)
Easy to use
Servers: userspace
Client: Kernel module w/o kernel patches
Graphical system administration & monitoring
Simple setup/startup mechanism
No specific Linux distribution
No special hardware requirements
10
Partners / Vendors
11
Customers (Examples)
2 Servers 2 Clients
8 TB 800 MB/s
12 Servers 900 Clients
1PB 20 GB/s
12 Servers 1200 Clients
300 TB 6 GB/s
5 Servers 100 Clients
200 TB 5 GB/s
12
Current development
Integrated High Availability
No shared storage needed
Flexible mirroring
RAID10 available in 2012.10-beta1
Internal speed improvements
e.g. metadata format (available in 2012.10-beta1)
HSM integration
Grau Data and Fraunhofer collaborate
Providing a fast archiving solution
Built-in benchmarking tools (available in 2012.10-beta1)
Quotas
13
Introduction
Overview on FhGFS
Benchmarks
14
File Statistics
Dice PFS comparison project surveyed HPC data center representatives to find the most important metrics 1)
Multi-stream performance
Large block I/O
Metadata performance
File size statistics by Johannes Gutenberg University Mainz 2)
Large files are common (>100 GB)
Very small files (<=4k) are the most common
90% of files => 10% disk capacity
1) PFS Survey Report; http://www.avetec.org/appliedcomputing/dice/projects/pfs/docs/PFS_Survey_Report_Mar2011.pdf 2) A Study on Data Deduplication in HPC Storage Systems; Dirk Meister et al.; Johannes Gutenberg Universität; SC12
15
Benchmarks – server hardware
20 servers for metadata and storage
2x Intel Xeon X5660 @ 2.8 GHz
48 GB RAM
4x Intel 510 Series SSD (RAID 0), Ext4
QDR Infiniband
Scientific Linux 6.3; Kernel 2.6.32-279
FhGFS 2012.10-beta1
16
Streaming Throughput
0
5000
10000
15000
20000
25000
0 2 4 6 8 10 12 14 16 18 20
MB
/s
# Storage servers
Sequential Read/Write,
up to 20 servers, 160 client procs
Write
Read
17
Streaming Throughput (2)
Single node local performance
Write: 1332 MB/s
Read: 1317 MB/s
20 nodes (theoretical)
Write: 26640 MB/s
Read: 26340 MB/s
FhGFS
Write: 26247 MB/s (98,5%)
Read: 24789 MB/s (94,1%)
25247
24789
0
5000
10000
15000
20000
25000
0 2 4 6 8 10 12 14 16 18 20
MB
/s
# Storage servers
Write
Read
18
Streaming Throughput (3)
25409
26649
4096
8192
16384
6 12 24 48 96 192 384 768
MB
/s
# Clients
Sequential Read/Write,
20 servers, up to 768 client procs
Write
Read
19
Shared file access (1)
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
0 2 4 6 8 10 12 14 16 18 20
MB
/s
# Servers
Sequential I/O, 1 shared file, 600k block size
up to 20 servers, 192 client procs
Write
Read
20
Shared file access (2)
6000
7000
8000
9000
10000
11000
12000
13000
14000
12 24 48 96 192 384 768
MB
/s
# Clients
Sequential write, 1 shared file
20 servers, up to 768 client procs
21
IOPS
109992
1126963
0
200000
400000
600000
800000
1000000
1200000
2 4 6 8 10 12 14 16 18 20
IOP
S
# Storage servers
IOPS (Random 4k writes)
up to 20 servers, 160 client procs
22
Metadata performance
34693
539724
0
100000
200000
300000
400000
500000
600000
1 2 4 6 8 10 12 14 16 18 20
Cre
ate
/se
c
# MDS
Create
93007
1381339
0
200000
400000
600000
800000
1000000
1200000
1400000
1 2 4 6 8 10 12 14 16 18 20
Sta
t/se
c
# MDS
Stat
File create / stat
up to 20 servers, up to 640 client procs (32*#MDS)
23
Metadata performance (2)
> 500,000 file creates per second
Creation of 1,000,000,000 files: ~ 33 minutes
24
Questions?
http://www.fhgfs.com
http://wiki.fhgfs.com
Fraunhofer Booth
# 643