state-of-the-art storage solutions...and more than that. fabrizio magugliani emea hpc business...
Post on 27-Mar-2015
220 Views
Preview:
TRANSCRIPT
State-of-the-art Storage Solutions
...and more than that.
Fabrizio MaguglianiEMEA HPC Business Development and Sales
fabrizio.magugliani@e4company.com
European AFS Workshop 2009
September 28th-30th 2009Department of Computer Science and Automation/University Roma Tre
1
What does E4 Computer Engineering stand for ?
E4 = Engineering 4 (for) Computing
2
E4 builds the solutions that accomplish the users’ requirements
Workstation (fluid-dynamics, video editing … )
Server (firewall, computing node, scientific apps …)
Storage (from small DB up to big data requirements)
SAN – Storage Area Network
HPC Cluster , GPU Cluster, Interconnect
Products and Services
3Wide – Reliable – Advanced
System config and optimization
www.e4company.com luca.oliva@e4company.com
5
Customer References
Choosing the right computing node
•Form factor: [1U,7U]
•Socket: [1,2,4,8]
•Core: 4,6
•Memory size
•Accelerators (GPUs)
Architecture Non Uniform Memory Access
(AMD)
•Form factor: [1U,7U]
•Socket: [1,2,4]
•Core: 4,6
•Memory size
•Accelerators (GPUs)
Architecture Uniform Memory Access
(INTEL)
•Form factor: •Workstation (graphic) •Server rack-mount•Blade 6
Choosing the right accelerator
7
Choosing and connecting the right accelerator
8
Choosing the right accelerator: Tesla S1070 Architecture
PowerSupply
ThermalManagement
System Monitoring
PCIe x16
Gen2 Switch PCI Express
Cables to Host
System(s)
PCIe x16 Gen2
PCIe x16 Gen2
TeslaGPU
4GB GDDR3DRAM
TeslaGPU
4GB GDDR3DRAM
PCIe x16
Gen2 Switch
TeslaGPU
4GB GDDR3DRAM
TeslaGPU
4GB GDDR3DRAM
Multiplexes PCIe bus between 2
GPUs
Each 2 GPU sub-system
can be connected to
a different host
10© NVIDIA Corporation 2008
800 GFLOPS on 16 GPUs~ 99% Scaling
Choosing the right accelerator: performance
Choosing the right interconnection technologies
• Gigabit Ethernet entry level on every solution. Ideal solution for codes with low interprocess communication requirements
• InfiniBand DDR 20 + 20 Gb/s, integrable on motherboard (first cluster InfiniBand 2005, Caspur)
• 10 Gb/s Ethernet
• Quadrics
• Myrinet
11
1 GbE 10 GbE10 GbE RDMA
(Chelsio)IB DDR
(InfiniHost)IB QDR
(ConnectX)
Latency (microsecondi) 50 50 10 2,5 1,2
Bandwith (MB/s) 112 350 875 1500 3000Bisectional Bandwith
(MB/S) 175 500 2900 5900
Interconnect
• Gigabit Ethernet: Ideal solution for applications requiring moderate bandwidth among processes
•Infiniband DDR 20 + 20 Gb/s motherboard-based. Infinipath on HTX slot, tested with latencies less than 2 microseconds.
• Myrinet, Quadrics
Choosing the right Storage
13
HD section 200-400 MB/s
Storage type
Performance
Disk Server-EHT300-800 MB/s
SAN - FCUp to 1 GB/s
HPC storage350-600 MB/s per each chassisETH interface
HPC storage6 GB/s – FC, IB interfaceStorage space of PB - DataDirect
Storage
• Disk subassembly
•Disk Server SATA /SAS
•Storage SAS / SAS
•Ideal for HPC applications, ethernet i/f
• Ideal for HPC applications, FC/IB i/f
Interface Performance
Ctrl RAID PCI-Ex
ETH
FC ETH
InfiniBand FC
ETH
200MB/s
300 – 800 MB/s
Up to 1 GB/s
Up to 3 GB/s
500 MB/s per chassis
Storage Server
• high flexibility, low power consumption solution engineered by E4 for high bandwidth requirements.
•COTS-based (2 CPU INTEL Nehalem)
•RAM can be configured according to the users’ requirements (up to 144GB DDR3)
•Controller SAS/SATA multi lane
•48 TB in 4U
• 1GbE (n via trunking), 10GbE, Infiniband DDR/QDR
•374 units installed at CERN (Geneva), 70 in several customers
HPC Storage Systems
Data Direct Network PANASAS Cluster Storage
• Interface: FC / IB
•Performance: up to 6GB/s
•560TB per Storage System
•Ideal areas:•Real time data acquisition•Simulation•Biomedicine, Genomics•Oil & Gas•Rich media•Finance
•Clustered storage system based on Panasas File System
•Parallel
•Asynchronous
•Object-based
•Snapshot
•Interface : 4x1GbE, 1x10GbE, IB (router)
•Performace (x shelf)
•500 – 600 MB/s up to 100s GB/S (sequential)
•20 TB per shelf, 200TB/rack, up to PBs
•SSD (optimal for random I/O)
HPC Storage Systems
File Systems
• NFS
•lustre
•GPFS
•panasas
•AFS
Storage Area Network
•E4 is Qlogic Signature Partner
•Latest technology
•Based on high performance I/F Fibre Channel 4+ 4 Gb multipath
•HA
•Failover for mission critical applicationss (finance, biomedics..)
•Oracle RAC
•Reliability: basic requirement, guaranteed by E4’s production cycle
•Selection of quality components
•Production process taken care of in every detail
•Burn-in to prevent infantile mortality of the component
•At least 72h accelerated stress test in a room with high temperature (35C)
•24h individual test of each sub-system
•48h simultaneous test of each sub-system
System’s validation - Rigid quality procedure
19
•OS installation to prevent HW/SW incompatibility
Case Histories
20
Case History – Oracle RAC http://www.oracle.com/global/it/customers/pdf/snapshot_gruppodarco_e4.pdf
21
Case History – INTEL cluster @ Enginsoft
May 2007 INTEL Infinicluster
• 96 computing nodes Intel quad core 2,66GHz 4 TFLOPS
• 1,5 TB RAM• Interconnection: Infiniband
4x DDR 20Gbps• 30 TB Storage FC• Application’s fields:
Computer Aided Engineering
22
Case History – CERN computing servers 1U
• Server 1U with high computing capacity
• Application’s field: Educational, academic research
• Customer: CERN (Geneva), main national computing and research centres
• 2005• 415 nodes dual Xeon® 2,8Ghz• 4,6 TFLOPS
• 2006• 250 nodes Xeon® Woodcrest 3GHz• 6 TFLOPS• 2 TB RAM
• System installed up to July ’08 : over 3000 units
23
Case History – AMD cluster @ CASPUR
June 2005 AMD Infinicluster
• 24 computing nodes Opteron dual core 2,4GHz, 460GFLOPS
• 192GB RAM• Interconnection Infiniband• Expanded at 64 nodes: 1,2
TFLOPS, 512GB RAM
2004 Cluster SLACSCluster SLACS
• 24 computing nodes Opteron, 200GFLOPS
• 128GB RAM
• Managed by CASPUR on behalf of Sardinian LAboratory for Computational materials Science, l'INFM (Istituto Nazionale Fisica della Materia)
24
Case History – CRS4 Cluster 96 core
February 2005
• 96 computing nodes Opteron Dual core, 384 GFLOPS
• 192 GB RAM in total
• Application’s fields :• environmental sciences• Renewable energy, fuel
cell• bioinformatics
25
Case History – Cluster HPC Myrinet 2005
Cluster HPC interconnection Myrinet
• 16 computing nodes dual Intel® Xeon® 3.2 GHz
• High speed interconnetcion Myrinet
• Storage SCSI to SATA 5 TB
• Monitor KVM
• 2 switch Ethernet 24 ports layer 3
Application’s fields : Educational, research
Customer : ICAR CNR of Palermo
26
12 Computes Nodes96 core - 24 CPU INTEL “Nehalem” 5520
GFLOPS(peak): 920RAM: 288 GB
6 GPU server nVIDIA S107024 GPU TESLA5760 core singola precisione 720 core doppia precisioneGLOPS(peak): 24000 (24 TFLOPS)
Case History – CNR/ICAR
Hybryd Cluster (CPU + GPU)
1 Front end Node
48-port Gb Ethernet Switch
24-port Infiniband 20Gb/s Switch
Case History – CNR/ICAR
Hybryd Cluster (CPU + GPU)
Hybrid cluster CPU/GPU – ICAR CNR Cosenza - ALEPH
Case History – CNR/ICAR
Case History – EPFL
E4: The right partner for HPC
Questions?
Feel free to contact me:
Fabrizio Maguglianifabrizio.magugliani@e4company.com+39 346 9424605
Thank you!
E4 Computer Engineering SpA
Via Martiri della Liberta’ 66
42019 - Scandiano (RE), Italy
www.e4company.com
Switchboard: +39.0522.99181135
top related