hpc case study - fujitsu · customer type no. of cpu peak perf. riken (kobe aics) the k computer...

13
Copyright 2012 FUJITSU LIMITED HPC Case Study

Upload: others

Post on 27-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Copyright 2012 FUJITSU LIMITED

HPC Case Study

Customers of Large-scale HPC Systems

Copyright 2012 FUJITSU LIMITED

Customer Type No. of CPU Peak Perf.RIKEN (Kobe AICS) The K computer 88,128 CPUs 11.28 PFlopsAustralian National University (NCI)

System operation will start in early 2013x86 Cluster (CX400) 3,592 CPUs 1.2 PFlops

University of Tokyo FX10 4,800 CPUs 1.1 PFlopsKyushu University

System operation will start in July 2012x86 Cluster (CX400),FX10 3,720 CPUs 510 TFlops

182 TFlopsHPC Wales, UK x86 Cluster > 2,000 CPUs > 300 TFlopsJapan Atomic Energy Agency x86 Cluster, FX1, SMP > 4,568 CPUs 214 TFlopsInstitute for Molecular Science x86 Cluster (RX300), FX10 > 420 CPUs > 140 TFlopsJapan Aerospace Exploration Agency FX1, SMP > 3,392 CPUs > 135 TFlopsRIKEN (Wako Lab. RICC) x86 Cluster (RX200) > 2,048 CPUs 108 TFlopsNAGOYA University x86 Cluster (HX600), FX1, SMP 1,504 CPUs 60 TFlopsA*STAR, Singapore x86 Cluster (BX900) 900 CPUs > 45 TFlopsA Manufacturer x86 Cluster > 2,600 CPUs > 77 TFlopsB Manufacturer x86 Cluster > 2,000 CPUs > 38 TFlops

Type definitions: FX10=PRIMEHPC FX10, x86 Cluster=Clusters based on PRIMERGY x86 server, SMP= SPARC Enterprise SMP server1

The University of Tokyo

Key requirements Increasing number of users and diverseness

Software compatibility with the K computer

1.4MW of power ceiling

System overview PRIMEHPC FX10 (4,800 nodes (50 racks))

Peak performance: 1.13 petaflops

Linpack performance: 1.04 petaflops (91.8% efficiency)

Focusing areas: earth science, astrophysics, seismology, weather modeling, materials science, energy, biology, hydrodynamics, solid-state physics…

Copyright 2012 FUJITSU LIMITED

#18 on TOP500

2

The University of Tokyo – System Overview

Copyright 2012 FUJITSU LIMITED

計算ノード群・インタラクティブノード群

[総理論演算性能 : 1.13PFLOPS][総主記憶容量 : 150TByte][インターコネクト : 6次元メッシュ/トーラス]

PRIMEHPC FX10 50筐体構成(4,800 計算ノード + 300 IOノード)

InfiniBand

FibreChannelEthernet

Compute nodes, Interactive nodes

Peak Performance: 1.13 petaflopsMemory capacity: 150 TBInterconnect: 6D mesh/torus - ”Tofu”

PRIMEHPC FX10 x 50 racks(4,800 compute nodes ) Job management

Operation management Authentication

PRIMERGY RX200S6 x 16

Management servers

Local file system

Storage capacity: 1.1PB (RAID-5)

PRIMERGY RX300 S6 x 2 (MDS)ETERNUS DX80 S2 x 150 (OST)

Storage capacity: 2.1PB (RAID-6)

PRIMERGY RX300 S6 x 8 (MDS)PRIMERGY RX300 S6 x 40 (OSS)ETERNUS DX80 S2 x 4 (MDT)ETERNUS DX410 S2 x 80 (OST)

Shared file system

InfiniBandnetwork

Log-in nodes

PRIMERGY RX300 S6 x 8

End users

Campus LAN

Ethernetnetwork

External connection router

External file system

3

Kyushu University

Copyright 2012 FUJITSU LIMITED

Features: Hybrid system of Fujitsu SPARC64 and x86 cluster

Software compatibility with the K computer

Supercomputer System PRIMEHPC FX10 (768 nodes, SPARC64 IXfx)

Peak performance: 181.6 teraflops

High-performance Server System PRIMERGY CX400 (2,952 CPUs, New Intel Xeon E5)

Peak Performance: 510.1 teraflops

Total peak performance: 691.7 teraflopsOperations beginning: July 2012

4

Gigabit Ethernet SwitchInfiniBand Switch

Supercomputer SystemPRIMEHPC FX10 (768 nodes)

181.6TFLOPS24TB memory

Local File System (FEFS)

ETERNUS DX80 S2

345.6TB

High Performance ServerSystemPRIMERGY CX400 (1,476 nodes)

510.1TFLOPS184.5TB memory

ETERNUS DX80 S2

File Servers: PRIMERGY RX300 S7Shared File System (FEFS)

4.0PB + 0.2PBSupercomputer: operation starts from July 2012High Performance Server System: from Sept 2012

LAN

Kyushu University – System Overview

Copyright 2012 FUJITSU LIMITED5

NCI-NF(Australia's national research computing service) Key requirements To improve the computational modeling capability in the research field below

- Climate change- Ocean and marine- Earth system science- National water management research

Very high-energy efficiency, PUE is well under 1.20

System overview PRIMERGY CX400 (Including CX250S1) : 3,592 nodes (50 racks)

Peak performance : 1.2 Petaflops

PRIMEHPC FX10 : 96 nodes (1 rack)

Copyright 2010 FUJITSU LIMITED6

NCI-NF – System Overview

Copyright 2010 FUJITSU LIMITED

計算ノード群・インタラクティブノード群

[総理論演算性能 : 1.13PFLOPS][総主記憶容量 : 150TByte][インターコネクト : 6次元メッシュ/トーラス]

PRIMEHPC FX10 50筐体構成(4,800 計算ノード + 300 IOノード)

InfiniBand

FibreChannelEthernet

Compute nodes

Peak Performance: 1.2 PetaflopsMemory capacity: 150 TBInterconnect: Full bisection bandwidth (FDR)

PRIMERGY CX400 x 50 racks(CX250S1 : 3,592 nodes, 57,472 cores) Management, authentication servers:

PRIMERGY RX300S7 x 7

Management servers

Storage capacity: 12.6PB

PRIMERGY RX300S7 x 6 (MDS)PRIMERGY RX300S7 x 30 (OSS)DDN EF3015 x 3 (MDT)DDN SFA12000 x 5 (OST)

Global file system

InfiniBandnetwork

Login nodesLogin, Data mover servers:PRIMERGY RX300S7 x 13

End users

Campus LAN

Ethernetnetwork

External connection routerCorroboration nodes

PRIMEHPC FX10(96 compute nodes)Peak Performance: 22.7 TflopsMemory capacity: 3 TB

7

Copyright 2012 FUJITSU LIMITED

Motivation and background Position Wales at the forefront of supercomputing Promotion of research, technology and skills

Improvement of economic development Creation of 400+ quality jobs, 10+ new business

Implementation and rollout Distributed HPC clusters among 15 academic sites With central hubs, tier 1 and 2 sites

Portal for transparent, easy use of resources

Rollout completed by Q1 2012

HPC Wales – A Grid of HPC Excellence

8

HPC Wales – Solution

Copyright 2012 FUJITSU LIMITED

Solution Design User-focused solution to access

distributed HPC systems from desktop browser

Multiple components integrated into a consistent environment with a single sign-on

Data accessible across the entire infrastructure, automated movement driven by workflow

Collaborative sharing of information and resources

Performance & Technology >1400 nodes PRIMERGY BX922S2 Intel Xeon, X5650 and X5680 Roadmap for upgrade

190 TFlops aggregated peak performance

Infiniband, 10 / 1 Gb Ethernet, FCS Eternus DX online SAN (home FS) Parallel File System (up to 10 GB/s) DDN Lustre

Backup & Archiving Symantec, Quantum

9

A*STAR

A*STAR Singapore’s lead government agency

Fostering world-class scientific research Biomedical Sciences Physical Sciences & Engineering

Spurs growth in key economic clusters

Fujitsu and A*STAR (IHPC) R&D partnership to jointly develop Applications Technologies

for the use of next-gen. supercomputer in Computational Fluid Dynamics

Material Sciences

Copyright 2012 FUJITSU LIMITED

PRIMERGY BX920 S2 at A*STAR 450 server blades (3888 cores)

45 Teraflops peak performance

91% of Linpack efficiency

10

Fujitsu HPC from workplace to #1 in TOP500

Copyright 2012 FUJITSU LIMITED

PRIMERGY x86 Clusters

PRIMEHPC FX10Supercomputers

Celsiusworkstations

11

Copyright 2012 FUJITSU LIMITED12