sgi13 - vergroenen ván ict - duurzame supercomputers - walter lioen (surfsara)

24
Symposium Groene ICT en Duurzame ontwikkeling: Meters maken in het Hoger Onderwijs Duurzame Supercomputers Walter Lioen <[email protected]> Groepsleider Supercomputing

Upload: surf

Post on 11-May-2015

294 views

Category:

Education


2 download

TRANSCRIPT

Page 1: SGI13 - Vergroenen ván ICT - Duurzame supercomputers  - Walter Lioen (SURFsara)

Symposium Groene ICT en Duurzame ontwikkeling: Meters maken in het Hoger Onderwijs

Duurzame Supercomputers

Walter Lioen <[email protected]>

Groepsleider Supercomputing

Page 2: SGI13 - Vergroenen ván ICT - Duurzame supercomputers  - Walter Lioen (SURFsara)

Supercomputing and Sustainability

January 31, 2013 Sustainable Supercomputing – Walter Lioen 2

Outline

• SURFsara

• Supercomputing

• Performance - TOP500

- Green500

• Requirements

• Sustainability - Investment vs. Total Cost of Ownership

- Energy efficiency:

- Application throughput / TCO

- Warm water cooling

- On-demand growth

- Energy aware scheduling

Page 3: SGI13 - Vergroenen ván ICT - Duurzame supercomputers  - Walter Lioen (SURFsara)

About SURFsara

• SURFsara offers an integrated ICT research infrastructure and provides services in the areas of computing, data storage, visualization, networking, cloud and e-Science.

• SARA was founded in 1971 as an Amsterdam computing center by the two Amsterdam universities (UvA and VU) and the current CWI

• Independent as of 1995 • Founded Vancis in 2008 offering ICT services and

ICT products to enterprises, universities, and educational and healthcare institutions

• As from 1 January 2013, SARA – from then on SURFsara – forms part of the SURF Foundation

• First supercomputer in The Netherlands in 1984 (Control Data Cyber 205). Hosting the national supercomputer(s) ever since.

Sustainable Supercomputing – Walter Lioen January 31, 2013 3

Page 4: SGI13 - Vergroenen ván ICT - Duurzame supercomputers  - Walter Lioen (SURFsara)

What is a Supercomputer?

January 31, 2013 Sustainable Supercomputing – Walter Lioen 4

• A supercomputer is a computer at the frontline of current

processing capacity, particularly speed of calculation

• Consequently, the specification of a supercomputer is constantly

changing

• Rule of thumb: a supercomputer is at least 1,000 – 10,000 up to

100,000 times faster than an average PC

Page 5: SGI13 - Vergroenen ván ICT - Duurzame supercomputers  - Walter Lioen (SURFsara)

Why supercomputing?

January 31, 2013 Sustainable Supercomputing – Walter Lioen 5

Large scale scientific computing Simulation of processes tot are otherwise • Impossible in practice • Too expensive • Too dangerous • Too extended

Examples • Astronomy

- How did the universe begin? - How do stars form and evolve?

• Weather Prediction, Climatology • Nuclear Physics • Aerodynamics (cars, planes, rockets) • Biology (proteins, DNA, drugs) • Medical sciences (bone formation, blood flow)

Page 6: SGI13 - Vergroenen ván ICT - Duurzame supercomputers  - Walter Lioen (SURFsara)

Top500: HPL benchmark

January 31, 2013 Sustainable Supercomputing – Walter Lioen 6

• HPL is a software package that solves a (random) dense linear

system in double precision (64 bits) arithmetic on distributed-

memory computers

• For Sequoia (the current nr. 2) - n 12,681,215

• Computational kernel: DGEMM (matrix multiply)

• Extremely efficient on all processors (in cache)

• Limiting factors: - Speed of interconnect

- Speed to (local accelerator) memory (for e.g. GPU)

• However, far more important: application speed

• “In Amsterdam a Ferrari is useless (speed-wise)”

Page 7: SGI13 - Vergroenen ván ICT - Duurzame supercomputers  - Walter Lioen (SURFsara)

Green500: TOP500 MFlop/s / W

January 31, 2013 Sustainable Supercomputing – Walter Lioen 7

November 2012

• position 1 – 4: - commodity processors with coprocessors or

- commodity processors with graphics processing units (GPUs)

- TOP500 #1 (Titan) is Green500 #3

• position 5 – 29: - Blue Gene/Q

Page 8: SGI13 - Vergroenen ván ICT - Duurzame supercomputers  - Walter Lioen (SURFsara)

SURFsara National Supercomputing History

January 31, 2013 Sustainable Supercomputing – Walter Lioen 8

Year Machine Rpeak

GFlop/s kW

GFlop/s

/ kW

1984 CDC Cyber 205 1-pipe 0.1 250 0.0004

1988 CDC Cyber 205 2-pipe 0.2 250 0.0008

1991 Cray Y-MP/4128 1.33 200 0.0067

1994 Cray C98/4256 4 300 0.0133

1997 Cray C916/121024 12 500 0.024

2000 SGI Origin 3800 1,024 300 3.4

2004 SGI Origin 3800 +

SGI Altix 3700

3,200 500 6.4

2007 IBM p575 Power5+ 14,592 375 40

2008 IBM p575 Power6 62,566 540 116

2009 IBM p575 Power6 64,973 560 116

2013 Bull bullx DLC 250,000 260 962

2014 Bull bullx DLC >1,000,000 >520 1923

Page 9: SGI13 - Vergroenen ván ICT - Duurzame supercomputers  - Walter Lioen (SURFsara)

Top500 – iPad 2 performance

January 31, 2013 Sustainable Supercomputing – Walter Lioen 9

• An A5 processor core of an iPad 2 is as fast as a four processor

Cray 2 supercomputer (1.951 GFlop/s)

• In 1985 an eight processor Cray 2 was the fastest supercomputer

in the world

• The iPad 2 would still have been listed in the Top500 of 1994

Page 10: SGI13 - Vergroenen ván ICT - Duurzame supercomputers  - Walter Lioen (SURFsara)

Moore’s Law (1965)

January 31, 2013 Sustainable Supercomputing – Walter Lioen 10

• The number of transistors on an integrated circuit doubles every

2 years

• Because of faster transistors, the speed doubles every 18 months

• The clock speed stopped doubling a couple of years ago

• Nowadays the number of cores doubles

• Moore noted that if car manufacturers

had something like this, cars would get

100,000 miles to the gallon and it would

be cheaper to buy a Rolls Royce than

park it. (Cars would also be only a half

an inch long.)

Page 11: SGI13 - Vergroenen ván ICT - Duurzame supercomputers  - Walter Lioen (SURFsara)

Governance of the procurement

January 31, 2013

Selection committee: • dr. ir. Anwar Osseyran (director SARA)

• prof. dr. Wim Liebrand (director SURF)

• prof. dr. Jacob de Vlieg (director NLeSC)

• prof. dr. ir. Henk Dijkstra (chairman NWO-WGS)

Technical advisory committee (SARA): • Walter Lioen (system architecture, applications & benchmarks)

• Huub Stoffers (system architecture, storage, system management)

• Aad van der Steen (system architecture, applications & benchmarks)

• Mark van de Sanden (system architecture and storage)

• Peter Michielse (general, phasing and vice-chair)

• Axel Berg (general, datacenter and chair)

11 Sustainable Supercomputing – Walter Lioen

Page 12: SGI13 - Vergroenen ván ICT - Duurzame supercomputers  - Walter Lioen (SURFsara)

Extensive requirements analysis

January 31, 2013

• Interviews with top 25 users of Huygens (mid 2011)

• Workshop grand challenge experiences (April 29, 2011)

• Detailed analysis of Huygens resource usage (mid 2011 – Q1 2012) - Which User Applications (2008 – 2012)

- Scaling of Applications (current use and scaling potential)

- Actual memory usage

- I/O profiles

• HPC market and technology assessment

12 Sustainable Supercomputing – Walter Lioen

Page 13: SGI13 - Vergroenen ván ICT - Duurzame supercomputers  - Walter Lioen (SURFsara)

From requirements analysis to technical

requirements for the procurement

January 31, 2013

Application benchmark suite

Technical requirements

HPC market

analysis

User requirements

System statistics

13 Sustainable Supercomputing – Walter Lioen

Page 14: SGI13 - Vergroenen ván ICT - Duurzame supercomputers  - Walter Lioen (SURFsara)

Most important technical requirements (1/2)

January 31, 2013

Compute & processor architecture

• General purpose capability system

• Large number of Thin compute nodes: - at least 16 cores

- at least 1 GB memory/core, 2 GB highly preferred

• Small number of Fat compute nodes: - at least 32 cores

- at least 4 GB memory/core, 8 GB highly preferred

Concept of thin node and fat node islands:

• Non-blocking low-latency interconnect within thin node islands (at least

4,096 cores) and fat node island (at least 1,024 cores)

• Interconnect bandwidth among islands not be pruned by more than a

factor of the order of 4:1

Application benchmark suite

Technical requirements

3

1 2

14 Sustainable Supercomputing – Walter Lioen

Page 15: SGI13 - Vergroenen ván ICT - Duurzame supercomputers  - Walter Lioen (SURFsara)

Most important technical requirements (2/2)

January 31, 2013

Accelerators

• At first only if application benchmark shows real benefit

• Option to add accelerators during the course of the contract

I/O

• I/O bandwidth to scratch minimal 0.15 GB/TFlop/s

• Disk space scratch/project minimal 5 TB/TFlop/s

Energy and cooling efficiency

• Costs for power and cooling in Total Costs of Ownership (TCO)

equation, vendor to optimize power related costs vs. investment costs

15 Sustainable Supercomputing – Walter Lioen

Page 16: SGI13 - Vergroenen ván ICT - Duurzame supercomputers  - Walter Lioen (SURFsara)

Application Benchmark Suite

January 31, 2013

• Application benchmark codes selected based on use, spread across

scientific areas, scaling (potential)

• These 7 codes represent 50% of the work load on Huygens (2008 – 2012)

• Final application benchmark set selected in consultation with NWO-WGS

Benchmark Code Scientific area Scaling (MPI tasks) Weight

ADF Quantum chemistry 384 10%

GROMACS MD 2048, 1024, 4096 20%

POP Ocean circulation 1280, 640, 2560 15%

SPARKLE CFD 1024 15%

SPO-DVR Molecular QD 512, 256, 1024 10%

SUSHI Cosmology 2048 15%

VASP ab-initio QM-MD 128 15%

16 Sustainable Supercomputing – Walter Lioen

Page 17: SGI13 - Vergroenen ván ICT - Duurzame supercomputers  - Walter Lioen (SURFsara)

Energy and cooling efficiency

January 31, 2013

Costs and sustainability are important, overall application performance/Watt

• Energy efficiency for the supercomputer system - Energy use under full load - Energy use when idle - Average energy use of running system

• Efficiency for cooling the supercomputer - Air cooling efficiency factor 1.6 - Water cooling (< 30ºC) efficiency factor 1.4 - Warm water cooling (> 30ºC ) efficiency factor 1.2

• Advantage of warm water cooling over air cooling and ‘cold’ water cooling: - when inlet temperature of water is 30ºC or higher, we can assume free cooling for

all year - in Amsterdam, 0.9% of days per year maximum temperature is above 30ºC - All thin compute nodes of the new Bull system are Direct Liquid Cooled with inlet

of 35ºC • Energy efficiency when using the supercomputing system

- Frequency of CPU is not fixed anymore - Optimization of CPU frequency per application becomes possible,

energy/application-aware scheduling technologies become possible - Evolution towards energy budget instead of CPU time budget for users

17 Sustainable Supercomputing – Walter Lioen

Page 18: SGI13 - Vergroenen ván ICT - Duurzame supercomputers  - Walter Lioen (SURFsara)

Phasing and on-demand growth requirements

January 31, 2013

Basic principle: stepwise growth of capacity with demand • Cost-effective use of available funding

• Less good for Top500 ranking

Phasing • Phase 0: as soon as possible in 2013:

- Installation of 1.5 current Huygens capacity (~100 TFlop/s)

• Phase 1: as soon as possible in 2013 (taking advantage of latest technology):

- Installation of 3 – 4 current Huygens capacity (195 – 260 TFlop/s)

• Phase 2: in 2014 (in part dependent on available technology):

- On-demand installation of at least 6 – 10 current Huygens capacity (at least

390 – 650 TFlop/s), dependent on user demand

18 Sustainable Supercomputing – Walter Lioen

Page 19: SGI13 - Vergroenen ván ICT - Duurzame supercomputers  - Walter Lioen (SURFsara)

Awarding requirements & weight

January 31, 2013

Awarding Requirements Weight

AR1 Hardware Requirements 10%

AR2 File system and I/O 10%

AR3 Software Requirements 10%

AR4 Operational Requirements (including energy usage) 15%

AR5 Maintenance, Support, Documentation and Training Requirements 5%

AR6 Applications Performance (through Applications Benchmark Suite) 40%

AR7 On-demand growth, phasing, partnership in innovation 10%

Total 100%

19 Sustainable Supercomputing – Walter Lioen

Page 20: SGI13 - Vergroenen ván ICT - Duurzame supercomputers  - Walter Lioen (SURFsara)

Specs of the new Cartesius supercomputer

January 31, 2013

Phase 0 (scheduled production May 2013, total peak perf. 89 TFlop/s)

• Fat node island (22 TFlop/s peak) - 32 fat nodes, 4 8-core Intel Sandy Bridge CPUs/node, 256 GB/node

• Thin node island (67 TFlop/s peak) - 202 thin nodes, 2 8-core Intel Sandy Bridge CPUs/node, 64 GB/node

Phase 1 (scheduled production July 2013, total peak perf. ~270 TFlop/s)

• Replacement of all thin nodes

• Installation of thin node islands with latest Intel Ivy Bridge CPUs - ~ 13,000 cores, 64 GB/node

Phase 2 (scheduled production from 2H 2014, total peak perf. > 1 PFlop/s)

• On-demand addition of thin node islands with latest Intel Haswell CPUs

Phase 1 – 2 (on-demand accelerator option)

• Addition of nodes with NVIDIA GPU or Intel Xeon Phi

20 Sustainable Supercomputing – Walter Lioen

Page 21: SGI13 - Vergroenen ván ICT - Duurzame supercomputers  - Walter Lioen (SURFsara)

Phased installation and on-demand growth

January 31, 2013

1 2 3 4 5 6

Data center

preparation

and delivery

(Early) access

for users

Production of

full phase 1

Upgrade phase 1

phase out of

Huygens

Installation of

phase 0

On-demand

growth to

> 1PFlop/s

Dec 2012 – Feb 2013

Feb – April 2013

July 2013 May 2013

2014 H2 May 2013

21 Sustainable Supercomputing – Walter Lioen

Page 22: SGI13 - Vergroenen ván ICT - Duurzame supercomputers  - Walter Lioen (SURFsara)

PRACE 2IP prototype:

Scalable Hybrid Architecture – CSC, Finland

January 31, 2013 Sustainable Supercomputing – Walter Lioen 22

EU collaboration: CSC, SURFsara, CSCS T-Platforms “T-REX” architecture • 192 compute nodes

- 48 Nvidia Kepler 48 Intel MIC - ~300 Tflop/s (~3 GF/s/W)

SURFsara research topics: • Programming paradigms

- Application porting to accelerator + MPI

• Energy policies - Dynamic Voltage and Frequency Scaling (DVFS)

Adjust frequency and voltage of the CPU. The actual workload determines which frequency/voltage is chosen.

- Dynamic Power Management (DPM) Power off when device becomes idle. Activation uses temporarily more energy.

- Maybe a hybrid policy, e.g. a mix of DPM and DVFS, is preferable.

Page 23: SGI13 - Vergroenen ván ICT - Duurzame supercomputers  - Walter Lioen (SURFsara)

Sustainability of / in / by Supercomputing – Summary

January 31, 2013 Sustainable Supercomputing – Walter Lioen 23

• Funding of NL supercomputing - SARA → SURFsara

• Requirements - general purpose: memory / core, not yet accelerators (for largest part), ... - (sustainability of parallel programming paradigms, think CUDA)

• Performance - application throughput: 7 most relevant applications, # jobs / lifetime - additional “application enabling effort”: 3 new fte (optimization, parallelization, scaling)

• Phasing - state-of-the-art processors (higher performance / lower energy)

• Energy - using “slower” processors (lower clock) - on-demand growth

• Cooling - warm water cooling → free cooling - cold corridors - (water cooled doors)

• Price - TCO: total budget =investment + energy + cooling + housing + ups (storage only)

• Price/Performance: hard optimization problem - maximization of application throughput / TCO: left as an “exercise” for the vendor

• Last but not least - Greening by IT is one of the supercomputing application areas

Page 24: SGI13 - Vergroenen ván ICT - Duurzame supercomputers  - Walter Lioen (SURFsara)

Thank you for listening!

January 31, 2013 Sustainable Supercomputing – Walter Lioen 24