ken ichi itakura (jamstec) - department of physics

19
http://www.jamstec.go.jp The Architecture and the Application Performance of the Earth Simulator Kenichi Itakura (JAMSTEC) 15 Dec., 2011 1 ICTS-TIFR Discussion Meeting-2011 Location of Earth Simulator Facilities Yokohama Earth Simulator Site Tokyo 2 HQ

Upload: others

Post on 12-Apr-2022

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ken ichi Itakura (JAMSTEC) - Department of Physics

http://www.jamstec.go.jp 

The Architecture and the Application Performance of the Earth Simulator

Ken’ichi Itakura (JAMSTEC)

15 Dec., 2011 1ICTS-TIFR Discussion Meeting-2011

Location of Earth Simulator Facilities

Yokohama

Earth Simulator Site

Tokyo

2

HQ

Page 2: Ken ichi Itakura (JAMSTEC) - Department of Physics

Earth Simulator Building

3

Cross‐sectional View of the Earth Simulator Building

Double Floor for Cables

Air Return Duct

Lightning Conductor

Power Supply SystemAir Conditioning system

Seismic Isolation System

Earth Simulator System

4

Page 3: Ken ichi Itakura (JAMSTEC) - Department of Physics

Earth Simulator

Earth Simulator

March, 2002 ~ March.2009 (1/2 : ~ Sep.2008)

Peak Performance : 40 T Flops

Main Memory : 10 T Bytes

Earth Simulator(Ⅱ)

March, 2009 ~

Peak Performance : 131 T Flops

Main Memory : 20 T Bytes

5

Development of ES started in 1997 with the aim of making a comprehensive understanding of global environmental changes such as global warming.

TOP500 List Earth Simulator got the top position at iSC02 (June 2002) and keep the top position two and half year.

Its construction was completed at the end of February, 2002 and the operation started from March 1, 2002 at the Earth Simulator Center

New Earth Simulator System (ES2) was installed late 2008 and started operation in march 2009.

Development of the Earth Simulator (ES)

6

Page 4: Ken ichi Itakura (JAMSTEC) - Department of Physics

Earth Simulator (ES2)

7

Operation Network

Maintenance Network

User Network

Labs, User Terminals

Processor nodes (PNs)

Login server

Operation Servers

Storage server

FC Switch

4Gbps Fiber ChannelSAN (Storage Area Network)

Usable Capacity 1.5PBRAID6 HDD

SX-9/E 160 processor nodes (PNs)(including interactive 2nodes)Total peak performance 131TFLOPSTotal main memory 20TBData storage system 500TBInter nodes network (IXS) 64GB/sec (bidirectional) /node

10GbE(partially link aggregation

40GbE)

NQS2 batch job system on PNsAgent request supportUse statistical and resource information managementAutomatic power saving management

FC SwitchFC Switch

ES2 System outline

Maximum power consumption 3000kVA8

Page 5: Ken ichi Itakura (JAMSTEC) - Department of Physics

New System Layout

50m

65mNewEarth Simulator(ES2)

Original Earth Simulator(Stoped)

Original Earth simulator is opened on March, 2002.

New System start operation on March, 2009.

9

Calculation nodesCalculation nodes

160 nodes

L Batch nodeS Batch node

Interactive node

2 nodes 2 nodes156 nodes

WORK area

Login server

login

HOME/DATA areas

Possible to refer

10

Clustering of nodes to control the system (transparent for uses) .A cluster is consists of 32 nodes.156 nodes are for batch jobs (batch clusters).

Page 6: Ken ichi Itakura (JAMSTEC) - Department of Physics

Calculation nodesCalculation nodes

160 nodes

L Batch nodeS Batch node

Interactive node

2 nodes 2 nodes156 nodes

WORK area

Login server

login

HOME/DATA areas

Possible to refer

11

Providing special 4 nodes for TSS and 

small batch jobs.

Configuration of the TSS cluster.TSS nodes  [2 nodes  →  1node (changed in 2010)]

Nodes for Single Node batch jobs 

[2 nodes → 3 nodes],

Calculation nodesCalculation nodes

160 nodes

L Batch nodeS Batch node

Interactive node

2 nodes 2 nodes156 nodes

WORK area

Login server

login

HOME/DATA areas

Possible to refer

12

Configuration of the batch cluster.Nodes for Multi‐Nodes batch jobs,

System disks for user‐file staging

Page 7: Ken ichi Itakura (JAMSTEC) - Department of Physics

Calculation nodesCalculation nodes

160 nodes

L Batch nodeS Batch node

Interactive node

2 nodes 2 nodes156 nodes

WORK area

Login server

login

HOME/DATA areas

Possible to refer

13

Storage of user files for batch jobs on a mass‐storage system.Automated file recall (Stage‐In) and migration (Stage‐Out).Connection of all the clusters to a mass‐storage system by IOCS (Linux WS)

14

ES ES2 (SX‐9/E) Ratio

CPUClock Cycle 1GHz 3.2GHz 3.2x

Performance 8GF 102.4GF 12.8x

Node

#CPUs 8 8 1x

Performance 64GF 819.2GF 12.8x

Memory 16GB 128GB 8x

Network 12.3GB/s x2 8GB/s x8 x2 5.2x

System

#Nodes 640 160 1/4x

Performance 40TF 131TF 3.2x

Memory 10TB 20TB 2x

Network Full Crossbar2 Level Fat‐Tree

Full Bisection Bandwidth-

Hardware Spec.

Page 8: Ken ichi Itakura (JAMSTEC) - Department of Physics

ES2 software

• OS‐ SUPER‐UX

• Environment‐ Fortran90‐ C/C++‐MPI‐1/2‐ HPF‐MathKeisan (BLAS, LAPACK,etc)‐ ASL‐ Cross‐compiler on Login‐server (linux)

15

ES2 Operation

Page 9: Ken ichi Itakura (JAMSTEC) - Department of Physics

How many user?

About 800 people

How many jobs?

About 10,000 jobs per month

Power(include cooling)?

About 3000KVA, about 70% of Original ES

Average load for job running?

70—80%, most of the rest is used for pre/post processing.

ES2 Operation

17

Projects

28th Sep, 2010 DOE HPC Best Practices Workshop 18

FY2011 ES2 Projects○ Proposed Research Projects  : 29

Earth Science :18   Innovation :  11

○ Contract Research Projects・KAKUSHIN 5・ The Strategic Industrial Use (Industrial) 13・CREST 1

○ JAMSTEC Research Projects 14・JAMSTEC ・Collaboration Research ・Industrial fee-based usage (New project is accepted at any time.)

Users  : 565Organization   125

University  57,   Government  15,   Company  34,   International  19

Proposed

Contract

JAMSTEC

Resource Allocation

Page 10: Ken ichi Itakura (JAMSTEC) - Department of Physics

#nodes 1~4

28.7%

#nodes 5~8

17.6%

#nodes 9~16

12.4%

#nodes 17~32

21.8%

#nodes 33~64

16.7%

#nodes over 65

2.9%

Computing Resource Distribution    (Based on Job Size) 

FY2010

19

ES2 Application Field                   

Atmospheric and 

Oceanographic Science28%

Solid Earth Science16%

Global Warming: IPCC

41%

Epoch‐Making Simulation

11%

Industorial Use4%

FY2010

20

Page 11: Ken ichi Itakura (JAMSTEC) - Department of Physics

21

ES2 Node Utilization(FY2010)

Stopped Operation on 14 Mar.

22

ES2 Node Utilization(FY2011)

※Degeneration operation is carried out for power saving.

Page 12: Ken ichi Itakura (JAMSTEC) - Department of Physics

April May June July August Sep. Oct.

ES22009(KWH)

3,065 3,105 2,944 3,084 2,973 3,042 3,091

ES2008(KWH)

3,987 4,013 3,978 4,015 4,138 3,9861,752(half

System)

Reduction rate

76.9% 77.4% 74.0% 76.8% 71.9% 76.3%(176.5

%)

・ES2 power consumption is reduced about 75% from ES. ・The ratio of peak performance and power consumption is 4.34 times better than ES.

23

Application Performance

Page 13: Ken ichi Itakura (JAMSTEC) - Department of Physics

ES2 Application   ‐1

25

AFESOFESCFES

ES2 Application   ‐2

26

Page 14: Ken ichi Itakura (JAMSTEC) - Department of Physics

ES2 Application   ‐3

27

ES2 Application   ‐4

28

Page 15: Ken ichi Itakura (JAMSTEC) - Department of Physics

Code Name Elapse Time  on ES[sec]

#CPUs on ES Elapse Time on ES2[sec]

(Speedup ratio)

#CPUs on ES2

PHASE 135.3 4096 62.2    (2.18) 1024

NICAM‐K* 214.7 2560 109.3     (1.97) 640

MSSG 173.9 4096 86.5    (2.01) 1024

SpecFEM3D 96.3 4056 45.5    (2.12) 1014

Seism3D 48.8 4096 15.6    (3.13) 1024

Speedup ratio harmonic mean 2.22

Performance Evaluation ResultsIn ES Real Applications

29

ES2 is 2.22 times faster

WRF

• WRF (Weather Research and Forecasting Model) is a mesoscale meteorological simulation code which has been developed under the collaboration among US institutions, including NCAR (National Center for Atmospheric Research) and NCEP (National Centers for Environmental Prediction). JAMSTEC has optimized WRFV2 on the Earth Simulator (ES2) renewed in 2009 with the measurement of computational performance.

• As a result, we successfully demonstrated that WRFV2 can run on the ES2 with outstanding performance and the sustained performance.

Page 16: Ken ichi Itakura (JAMSTEC) - Department of Physics

WRF Performance on ES

TOP500

2002Jun

2002Nov

2003Jun

2003Nov

2004Jun

2004Nov

2005Jun

2005Nov

2006Jun

2006Nov

2007Jun

2007Nov

2008Jun

2008Nov

2009Jun

2009Nov

2010Jun

2010Nov

2011Jun

2011Nov

系列1 1 1 1 1 1 3 4 7 10 14 20 30 49 73 22 31 37 54 68 94Rank

Rank 94 on Nov. 2011

32

Page 17: Ken ichi Itakura (JAMSTEC) - Department of Physics

HPC Challenge Awards

• The Competition will focus on four of the most challenging benchmarks in the suite: – Global HPL ‐ the Linpack TPP benchmark which measures the floating 

point rate of execution for solving a linear system of equations. DGEMM ‐measures the floating point rate of execution of double precision real matrix‐matrix multiplication. 

– Global RandomAccess ‐measures the rate of integer random updates of memory (GUPS). 

– EP STREAM (Triad) per system ‐ a simple synthetic benchmark program that measures sustainable memory bandwidth (in GB/s) and the corresponding computation rate for simple vector kernel. 

– Global FFT ‐measures the floating point rate of execution of double precision complex one‐dimensional Discrete Fourier Transform (DFT). 

33

The 2009 HPC Challenge Class 1 Awards:G-HPL Achieved System Affiliation

1st place 1533 Tflop/sCray XT5 ORNL

1st runner up 736 Tflop/sCray XT5 UTK

2nd runner up 368 Tflop/sIBM BG/P LLNL

G-RandomAccess Achieved System Affiliation

1st place 117 GUPSIBM BG/P LLNL

1st runner up 103 GUPSIBM BG/P ANL

2nd runner up 38 GUPSCray XT5 ORNL

G-FFT Achieved System Affiliation

1st place 11 Tflop/sCray XT5 ORNL

1st runner up 8 Tflop/sCray XT5 UTK

2nd runner up 7 Tflop/sNEC SX-9 JAMSTEC

EP-STREAM-Triad (system)

Achieved System Affiliation

1st place 398 TB/sCray XT5 ORNL

1st runner up 267 TB/sIBM BG/P LLNL

2nd runner up 173 TB/sNEC SX-9 JAMSTEC

28th Sep, 2010 DOE HPC Best Practices Workshop 34

XT5@ORNL is 2.3PF and ES2 is 131TF at peak. That is about 17 times . However, G‐FFT 

performance is only 1.5 times. 

Page 18: Ken ichi Itakura (JAMSTEC) - Department of Physics

New Earth Simulator(ES2 SX‐9/E)HPC Challenge  Awards  2010

G‐FFT No1in the World

Efficiency of G-FFT

ES2 11.88TF/131.07TF = 9.1%XT5 10.70TF/2044.7TF = 0.52%

17.5 Times Better

http://www.hpcchallenge.org/

New Earth Simulator(ES2 SX‐9/E)HPC Challenge  Awards 2010

EP‐STREAM‐Triad   No.3

35

HPC Challenge Awards 2010

Peak Performance

15.6 Time

2045

131

‐ 0.1

11.9

10.7

G-FFT Performance

Earth Simulator 2Jaguar(ORNL)

We  are  BIG!!

36

Page 19: Ken ichi Itakura (JAMSTEC) - Department of Physics

The 2011 HPC Challenge Class 1 Awards:

TOP500(Linpack) #cores G‐FFT Effective Performance Ratio #cores

JAMSTEC  ES2 (SX‐9)

Rank94

122  TF 1,280 Rank2

11.9 TF 9.1 % 1,280

RIKEN AICS(K Computer)

Rank1

10,510  TF 705,024 Rank1

34.7 TF 1.47 % 147,456

Compare K is 86 times faster than ES2

K is only 2.9 times faster than ES2

ES is 6.2 times highereffective than K

37

Thank you for your kind attention !Thank you for your kind attention !

JAMSTECJAMSTECJapan Agency for Marine-Earth Science and TechnologyJapan Agency for Marine-Earth Science and Technology