high-performance tools to model instabilities in tokamak ... · outline i jorek code (modelling non...

20
High-performance tools to model instabilities in tokamak plasmas jorek & gysela JOREK: M. Becoulet, G. Dif-Pradalier, A. Fil, V. Grandgirard, G. Latu , E. Nardon, F. Orain, C. Passeron, A. Ratnani, C. Reux Collaborations with: INRIA, IPP, ITER Org., french Universities GYSELA: J. Bigot, G. Dif-Pradalier, D. Est` eve, V. Grandgirard, X. Garbet, Ph. Ghendrih, G. Latu , C. Passeron, Y. Sarazin, F. Rozar Collaborations with: INRIA, IPP, french Universities G. Latu CEA-IRFM, C2S@Exa 13/01/15 1

Upload: others

Post on 07-Apr-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: High-performance tools to model instabilities in tokamak ... · Outline I JOREK code (modelling non linear MHD) I Context, Results & work in progress I GYSELA code (modelling transport

High-performance tools to modelinstabilities in tokamak plasmas

jorek & gysela

JOREK: M. Becoulet, G. Dif-Pradalier, A. Fil, V. Grandgirard,

G. Latu, E. Nardon, F. Orain, C. Passeron, A. Ratnani, C. Reux

Collaborations with: INRIA, IPP, ITER Org., french Universities

GYSELA: J. Bigot, G. Dif-Pradalier, D. Esteve, V. Grandgirard, X.

Garbet, Ph. Ghendrih, G. Latu, C. Passeron, Y. Sarazin, F. Rozar

Collaborations with: INRIA, IPP, french Universities

13/01/15

G. Latu CEA-IRFM, C2S@Exa � 13/01/15 1

Page 2: High-performance tools to model instabilities in tokamak ... · Outline I JOREK code (modelling non linear MHD) I Context, Results & work in progress I GYSELA code (modelling transport

Outline

I JOREK code (modelling non linear MHD)I Context, Results & work in progress

I GYSELA code (modelling transport & ITG turbulence)I Context, Results & work in progress

I Perspectives

G. Latu CEA-IRFM, C2S@Exa � 13/01/15 2

Page 3: High-performance tools to model instabilities in tokamak ... · Outline I JOREK code (modelling non linear MHD) I Context, Results & work in progress I GYSELA code (modelling transport

JOREK: motivation ELMs

Extracted from [Liang Yunfeng, 2010]

G. Latu CEA-IRFM, C2S@Exa � 13/01/15 3

Page 4: High-performance tools to model instabilities in tokamak ... · Outline I JOREK code (modelling non linear MHD) I Context, Results & work in progress I GYSELA code (modelling transport

JOREK: main characteristics

I Physics: ELM cycle & control, DisruptionsI ELMs [G. Dif-Pradalier, M. Becoulet, S. Pamela]I Resonant Magnetic Perturbations (RMPs) [M. Becoulet, F. Orain]I Pellets injection [G. Huijsmans, S. Futatani]I VDE, β limit disruptions, density limit [C. Reux, E. Nardon, A. Fil]

I Realistic geometry (X-point)I Cubic finite elements, flux aligned poloidal gridI Fourier series in toroidal directionI Fluid description (3D), 6 to 8 variables per grid pointI Numerical scheme: solve a large sparse linear system

I Challenges to improve handled PhysicsI exact geometry?? & boundary conditions??

I non-linear MHD equ. in toroidal geometry over long time scales? (µs → s)I realistic physical variables??? [resistivity, parallel conductivity, collisionality]I large number of n−modes???, coupling with background turbulence?

G. Latu CEA-IRFM, C2S@Exa � 13/01/15 4

Page 5: High-performance tools to model instabilities in tokamak ... · Outline I JOREK code (modelling non linear MHD) I Context, Results & work in progress I GYSELA code (modelling transport

JOREK: context, issues

I Physics studies:Production code for non-linear MHDuse Tier-2, Tier-1, Tier-0 facilities

I Mathematical issues:→ Mesh, robustness, convergence→ large costs (memory, computation)

I Parallel computing issues:→ Rely on sparse linear solver

depends on performance: Pastix, Mumps→ Save memory space (larger cases)

I Collaborative issues:→ Modify a single code, ensure correct results

G. Latu CEA-IRFM, C2S@Exa � 13/01/15 5

Page 6: High-performance tools to model instabilities in tokamak ... · Outline I JOREK code (modelling non linear MHD) I Context, Results & work in progress I GYSELA code (modelling transport

JOREK: Work in progress(prod. runs 512 cores, test runs 1500 cores)

I Physics: CEA, ITER, IPP (Germany), JET(UK), ...Eurofusion funding (H2020)

I Mathematical bottlenecks: convergence, large cases→ IRFM, IPP Garching, INRIA TONUS, INRIA CASTOR:

preconditioner, time scheme (convergence)other finite elem. (robustness/accuracy/stability)isogeometric analysis (reduced costs)

I Parallel computing bottlenecks: large cases, new arch.→ INRIA HIEPACS + IRFM (ANR ANEMOS):

improve coupling Jorek/Pastix, save memory space→ IPP + IRFM

porting matrix construction on Xeon Phi

I Parallel code issues: maintain a healthy code→ automatize checks, regression tests on parallel machinesG. Latu CEA-IRFM, C2S@Exa � 13/01/15 6

Page 7: High-performance tools to model instabilities in tokamak ... · Outline I JOREK code (modelling non linear MHD) I Context, Results & work in progress I GYSELA code (modelling transport

JOREK, Result 1: Memory

I Aim: Reduce memory peak, improve mem. scalabilityI Pb: Memory required to centralize/redistribute matrixI Solution: distributed API Murge (reduce memory peak)I Adapting/Improving Jorek&Murge [; Hiepacs, X. Lacoste]4.5. Discussion 117

Iteration number

Iter

atio

ntim

e[s

]

50 100 150 200 250 300 350 400 450 500

10

20

30

40

50

60

70

PaStiX ntor = 1

MURGE, balanced element distribution ntor = 3

�a) Iteration time.

Iteration number

Mem

ory

consu

mpti

on[G

B]

50 100 150 200 250 300 350 400 450 500

1

1�5

2

2�5

3

3�5

4

PaStiX ntor = 1

MURGE, balanced elements distribution ntor = 3

�b) Memory consumption.

Figure 4.9 – Comparison of iteration time and memory consumption during for each timestep of the simulation on model 303. On four eight nodes of eight cores, on Curie clusterhybrid partition. Model 303 was run with default parameters, in particular ntor = 3 (2harmonics).

can be reduced by 30%. Thus, using MURGE this execution can be run with a lowermemory requirement and executed on smaller configuration. Thus, users can executethe simulation with a larger number of toroidal planes.

4.5 Discussion

In this chapter, we explained how we could improve the memory and time scaling of a nu-merical simulation using a generic matrix assembly API developed above PaStiX solver.Using an implementation specific to the solver one can reach higher performance. Thisfinite elements oriented API also simplify the development of a distributed matrix assem-bly for the numerical simulation developer. In a large scale simulation code, JOREK,we could achieve both good performances and less memory consumption through thisAPI. All this study is related to PaStiX integration inside JOREK but the same APIcould be used to produce a distributed assembly for another solver or/and another finiteelements based simulation code.

Figure : Runs of Jorek model 303, on 8 nodes of 8 cores (curie)G. Latu CEA-IRFM, C2S@Exa � 13/01/15 7

Page 8: High-performance tools to model instabilities in tokamak ... · Outline I JOREK code (modelling non linear MHD) I Context, Results & work in progress I GYSELA code (modelling transport

Jorek, Result 2: Strong scalingwith mpi thread multiple - MdlS

I Simulation model 302 - (ntor=15,nflux=32,ntheta=48)using mpi thread multiple mode:

Nb cores 128 256 512 1024

Nb nodes 8 16 32 64

Steps

construct matrix 15.3 7.7 4.1 2.2

factorisation 0. 0. 0. 0.

gmres/solve 3.0 1.7 1.2 0.86

iteration time 18.6 9.7 5.6 3.4

rel. efficiency 100% 96% 83% 68%

Table : one iteration - with no Factorization

I Good result: 68% rel. efficiency (whole code) at 1024 coresI Pb: mpi thread multiple is not available on many machines→ Exec time multiplied by a factor 2 to 4

G. Latu CEA-IRFM, C2S@Exa � 13/01/15 8

Page 9: High-performance tools to model instabilities in tokamak ... · Outline I JOREK code (modelling non linear MHD) I Context, Results & work in progress I GYSELA code (modelling transport

GYSELA code - contextTurbulence governs plasma performance

Magnetic ConfinementIn a plasma, particles follow magnetic field lines: Particles may be confined in a toroidal magnetic field

In the sun, plasma is confined by gravityIn a tokamak, plasma is confined by a magnetic field

magnetic toroidal geometry (r , θ, ϕ)

I Turbulence limits the maximal value reachable for n and Tà Generates loss of heat and particlesà ↘ Confinement properties of the magnetic configuration

I To predict and control turbulence for optimizing experiments likeITER and future reactors is a subject of utmost importance.

G. Latu CEA-IRFM, C2S@Exa � 13/01/15 9

Page 10: High-performance tools to model instabilities in tokamak ... · Outline I JOREK code (modelling non linear MHD) I Context, Results & work in progress I GYSELA code (modelling transport

GYSELA code - contextGyrokinetic theory

I Kinetic approach: 6D distribution function of particles f(r , θ, ϕ, v‖, v⊥, α)

I Gyrokinetic codes:

I fusion plasma turbulence is low frequencyωturb ∼ 105s−1

� ωci ∼ 108s−1

↪→ Reduced to 5D distrib. function f(r , θ, ϕ, v‖, µ)

, Reduced memory needs but / increased complexity

G. Latu CEA-IRFM, C2S@Exa � 13/01/15 10

Page 11: High-performance tools to model instabilities in tokamak ... · Outline I JOREK code (modelling non linear MHD) I Context, Results & work in progress I GYSELA code (modelling transport

GYSELA code - contextQuick overview of equations and operators

I Main features in GYSELA (Gyrokinetic Semi-Lagragian code):I Main equations: Vlasov 5D, Poisson 3D (quasineutrality)I Gyrokinetic setting (5D = 3D space + 2D velocity)I ITG driven turbulenceI Heat & vorticity sources, flux drivenI Collisional operatorI Modelling fast particlesI Adiabatic electron response

G. Latu CEA-IRFM, C2S@Exa � 13/01/15 11

Page 12: High-performance tools to model instabilities in tokamak ... · Outline I JOREK code (modelling non linear MHD) I Context, Results & work in progress I GYSELA code (modelling transport

GyselaResult 3A: Scalability and bottlenecks

Strong scaling: Nr = 512, Nθ = 512, Nϕ = 128, Nv‖ = 128

Mµ = 16, main data=512 GiB

Relative efficiency, one run (Turing)

8192 16384 32768 65536Nb. of cores

0

20

40

60

80

100

120

Vlasov solverField solverDerivatives computationSourcesCollisionsDiffusionDiagnosticsTotal for one run

à Time dominated by Vlasov solver

à Bottleneck at large scale: Poisson solver, IO≈ 60% efficiency at 64 k cores on machines: Curie and Turing

G. Latu CEA-IRFM, C2S@Exa � 13/01/15 12

Page 13: High-performance tools to model instabilities in tokamak ... · Outline I JOREK code (modelling non linear MHD) I Context, Results & work in progress I GYSELA code (modelling transport

GyselaResult 3B: Weak scaling on Juqueen

I Many communication schemes rewritten (hierarchical gather/scatter)

I Tests performed on the whole Juqueen/Blue Gene machine (Juelich)

64 128 192 256 320 384 448Nb. of Kcores (x 1000)

0

30

60

90

120

150

180Vlasov solverField solverDerivatives computationDiagnosticsTotal for one run

Execution time, one Gysela (Weak Scaling - Juqueen)

64 128 192 256 320 384 448Nb. of Kcores (x 1000)

0

20

40

60

80

100

120

Vlasov solverField solverDerivatives computationDiagnosticsTotal for one run

Relative efficiency, one run (Weak scaling - Juqueen)

I Weak scaling: Relative efficiency of 91% on 458 752 cores (2013)

I PRACE preparatory access (April 2012 - Nov 2012): 250 000 hoursI Extra CPU allocation (via P. Gibbon in 2013 / ANR G8-Exascale)G. Latu CEA-IRFM, C2S@Exa � 13/01/15 13

Page 14: High-performance tools to model instabilities in tokamak ... · Outline I JOREK code (modelling non linear MHD) I Context, Results & work in progress I GYSELA code (modelling transport

GyselaResult 4A: Model memory consumption

I GYSELA is global à Huge meshes à Constrained by memory per nodeI Development of the MTM library (Modelization & Tracing Memory consumption)

I Identify memory peak, look at memory scalability

Before optimisation After optimisation

I Static to dynamic memory alloc. + improvement of algorithmsà Gain of factor 50% on 32k cores

G. Latu CEA-IRFM, C2S@Exa � 13/01/15 14

Page 15: High-performance tools to model instabilities in tokamak ... · Outline I JOREK code (modelling non linear MHD) I Context, Results & work in progress I GYSELA code (modelling transport

GyselaResult 4B: Predict memory consumption

I Thanks to MTM library (Modelization & Tracing Memory consumption)

(a) Instrument your code with MTM, run a small case(b) Run a virtual scaling (predict mem. required before submit)(c) Never run out of memory !

G. Latu CEA-IRFM, C2S@Exa � 13/01/15 15

Page 16: High-performance tools to model instabilities in tokamak ... · Outline I JOREK code (modelling non linear MHD) I Context, Results & work in progress I GYSELA code (modelling transport

GyselaPb 5: Fault Tolerance

I ContextI Number of cores in supercomputers increases very fastI Fault tolerance needed for long-running, large-scale codesI Current options: replication, or checkpoint/restart

I Cheaper checkpoints, reduced exec. time à actual pb

I FTI: Fault Tolerance Interface (INRIA & Argonne)I register data part of the application stateI notify the library when in a consistent state

I FTI strategy: take advantage of node-local SSDs (curie)Level 1 checkpoint to local SSD only (transient errors)Level ... ...Level 4 checkpoint to SSD + Asynchronous copy to PFS

G. Latu CEA-IRFM, C2S@Exa � 13/01/15 16

Page 17: High-performance tools to model instabilities in tokamak ... · Outline I JOREK code (modelling non linear MHD) I Context, Results & work in progress I GYSELA code (modelling transport

GyselaResult 5A: Asynchronous method

I Checkpoint writing is overlaped by computations

I FTI uses also asynchronous writingG. Latu CEA-IRFM, C2S@Exa � 13/01/15 17

!""#$%&"'()&$"!"*+",-.&"*('*

/01/21

/01/21/01/21

/01/21

!"#$%&#'()*+,"-#.,'(/(01%-.2'

!""#$%&"'(

3/456247

8369:;7"<=>/94:>7"?@1:A20>67B

/01/21 3/456247 /01/21 /01/21

3/456247

/01/21 /01/21 /01/21 /01/21

8369:;7"0<=>/94:>7"?>:>"@1:A20>67B

3/456247

' * C

/01/21

/01/21/01/21

""""""""""""""")&$"!"'C"8$D"*('C"!""#$%&"'(

47/:2E47F7>6

Méthode  synchrone  

Méthode  asynchrone  

Page 18: High-performance tools to model instabilities in tokamak ... · Outline I JOREK code (modelling non linear MHD) I Context, Results & work in progress I GYSELA code (modelling transport

GyselaResult 5B: Weak scaling for checkpointing

I FTI has been integrated in Gysela⇒ modularization of existing codeI Weak scaling benchs are currently running (FTI) à pb of reproductibility

G. Latu CEA-IRFM, C2S@Exa � 13/01/15 18!""#$%&"''(&$"!")*"+,-&").')

!"#$%&#'()*+,"-#.,'(/(#0*&1%)%*

!""#$%&"''

/01"234567896:

4567896:"01;<5=6><:"?<>9@:AA:"@:617><B

4567896:"1;<5=6><:"?0<57:<<:"@:617><B

"""""""""""""""(&$"!"'C"D$E").'C"!""#$%&"''

D07AA0F:"/06"<G92"H

I07AA:"2:1"2><<4:1"H")JC"I>I:K/1"234567896:"H"L"'"K7<98:M567896:"8>98:1"A:1"N"7846087><1O7K9A087><"H"'."7846087><1

I:K/1"50A59A"987A714"H"'.."..."=:96:1

#6>5::27<F1"(&DP$(O").')

!"!!"#$! !!!"#$! "!"#%$!#!!

!&'$! #!"

Helios machine (Rokkasho)

Page 19: High-performance tools to model instabilities in tokamak ... · Outline I JOREK code (modelling non linear MHD) I Context, Results & work in progress I GYSELA code (modelling transport

GyselaPb 6: Reliability of parallel programs

I Should we trust our parallel codes ?I Many parallel computations, undeterminism of executionI Many possible input parameter sets, many parallel systemsI Bugs introduced by developers

I Solution (using http://ci.inria.fr)I At every commit: compilation test

I Compile every variant of the codeI On multiple compilers/architecturesI On multiple supercomputers

I At every commit: launch cases (work-in-progress)I Run multiple test-cases on multiple supercomputersI Compare results with stored reference

I Regularly: large-scale validation tests (TODO)I Run large test-casesI Analyze parallel performanceI Validate results with known physical reference

G. Latu CEA-IRFM, C2S@Exa � 13/01/15 19

Page 20: High-performance tools to model instabilities in tokamak ... · Outline I JOREK code (modelling non linear MHD) I Context, Results & work in progress I GYSELA code (modelling transport

Parallel fusion codesPerspectives

I Improving Jorek parallel performances(INRIA HIEPACS)

I New preconditioners, time integration scheme for Jorek(INRIA TONUS, CASTOR)

I Parallelization of a new gyroaverage operator in Gysela(INRIA TONUS, HIEPACS)

I Software components and StarPU approaches for Gysela(INRIA AVALON, RUNTIME)

I Thread & mem. affinity for deploying complex parallel codes(INRIA HIEPACS, RUNTIME)

I Designing parallel kernels for 6D Vlasov equation(INRIA KALIFFE)

I Porting and developing new Gysela kernels for Xeon Phi(Japan collaboration, INTEL Exascale labs)

I Improved parallel I/O ...

G. Latu CEA-IRFM, C2S@Exa � 13/01/15 20