aerosol - inria filecontext software architecture distributed memory level shared memory level...

31
Damien Genet IPL C2S@Exa - July 11, 2014 AeroSol Solver for CFD problem for modern architecture

Upload: tranhanh

Post on 05-Jun-2019

229 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Damien Genet IPL C2S@Exa - July 11, 2014

AeroSolSolver for CFD problem formodern architecture

Page 2: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

INTRODUCTION

Many CFD platforms around the world.Each platfom is specialized:

I One element support: Triangle, Tetrahedron, etc.I One dimension support: 2D, 3D.I Solution: Low or high orderI Language..

Each develop its solver and only a few share.

⇒ We want to achieve genericity, collaboration, andperformance.

Page 3: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Context

Software Architecture

Distributed Memory level

Shared Memory level

Conclusion, Perspectives

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Page 4: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Context

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 1

1ContextThe story begins...

Page 5: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Context

Bacchus

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 2

Bacchus, INRIA project:I HPC tools: Scotch , MMG3D , PaMPA .I CFD solvers: FluidBox , RealFluid , AeroSol .

Page 6: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Context

CAGIRE

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 3

Simulation and experimentationI Simulation: AeroSol platform.I Experimentation: Maveric test bench.

Page 7: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Context

AeroSol

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 4

Main goals for the AeroSol platform:I Collaborative: Anyone can ‘easily‘ contribute.I Genericity: No restrictions !I Maintainability: Well designed architecture.I Performance: Just performance.

Page 8: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Software Architecture

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 5

2Software Architecture

Page 9: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Software Architecture

Outside AeroSol

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 6

Page 10: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Software Architecture

Inside AeroSol

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 7

Page 11: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Software Architecture

Import - Export

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 8

I Import: GMSH in sequential or in parallel.I Export: Write solution in VTK, XDMF(HDF5), or

TecPlot.

Page 12: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Software Architecture

10

23

5

4

6

7 8

I An entity for the element.I An entity by face.I An entity by edge.I An entity by vertex.

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 9

Construction of a rich graph. Each element is decomposedhierarchically in a set of entities:

Add some relations between entities:I Ownership between element and face/edge.I Rotation between element and face.

⇒ PaMPA

Page 13: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Software Architecture

Time scheme

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 10

I does multi-step, multi-stage;I does non linear iterations (Newton);I does abstract operations on vectors;I commands the spatial scheme for computation.

Page 14: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Software Architecture

Spatial scheme

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 11

I knows the mesh, and iterates over it;I reconstructs a local cell

→ works for Continuous / Discontinuous;I uses Integrators to computes quantities;

→ generic for an element;→ parametrized by numerical flux in D.G.;

I assembles a matrix.

Page 15: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Software Architecture

Cell reconstruction

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 12

Triangle 1

Triangle 2

0 1 2 3 4 5 6 7 8 9

1 10 2 11 12 13 14 6 5 15

1

2

0 1

2

3 4

5

67

8 9

10

11

12

1314

15

-1+1

Page 16: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Software Architecture

Integrators

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 13

0 1

2

3

45

27 33

42

43

28

34

T

T−1

I works on reference element;I uses finite elements;I uses quadrature formulae;I uses geometry;I parametrized by Model or Numerical Flux;I computes quantities over an element, or a face.

→ works for both continuous and discontinuous

Page 17: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Software Architecture

Matrix class

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 14

I Distributed matrix.I Assembled or not.I Interfaced with many linear solvers:

I MUMPS , PETSc , UMFPACK

Page 18: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Distributed Memory level

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 15

3Distributed memorylevel

Page 19: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Distributed Memory level

PaMPA

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 16

I handles the mesh and gives visitors;I redistributes the entities;I remeshes;I computes the overlap and do the communications.

Page 20: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Distributed Memory level

Domain Decomposition

Pj

Pi

→ entites are scattered→ incomplete elements

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 17

Page 21: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Distributed Memory level

Overlap construction

Pj

Pi

→ the overlap completes each incomplete element.→ the overlap gathers entities needed for computations

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 18

Page 22: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Distributed Memory level

Results on Avakas cluster

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 19

Page 23: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Shared Memory level

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 20

4Shared memory level

Page 24: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Shared Memory level

Macroelements

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 21

We built a set of macroelements.

Mass Integrator

B,W

∇U∇V

B, RW

Assembly

B,RA,RW

1

23

4

5

6

7

8

9

10 1112

Low Priority Tasks

High priority Tasks

Spawn SpawnSpawn

→ limits the memory consumption;→ limits the number of tasks in the DAG.

Page 25: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Shared Memory level

Assembly

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 22

We focus on the assembly operations.

Race conditions occur when 2 or more elements shareunknowns.

1 2

34

5 6 1

2

3

4

5

6

1 2 3 4 5 61 2 3 4

3 4 5 6

3

4

5

6

1

2

3

4

→ many strategies to do the assembly.

Page 26: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Shared Memory level

State of the art

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 23

OpenMP: Parallelize the inner loops of the assembly, andtreats the element sequentially

Coloration + OpenMP: 2 elements sharing unknowns havedifferent colours.

Page 27: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Shared Memory level

Our strategies

1 2 3 4

1

2

3

4

1 2 3 4 5 6

1

2

3

4

5

6

3 4 5 6

3

4

5

6

0.1

0.5

tim

e in s

econds

tile size = 10000 tile size = 20000 tile size = 50000

204060 80

4 8 16 24 32

eff

icie

ncy

number of threads

tile size = 10000 tile size = 20000 tile size = 50000tile size = 10000 tile size = 20000 tile size = 50000

4 8 16 24 32

number of threads

tile size = 10000 tile size = 20000 tile size = 50000tile size = 10000 tile size = 20000 tile size = 50000

StarPU flat

StarPU fixed

StarPU adapt

StarPU no-chain

PaRSEC flat

PaRSEC fixed

PaRSEC adapt

PaRSEC no-chain

OpenMP

Colour

4 8 16 24 32

number of threads

tile size = 10000 tile size = 20000 tile size = 50000

StarPU

PaRSEC

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 24

Page 28: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Conclusion, Perspectives

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 25

5Conclusion,perspectives...

Page 29: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Conclusion, Perspectives

Conclusion

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 26

I Generic platform with continuous testing.I Time schemes: Explicit Runge Kutta, Implicit BDF.I Continuous: CG, Taylor Galerkin, SUPG scheme.I Discontinuous: Discontinuous Galerkin.I Model: Advection, advection-diffusion, Euler, N-S (DG).I FE: Up to 4th order, Dubiner, Legendre, Lagrange.I Runtimes: Laplacian solved on GPU + CPU with StarPU

.I Portability: Avakas, Plafrim clusters. Turing cluster.

Compilers : gcc, icc, xlc, Clang.

Page 30: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Conclusion, Perspectives

Perspectives

Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014- 27

I DG: Marsu ADT, multigrid methods.I Scheme: platform continuously improved by ongoing

research.I Model: RANS, LES, Combustion, incompressible free

surface flows.I HPC: efficiency nonlinear iterations in long time

integration (interaction non-linear-linear solver), tightercoupling with runtime systems.

Page 31: AeroSol - Inria fileContext Software Architecture Distributed Memory level Shared Memory level Conclusion, Perspectives Damien Genet - AeroSol IPL C2S@Exa - July 11, 2014

Conclusion, Perspectives

Thank you !

INRIA

Bordeaux Sud-Ouest