scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › mwndea ›...

51
Scalability analysis of the distributed-memory implementation of the Aggregated unfitted Finite Element Method (AgFEM) Alberto F. Martín * , Santiago Badia, Francesc Verdugo, Eric Neiva MWNDEA 2020, Melbourne, Australia, 12/02/2020

Upload: others

Post on 30-Jun-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Scalability analysis of the distributed-memory implementation of the

Aggregated unfitted Finite Element Method (AgFEM)

Alberto F. Martín∗, Santiago Badia, Francesc Verdugo, Eric Neiva

MWNDEA 2020, Melbourne, Australia, 12/02/2020

Page 2: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Embedded Finite elements

CutFEM, Finite Cell Method, AgFEM, X-FEM, ...

body-fitted mesh unfitted mesh

4 Simplified mesh generation

8 Dirichlet BC? 8 Numerical integration? 8 ill-conditioning? (this talk)

Alberto F. Martín (Monash University) MWNDEA2020 2/35

Page 3: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Embedded Finite elements

CutFEM, Finite Cell Method, AgFEM, X-FEM, ...

body-fitted mesh unfitted mesh

4 Simplified mesh generation

8 Dirichlet BC? 8 Numerical integration? 8 ill-conditioning? (this talk)

Alberto F. Martín (Monash University) MWNDEA2020 2/35

Page 4: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Parallel distributed-memory simulation pipeline

1. Unfitted (adaptive) Cartesian grids(p4est)

2. Partition using space filling-curves(p4est)

3. Unfitted FE discretization (AgFEM)

4. AMG linear solver (PETSc)

Alberto F. Martín (Monash University) MWNDEA2020 3/35

Page 5: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Parallel distributed-memory simulation pipeline

1. Unfitted (adaptive) Cartesian grids(p4est)

2. Partition using space filling-curves(p4est)

3. Unfitted FE discretization (AgFEM)

4. AMG linear solver (PETSc)

Alberto F. Martín (Monash University) MWNDEA2020 3/35

Page 6: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Parallel distributed-memory simulation pipeline

1. Unfitted (adaptive) Cartesian grids(p4est)

2. Partition using space filling-curves(p4est)

3. Unfitted FE discretization (AgFEM)

4. AMG linear solver (PETSc)

Alberto F. Martín (Monash University) MWNDEA2020 3/35

Page 7: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Unfitted methods at large scales: pros and cons

4 Highly scalable mesh generationbased on octrees (e.g. p4est)

4 Highly scalable mesh partition withspace-filling curves (Parmetis notneeded)

4 Highly scalable adaptive meshrefinement + load balancing

8 Not guaranteed that highly scalablelinear solvers keep their optimalproperties for cut elements.

Alberto F. Martín (Monash University) MWNDEA2020 4/35

Page 8: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Unfitted methods at large scales: pros and cons

4 Highly scalable mesh generationbased on octrees (e.g. p4est)

4 Highly scalable mesh partition withspace-filling curves (Parmetis notneeded)

4 Highly scalable adaptive meshrefinement + load balancing

8 Not guaranteed that highly scalablelinear solvers keep their optimalproperties for cut elements.

Alberto F. Martín (Monash University) MWNDEA2020 4/35

Page 9: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Unfitted methods at large scales: pros and cons

4 Highly scalable mesh generationbased on octrees (e.g. p4est)

4 Highly scalable mesh partition withspace-filling curves (Parmetis notneeded)

4 Highly scalable adaptive meshrefinement + load balancing

8 Not guaranteed that highly scalablelinear solvers keep their optimalproperties for cut elements.

Alberto F. Martín (Monash University) MWNDEA2020 4/35

Page 10: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Unfitted methods at large scales: pros and cons

4 Highly scalable mesh generationbased on octrees (e.g. p4est)

4 Highly scalable mesh partition withspace-filling curves (Parmetis notneeded)

4 Highly scalable adaptive meshrefinement + load balancing

8 Not guaranteed that highly scalablelinear solvers keep their optimalproperties for cut elements.

Alberto F. Martín (Monash University) MWNDEA2020 4/35

Page 11: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

PETSc CG + AMG preconditioner on unfitted meshes

Poisson equation (weak scaling test with 5 meshes)

AMG+AgFEM AMG + Naive unfitted FEM*

103 104 105 106 107 108

DOFs

4

6

8

10

12

14

16

GC

itera

tions

103 104 105 106 107 108

DOFs

4

6

8

10

12

14

16

GC

itera

tions

* Nitsche BCs + modified integration in cut cells

Alberto F. Martín (Monash University) MWNDEA2020 5/35

Page 12: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Why linear solvers are affected by cut cells?

Condition number estimates (Poisson Eq.)

(a) Body-fitted case

k2(A) ∼ h−2

(b) Naive unfitted FEM

k2(A) ∼ |η|−(2p+1− 2d )

"small cut cell problem"

Alberto F. Martín (Monash University) MWNDEA2020 6/35

Page 13: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Possible remedies

Fix the linear solver

Taylor your parallel solver to deal with k2(A) ∼ |η|−(2p+1−2/d)

Example: [S. Badia, F. Verdugo. Robust and scalable domaindecomposition solvers for unfitted finite element methods. Journal ofComputational and Applied Mathematics (2018) ].

Fix the linear system (this talk)

Enhance the unfitted FE method so that k2(A) ∼ h−2

Use a standard scalable solver

Examples: CutFEM, AgFEM

Alberto F. Martín (Monash University) MWNDEA2020 7/35

Page 14: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Agenda

1. The AgFEM method (serial case)

2. Parallel implementation

3. Performance of parallel AgFEM + AMG solvers

Alberto F. Martín (Monash University) MWNDEA2020 8/35

Page 15: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Agenda

1. The AgFEM method (serial case)

2. Parallel implementation

3. Performance of parallel AgFEM + AMG solvers

Alberto F. Martín (Monash University) MWNDEA2020 9/35

Page 16: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

AgFEM method for the Poisson Eq.

−∆u = f in Ω

u = uD on ∂Ω

Alberto F. Martín (Monash University) MWNDEA2020 10/35

Page 17: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Weak imposition of Dirichlet BCs

Nitsche’s Method

Find uh ∈ V h such that

ah(uh, vh) = lh(vh) ∀vh ∈ V h (vh does not vanish on ∂Ω!)

where

ah(uh, vh) :=∑

K∈T acth

K∩Ω

∇uh · ∇vh −∑

F∈T acth ∩∂Ω

F

(∇uh · n)vh

−∑

F∈(T acth ∩∂Ω)

F

uh(∇vh · n) +∑

F∈(T acth ∩∂Ω)

βh−1F

∂Ω

uh vh

lh(vh) :=∑

K∈T acth

K∩Ω

f vh +∑

F∈(T acth ∩∂Ω)

βh−1F

F

uDvh −∑

F∈(T acth ∩∂Ω)

F

uD(∇vh · n

)

The key feature of AgFEM is the definition of the discrete space VhAlberto F. Martín (Monash University) MWNDEA2020 11/35

Page 18: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Starting point: "naive" FE space

V stdh := u ∈ C0(Ωact) : u|K ∈ Qp(K) ∀K ∈ T act

h

T acth , Ωact V std

h

Alberto F. Martín (Monash University) MWNDEA2020 12/35

Page 19: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Aggregated FE space

Basic idea: improve conditioning by removing problematic DOFs

V aggh :=

u ∈ Vh : u× =

•∈masters(×)

Cוu• ∀× ∈ P

• well-posed dofs× problematic dofs (P)

Alberto F. Martín (Monash University) MWNDEA2020 13/35

Page 20: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Definition of constraints via cell aggregates

1. Generate cell aggregates(1 interior cell + several cut cells)

2. Define dof to root cell map root(×)

via the aggregates

3. Define constraints:

u× =∑

•∈dofs(root(×))

φroot(×)• (x×)u•

Alberto F. Martín (Monash University) MWNDEA2020 14/35

Page 21: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Definition of constraints via cell aggregates

1. Generate cell aggregates(1 interior cell + several cut cells)

2. Define dof to root cell map root(×)

via the aggregates

3. Define constraints:

u× =∑

•∈dofs(root(×))

φroot(×)• (x×)u•

Alberto F. Martín (Monash University) MWNDEA2020 14/35

Page 22: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Definition of constraints via cell aggregates

1. Generate cell aggregates(1 interior cell + several cut cells)

2. Define dof to root cell map root(×)

via the aggregates

3. Define constraints:

u× =∑

•∈dofs(root(×))

φroot(×)• (x×)u•

Alberto F. Martín (Monash University) MWNDEA2020 14/35

Page 23: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Definition of constraints via cell aggregates

1. Generate cell aggregates(1 interior cell + several cut cells)

2. Define dof to root cell map root(×)

via the aggregates

3. Define constraints:

u× =∑

•∈dofs(root(×))

φroot(×)• (x×)u•

Alberto F. Martín (Monash University) MWNDEA2020 14/35

Page 24: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Results for the unfitted aggregated FEM (Poisson Eq.)1

κ(A) ≤ c1h−2 (Condition number bound)

β ≤ c2h−2 (Nitsche’s penalty coef.)

‖u− uh‖H1(Ω) ≤ c3hp (Optimal convergence order)

‖u− uh‖L2(Ω) ≤ c4hp+1 (Optimal convergence order)

and others (inverse/trace inequalities, bound of aggregate size, boundof the extended solution, ...)

1 [Badia, Verdugo, Martín. The aggregated unfitted finite element method for ellipticproblems. Comput. Methods Appl. Mech. Eng. (2018).]

Alberto F. Martín (Monash University) MWNDEA2020 15/35

Page 25: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

0.3 0.4 0.5 0.6 0.7

0

5

10

15

20

25

30

log

10

(co

nd

est(

A))

p=1, standard

p=2, standard

p=1, aggr.

p=2, aggr.

Alberto F. Martín (Monash University) MWNDEA2020 16/35

Page 26: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

0.3 0.4 0.5 0.6 0.7

0

5

10

15

20

25

30

log

10

(co

nd

est(

A))

p=1, standard

p=2, standard

p=1, aggr.

p=2, aggr.

Alberto F. Martín (Monash University) MWNDEA2020 17/35

Page 27: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Convergence test

-2.5 -2 -1.5 -1

log10(h)

-8

-7

-6

-5

-4

-3

-2

-1

0

1

log10(E

rror

energ

y n

orm

)

p=1, standard

p=2, standard

p=1, aggr.

p=2, aggr.

slope 1

slope 2

(a) 2D

-2.5 -2 -1.5 -1

log10(h)

-8

-7

-6

-5

-4

-3

-2

-1

0

1

log10(E

rror

energ

y n

orm

)p=1, standard

p=2, standard

p=1, aggr.

p=2, aggr.

slope 1

slope 2

(b) 3DAlberto F. Martín (Monash University) MWNDEA2020 18/35

Page 28: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Extension to the Stokes problem2

75.0 |u|0.0

−∆u+∇p = f in Ω

∇ · u = 0 in Ω

u = 0 on ΓD

(∇u− pI) · n = g on ΓN

2[Badia, Martín, Verdugo. Mixed aggregated finite element methods for the unfitteddiscretization of the stokes problem. SIAM J. Sci. Comput., 40(6). 2018.]

Alberto F. Martín (Monash University) MWNDEA2020 19/35

Page 29: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Alberto F. Martín (Monash University) MWNDEA2020 20/35

Page 30: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

0.3 0.4 0.5 0.6 0.7

`

0

5

10

15

20

25

30

35

40

log10(condest(A

))

Aggregated

Standard

0.3 0.4 0.5 0.6 0.7

`

0

5

10

15

20

25

30

35

40

log10(condest(A

))

Aggregated

Standard

Alberto F. Martín (Monash University) MWNDEA2020 21/35

Page 31: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

1.2 1.4 1.6 1.8 2.0

log10((DOF)1/d

)

−6

−5

−4

−3

−2

log10

(‖u

h−

u‖ H

1

‖u‖ H

1

)

Aggregated

Standard

slope -2

1.2 1.4 1.6 1.8 2.0

log10((DOF)1/d

)

−6

−5

−4

−3

−2

log10

(‖u

h−

u‖ L

2

‖u‖ L

2

)Aggregated

Standard

slope -3

1.2 1.4 1.6 1.8 2.0

log10((DOF)1/d

)

−6

−5

−4

−3

−2

log10

(‖p

h−

p‖ L

2

‖p‖ L

2

)

Aggregated

Standard

slope -2

Alberto F. Martín (Monash University) MWNDEA2020 22/35

Page 32: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Agenda

1. The AgFEM method (serial case)

2. Parallel implementation

3. Performance of parallel AgFEM + AMG solvers

Alberto F. Martín (Monash University) MWNDEA2020 23/35

Page 33: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Parallel mesh distribution

D1 D2

(a) D = D1, D2. (b) View from D1. (c) View from D2

Main phases to be parallelized:

• Cell Aggregation

• Imposition of constraints

Alberto F. Martín (Monash University) MWNDEA2020 24/35

Page 34: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Cell aggregation (serial)

touched

untouched

aggregated

Alberto F. Martín (Monash University) MWNDEA2020 25/35

Page 35: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Cell aggregation (serial)

touched

untouched

aggregated

Alberto F. Martín (Monash University) MWNDEA2020 25/35

Page 36: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Cell aggregation (serial)

touched

untouched

aggregated

Alberto F. Martín (Monash University) MWNDEA2020 25/35

Page 37: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Cell aggregation (serial)

touched

untouched

aggregated

Alberto F. Martín (Monash University) MWNDEA2020 25/35

Page 38: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Cell aggregation (serial)

touched

untouched

aggregated

Alberto F. Martín (Monash University) MWNDEA2020 25/35

Page 39: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Aggregates in 3D

Alberto F. Martín (Monash University) MWNDEA2020 26/35

Page 40: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Cell aggregation (parallel)

(a) Step 1.

(b) Step 2. (c) Comm. (d) Step 3.

4 Standard nearest neighbor communication to determine root cellsAlberto F. Martín (Monash University) MWNDEA2020 27/35

Page 41: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Cell aggregation (parallel)

(a) Step 1. (b) Step 2.

(c) Comm. (d) Step 3.

4 Standard nearest neighbor communication to determine root cellsAlberto F. Martín (Monash University) MWNDEA2020 27/35

Page 42: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Cell aggregation (parallel)

(a) Step 1. (b) Step 2. (c) Comm.

(d) Step 3.

4 Standard nearest neighbor communication to determine root cellsAlberto F. Martín (Monash University) MWNDEA2020 27/35

Page 43: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Cell aggregation (parallel)

(a) Step 1. (b) Step 2. (c) Comm. (d) Step 3.

4 Standard nearest neighbor communication to determine root cellsAlberto F. Martín (Monash University) MWNDEA2020 27/35

Page 44: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Parallel imposition of constraints

Ds′ Ds′′ Ds

8 Subdomain-local constraints not even possible in some cases

4 At the end, only standard neighbor communication required

Alberto F. Martín (Monash University) MWNDEA2020 28/35

Page 45: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Agenda

1. The AgFEM method (serial case)

2. Parallel implementation

3. Performance of parallel AgFEM + AMG solvers

Alberto F. Martín (Monash University) MWNDEA2020 29/35

Page 46: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Weak scaling test setup

• Poisson eq.

• AgFEM vs "naive" unfitted FEM

• Linear solver:PCG from PETSc

• Preconditioner:smooth aggregation AMG fromPETSc (GAMG)

• Up to 16K cores and 1000Mbackground cells

Computed at Mare Nostrum 4

Alberto F. Martín (Monash University) MWNDEA2020 30/35

Page 47: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Number of PCG iterations (weak scaling)

103 104 105 106 107 108

DOFs

4

6

8

10

12

14

16

GC

itera

tions

103 104 105 106 107 108

DOFs

4

6

8

10

12

14

16

GC

itera

tions

103 104 105 106 107 108

DOFs

4

6

8

10

12

14

GC

itera

tions

(a) Popcorn (b) Spiral (c) Swiss Cheese

agg (load 1)agg (load 2)agg (load 3)

std (load 1)std (load 2)std (load 3)

Alberto F. Martín (Monash University) MWNDEA2020 31/35

Page 48: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Computational time (secs) AgFEM stages (weak scaling)

105 106 107 108

DOFs

0

1

2

3

4

Wal

lclo

cktim

e[s

]

105 106 107 108

DOFs

0

1

2

3

4

5

Wal

lclo

cktim

e[s

]105 106 107 108

DOFs

0

5

10

15

Wal

lclo

cktim

e[s

]

(a) Popcorn (b) Spiral (c) Swiss Cheese

Cell aggregation (Alg. 2)Path reconstruction (Algs. 3 and 4)Import data from root cells (Alg. 5)Setup constraints (Sect. 3.8)

Setup of local DOFs ids (Sect. 3.3)Setup of global DOFs ids (Sect. 3.7)FE integration + assembly (Table 2)

Alberto F. Martín (Monash University) MWNDEA2020 32/35

Page 49: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Computational time (secs) AMG solver (weak scaling)

105 106 107 108

DOFs

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Wal

lclo

cktim

e[s

]

105 106 107 108

DOFs

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Wal

lclo

cktim

e[s

]

104 105 106 107

DOFs

0

1

2

3

4

Wal

lclo

cktim

e[s

]

(a) Popcorn (b) Spiral (c) Swiss Cheese

Linear solver setup Linear solver run

• Setup degradation (11.94x CPU time for 4,448x larger problem)

• Similar for standard FEM in a box (2.50x for 355.26x)

Alberto F. Martín (Monash University) MWNDEA2020 33/35

Page 50: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

Conclusions

4 Embedded FEM enables scalable octree-based meshes

8 ... but can destroy the scalability of linear solvers

4 AgFEM allows to recover the optimal scaling of linear solver

4 ... while keeping the optimal discretization order.

Alberto F. Martín (Monash University) MWNDEA2020 34/35

Page 51: Scalability analysis of the distributed-memory ...users.monash.edu › ~jdroniou › MWNDEA › slides › slides-martin.pdf · Scalability analysis of the distributed-memory implementation

For more details, see papers:

F. Verdugo, A.F. Martín, S. Badia. Distributed-memory parallelization ofthe aggregated unfitted finite element method. CMAME, 357, 2019.

S. Badia, A.F. Martín, F. Verdugo. Mixed aggregated finite elementmethods for the unfitted discretization of the Stokes problem. SIAM J.Sci. Comput., 40(6), 2018.

S. Badia, F. Verdugo, A.F. Martín. The aggregated unfitted finite elementmethod for elliptic problems. CMAME, 336, 2018.

https://github.com/fempar/fempar

Alberto F. Martín (Monash University) MWNDEA2020 35/35