geometric multigrid for scalable dpg solves in camellia · 2017. 7. 27. · rapid speci cation of...

32
Exceptional service in the national interest Copper Mountain March 26-30, 2017 Geometric Multigrid for Scalable DPG Solves in Camellia Nathan V. Roberts and Jesse Chan [email protected], [email protected] Sandia National Laboratories and Rice University Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energyʼs National Nuclear Security Administration under contract DE-AC04-94AL85000.

Upload: others

Post on 16-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

E x c e p t i o n a l s e r v i c e i n t h e n a t i o n a l i n t e re s t

Copper Mountain March 26-30, 2017

Geometric Multigrid for Scalable DPGSolves in Camellia

Nathan V. Roberts and Jesse [email protected], [email protected]

Sandia National Laboratories and Rice University

Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energyʼs National Nuclear Security Administration under contract DE-AC04-94AL85000.

Page 2: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

Outline

1 Motivation/Introduction: DPG, Camellia, HPCCamellia: Design GoalsDPG and HPC

2 Our Geometric Multigrid Approach

3 Selected Numerical Results

4 Conclusions

Copper Mountain March 26-30, 2017 2

Page 3: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

DPG 6= DG

Copper Mountain March 26-30, 2017 3

Page 4: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

DPG in Brief

DPG approach:� Petrov-Galerkin: test and trial spaces differ� discontinuous test and trial spaces� optimal test functions computed on the fly so that

(voptei

, v)V = b(ei, v) ∀v ∈ V� key choice: which norm to use on the test space?

DPG features:� automatic stability even on coarse mesh� SPD/HPD stiffness matrix =⇒ can use (P)CG� discontinuous test space =⇒ optimal test solve is local� Error in uh is minimized in the energy norm

||uh||E = supv∈V

b(uh, v)

||v||V= ||b(uh, · )||V ′

� Can measure the error in the energy norm to drive adaptivity.

Copper Mountain March 26-30, 2017 4

Page 5: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

DPG in Brief: Concept Map

inf-sup stability

optimal test functions

discontinuous test space

computationaltractability

ultraweakformulation

min. residual in

� · �E

graph normon test space

Can we make� · �U � � · �E ?

∗ Note: we approximate the infinite-dimensional test space by taking the polynomialorder k for the trial and “enriching” it somewhat: ktest = ktrial + ∆k—in all thatfollows, ∆k = 1, 2, or 3.

Copper Mountain March 26-30, 2017 5

Page 6: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

Building the ultraweak formulation

PDE

�� = f

First-Order System

� · � = f

� ��� = 0

Integration by Parts

(� · n, v)�h� (�,�v)�h

= (f, v)�h

(�, q)�h+ (�, q · n)�h

� (�,� · q)�h= 0

Ultraweak (DPG) Variational Formulation

( ��n, v)�h� (�,�v)�h

+(�, q)�h+ (��, qn)�h

� (�,� · q)�h= (f, v)�h

b((�,�, ��, ��n), (v, q)) = (f, v)�h

“b(u, v) = l(v)”

Copper Mountain March 26-30, 2017 6

Page 7: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

DPG Applications to Date

DPG is a general framework, and has been successfully applied to ahost of PDE problems, including:

� convection-dominateddiffusion

� acoustics/wave propagation� linear elasticity� Maxwell’s equations

(cloaking problem)� Burgers’ equations� Euler equations� compressible Navier-Stokes� Stokes� incompressible Navier-Stokes� Oldroyd-B Flow

flow past a cylinder, Re = 40

1Bold items have Camellia-based implementations.

Copper Mountain March 26-30, 2017 7

Page 8: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

Camellia1

Design Goal: make DPG research and experimentation as simple aspossible, while maintaining computational efficiency and scalability.

Core features:

� rapid specification of new formulations (FEniCS-inspired)

� arbitrary element types (simplices and hypercubes provided)

� h- and p-adaptivity (with hanging nodes)

� trace and field unknowns (discontinuous and C0)

� scalability via MPI (take advantage of parallelism in optimal testfunction determination)

� implemented in C++, built atop Trilinos

1NVR. Camellia: A software framework for discontinuous Petrov-Galerkin methods.Computers & Mathematics with Applications, 2014.

Copper Mountain March 26-30, 2017 8

Page 9: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

Suitability of DPG for HPC

DPG has several attractive features for HPC:

� locality: optimal test functions embarrassingly parallel

� intensity: high-order computations take advantage of “free” flops

� automaticity: robust adaptivity means less human involvement

Copper Mountain March 26-30, 2017 9

Page 10: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

Multigrid: V-cycle

Restrict

Restrict Prolongate

Prolongate

Solve

Multigrid choices:� V-cycle� multiplicative smoothing (accelerates convergence at cost of extra

residual computation).� smoother is damped; see our arXiv report for details.� Prolongation and smoothing details follow. . .

Copper Mountain March 26-30, 2017 10

Page 11: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

Multigrid: Our Prolongation Operators

The basic rule for the prolongation operator P is� A solution that is exact on the coarse mesh should also be a

solution on the fine mesh when prolongated.

For p-multigrid, this is straightforward. What about our traces withh-multigrid?

bu

bu

bu

bu

u

In particular, for the DPG traces under h-refinement:� some traces do not exist in the coarse mesh;� we define these in terms of the fields (u = tr(u));� this is further complicated by static condensation: we must

reconstruct the fields.

Copper Mountain March 26-30, 2017 11

Page 12: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

Multigrid: Our Prolongation Operators

The basic rule for the prolongation operator P is� A solution that is exact on the coarse mesh should also be a

solution on the fine mesh when prolongated.

For p-multigrid, this is straightforward. What about our traces withh-multigrid?

bu

bu

bu

bu

u

In particular, for the DPG traces under h-refinement:� some traces do not exist in the coarse mesh;� we define these in terms of the fields (u = tr(u));� this is further complicated by static condensation: we must

reconstruct the fields.

Copper Mountain March 26-30, 2017 12

Page 13: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

Multigrid: Our Smoothers

For p-multigrid smoothers, we use “minimal overlap” additive Schwarz:

For h-multigrid smoothers, we use 1-overlap additive Schwarz:

Copper Mountain March 26-30, 2017 13

Page 14: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

Multigrid: Our Smoothers

For p-multigrid smoothers, we use “minimal overlap” additive Schwarz:

For h-multigrid smoothers, we use 1-overlap additive Schwarz:

Copper Mountain March 26-30, 2017 14

Page 15: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

Two-Grid Tests for Stokes

Our coarse grids:

� p-multigrid: kcoarse = kfine/2; when kfine = 1,kcoarse = 0.

� h-multigrid: coarse mesh of same degree as fine, once-coarsenedrelative to fine mesh.

DPG choices:

� we use static condensation and the graph norm;

� we use ∆k = d = 2 or 3 (but little difference for ∆k = 1);

� for H1 traces, we enrich the corresponding fields (i.e., if Stokesfield variables have order 3, then both the velocity traces and thevelocity fields will have order 4).

Our exact solution:

� u =(−ex y cosy+ siny, ex y siny+ ez y cosy,−ez(cosy− y siny))

� p = 2 ex siny+ 2 ez cosy

Copper Mountain March 26-30, 2017 15

Page 16: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

Two-Grid Tests for Stokes

Our coarse grids:

� p-multigrid: kcoarse = kfine/2; when kfine = 1,kcoarse = 0.

� h-multigrid: coarse mesh of same degree as fine, once-coarsenedrelative to fine mesh.

DPG choices:

� we use static condensation and the graph norm;

� we use ∆k = d = 2 or 3 (but little difference for ∆k = 1);

� for H1 traces, we enrich the corresponding fields (i.e., if Stokesfield variables have order 3, then both the velocity traces and thevelocity fields will have order 4).

Our exact solution:

� u =(−ex y cosy+ siny, ex y siny+ ez y cosy,−ez(cosy− y siny))

� p = 2 ex siny+ 2 ez cosy

Copper Mountain March 26-30, 2017 16

Page 17: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

Two-Grid Tests for Stokes

Our coarse grids:

� p-multigrid: kcoarse = kfine/2; when kfine = 1,kcoarse = 0.

� h-multigrid: coarse mesh of same degree as fine, once-coarsenedrelative to fine mesh.

DPG choices:

� we use static condensation and the graph norm;

� we use ∆k = d = 2 or 3 (but little difference for ∆k = 1);

� for H1 traces, we enrich the corresponding fields (i.e., if Stokesfield variables have order 3, then both the velocity traces and thevelocity fields will have order 4).

Our exact solution:

� u =(−ex y cosy+ siny, ex y siny+ ez y cosy,−ez(cosy− y siny))

� p = 2 ex siny+ 2 ez cosy

Copper Mountain March 26-30, 2017 17

Page 18: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

Two-Grid Results: Stokes 2D p-Multigrid

2 4 8 16 32

14

18

23

Mesh Width (# Elements)

Iter

atio

nC

oun

t

2D p-Multigrid

Stokes p = 1 (conforming)

Stokes p = 2 (conforming)

Stokes p = 4 (conforming)

Copper Mountain March 26-30, 2017 18

Page 19: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

Two-Grid Results: Stokes 2D h-Multigrid

4 8 16 32

14

15

16

17

Mesh Width (# Elements)

Iter

atio

nC

oun

t

Stokes 2D h-Multigrid

Stokes p = 1 (conforming)

Stokes p = 2 (conforming)

Stokes p = 4 (conforming)

Copper Mountain March 26-30, 2017 19

Page 20: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

Two-Grid Results: Stokes 3D p-Multigrid

2 4 8 16

22

27

42

45

Mesh Width (# Elements)

Iter

atio

nC

oun

t

3D p-Multigrid

Stokes p = 1 (conforming)

Stokes p = 2 (conforming)

Copper Mountain March 26-30, 2017 20

Page 21: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

Two-Grid Results: Stokes 3D h-Multigrid

4 8 16

20

222324

2627

3132

Mesh Width (# Elements)

Iter

atio

nC

oun

t

Stokes 3D h-Multigrid

Stokes p = 1 (conforming)

Stokes p = 2 (conforming)

Copper Mountain March 26-30, 2017 21

Page 22: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

Approach for More than Two Grids

k=1

k=1

k=1

k=1

k=2

k=4

Copper Mountain March 26-30, 2017 22

Page 23: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

Lid-Driven Cavity Flow

A classical challenge problem for Stokes flow is lid-driven cavity flow.

u1 = 1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Left: schematic of the flow. Right: streamlines. We will start with a2× 2, k = 4 mesh, and perform automatic refinements, using ourmultigrid preconditioner at each refinement step.

Copper Mountain March 26-30, 2017 23

Page 24: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

Lid-Driven Cavity Flow: Meshes for Sixth Refinement

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

mesh

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

mesh

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

mesh

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

mesh

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

mesh

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

mesh

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

mesh

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

meshk = 1 k = 1 k = 1 k = 1

k = 1 k = 1 k = 1 k = 4

Top left to bottom right: sequence of meshes for multigrid operatorfor refinement 6. From coarsest mesh, refine first in h, then jump tofine k.

Copper Mountain March 26-30, 2017 24

Page 25: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

Lid-Driven Cavity Flow: Results

Ref. # hminhmaxhmin

Elements Energy Error Zero Guess Prev.

0 1/2 1 4 7.27e-01 15 151 1/4 2 10 6.30e-01 22 202 1/8 4 16 5.82e-01 28 153 1/16 8 22 2.06e-01 41 204 1/32 16 28 7.54e-02 46 215 1/64 32 34 5.90e-02 33 186 1/128 64 70 3.01e-02 55 217 1/256 128 88 1.55e-02 63 198 1/512 256 106 8.63e-03 70 15

Table: Stokes cavity flow: iteration counts with k = 4 to achieve a residualtolerance of 10−6, starting from a zero initial guess or with the solution fromthe previous refinement step.

Copper Mountain March 26-30, 2017 25

Page 26: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

A Scaling Test

For a challenging scaling test, take a 32× 32× 32 (32,768-element)quartic, conforming mesh with the same Stokes problem as before.This has 7.6× 107 dofs (1.4× 107 trace dofs). Notes:

� Use static condensation

� Coarse mesh has 512 constant elements

� 113 iterations to converge

� Note: mesh initialization involves communication costs that donot scale (takes additional 35 seconds on 32K ranks compared to4K)

� Use 8 MPI ranks per BG/Q node (2 GB/rank)

Copper Mountain March 26-30, 2017 26

Page 27: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

A Scaling Test

For a challenging scaling test, take a 32× 32× 32 (32,768-element)quartic, conforming mesh with the same Stokes problem as before.This has 7.6× 107 dofs (1.4× 107 trace dofs).

4096 8192 16384 32768

693

136

101

MPI ranks (2 BG/Q cores each)

Tim

e(s

econ

ds)

Stokes Strong Scaling, 32× 32× 32 quartic elements

solvetotal

total (ideal)

Total time to solution achieves 64% of the ideal 8x speedup (83% ifmesh initialization costs neglected).

Copper Mountain March 26-30, 2017 27

Page 28: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

Some timing details

0 100 200 300 400 500 600 700

4096 ranks

8192 ranks

16384 ranks

32768 ranks

Time in secondsMesh Init. GMG Init. Solve

Timing detail for the 32× 32× 32 (32,768-element) quartic,conforming Stokes solve (times in seconds).

“GMG Init.” is the time to construct the GMG data structures at each level;

includes setting up Solution objects at each level but not constructing prolongation

or smoothing operators (these are included in “Solve.”)

Copper Mountain March 26-30, 2017 28

Page 29: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

Some timing details: Solve

0 100 200 300 400 500 600 700

4096 ranks

8192 ranks

16384 ranks

32768 ranks

Time in secondsLocal Stiffness Prolongation Construction

Smoother Construction Smoother Application

Other Solve

Timing detail for the 32× 32× 32 (32,768-element) quartic,conforming Stokes solve (times in seconds).

Copper Mountain March 26-30, 2017 29

Page 30: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

Conclusions

Summary results:

� Our strategy works well, both in iteration counts and in computetime, to scale the Stokes problems we have considered.

� For Navier-Stokes, this works best for smaller Reynolds numbers(see report for details). Likely need something more specializedfor high Reynolds numbers.

Resources:

� Camellia is available under a BSD License atbitbucket.org/nateroberts/Camellia

� Manual available as Argonne Tech Report

Copper Mountain March 26-30, 2017 30

Page 31: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

Thank you for your attention!

Questions?

For more details:NVR. Camellia v1.0 manual: Part I. Technical Report ANL/ALCF-16/3, Argonne NationalLaboratory, 2016.NVR and Chan, J. A geometric multigrid preconditioning strategy for DPG systemmatrices. ArXiv e-prints, August 2016.

Copper Mountain March 26-30, 2017 31

Page 32: Geometric Multigrid for Scalable DPG Solves in Camellia · 2017. 7. 27. · rapid speci cation of new formulations (FEniCS-inspired) arbitrary element types (simplices and hypercubes

NVR.Camellia: A software framework for discontinuous Petrov-Galerkin methods.Computers & Mathematics with Applications, 2014.

NVR.Camellia v1.0 manual: Part I.Technical Report ANL/ALCF-16/3, Argonne National Laboratory, 2016.

NVR and Chan, J.A geometric multigrid preconditioning strategy for DPG system matrices.ArXiv e-prints, August 2016.

Copper Mountain March 26-30, 2017 32