multigrid algorithms for three-dimensional rans...

27
Multigrid Algorithms for Three-Dimensional RANS Calculations - The SUmb Solver Juan J. Alonso Department of Aeronautics & Astronautics Stanford University CME342 Lecture 14 May 21, 2012

Upload: others

Post on 08-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency

Multigrid Algorithms for Three-Dimensional RANS Calculations - The SUmb Solver

Juan J. Alonso

Department of Aeronautics & Astronautics Stanford University

CME342 Lecture 14 May 21, 2012

Page 2: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency

Outline

•  Non-linear multigrid algorithm - Review

•  FAS multigrid + modified Runge-Kutta scheme

•  Software and parallel implementation

•  SUmb solver and results

•  Unsteady algorithms

•  Future

Page 3: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency

Non-linear Multigrid Algorithm

Solve the non-linear equation

in a mesh with spacing, h

by finding a correction to the current iterate

with some algebra

Page 4: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency

Non-linear Multigrid Algorithm

and after smoothing the right hand side we can transfer the equation, solution, and residual to a coarser mesh

After relaxation in the coarse mesh, the coarse grid correction is

which can be interpolated to the finer mesh and the procedure repeated until convergence (possibly recursively)

or

Page 5: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency

Modified Runge-Kutta Time Stepping

Semi-discrete NS equations

Convective and dissipative residuals

Modified Runge-Kutta scheme

Page 6: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency

Modified Runge-Kutta Time Stepping

Modified Runge-Kutta scheme

•  Coefficients for Q and D are created separately to –  Minimize CPU time –  Improve convergence properties

•  Time accuracy is lost...do we care?

Page 7: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency

FAS Multigrid Algorithm

•  Coarse grid driven by residuals transferred from fine mesh

•  We impose boundary conditions on every mesh •  1st order artificial dissipation on coarser meshes

Volume/Area-weighted solution coarsening

Coarse grid residual forcing term

Modified R-K in coarse grid

Page 8: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency

FAS Multigrid Algorithm

•  Full-coarsened multigrid

•  V or W cycles of arbitrary depth

•  Bi-linear or tri-linear interpolation of solution to finer meshes

Page 9: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency

Additional Convergence Acceleration Schemes

•  Local time stepping

•  Implicit Residual Smoothing

•  Enthalpy Damping

•  Others... –  Block-Jacobi preconditioner –  Low-speed preconditioning –  J-coarsening or semi-coarsening

Page 10: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency

FAS Multigrid Algorithm C ****************************************************************** C * * C * TRANSFERS THE SOLUTION TO A COARSER MESH * C * * C ****************************************************************** ………. DO N=1,4 JJ = 1 DO J=2,JL,2 JJ = JJ +1 II = 1 DO I=2,IL,2 II = II +1 WWR(II,JJ,N) = (DW(I,J,N)*VOL(I,J) +DW(I+1,J,N)*VOL(I+1,J) . +DW(I,J+1,N)*VOL(I,J+1) +DW(I+1,J+1,N) *VOL(I+1,J+1))/ . (VOL(I,J)+VOL(I+1,J)+VOL(I,J+1)+ VOL(I+1,J+1)) END DO END DO END DO ……….

C ****************************************************************** C * * C * COMPLETE THE CALCULATION OF THE MGRID FORCING TERMS * C * ON FIRST ENTRY TO ANY COARSER MESH DURING THE CYCLE * C * TRANSFERS THE SOLUTION TO A COARSER MESH * C * * C ****************************************************************** C DO 32 J=2,JL DO 32 I=2,IL WR(I,J,N) = FCOLL*WR(I,J,N) -DW(I,J,N) 32 CONTINUE C C ADD THE MULTIGRID FORCING TERMS TO THE RESIDUALS C 33 DO 34 J=2,JL DO 34 I=2,IL DW(I,J,N) = DW(I,J,N) +WR(I,J,N) 34 CONTINUE 40 CONTINUE

•  Implementation follows description of FAS algorithm closely

•  Most of the code is unaware of what grid level it is working on

•  Pointer kept for current, finer, and coarser grid levels

Page 11: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency
Page 12: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency

Is this good enough?

•  Two-dimensional Euler results converge with an average residual contraction ratio of ~ 0.65 (almost “textbook” multigrid)

•  Three-dimensional Euler calculations also converge with an average contraction ratio ~ 0.75

•  Turbulent NS calculations with integration to the wall usually converge at ~ 0.98-0.99

•  What has happened to our “textbook” multigrid?

Page 13: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency

Parallel Implementation

•  For cell-centered schemes such as the one in SUmb, parallel implementations are “straightforward”

•  Chop up the domain and distribute it “evenly” among processors (using Metis and a graph representation of the multiblock mesh)

•  Whole blocks get assigned to processors •  Possibility of block splitting to improve load balancing •  Double halo approach to exchange information

Page 14: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency

Parallel Implementation

Page 15: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency

Parallel Implementation - Main Issues

•  In our algorithm, bandwidth is important since we must communicate –  At the end of each stage in the mod R-K sequence –  At all multigrid levels

•  Double level halo is necessary on each communication

•  Double precision messages (8 words = 5 cons. vars + 2 turbulence vars + pressure)

•  Non-blocking receives and sends used throughout the code

•  Not the most stringent requirement

Page 16: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency

Parallel Implementation - Main Issues

•  In our algorithm, latency is VERY important, since its importance grows particularly at the coarse mesh levels –  Fine meshes: 95% of communication cost is bandwidth –  Coarsest meshes: 65% of communication cost is latency

•  Only way to avoid latency cost is to communicate less often. We have tried this and have found that the improvement in parallel performance is more than offset by the degradation in multigrid convergence.

•  Bottomline: for good multigrid convergence in parallel you will need a high-performance network...100BaseT switches will not cut it beyond 8 procs.

Page 17: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency

Parallel Implementation - Main Issues •  In the literature, there are options for vertical and

horizontal multigrid –  How often does one communicate at the coarser levels of the

multigrid sequence? –  Only on the fine mesh? At all mesh levels? Sometimes at the

coarse levels? •  Most researchers now agree that:

–  A parallel multigrid implementation that mimics the convergence history of a serial implementation is the best approach

–  Parallel performance hit is taken (but overcome by improvements in performance)

–  Some uses of multigrid as preconditioner can ameliorate the parallel performance impact

Page 18: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency

Multigrid Semi-coarsening / J-coarsening •  There are indications that something can be done about the

lack of convergence of multigrid methods in RANS calculations

•  Encouraging 2D results first shown by N. A. Pierce and reproduced by Darmofal

•  Results shown in this talk from N. A. Pierce’s PhD Thesis –  N.A.Pierce, Preconditioned Multigrid Methods for Compressible Flow

Calculations on Stretched Meshes, Christ Church, University of Oxford, 1997.

•  3D remains an elusive challenge

Page 19: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency

Why does multigrid not converge well for RANS?

•  Stiffness comes in from a variety of sources –  Propagative speed disparity

–  Cell stretching

–  Flow alignment

–  Turbulence models

Page 20: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency

Euler Convergence, Full Multigrid

•  Euler solutions with full multigrid converge reasonably well, although performance can be improved

Page 21: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency

•  Convergence behaves appropriately for larger meshes and higher Mach numbers, although it can always be improved

Page 22: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency

Euler / NS Multigrid Convergence Analysis

Standard Full Coarsened Multigrid

Standard Full Coarsened Multigrid + Block-Jacobi Precon

Page 23: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency

Euler / NS Multigrid Convergence Analysis •  Standard full coarsening has

problems with –  All convective modes –  High-x / Low-y acoustic modes

•  Adding Block-Jacobi Precon –  High-x / Low-y acoustic modes

•  J-Coarsening (semi-coarsening) appears to be able to damp all modes.

•  Does it? J-Coarsened Multigrid with Block-Jacobi Precon

Page 24: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency

NS Multigrid Convergence Test Cases

•  RANS solutions can be made to converge as efficiently (contraction ratios ~ 0.75) as Euler calculations!!!

Page 25: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency

NS Multigrid Convergence Test Cases •  Good convergence is

a combination of techniques that provide damping across the whole spectrum of modes.

•  What about for 3D?

Page 26: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency

NS Multigrid Convergence Test Cases

•  In theory, the same can be proven for 3D calculations •  In practice, this has only been shown for VERY smooth meshes

(no wing tips, no tip gaps, no bad stretching) •  More works remains to be done

Page 27: Multigrid Algorithms for Three-Dimensional RANS ...adl.stanford.edu/cme342/Lecture_Notes_files/lecture14-12.pdf · Parallel Implementation - Main Issues • In our algorithm, latency

Conclusions

•  Multigrid has been one of the most successful algorithms for compressible fluid flow (of course, also for elliptic equations)

•  It can be parallelized efficiently •  It can be used for both steady and unsteady

computations (dual-time stepping) •  Performance degrades with high degrees of mesh

stretching •  Work remains to be done to obtain “textbook”

multigrid convergence