exact and approximate projection methods for transient ...christon/me561/projections.pdf · exact...
TRANSCRIPT
Exact and approximate projection methods for transient incompressible and low-Mach flows
Mark ChristonComputational Physics R&DSandia National Laboratories
Collaborators: Phil Gresho, Steve Chan,
Tom Voth, Wing Kam Liu
May 2002
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy under contract DE-AC04-94AL85000.
Overview
Background & driving applicationsThe incompressible Navier-Stokes equationsThe projection methodPressure stabilization and approximate projections The A-conjugate projection conjugate gradient methodAdvective treatment & low-Mach number extensionsLES, two-scale filter & realizable turbulent stressesSummary & Conclusions
Incompressible/low-Mach flows span a diverse range of applications
Vehicle aerodynamicsInitial focus for the incompressible algorithms was submarine flows
Mold filling – casting and resin-transfer moldingEncapsulation of electronic componentsChemically reacting flows – burners, CVD reactorsDispersal of chemical and biological agentsHVAC – environmentally controlled roomsVehicle NVH (noise-vibration-heat)
Flow-induced vibration and noise
Inlet
Outlet
The well-posed incompressible flow problem satisfies “solvability”
BC’s:
IC’s:Constraints:
2Pt
ν∂+ ⋅∇ = −∇ + ∇ +
∂u u u u f
0∇⋅ =u
( ,0) ( )0 x=u x u
0 ˆ 1on= Γi in u n u
0 0 in∇ = Ωiu for a well-posed problem
12 ˆ, 0if
ΓΓ =∅ =∫ in u
1Γ 2Γ
Ω
ˆt t on= Γ1u(x, ) u(x, )n
nuP Fn
ν ∂− + =∂u Fττν
τ∂
=∂
2on Γ
The pressure-Poisson equation provides an alternate formulation
PPE:
With BC’s:
If all solvability conditions are met, momentum + PPE will deliver the same solution as momentum + div constraint!
2
0
Pt
with implies
ν∂+ ⋅∇ = −∇ + ∇ +
∂∇ =i
u u u u f
u2 ( )P in∇ =∇ − ∇ Ωi if u u
21( )P on
n tν∂ ∂
= ∇ + − − ∇ Γ∂ ∂
i iun u f u u
2n
nuP F onn
ν ∂= − Γ∂
( , )Pu
Spatial discretization produces DAE’s not ODE’s
DAE’s are ODE’s with algebraic constraints
Index-2 DAE:
Index-1 DAE:
TC g=u( )M A K CP+ + + =u u u u F Fully-coupled
( )M A K CP+ + + =u u u u F 1 1 ( )T TC M C P C M K A g− − = − − − F u u u
Segregated
2 0,i P dt
φ νΩ
∂+ ⋅∇ ∇ − ∇ − Ω =
∂∫u u u+ u f 0i dψ
Ω∇⋅ Ω =∫ u
jij iM dφ φΩ
= Ω∫ ( )ij i jA dφ φΩ
= ⋅∇ Ω∫u u
ij i jK dφ φΩ
= ∇ ∇ Ω∫ ν i i dφΩ
= Ω∫F fij i jC dψ φΩ
= ∇ Ω∫
,j j j ju u P Pφ ψ= =∑ ∑iil iM dφ
Ω= Ω∫
Some DAE properties
DAE’s with index 2 are “higher index”Higher-index DAE’s are harder to solveHigher-index systems require derivatives of order “index-1” resulting in ill-conditioned systemsHigher-index DAE’s can have hidden constraintsLower-index DAE’s can have more solutions thanoriginal DAE’sDAE’s require consistent IC’sThe “correct” solution satisfies the original DAE’s, i.e., the higher-index system
The indexThe index--1 1 DAE’sDAE’s are hard to solve “correctly” are hard to solve “correctly” -- and and the indexthe index--2 (fully2 (fully--coupled) equations are even harder!coupled) equations are even harder!
≥
Projection methods are one of the most popular schemes for solving the Navier-Stokes
Methods:Chorin (1968) -- Chorin’s original first-order methodVan Kan (1986) Karniadakis, et al. (1991)Gresho, Chan and Christon(1990 - present)Almgren, Bell, Colella, Howell, Lai, Welcome, et al. (1989 - 2000)Guermond & Quartapelle (1997)Guermond & Lu (2000)Kothe, et al. (1995 - present)Rider (1994, 1995)
Analysis:Chorin (1969)Shen
1992: High-order1993: P-3 Scheme1996: 2nd-order
E & Liu (1995, 1996)Jacobs (1995)Guermond & Quartapelle
1996: Convergence1998: Error estimates1999: Incremental method
Wetton (1998)Almgren, Bell & Crutchfield (1999)Armfield & Street (1999)
Projection methods perform betterthan formal analysis suggests!!!
Some additional projection literature …
Stewart (1981)Issa, et al. (1985, 1986)Kovacs & Kawahara (1991)Simo, et al. (1994, 1995)Brown & Minion (1995 - 97)Sheu and Lee (1996)Thomadakis & Leschziner (1996)Timmermans, et al. (1996)Nonino, Comini & Croce (1997)
Knio, Najm & Wyckoff (1999)Cummins & Rudman (1999)Minev
1998: BHC correction1999: Stabilization
Codina, et al.1998: predictor-corrector2000: stabilization ideas
Pozrikidis (2001)
The projection method for Navier-Stokes relies on a Helmholtz decomposition
Given: and
A Helmholtz decomposition of yields div-free and curl-free components
The decomposition exists and is unique for a well-posed “incompressible” flow problem
Given is projected onto a div-free and curl-free subspace to obtain
Exact projections are idempotent:
The eigenvalues of are either 0 or 1, so that the projections are stable and norm reducing
2( ) ( )P wheret
ν∂+∇ = = + ∇ − ⋅∇
∂F Fu u u f u u u
( )uF
0, 0and Pt
∂∇ ⋅ = ∇×∇ =
∂u
, ( )Fu u( ( )), ( ( ))and P where
t∂
= ∇ =∂
P F Fu
u uQ2 1 2 1( ) , ( )I and− −= −∇ ∇ ∇⋅ = ∇ ∇ ∇⋅P Q
2 2, and= =P P Q Q,P Q
0∇⋅ =u
The projection algorithm is “realized” by a velocity decomposition
Solve:
Use:
The div-free velocity is obtained with a projection step
Attempts to legitimately decouple the velocity and pressurePreserves a discretely div-free velocity field at each time step
( ) 0,P wheret
∂+∇ = ∇ ⋅ ≠
∂Fu u u
( )uQ
( ) div free−uP
u
1 1[ ] T T nlC M C Cλ− += u
11 1 1, ( )ln n n ntM C P Pλ λ γ
−+ + +∆= − = −u u
0 0I
λ−∇
= −∇ ⋅
u u 1
00
M C n Ml lTC λ
+ =
uu
λ−∇ =u u−∇ =iu 0
Projection methods admit both explicit and semi-implicit time integration
Semi-implicit time integration for the optimal projection algorithm (P2)
1.
2.
3.
4.
The “exact” projection method uses a consistent PPE based on the discrete divergence and gradient operatorsThe “approximate” projection method uses a modified, i.e., a “stabilized” PPESelective mass lumping preserves 4th-order advective phase accuracyAdvection treatment is explicit with a monotonicity preserving predictor-corrector scheme
[ ] [ (1 ) ] nM t K M t Kθ θ+ ∆ = −∆ − +u u
1 1nlM Cλ+ −= −u u
1n nP Pt
λγ+ = +∆
1[ ]T TlC M C Cλ− = u 1( [ ] )T T T n
lor C M C S C Cλ− + = −u u
1 1 (1 ) ( ) n n n nLt A MM CPθ θ+ −∆ + − − −F F u u
The PPE solution can dominate the time step
PPE Properties:is symmetric regardless of the Reynolds number
Can be singular, e.g., contains a hydrostatic modeFor Q1Q0 element can have additional singular modes
Checkerboard mode in 2-DMultiple CB-like modes in 3-D
Right-hand-side, , is noisy, i.e., has a large spectral content and changes rapidly from step-to-step
Source data can control the convergence rate of the solution method -- see Kim and Ro (1995)
Solution of the PPE can consume 80 - 90% of the CPU time per time step.
1[ ]TlC M C−
TC u
Geometrical singularities canexcite “local” pressure modes
The “bi-harmonic catastrophe” problem shows large amplitude CB mode decays spatiallyPressure stabilization filters the modes and improves PPE conditioning
Un-stabilized
Stabilized (β=0.05)
Diverging channel (Gresho, et al., 1995)
0,5.1 == vu
)1100ln(2.0 += yu
Un-stabilized
Stabilized (β=0.05)
The Q1Q0 approximate projection algorithm relies on pressure stabilization
Replace the consistent PPE by an operator that is “easier” to solve -- at the expense of the divergence
Equal-order Q1Q1 approximate projection --Gresho, Chan, Christon and Hindmarsh, 1995Related work of Rider, Almgren – Bell – Colella
Pressure stabilization for Q1Q0 element yields a regularized saddle-point problem
Silvester & Kechkar, 1990, Silvester, 1995,Norburn & Silvester, 1997
1
IJ
Tl IJ
IJ I JIJ
C M CS dβ ψ ψ
−
Γ= − Γ
Γ
∫0.0 1.0β≤ ≤ Global
LocalI
Jh
Pressure
h
I J
“Project thedifference”
( )1
0
nl
nlT
M C M
C S λ
+ −=
−
u uu
( )1[ ]T T nlC M C S Cλ− + = −u u
Pressure stabilization reduces computational work for PPE solve
minimizeseigenspectraof the PPE
yields inaccuratevelocity fieldsNo. of iterationsrelativelyinsensitive to βProjections areapproximate so∇ ≠iu 0
0.01 0.1β≤ ≤
0.25β ≥
β=0.05 typically
Pressure stabilization also improves the iterative convergence rate
2-D: 20% reduction in iterations and computational work3-D: 7-10X reduction in iterations and computational work“Realistic” convergence criteria yields more modest gains
12 1 1210 , 10 , 0.05n n nb Ax b x x x β− − −− ≤ − ≤ =
Global JumpLocal Jump
No Stabilization
0 500 1000 1500 2000 2500 3000
Iterations
||b-A
x||/|
|b||
10-14
10-12
10-10
10-8
10-6
10-4
10-2
100
62720 2116807840
No StabilizationGlobal JumpLocal Jump
0 50 100 150 200 250 300 350 400 450 500
Iterations
||b-A
x||/|
|b||
10-10
10-12
10-14
10-8
10-4
10-6
10-2
100
2800 11200 44800
2-D Vortex Shedding 3-D Post & Plate
The A-conjugate projection-CG methodfurther reduces the PPE solution time
Efforts with algebraic multigrid (AMG) of Ruge and Stuben (1987)Lack of robustness in computing coarse-grid operators -- particularly for complex geometryFactor of 17 slower than direct resolve cost
A-conjugate projection CG uses short and long-wavelength information to, in effect, “deflate” the eigenspectra of the PPEConstruct initial solution that minimizes the residual in the A-norm (Christon, 2002).Given a set of A-conjugate vectors,
1.
2.
3.
4.
5.
( )r b Ax= −
( )T ni i bα φ=
Φ
1N
i iix α φ==∑n nSolve A x b Ax∆ = −
n nx x x= + ∆11 1 11
, lnll l i iil Axψφ ψ α φψ
++ + =+= = −∑
1 additional mat-vecrequired per time-stepto add A-conjugate vector
A-conjugate projection selectively uses information from previous pressure fields
• Solving relies on N-prior solutions that are A-conjugate
• where are A-conjugate• A-conjugate construction is achieved with
one additional mat-vec per time step
( )1[ ]T Tl
nC M C S Cλ− + = −u u
∑+∆= iinn φαλλ iφ
λ∆ λ
11φα
1 1 2 2 3 3n nλ λ α φ α φ α φ= ∆ + + +
A-conjugate projection-CG with SSOR preconditioning approaches direct resolve cost
Grind-time comparison for 1000 time steps relative to resolve time for PVS direct solver (3D Cylinder: Nel=11250, Nnp=11466, Nb=146, ε=1.0e-5)
49.21.602660.42.032966.62.398750
51.71.713162.32.183469.42.6610025
51.11.753859.92.134166.52.5312110
59.72.024769.42.634975.03.201485
74.73.3610083.65.0511087.46.523280
% PPEGrindTimeNIT% PPE
GrindTimeNIT% PPE
GrindTimeNIT
No. of Vectors
ESSOR-PCGSSOR-PCGJPCGβ=0
*
46.01.522357.91.922663.72.207650
49.41.652758.82.002966.52.448825
48.11.683756.81.983663.72.3510810
56.21.874466.22.394372.42.911305
74.03.239682.84.8310586.66.163090
% PPEGrindTimeNIT% PPE
GrindTimeNIT% PPE
GrindTimeNIT
No. of Vectors
ESSOR-PCGSSOR-PCGJPCGβGlob=0.05
*
Coarse-grained parallelism is treated with domain-decomposition message-passing
Explicit message passing via MPI (SPMD model)Non-overlapping sub-domainswith cache-blockingFinite element assembly procedure induces communication
Requires remapping of on-processor nodes, elements, etc.Formation of consistent and lumped mass matricesMatrix-vector products foriterative solversGradient operators for flow solverFormation of right-hand-side
P0 P1
P3P2
P0
P1 P2 P3
Fine-grained parallel speedupsdemonstrate scalability
Communication costs scale with the number of CG iterations required for the pressure/projection solveO(1000) elements per processor yields 12.6% communication overhead on 1024 processors
• Re=100 3-D Post & Plate• Efficiencies for 1000 time steps• ~250 Elements per Processor
• Re=800 Backward Facing Step• Efficiencies for 1000 time steps• ~250 Elements per Processor
Predictor-corrector advection scheme uses operator-limiting to preserve monotonicity
15 mm slot Re=4000, Fr=326
SOUC
C
QUICK
FEMLumped FEM
Preserve 4th-order FEM phase accuracyOperator-limiters are characteristic-based Forward-Euler inviscid predictorTrapezoidal-rule correctorLimiters DO NOT act on smooth data
Re=4000 slot-jet test for the operator limited explicit advection treatment
Temperature Vorticity15 mm slotRe=4000Fr=326Experimental Strouhal: St=0.16 (Yu and Monkewitz, 1990)Computational Strouhal: St=0.18∆t based on a fixed grid-CFL=1.0
High-Grashof number flow test for the monotonicity preserving advection scheme
Insulated
InsulatedGr = 1.32 x 109
Pr = 0.7L = 1 m∆T = 10o C201 x 201 grid∆t based on a fixed grid-CFL=2.0
Th Tc
g
Acoustically filtered equations for low-speed, reacting flow neglect fast (and weak) waves
Majda & Sethian (1985) formulation assumes:Two species - burnt and unburnt gas with same γ-law and molecular weights, one-step Arrhenius kineticsSolution algorithm adapted from Lai (1993), similar to Day & Bell (1999)
2, ( ), , exptotp p Ep p p O M k Ap RT RT
ρ∆ = + ∝ = = −
2pt
ρ µ∂ + ⋅∇ = −∇ + ∇ + ∂ u u u u f
20p
T dpC T T q k Zt dt
ρ κ ρ∂ + ⋅∇ = + ∇ + ∂ u
( )Z Z D Z k Zt
ρ ρ ρ∂ + ⋅∇ = ∇ ∇ − ∂ iu
( )( )20
1 1dp q k Z Tp dt
γ ρ κγ
∇ − + − + ∇
iu =
Low-Re variable density and reacting flowshave similar wake structures
, -- no buoyancy forces100, 100, 1, 1.4Re Pe Sc γ= = = =
Non-Reacting Reacting
Large Eddy Simulation relies on resolving the large-scale fluid dynamics
Energy is added to a turbulent flow at the large scales via stirring, shear, etc., and cascades to smaller scales where dissipation occurs
Length and time scales span a broad spectrum from the integral to the KolmogorovLarge eddy simulation (LES) models the self-similar fine-scale physics than can not be resolved at the grid-scale
Energy Input Cascade Dissipation
Increasing wave number(Decreasing length scale)
LES relies upon filtering the Navier-Stokes Equations
LES Filtering is based upon a finite-domain convolution
• For a filter scale,
The filtered, incompressible Navier-Stokes equations
Postulated SGS stresses:Smagorinsky model:
• Based on a balance between production and dissipation of turbulent kinetic energy
• Does not account for wall bounded or anisotropic flows.
τ
µ ρ
, ( ) ( ) ( , , )a f x f x K a x dξ ξ ξΩ
= −∫
0∇⋅ =u
( ) 2Tp
tµρ ρ∂ + ⋅∇ = −∇ +∇ ⋅ ∇ + ∇ −∇ ⋅ ∂
u u u u u τ
where ij i j i ju u u uτ = −
1, ,2
1 2 ,3ij ij kk t ij ij i j j iS S u uδ τ µ− = − = +
( )t s ij ijC a S S= 2
GILA LES computation captures large-scale vortical structures and mean drag
Ahmed’s body with 30o slant andExperimental drag coefficient:Predicted short-time average drag coefficient:LES crimes:
RANS type graded grid (~500,000 elements)No explicit wall functionsUnder-resolved energy spectrum
Re .= 4 29 106xCw = 0 378.
Cw = 0 386.
Streamwise Vorticity
Flow Direction
U∞ 30o
Artificial non-stationary flows can occur when div(u) is unconstrained
Re=10,000 (Zang, Street & Koseff 1993
Unconstrained div(u)
Constrained div(u)
0.000
0.005
0.010
0.015
0.020
0.025
Kin
etic
Ene
rgy
0 250 500
Time [S]
750 1000 1250 1500
Collect Statistics
0.00
2.00E-6
4.00E-6
6.00E-6
8.00E-6
1.00E-5
1.20E-5
1.40E-5
div(
u)
0 250 500 750
Time [s]1000 1250 1500
Filtering can be used to control and reduce numerical errors for LES
Application of the LES SGS model at 2∆x wavelengths introduces dispersive errorsSetting the filter scale larger than the grid scale promises to reduce the numerical errors over the entire discrete spectrum (Ghosal, 1996)Commutative RKPM filter may be used with a grid-independent filter size for LES
E(k)
η−1∆x−1l−1
EnergyContainingEddies
Intertialrange
SOUC
C
2 x λ∆
QUICKFEM
Lumped FEM
Filter Scale
LES filters are based on the Reproducing Kernel Particle Method (RKPM)
Commutation error, , is determined by filter moments
RKPM consistency is determined by moments as well …
Commutivity error for RKPM is set by the degree of consistency (see Wagner & Liu, Voth & Christon)
RKPM filters permit treatment of arbitrary geometry and unstructured meshes (with an increased computational cost associated with the initial search)
( ) ( ) ( )M x x K a x dk k= −∫ ξ ξ ξ
Ω
, , 0 1, 0, 1, 2,kM M k== = …
( ) ( )M x x xa
dk k= −−
∫ ξ φ
ξξ
Ω
0 11, 0 yields linear consistencyM M= =
df df d fdx dx dx
= −
a( )kO xdf
dx ∆
∝
Dynamic SGS stress model removes the empiricism associated with a model constant
Application of the coarse-scale filter yields the test-scale stresses:
Use of the same postulated form of the test-scale stresses yields the “dynamic constant” in terms of known quantities
A second filter is required at a length scale larger than the original filter scale
T u u u uij i j i j= −
P u P u Q u P u x x u xa a a a a ii
Np
i i= + = −=∑2 2
1
, ( ) ,where andφ ∆ Q u u xa a ai
Np
i i2 21
= −=∑( )φ φ ∆
CL MM M
L Tij ij
ij ijij ij ij= = −where andτ , M a S S a S Sij ij ij ij ij= − −( )2
2 2
aP u 2aP u 2aQ u
Re 395 DNS data provided by William Cabot, CTRτ =
Positivity and consistency yield realizable instantaneous sub-grid scale stresses
Realizability requires positivity in the three principal invariants of the SGS stresses:
( )
1
22
3
0
0
det 0
iii
ii jj iji j
ij
I
I
I
τ
τ τ τ
τ
≠
= ≥
= − ≥
= ≥
∑
∑
ij i j i ju u u uτ = −
DNS data: William Cabot, CTR395Reτ =
Two-scale decomposition using RKPM filter with linear consistency
P ua Q ua2P ua2
Re 395 DNS data provided by William Cabot, CTRτ =
Improving LES reliability on unstructured grids will rely on understanding:
the side effects of a finite divergencethe influence of commutative errors for LES in complex geometry and unstructured gridsfilter construction that yields second-order commutative errorsdispersive and diffusive errors on SGS modelsthe influence of grid anisotropy on SGS modelsthe interaction between explicit filters, filter scales, SGS models and resolved-scale numericsthe interaction between advective schemes and sub-grid scale modelsthe interaction between SGS models and the resolved-scale numericsthe errors introduced by VLES, i.e., under-resolved LES
Summary & Conclusions
Projection methods have evolved dramatically over the past decadeA-conjugate projection-CG is an effective way to treat the PPETheir numerical performance is difficult to predict from analysisSimplicity & flexibility makes them capable of treating a large range of flow regimes and physics
Preserving is a critical part of incompressible LES computations
unconstrained divergence can introduce artificial dissipation independent of advective schemes
RKPM filterscontrol commutative errors (on unstructured meshes)provide multi-scale decomposition for dynamic SGS modelsyield realizable SGS stresses due to enforced consistency
0∇ =iu