on the design of nonconforming high-resolution finite ... · july 30 - august 1, 2012. motivation...
TRANSCRIPT
-
On the Design of NonconformingHigh-Resolution Finite Element
Schemes for Transport ProblemsMatthias Möller
Institute of Applied Mathematics (LS3)TU Dortmund, Germany
Modeling and Simulation of Transport PhenomenaMoselle Valley, GermanyJuly 30 - August 1, 2012
-
Motivation
Objective: apply AFC schemes to nonconforming finite elements
■ Do the algebraic design criteria apply to nonconforming elements?■ Is the accuracy of solutions comparable to P1/Q1 approximations?■ How to implement essential boundary conditions?■ Is there any benefit from using nonconforming elements?
Algebraic Flux Correction, abbr. AFC, family of high-resolution schemesfor convection-dominated transport and anisotropic diffusion problems.
■ universal stabilization approach based on algebraic design criteria■ approved for conforming (multi-)linear finite element schemes■ found complicated to extend to higher-order finite elements
-
kij = �vj ·Z
⌦'ir'j dx, sij = �d
Z
⌦r'i ·r'j dx
Model problem
■ Convection-diffusion equation
■ Fletcher‘s group representation
■ Semi-discrete high-order scheme
u(x, t) ⇡X
j'j(x)uj(t), f(u) ⇡
Xj'j(x)f(uj)
Z
⌦w [u̇+r · f(u)] dx = 0, f(u) = vu� dru
Xjmij u̇j =
Xjkijuj +
Xjsijuj
-
kij = �vj ·Z
⌦'ir'j dx,
Model problem
■ Convection-diffusion equation
■ Fletcher‘s group representation
■ Semi-discrete high-order scheme
u(x, t) ⇡X
j'j(x)uj(t), f(u) ⇡
Xj'j(x)f(uj)
Z
⌦w [u̇+r · f(u)] dx = 0, f(u) = vu� dru
MCu̇ = KuXjmij u̇j =
Xj 6=i
kij(uj � ui) + �iui
�i =X
jkij
-
Family of AFC schemes Kuzmin et al.
MLu̇ = [K+D]u+ F(u̇,u)
artificial diffusion operator
antidiffusive correctionlumped mass matrix
-
linearized flux-correction
nonlinear flux-correction
Family of AFC schemes Kuzmin et al.
NL-GP NL-FCT NL-TVD Low-order
MLu = MLuL + F̄ (u̇L, uL)Lin-FCT
F (u) = �DuF (u̇, u) = [ML �MC ]u̇�Du F ⌘ 0
MLu̇ = [K+D]u+ F(u̇,u)
AFC-LPT. . .
-
Review of algebraic design principles
■ Jameson‘s Local Extremum Diminishing criterionIF
THEN local solution maxima/minima do not increase/decrease
■ Semi-discrete high-resolution AFC scheme
positive not negative
miu̇i =X
j 6=i�ij(uj � ui)
miu̇i =X
j 6=i(kij + dij)(uj � ui) + �iui +
X
j 6=i↵ijfij
not negative by construction
controlled byflux limiter
mi =X
jmij
need to check the positivity of mass matrix coefficients for each finite element by hand!
-
Finite element spaces
■ Parametric quadrilateral finite elements
■ Bilinear one-to-one mappingQ(T ) = {q = q̂ � �1T , q̂ 2 Q̂(T̂ )}
T : T̂ := [�1, 1]⇥ [�1, 1] 7! T 2 Th
Q̂1(T̂ ) = spanh1, x̂, ŷ, x̂ŷi
Rannacher& Turek ‘92
Qrot1,npar(T ) = spanh1, ⇠, ⌘, ⇠2 � ⌘2i
Q̂
rot
1
(T̂ ) = spanh1, x̂, ŷ, x̂2 � ŷ2i
Q1(T )
Qrot1
(T )
-
Nonconforming shape functions
■ Midpoint based variant
'̂1(x̂) =1
4� 1
2ŷ � �(x̂2 � ŷ2), '̂3(x̂) =
1
4+
1
2ŷ � �(x̂2 � ŷ2)
'̂2(x̂) =1
4+
1
2x̂+ �(x̂2 � ŷ2), '̂4(x̂) =
1
4� 1
2x̂+ �(x̂2 � ŷ2)
■ Mean value based variant� =
1
4: '̂i(m̂j) = �ij � =
3
8: |�̂j |�1
Z
�̂j
'̂i(x̂) d� = �ij
-
Matrix analysis - part 1
■ Consistent mass matrix on reference element
■ Positivity criterion
■ midpoint based variant has negative matrix coefficients■ mean value based variant has positive matrix coefficients■ lower bound is invariant to shape of quadrilateral element
M̂ =
0
BBBB@
1645�
2 + 724 �1645�
2 + 181645�
2 � 124 �1624�
2 + 18� 1624�
2 + 181645�
2 + 724 �1645�
2 + 181645�
2 � 1241645�
2 � 124 �1624�
2 + 181645�
2 + 724 �1645�
2 + 18� 1645�
2 + 181645�
2 � 124 �1624�
2 + 181645�
2 + 724
1
CCCCA
0.3423 ⇡ 116
p30 < |�| < 3
16
p10 ⇡ 0.5929
(� = 14 )
(� = 38 )
-
Matrix analysis - part 2
0 10 20 30 40 50
0
5
10
15
20
25
30
35
40
45
50
nz = 369
Q1 FE
-
Matrix analysis - part 2
0 10 20 30 40 50
0
5
10
15
20
25
30
35
40
45
50
nz = 369
Q1 FE
0 10 20 30 40 50 60 70 80
0
10
20
30
40
50
60
70
80
nz = 530
Q1rot FE
-
Matrix analysis - part 2
0 10 20 30 40 50
0
5
10
15
20
25
30
35
40
45
50
nz = 369
Q1 FE
0 10 20 30 40 50 60 70 80
0
10
20
30
40
50
60
70
80
nz = 530
Q1rot FE
3456789
101112
0 10 20 30 40 50 60 70 80
Number of non-zero entries per matrix row
Number of matrix row
Q1Q1rot
Storage format:■ CRS, CCS
Storage format:■ ELLPACK
-
Solid body rotation
■ Velocity field
■ Grid size
■ Stochastic grid disturbance
■ Time step in Crank-Nicolson
■ Initial = exact solution at
v(x, y) = (0.5� y, x� 0.5)
�t = 1.28 · h
t = 2⇡k, k 2 N
� 2 {0%, 1%, 5%}
h = 1/2l, l = 5, 6, . . .
u̇+r · (vu) = 0 in (0, 1)2
u = 0 on �inflow
-
SBR: low-order solutions (0% mesh perturbation)
Q1
Qrot,MVal1,npar
Qrot,MVal1
Qrot,MPt1
54%
63% 64%
64%
-
SBR: low-order solutions (5% mesh perturbation)
Q1
Qrot,MVal1,npar
Qrot,MVal1
Qrot,MPt1
54%
64%
64%
64%
-
SBR: NL-FCT solutions (0% mesh perturbation)
Q1
Qrot,MVal1,npar
Qrot,MVal1
Qrot,MPt1
-
SBR: NL-FCT solutions (5% mesh perturbation)
Q1
Qrot,MVal1,npar
Qrot,MVal1
Qrot,MPt1
-
SBR: L2-error (5% mesh perturbation)
0,01
0,1
1
5 6 7 8
conforming FE
Refinement level
Low-order Lin-FCT NL-FCT NL-GP NL-TVD
0,01
0,1
1
5 6 7 8
nonconforming FE
Refinement level
-
Rotation of a Gaussian hill
~Q1 FE NL-FCTtime t=5/2π
u̇+r · (vu� dru) = 0 in (�1, 1)2, v(x, y) = (�y, x), d = 0.001
-
RGH: L2-error (5% mesh perturbation)
0,001
0,01
0,1
1
10
5 6 7 8 9
conforming FE
Refinement level
Low-order Lin-FCT NL-FCT NL-GP NL-TVD
0,001
0,01
0,1
1
10
5 6 7 8 9
nonconforming FE
Refinement level
-
RGH: dispersion-error (5% mesh perturbation)
-0,5
0,5
1,5
2,5
3,5
4,5
5,5
5 6 7 8 9
conforming FE
Refinement level
Low-order Lin-FCT NL-FCT NL-GP NL-TVD
-0,5
0,5
1,5
2,5
3,5
4,5
5,5
5 6 7 8 9
nonconforming FE
Refinement level
-
conforming FE nonconforming FE
good accuracy good
small numerical diffusion small(er)
smaller #DOFs, #edges larger
irregular sparsity pattern regular
Taxonomy of finite elements
-
Boundary conditions
■ Y. Basilevs, T. Hughes, Weak imposition of Dirichlet boundary conditionsin fluid mechanics, Computers & Fluids 32 (1) 2007, 12-26
■ consistent, adjoint-consistent■ consistent, adjoint-inconsistent
■ E. Burman, A penalty free non-symmetric Nitsche type method for the weak imposition of boundary conditions, eprint arXiv:1106.5612v2 (Nov 2011)
Convection-diffusion equation hyperbolic limit
r · (vu� dru) = f in ⌦u = uD on �D
(dru) · n = g on �N
r · (vu) = f in ⌦(vu) · n = h on �in
d ! 0
� = 1
� = �1�b =
Cd
hb
� = �1, � ⌘ 0
-
Weak imposition of boundary conditions
Z
⌦�rwh · (vuh � druh)dx+
Z
�wh(vuh) · nds
�Z
�D
wh(druh) · nds�Z
�D
(�drwh) · nuhds
�Z
�D\�in(vwh) · nuhds+
NebX
b=1
Z
�D\�b�bwhuhds
=
Z
⌦whfdx+
Z
�N
whgds�Z
�D
�(drwh) · nuDds
�Z
�D\�in(vwh) · nuDds+
NebX
b=1
Z
�D\�b�bwhuDds
� = ±1, �b =Cd
hb
-
Weak imposition of boundary conditions
Z
⌦�rwh · (vuh � druh)dx+
Z
�wh(vuh) · nds
�Z
�D\�in(vwh) · nuhds
=
Z
⌦whfdx
�Z
�D\�in(vwh) · nuDds
Z
�out
whv · nuhds
-
� = 1, �b =4 · 0.001
hbFEM-TVDQrot
1
-
� = �1, �b =4 · 0.001
hbFEM-TVDQrot
1
-
� = �1, �b ⌘ 0FEM-TVDQrot1
-
1D convection-diffusion equation
� = 1
�b =4 · 0.001
h
� = �1
�b =4 · 0.001
h
� = �1�b ⌘ 0
1ux
� 0.001uxx
= 0
u(0) = 1, u(1) = 0
FEM-TVDQ1
-
Summary
■ Do the algebraic design criteria apply to nonconforming elements? Yes, for the integral mean value based variant of nonconforming rotated bilinear finite elements.
■ Is the accuracy of solutions comparable to P1/Q1 approximations? Yes, accuracy and numerical dissipation are comparable.
■ How to implement essential boundary conditions? Apply consistent, adjoint-consistent Nitsche-type method.
■ Is there any benefit from using nonconforming elements? Regular sparsity structure beneficial for parallel HPC.
-
Outlook
■ Extend AFC schemes to other nonconforming finite elements■ NC-Quad (Z. Cai, J. Douglas, Jr., X. Ye)■ Composite Crouzeix-Raviart (F. Schieweck)
■ Generalize consistent, adjoint-consistent approach towards the implementation of Dirichlet boundary conditions to periodic ones
■ Improve (parallel) efficiency by exploring the regular sparsity structure■ use variant of ELLPACK matrix storage format■ reduce communication costs due to weaker coupling
-
Acknowledgment
AFC schemes
■ D. Kuzmin (University of Erlangen-Nuremberg)Featflow2
■ M. Köster, P. Zajac (TU Dortmund)
Source code freely available at:
http://www.featflow.de/en/software/featflow2.html
http://www.featflow.de/en/software/featflow2.htmlhttp://www.featflow.de/en/software/featflow2.html
-
Edge-based solvers for the compressible Euler equations on multicores and GPUs
-
Example: edge-based flux-assembly with Q1 FE
0
10
20
30
40
50
0K 1K 10K 100K 1.000K
Band
wid
th (
GB/
s)
#edges per color group
1 OMP 2 OMP 4 OMP8 OMP 12 OMP Xeon X5680(2x6) CUDA C2070
best GPU performancefor large problem sizes
best CPU performanceif data fits into cache
-
Example: edge-based flux assembly with Q1 FE
0
25
50
75
100
125
150
CUDA OMP 6
Acc
umul
ated
tim
e (m
s)
w/ transf.„55 ms“
132 ms
23 ms
5,7x
0
10
20
30
40
50
0K
125K
250K
375K
500K
Band
wid
th (
GB/
s)
Color groups 1-14
OMP 6 on Core i7 EPCUDA on C2070#edges per color group
6-7 GB/s
Fi =N
colorsX
k=1
X
ij2CGk
�ij(Uj � Ui)
-
0
25
50
75
100
125
150
175
200
CUDA OMP 6
Acc
umul
ated
tim
e (m
s)
w/ transf.„95 ms“
191 ms
32 ms
5,9x
Example: edge-based flux assembly with Q1rot FE
0
10
20
30
40
50
60
0K
375K
750K
1.125K
1.500K
Band
wid
th (
GB/
s)
Color groups 1-9
OMP 6 on Core i7 EPCUDA on C2070#edges per color group
6-7 GB/s