frontal and multi-frontal solvers: orderings, elimination...
TRANSCRIPT
Maciej Paszynski
Department of Computer Science
AGH University of Science and Technology, Krakow, Poland
http://www.ki.agh.edu.pl/en/research-groups/a2s
Frontal and multi-frontal solvers:
orderings, elimination trees,
refinement trees
EXEMPLARY SIMPLE 1D NUMERICAL PROBLEM
Find temperature distribution such that
Finite difference disretization
TRIDIAGONAL MATRIX FOR EXEMPLARY 1D PROBLEM
FRONTAL SOLVER
Front matrix
Frontal matrix focuses on forward elimination of the first row
FRONTAL SOLVER
Front matrix
2nd row = 2nd row – 1/h2 * 1st row
FRONTAL SOLVER
Front matrix
2nd row = 2nd row – 1/h2 * 1st row
FRONTAL SOLVER
Front matrix
The first row has been eliminated
Frontral matrix focuses on the elimination of the second row
FRONTAL SOLVER
Front matrix
3rd row = 3nd row – [1/h2] / [-2/h2] * 2nd row
FRONTAL SOLVER
Front matrix
3rd row = 3nd row – [1/h2] / [-2/h2] * 2nd row
ELEMENT-WISE DECOMPOSITION
Construct multiple
frontal matrices
in such a way
so they sum up
to the full matrix
Variables must be split
into parts
MULTI-FRONTAL SOLVER ALGORITHM
First all frontal matrices are constructed
MULTI-FRONTAL SOLVER ALGORITHM
First and second frontal matrices are sum up into new 3x3 frontal matrix
Its first and second rows are fully assembled
+
MULTI-FRONTAL SOLVER ALGORITHM
+
First column is eliminated by using first row
2nd row = 2nd row – [1/h2] * 1st row
MULTI-FRONTAL SOLVER ALGORITHM
+
2nd row = 2nd row – [1/h2] * 1st row
MULTI-FRONTAL SOLVER ALGORITHM
Third and fourth frontal matrices are sum up into new 3x3 frontal matrix
Now only the second row is fully assembled
+ +
MULTI-FRONTAL SOLVER ALGORITHM
Change of the ordering
+ +
MULTI-FRONTAL SOLVER ALGORITHM
+ +
Eliminate entries below the diagonal
2nd row = 2nd row – [1/h2] / [-2/h2] * 1st row
3rd row = 3rd row – [1/h2] / [-2/h2] * 1st row
MULTI-FRONTAL SOLVER ALGORITHM
+ +
Why can I substract fully assembled 1st row
from not fully assemled rows 2nd and 3rd ?
Eliminate entries below the diagonal
2nd row = 2nd row – [1/h2] / [-2/h2] * 1st row
3rd row = 3rd row – [1/h2] / [-2/h2] * 1st row
MULTI-FRONTAL SOLVER ALGORITHM
+ +
This is because substraction and addition are interchangable
(now I am substructing from the 2nd not fully assembled row,
and later I will add the remaining part)
Moreover the 1st row which is utilized for the substractions
in other columns contains only zeros
Eliminate entries below the diagonal
2nd row = 2nd row – [1/h2] / [-2/h2] * 1st row
3rd row = 3rd row – [1/h2] / [-2/h2] * 1st row
MULTI-FRONTAL SOLVER ALGORITHM
+ +
Eliminate entries below the diagonal
2nd row = 2nd row – [1/h2] / [-2/h2] * 1st row
3rd row = 3rd row – [1/h2] / [-2/h2] * 1st row
MULTI-FRONTAL SOLVER ALGORITHM
Fifth and sixth frontal matrices are sum up into new 3x3 frontal matrix
Now the second and third rows are fully assembled
…
+ + +
+ + +
Continue until root node
MULTI-FRONTAL SOLVER ALGORITHM
+
node
+
root
PARALLEL MULTI-FRONTAL SOLVER ALGORITHM
All frontal matrices are generated at the same time
Procesor 1 Procesor 2 Procesor 3 Procesor 4 Procesor 5 Procesor 6
Summing up and elimination are executed at the same time
over different pairs of frontal matrices
+ + +
PARALLEL MULTI-FRONTAL SOLVER ALGORITHM
Procesor 1 Procesor 2 Procesor 3 Procesor 4 Procesor 5 Procesor 6
+ + +
PARALLEL MULTI-FRONTAL SOLVER ALGORITHM
Procesor 1 Procesor 2 Procesor 3 Procesor 4 Procesor 5 Procesor 6
Summing up and elimination are executed at the same time
over different pairs of frontal matrices
+ + +
The agorithm is recursively repeated until we reach the root of the tree
The algorithm results in upper trianguler matrix stored in distributed manner
Computational complexity = height of the tree = logN (where N = #unknowns-1)
PARALLEL MULTI-FRONTAL SOLVER ALGORITHM
Procesor 1 Procesor 2 Procesor 3 Procesor 4 Procesor 5 Procesor 6
+
node
+
root
GENERALIZATION TO 1D FINITE ELEMENT METHOD
)()()()(')(')(')( xfxuxcxuxbxuxa
lCu ,02Find such that
0)0( u
)()(')( lulula
Strong formulation
Weak formulation
)()()('''
00
lvdxvflvludxcuvvbuvau
ll
Find such thatVu
0)0(:,01 vlHvVv
vlvub ,
GENERALIZATION TO 1D FINITE ELEMENT METHOD
Finite element method disretization
Njeleeba jji
N
i
i ,...,1,
1
N
i
ii eau
1
jev
1,5.00
5.0,0211
xfor
xforxxe
1,5.00
5.0,021
xfor
xfor
dx
xde
1,5.022
5.0,022
xforx
xforxxe
1,5.02
5.0,022
xfor
xfor
dx
xde
1,5.012
5.0,003
xforx
xforxe
1,5.02
5.0,003
xfor
xfor
dx
xde
Exemplary basis functions for [0,l] = [0,1], for two finite elements
)()(''',
0
leledxeceebeeaeeeb ji
l
jijijiji )(
0
ledxefel j
l
jj
GENERALIZATION TO 1D FINITE ELEMENT METHOD
3
2
1
3
2
1
333231
232221
131211
,,,
,,,
,,,
el
el
el
a
a
a
eebeebeeb
eebeebeeb
eebeebeeb
N
i
ii eau
1
jev
Exemplary basis functions for [0,l] = [0,1], for two finite elements
)()(''',
0
leledxeceebeeaeeeb ji
l
jijijiji )(
0
ledxefel j
l
jj
1
11K
e
12
212KK
e 2
23K
e
0)()(''', 31
1
0
31313131 leledxeceebeeaeeeb
5.0
0
212121
11
1
0
21212121111
111
)()(''', dxcdx
db
dx
d
dx
daleledxeceebeeaeeeb
KKKKKK
1
5.0
212121
32
1
0
32323232222
222
)()(''', dxcdx
db
dx
d
dx
daleledxeceebeeaeeeb
KKKKKK
GENERALIZATION TO 1D FINITE ELEMENT METHOD
3
2
1
3
2
1
333231
232221
131211
,,,
,,,
,,,
el
el
el
a
a
a
eebeebeeb
eebeebeeb
eebeebeeb
N
i
ii eau
1
jev
Exemplary basis functions for [0,l] = [0,1], for two finite elements
)()(''',
0
leledxeceebeeaeeeb ji
l
jijijiji )(
0
ledxefel j
l
jj
1
11K
e
12
212KK
e 2
23K
e
5.0
0
2
11111
11
1
0
1111111111
111
)()(''', dxcdx
db
dx
d
dx
daleledxeceebeeaeeeb
KKKKK
1
5.0
2
11111
5.0
0
2
22222
22
1
0
22222222
22
222
11
111
)()(''',
dxcdx
db
dx
d
dx
dadxc
dx
db
dx
d
dx
da
leledxeceebeeaeeeb
KKKKK
KKKKK
1
5.0
2
22222
33
1
0
3333333322
222
)()(''', dxcdx
db
dx
d
dx
daleledxeceebeeaeeeb
KKKKK
2
21
1
2222
22221111
1111
2
12
1
3
2
1
2221
12112221
1211
,,0
,,,,
0,,
K
KK
K
KKKK
KKKKKKKK
KKKK
l
ll
l
a
a
a
bb
bbbb
bb
K
K
K
K
KKKK
KKKK
l
l
x
x
bb
bb
2
1
2
1
2221
1211
,,
,,
GENERALIZATION TO 1D FINITE ELEMENT METHOD
3
2
1
3
2
1
333231
232221
131211
,,,
,,,
,,,
el
el
el
a
a
a
eebeebeeb
eebeebeeb
eebeebeeb
Notice that when we switch from finite difference to finite elements,
it only changes the local systems of equations at tree nodes
GENERALIZATION TO 1D FINITE ELEMENT METHOD
)()(''', lldxcbab Ki
Ki
K
Kj
Ki
Kj
Ki
Kj
Ki
Kj
Ki
)(ldxfel Kj
K
Kjj
12
212KK
e
Global basis functions
are composed with
local shape functions,
e.g.
K
K
K
K
KKKK
KKKK
l
l
x
x
bb
bb
2
1
2
1
2221
1211
,,
,,
Local system of equations generated over the element K
)()(''', lldxcbab Ki
Ki
K
Kj
Ki
Kj
Ki
Kj
Ki
Kj
Ki
)(ldxfel Kj
K
Kjj
K
K
K
K
KKKK
KKKK
l
l
x
x
bb
bb
2
1
2
1
2221
1211
,,
,,
K
K
K
K
KKKK
KKKK
l
l
x
x
bb
bb
2
1
2
1
2221
1211
,,
,,
K
K
K
K
KKKK
KKKK
l
l
x
x
bb
bb
2
1
2
1
2221
1211
,,
,,
K
K
K
K
KKKK
KKKK
l
l
x
x
bb
bb
2
1
2
1
2221
1211
,,
,,
K
K
K
K
KKKK
KKKK
l
l
x
x
bb
bb
2
1
2
1
2221
1211
,,
,,
K
K
K
K
KKKK
KKKK
l
l
x
x
bb
bb
2
1
2
1
2221
1211
,,
,,
Notice that when we switch from finite difference to finite elements,
it only changes the local systems of equations at tree nodes
GENERALIZATION TO 1D FINITE ELEMENT METHOD
1
hpe 3
hpe 5
hpe
7
hpe 9
hpe8
hpe
15
1i
i
hp
i
hpeuuWe seek the solution u of some weak form of PDE as a linear
combination of shape functions spread over finite element mesh i
hpe
2D hp FINITE ELEMENT METHOD
4
hpe2
hpe
1
hpe 3
hpe 5
hpe
2
hpe 4
hpe7
hpe 9
hpe
8
hpe
15,...,1,15
1
jeleebu j
hp
i
j
hp
i
hp
i
hp
The coefficients
(called „degrees of freedom” d.o.f.)
are obtained by solving
system of linear equations –
finite element disecretization
of a weak (variational) form of PDE
i
hpu
j
hp
i
hp eeb , j
hpelwhere and
are some integrals of shape functions
j
hp
i
hp ee ,
2D hp FINITE ELEMENT METHOD
j
hp
i
hp eeb ,
Generates frontal matrix for the first element,
eliminates fully assembled degrees of freedom
FRONTAL SOLVER
SOLUTION BASED ON LINEAR ORDER OF ELEMENTS
Partial
forward elimination
O(6*9^2)
Generates frontal matrix for the second element,
merges with the current frontal matrix
eliminates fully assembled degrees of freedom
FRONTAL SOLVER
SOLUTION BASED ON LINEAR ORDER OF ELEMENTS
Full
forward elimination
MULTI-FRONTAL SOLVER
SOLUTION BASED ON THE ELIMINATION TREE
Partial
forward elimination
Partial
forward elimination
MULTI-FRONTAL SOLVER
SOLUTION BASED ON THE ELIMINATION TREE
Full forward elimination
of the interface problem matrix
MULTI-FRONTAL SOLVER
SOLUTION BASED ON THE ELIMINATION TREE
Number of operations for partial forward elimination
COMPARISON OF COSTS
CONSTRUCTION OF THE ELIMINATION TREE
BASED ON THE HISTORY OF MESH REFINEMENTS
For any initial mesh, the elimination tree can be created based on
nested dissection algorithm.
CONSTRUCTION OF THE ELIMINATION TREE
BASED ON THE HISTORY OF MESH REFINEMENTS
The elimination tree created for the initial mesh is updated when the mesh is refined
(elimination tree is constructed dynamically, during mesh refinements)
CONSTRUCTION OF THE ELIMINATION TREE
BASED ON THE HISTORY OF MESH REFINEMENTS
The elimination tree created for the initial mesh is updated when the mesh is refined
(elimination tree is constructed dynamically, during mesh refinements)
CONSTRUCTION OF THE REFINEMENT TREES
CONSTRUCTION OF THE REFINEMENT TREES
CONSTRUCTION OF THE REFINEMENT TREES
CONSTRUCTION OF THE REFINEMENT TREES
ELIMINATIONS BASED ON REFINEMENT TREES
• Local matrices for active elements – leaves of the elimination tree –
are created
Level of
initial mesh elements
Level of
refinement trees
• Interior degrees of freedom are eliminated at every leaf
Level of
initial mesh elements
Level of
refinement trees
ELIMINATIONS BASED ON REFINEMENT TREES
• Schur complement contributions are merged at parent level
Level of
initial mesh elements
Level of
refinement trees
ELIMINATIONS BASED ON REFINEMENT TREES
• Fully assembled degrees of freedom are eliminated at parent nodes level
(degrees of freedom related to common edges shared by both son elements
can be eliminated now)
Level of
initial mesh elements
Level of
refinement trees
ELIMINATIONS BASED ON REFINEMENT TREES
• Contributions to Schur complement are merged again at the next level
Level of
initial mesh elements
Level of
refinement trees
ELIMINATIONS BASED ON REFINEMENT TREES
• Fully assembled degrees of freedom are eliminated
Level of
initial mesh elements
Level of
refinement trees
ELIMINATIONS BASED ON REFINEMENT TREES
• Finally, Schur complement contributions are merged at the tree root node
Level of
initial mesh elements
Level of
refinement trees
ELIMINATIONS BASED ON REFINEMENT TREES
• The common interface problem is solved at the tree root node
(The size of the common interface problem corresponds to the size of
crossection of the entire domain)
Level of
initial mesh elements
Level of
refinement trees
ELIMINATIONS BASED ON REFINEMENT TREES
Level of
initial mesh elements
Level of
refinement trees
• The solution obtained at root node is utilized at son nodes.
• The backward substitution is executed
ELIMINATIONS BASED ON REFINEMENT TREES
Level of
initial mesh elements
Level of
refinement trees
• The solution utilized at the second level nodes is utilizes at their sone nodes.
• The backward substitution is executed
ELIMINATIONS BASED ON REFINEMENT TREES
Level of
initial mesh elements
Level of
refinement trees
• The solution obtained at the third level nodes is utilized at leaf nodes.
• The backward substitution is executed on the level of leaf nodes
ELIMINATIONS BASED ON REFINEMENT TREES
PERFORMANCE OF THE 1st VERSION OF THE DISTRIBUTED
MEMORY MULTI-FRONTAL PARALLEL SOLVER
- Embeded into hp code
+ Outperforms MUMPS for large number of processors
- Slower than MUMPS for low number of processors
Profiling showed that the time consuming part for low number of processors
was actually process of merging of matrices –
performed on the level of unknowns, with moving of matrix entries.
Switch to the level of nodes, do not touch matrix entries –work with pointers
1,482,570 degrees of freedom (68,826,475 non-zeros)
PAPERS
http://home.agh.edu.pl/~paszynsk/Publications.html
Maciej Paszyński, David Pardo, Carlos Torres-Verdin, Leszek Demkowicz,
Victor Calo
A PARALLEL DIRECT SOLVER FOR SELF-ADAPTIVE HP FINITE
ELEMENT METHOD
Journal of Parallel and Distributed Computing, 70, 3 (2013) 270-281
Anna Paszynska, Maciej Paszynski, Konrad Jopek, Maciej Wozniak, Damian Goik, Piotr Gurgul, Hassan AbouEisha, Mikhail Moshkov, Victor Calo, Andrew Lenharth, Donald Nguyen, Keshav Pingali
QUASI-OPTIMAL ELIMINATION TREES FOR 2D GRIDS WITH SINGULARITIES
Scientiffic Programming, (2015) Article ID 303024, 1-18
Maciej Paszynski, David Pardo, Anna Paszynska, Leszek DemkowiczOUT-OF-CORE MULTI-FRONTAL SOLVER FOR MULTI-PHYSICS HP ADAPTIVE PROBLEMS
Procedia Computer Science, 4 (2011) 1788-1797