efficient finite element method implementation
DESCRIPTION
Efficient Finite Element Method Implementation. Realization on Adaptive Cartesian Grids. Jointed Advanced Student School (JASS2006) Numerical Simulation TU München Carla Guillen. Agenda. FEM: Quick Review Adaptive Cartesian Grids and FEM Element-wise Traversal Case Study: Jacobi Solver - PowerPoint PPT PresentationTRANSCRIPT
Efficient Finite Element Method Implementation
Realization on Adaptive Cartesian Grids
Jointed Advanced Student School (JASS2006)
Numerical Simulation
TU München
Carla Guillen
2
Agenda FEM: Quick Review Adaptive Cartesian Grids and FEM Element-wise Traversal Case Study: Jacobi Solver Data Structures
Cache Efficiency Traversal of Adaptive Grid Peano Tree and Quadtree Minimal Memory Requirement Parallelization
3
Finite Element Method Review
…we recall the treatment of a PDE with the FEM:
1. Use the weak form of the PDE2. Choose Test and Shape functions3. Obtain linear system of equations
and stencil
4
System of Equations
System of equations obtained from FEM has the form
Au=b
A: System matrixu: Vector containing unknowns assigned to
vertices on a cartesian gridb: Vector containing right hand side values
5
System Matrix
Non-zero coefficients of u
Zero coefficients of u
We can obtain the stencil from the structure of the rows.
System matrix is typically sparse.
6
General Stencil
The structure of the row (an example):
[ 0 0 -1 -1 -1 0 0 0 -1 8 -1 0 0 0 -1 -1 -1 0 0 ]
All nodes depend on their neighbouring nodes:
7
Agenda FEM: Quick Review Adaptive Cartesian Grids and FEM Element-wise Traversal Case Study: Jacobi Solver Data Structures
Cache Efficiency Traversal of Adaptive Grid Peano Tree and Quadtree Minimal Memory Requirement Parallelization
8
Adaptive grids The need of fine grids arises when:
Complex geometry boundaries Singularities (related to discretization error) Multi-scale phenomena
High resolution is sometimes required but not affordable: High resolution leads to the need of more
memory space Time complexity of applied algorithms increases
Why not combine high and low resolution where needed?
9
Adaptive grids Refine only where
necessary combining high and low resolution: Done in a recursive
way. Splitting of a cell into
subgrids. Done only where
needed.
10
Agenda FEM: Quick Review Adaptive Cartesian Grids and FEM Element-wise Traversal Case Study: Jacobi Solver Data Structures
Cache Efficiency Traversal of Adaptive Grid Peano Tree and Quadtree Minimal Memory Requirement Parallelization
11
Use of Elements Advantages of using elements instead of nodes:
The calculation of the stencil for the adaptive case simplyfies
Space filling curves are easily applied. Stack data structures are applicable.
12
Agenda FEM: Quick Review Adaptive Cartesian Grids and FEM Element-wise Traversal Case Study: Jacobi Solver Data Structures
Cache Efficiency Traversal of Adaptive Grid Peano Tree and Quadtree Minimal Memory Requirement Parallelization
13
Implicit solver: Jacobi Jacobi General Formulation:
Jacobi Iterations:While residual is not sufficiently small:
End while.
14
Residual in element-wise view
Stencil is splitted into elements. How do we treat the residual per element? The residual of one
node is equal to the right hand side minus the stencil times u.
The residual of the node is the sumation of the residual of all elements
15
Additional solvers to be used
Not only Jacobi solver, but:
Gauss-Seidel Red-black Gauss-Seidel Conjugate Gradient Multigrid with Jacobi smoother
16
Agenda FEM: Quick Review Adaptive Cartesian Grids and FEM Element-wise Traversal Case Study: Jacobi Solver Data Structures
Cache Efficiency Traversal of Adaptive Grid Peano Tree and Quadtree Minimal Memory Requirement Parallelization
17
Considerations of the Traversal of the Grid
We need to traverse the grid in an efficient way: Saving processing
time. Cache efficient.
Saving memory space.
Use of parallel processing where available
?
18
Cache: Temporal Locality
The goal is to ensure that the information referenced now will be referenced in the near future.
CPU
Cache
RAM
CPU
Cache
RAM
CPU
Cache
RAM
CPU
Cache
RAM
copy copycopy
copy
19
Visiting all the elements: data is required more than once
Visiting element U6 will require to load vertex data that was already loaded
Element U1, U2 and U3 were already referenced previously and their vertices may be still on the cache memory
U6 U10
U9U5U1
U2
U3 U7 U11
20
Agenda FEM: Quick Review Adaptive Cartesian Grids and FEM Element-wise Traversal Case Study: Jacobi Solver Data Structures
Cache Efficiency Traversal of Adaptive Grid Peano Tree and Quadtree Minimal Memory Requirement Parallelization
21
Space filling curves: Peano curve
Cartesian grid is divided into nine elements. Elements with need of higher resolution are divided again into nine.
The traversing of the cells is done in a characteristic order: Traversed only once Cells visited in a succession must be
neighbouring cells.
22
Peano Curve
Level 2 Level 3 Level 4
Level 2 in 3D
23
Other space filling curves
Hilbert curve Sierpinsky (uses triangular grid)
24
Agenda FEM: Quick Review Adaptive Cartesian Grids and FEM Element-wise Traversal Case Study: Jacobi Solver Data Structures
Cache Efficiency Traversal of Adaptive Grid Peano Tree and Quadtree Minimal Memory Requirement Parallelization
25
Tree Representation of the Adaptive Grid
The representation of the adaptive grid can be done with a tree.
Data of a cell is stored at the corresponding node.
The root represents the lowest resolution level.
The leaves are the high resolution levels.
26
Levels of the Grid Let‘s consider the
following adaptive grid. It contains 3 levels. Highest resolution
level occurs in two cells only.
27
Traversal of Each Level of the Adaptive Grid
0
First level
16 17 27
15 14 4
1 2 3
Second level Third level
28
Peano-tree: Numbering of Elements and Data Structure
0
1 32 4 14 15 1716 27
. . .. . . . 135 . . .. . . . 2618
29
Example
30
Quadtree Same principle as the Peano tree. Splitting occurs in one element into four
subcells. Node has either 0 or 4 children.
0
1 2 7 8
5 63 4
31
Agenda FEM: Quick Review Adaptive Cartesian Grids and FEM Element-wise Traversal Case Study: Jacobi Solver Data Structures
Cache Efficiency Traversal of Adaptive Grid Peano Tree and Quadtree Minimal Memory Requirement Parallelization
32
The Refinement Extra Bit Every element will
contain one refinement bit.
If element‘s refinement bit is set, the node has children and these are visited.
0
1 2 7 8
5 63 4
33
Agenda FEM: Quick Review Adaptive Cartesian Grids and FEM Element-wise Traversal Case Study: Jacobi Solver Data Structures
Cache Efficiency Traversal of Adaptive Grid Peano Tree and Quadtree Minimal Memory Requirement Parallelization
34
Load per Processor
The load per processor should be balanced although in adaptive grids this is not obvious.
Amount of workload per processor: elements/number of processors.
Dynamic load balancing even more complicated.
35
Challenges for Domains
A balanced load for each processor Domain decomposition
A small ratio between comunication surface and volume Compact domains Not straight forward on adaptive grids
36
Domain decomposition
37
Summary FEM: Main ingredients. Adaptive cartesian grid: low and high
resolution. Element-wise view of grid. Iterative methods to solve system of
equations. Data structures:
Cache efficiency Traversal of adaptive grid Peano tree and quadtree Minimal memory requirement Parallelization
38
Backup