solving poisson equation using conjugate gradient methodand its implementation
TRANSCRIPT
Solving Poisson Equation using Conjugate Gradient Method
and its implementation
Jongsu Kim
Theoretical
From the Basics, Ax=b
Linear Systems
𝐴𝑥 = 𝑏
Goal of this presentation
What have you learned?
• Direct Method• Gauss Elimination• Thomas Algorithm (TDMA) (for tridiagonal matrix only)
• Iterative Method• Jacobi method• SOR method• Conjugate Gradient Method• Red Black Jacobi Method
Iterative Method
Start with decomposition
𝐴 = 𝐷 − 𝐸 − 𝐹
Jacobi Method
𝑥𝑘+1 = 𝐷−1 𝐸 + 𝐹 𝑥𝑘 + 𝐷−1𝑏
Gauss-Seidel Method
𝑥𝑘+1 = (𝐷 − 𝐸)−1𝐹𝑥𝑘 + 𝐷 − 𝐸 −1𝑏
𝐴𝑥 = 𝑏
Backward Gauss-Seidel Iteration
𝐷 − 𝐹 𝑥𝑘+1 = 𝐸𝑥𝑘 + 𝑏
(𝑖 = 1,… , 𝑛 − 1, 𝑛)
(𝑖 = 𝑛, 𝑛 − 1… , )
Splitting of A matrix
Previous method has a common form
𝐴 = 𝐷 − 𝐸 − 𝐹
𝑥𝑘+1 = 𝐷−1 𝐸 + 𝐹 𝑥𝑘 + 𝐷−1𝑏
𝑥𝑘+1 = (𝐷 − 𝐸)−1𝐹𝑥𝑘 + 𝐷 − 𝐸 −1𝑏
𝐴𝑥 = 𝑏
𝐷 − 𝐹 𝑥𝑘+1 = 𝐸𝑥𝑘 + 𝑏
𝑴𝒙𝒌+𝟏 = 𝑵𝒙𝒌 + 𝒃 = 𝑴− 𝑨 𝒙𝒌 + 𝒃
𝑨 = 𝑴−𝑵
Introducing SOR (Successive Over Relaxation) method
𝝎𝑨 = 𝑫 −𝝎𝑬 − (𝝎𝑭 + 𝟏 −𝝎 𝑫)
𝑫 −𝝎𝑬 𝒙𝒌+𝟏 = 𝝎𝑭 + 𝟏 −𝝎 𝑫 𝒙𝒌 +𝝎𝒃
SOR to SSOR
Gauss Seidel method
𝑥𝑘+1 = (𝐷 − 𝐸)−1𝐹𝑥𝑘 + 𝐷 − 𝐸 −1𝑏
SOR (Successive Over Relaxation) method
𝐷 − 𝜔𝐸 𝑥𝑘+1 = 𝜔𝐹 + 1 − 𝜔 𝐷 𝑥𝑘 + 𝜔𝑏
(𝐷 − 𝐸)𝑥𝑘+1= 𝐹𝑥𝑘 + 𝑏
𝐷 − 𝐹 𝑥𝑘+1 = 𝐸𝑥𝑘 + 𝑏
Backward Gauss Seidel method
Backward SOR method
𝐷 − 𝜔𝐹 𝑥𝑘+1 = 𝜔𝐸 + 1 − 𝜔 𝐷 𝑥𝑘 + 𝜔𝑏
SSOR method
SSOR (Symmetric Successive Over Relaxation) method
SOR step followed by backward SOR step for symmetric matrix
𝐷 − 𝜔𝐸 𝑥𝑘+1/2 = 𝜔𝐹 + 1 − 𝜔 𝐷 𝑥𝑘 + 𝜔𝑏
𝐷 − 𝜔𝐹 𝑥𝑘+1 = 𝜔𝐸 + 1 − 𝜔 𝐷 𝑥𝑘+1/2 + 𝜔𝑏
𝑥𝑘+1 = 𝑮𝝎𝑥𝑘 + 𝒇𝝎
𝑮𝝎 = 𝐷 − 𝜔𝐹 −1 𝜔𝐸 + 1 − 𝜔 𝐷 × 𝐷 − 𝜔𝐸 −1 𝜔𝐹 + 1 − 𝜔 𝐷
𝒇𝝎 = 𝜔 𝐷 − 𝜔𝐹 −1 𝐼 + 𝜔𝐸 + 1 − 𝜔 𝐷 𝐷 − 𝜔𝐸 −1 𝑏
Observing that
𝜔𝐸 + 1 − 𝜔 𝐷 𝐷 − 𝜔𝐸 −1 = − 𝐷 − 𝜔𝐸 + 2 − 𝜔 𝐷 𝐷 − 𝜔𝐸 −1
= −𝐼 + 2 − 𝜔 𝐷 𝐷 − 𝜔𝐸 −1
𝒇𝝎 = 𝜔 2 − 𝜔 𝐷 − 𝜔𝐹 −1𝐷 𝐷 − 𝜔𝐸 −1𝑏
Used as preconditioner (explain later)
Preconditioned System
𝒙𝒌+𝟏 = 𝑮𝝎𝒙𝒌 + 𝒇𝝎
𝐺𝐺𝑆 𝐴 = 𝐼 − (𝐷 − 𝐸)−1𝐴𝐺𝐽𝐴 𝐴 = 𝐼 − 𝐷−1𝐴,
𝒙𝒌+𝟏 = 𝑴−𝟏𝑵𝒙𝒌 +𝑴−𝟏𝒃
We have two forms for iterative method
Ex)
𝐺 = 𝑀−1𝑁 = 𝑀−1 𝑀 − 𝐴 = 𝐼 −𝑀−1𝐴 𝑓 = 𝑀−1𝑏
𝐼 − 𝐺 𝑥 = 𝑓
Another view…
[𝐼 − (𝐼 − 𝑀−1𝐴)]𝑥 = 𝑓
𝑀−1𝐴𝑥 = 𝑓
𝑴−𝟏𝑨𝒙 = 𝑴−𝟏𝒃 Preconditioner 𝑀
Preconditioned System
𝑴−𝟏𝑨𝒙 = 𝑴−𝟏𝒃 With Preconditioner 𝑀
𝑀𝐺𝑆 = 𝐷 − 𝐸Gauss-Seidel
𝑀𝑆𝑆𝑂𝑅 =1
𝜔 2 − 𝜔𝐷 − 𝜔𝐸 𝐷−1(𝐷 − 𝜔𝐹)SSOR
𝑀𝐽𝐴 = 𝐷Jacobi
𝑀𝐽𝐴 =1
𝜔(𝐷 − 𝜔𝐸)SOR
It may not be “SPARSE” due to inverse (𝑀−1 )
How to compute this?
𝑤 = 𝑀−1𝐴𝑣 𝑟 = 𝐴𝑣 and 𝑀𝑤 = 𝑟
𝐴𝑣 might be expensive. Much better?
𝑤 = 𝑀−1𝐴𝑣 = 𝑀−1 𝑀 −𝑁 𝑣 = 𝐼 −𝑀−1𝑁 𝑣
𝑟 = 𝑁𝑣
𝑤 = 𝑀−1𝑟
𝑤 ≔ 𝑣 −𝑤
N may be sparser than A and less expensive than 𝐴𝑣
Minimization Problem
Forget about 𝐴𝑥 = 𝑏 temporarily, but thinking about some quadratic function 𝑓
Function Matrix
𝑓(x) =1
2𝐴𝑥2 − 𝑏𝑥 + 𝑐 𝑓 𝑥 =
1
2𝑥𝑇𝐴𝑥 − 𝑏𝑇𝑥 + 𝑐
𝑓′ x = 𝐴𝑥 − b 𝑓′ x =1
2𝐴𝑇𝑥 +
1
2A𝑥 − b
If Matrix 𝐴 is symmetric, 𝐴𝑇 = 𝐴, then
𝒇′ 𝒙 = 𝑨𝒙 − 𝒃
Setting the gradient to zero, we get the linear system we wish to solve.
Our original GOAL!!
(a) Quadratic form for a positive definite matrix
(b) Quadratic form for a negative definite matrix
(c) Singular (and positive-indefinite) matrix; A line that runs through bottom of the valley is the set of solutions
(d) For an indefinite matrix. Saddle point.
For a Symmetric and Positive Definite Matrix, minimizing
𝑓 𝑥 =1
2𝑥𝑇𝐴𝑥 − 𝑏𝑇𝑥 + 𝑐
Reduced to our solution
Minimization Problem
Steep Descent Method
Choose direction in which 𝑓 decrease most quickly, which is the direction opposite 𝑓′(𝑥 𝑖 )
𝑟(𝑖) = 𝑏 − 𝐴𝑥(𝑖)
−𝑓′ 𝑥 𝑖 = 𝑟(𝑖) = 𝑏 − 𝐴𝑥(𝑖)
𝑥(1) = 𝑥(0) + 𝛼𝑟(0)
To Find 𝛼, set 𝑑
𝑑𝛼𝑓 𝑥 1 = 0
𝑑
𝑑𝛼𝑓 𝑥 1 = 𝑓′ 𝑥 1
𝑇 𝑑
𝑑𝛼𝑥(1) = 𝑓′ 𝑥 1
𝑇𝑟(0)
𝑓′ 𝑥 𝑖+1𝑇
and 𝑟(𝑖) are orthogonal!
−𝑓′ 𝑥 𝑖+1 = 𝑟(𝑖+1)
𝑓′ 𝑥 𝑖+1𝑇𝑟(𝑖) = 0
𝑟 𝑖+1𝑇 𝑟(𝑖) = 0
𝜶 =𝒓 𝒊𝑻 𝒓 𝒊
𝒓(𝒊)𝑻 𝑨𝒓(𝒊)
Conjugate Gradient Method
Steep Descent Method not always converge well
Worst case of steep descent method
• Solid lines : worst convergence line• Dashed line : steps toward convergence
Why it doesn’t directly go along line for fast convergence? → related to eigen value problem
Introducing Conjugate Gradient method
Conjugate Gradient Method
What is the meaning of conjugate?• Definition : A binomial formed by negating the second term of binomial• 𝑥 + 𝑦 ← conjugate → 𝑥 − 𝑦
Then, what is the meaning of conjugate gradient?• Steep descent method often finds itself taking steps in the same direction• Wouldn’t it better if we got it right the every step?• Here is a step
• error 𝑒(𝑖) = 𝑥(𝑖) − 𝑥, residual 𝑟(𝑖) = 𝑏 − 𝐴𝑥(𝑖), 𝑑(𝑖) a set of orthogonal search
direction• for each step, we choose a point 𝑥(𝑖+1) = 𝑥(𝑖) + 𝛼(𝑖)𝑑(𝑖)• To find 𝛼, 𝑒(𝑖+1) should be orthogonal to 𝑑(𝑖). (𝑒 𝑖+1 = 𝑒 𝑖 + 𝛼 𝑖 𝑑 𝑖 )
𝑑(𝑖)𝑇 𝑒(𝑖+1) = 0
𝑑(𝑖)𝑇 (𝑒 𝑖 +𝛼(𝑖)𝑑(𝑖)) = 0
𝛼(𝑖) = −𝑑 𝑖𝑇 𝑒 𝑖
𝑑(𝑖)𝑇 𝑑(𝑖)
We don’t know anything about 𝑒(𝑖), because if we know 𝑒(𝑖), it means we know the answer.
Conjugate Gradient Method
Instead of orthogonal, introduce 𝐴-orthogonal
𝒅(𝒊)𝑻 𝑨𝒅(𝒋) = 𝟎, if 𝑑(𝑖) and 𝑑(𝑗) are 𝐴-orthogonal, or conjugate
𝒆(𝒊+𝟏) is 𝑨-orthogonal to 𝒅(𝒊), and this condition is equivalent to finding the minimum
point along the search direction 𝑑(𝑖) , as in steep descent method
𝑑
𝑑𝛼𝑓 𝑥 𝑖+1 = 0
𝛼 minimize 𝑓 when directional derivative is equal to zero
𝑓′ 𝑥 𝑖+1𝑇 𝑑
𝑑𝛼𝑥 𝑖+1 = 0
−𝑟 𝑖+1𝑇 𝑑(𝑖) = 0
Chain rule
𝑓′ 𝑥(𝑖+1) = 𝐴𝑥(𝑖+1) − 𝑏
𝑟(𝑖) = 𝑏 − 𝐴𝑥(𝑖)𝑥(𝑖+1) = 𝑥(𝑖) + 𝛼(𝑖)𝑑(𝑖)
𝑑(𝑖)𝑇 𝐴𝑒(𝑖+1) = 0 𝑥(𝑖+1)
𝑇 𝐴𝑇𝑑(𝑖) − 𝑏𝑇𝑑 𝑖 = 0
𝑥(𝑖+1)𝑇 𝐴𝑇𝑑(𝑖) − 𝑥𝑇𝐴𝑇𝑑 𝑖 = 0
𝑒 𝑖+1𝑇 𝐴𝑇𝑑(𝑖) = 0 Transpose again
How it can be same as orthogonality used in steep descent method?
𝑒(𝑖+1) = 𝑥(𝑖+1) − 𝑥
𝜶(𝒊) = −𝒅 𝒊𝑻 𝒓 𝒊
𝒅(𝒊)𝑻 𝑨𝒅(𝒊)
Conjugate Gradient Method
𝑑(𝑖)𝑇 𝐴𝑒(𝑖+1) = 0
𝑥(𝑖+1) = 𝑥(𝑖) + 𝛼𝑑(𝑖)𝑒(𝑖+1) = (𝑥(𝑖) + 𝛼𝑟(𝑖)) − 𝑥
𝑑(𝑖)𝑇 𝐴𝑒(𝑖+1) = 𝑑 𝑖
𝑇 𝐴((𝑥 𝑖 + 𝛼𝑑 𝑖 ) − 𝑥)
𝑑 𝑖𝑇 𝐴𝑥(𝑖) + 𝛼𝑑 𝑖
𝑇 𝐴𝑑(𝑖) − 𝑑 𝑖𝑇 𝐴𝑥 = 0
𝑑 𝑖𝑇 𝐴𝑥 𝑖 − 𝑏 = −𝛼𝑑 𝑖
𝑇 𝐴𝑑(𝑖)
How to find 𝑑(𝑖)?
Gram-Schmidt Process
𝑑(𝑖) = 𝑢(𝑖) + Σk=0𝑖−1𝛽𝑖𝑘𝑑(𝑘)
Find set of 𝐴-orthogonal vector
𝛽𝑖𝑘 = −𝑢𝑖𝑇𝐴𝑑 𝑗
𝑑(𝑗)𝑇 𝐴𝑑(𝑗)
For set of independent vectors 𝑢𝑖
due to 𝑑 𝑖𝑇 𝐴𝑑(𝑗) = 0
𝑖 > 𝑗
Conjugate Gradient Method
Overall Algorithm
Initialization𝑖 = 0
𝑟 = 𝑏 − 𝐴𝑥𝑑 = 𝑟
𝛿𝑛𝑒𝑤 = 𝑟𝑇𝑟𝛿0 = 𝛿𝑛𝑒𝑤𝜖 = 1.0𝑒 − 6
Iteration checkWhile i<imax && 𝛿𝑛𝑒𝑤 > 𝜖2𝛿0
Inside loop𝑞 = 𝐴𝑑
𝛼 =𝛿𝑛𝑒𝑤𝑑𝑇𝑞
𝑥 = 𝑥 + 𝛼𝑑If 𝑖 is divisible by 50
𝑟 = 𝑏 − 𝐴𝑥else
𝑟 = 𝑟 − 𝛼𝑞endif
𝛿𝑜𝑙𝑑 = 𝛿𝑛𝑒𝑤𝛿𝑛𝑒𝑤 = 𝑟𝑇𝑟
𝛽 =𝛿𝑛𝑒𝑤𝛿𝑜𝑙𝑑
𝑑 = 𝑟 + 𝛽𝑑𝑖 = 𝑖 + 1
Preconditioner Again
𝑴−𝟏𝑨𝒙 = 𝑴−𝟏𝒃 With Preconditioner 𝑀
𝑀𝐺𝑆 = 𝐷 − 𝐸Gauss-Seidel
𝑀𝑆𝑆𝑂𝑅 =1
𝜔 2 − 𝜔𝐷 − 𝜔𝐸 𝐷−1(𝐷 − 𝜔𝐹)SSOR
𝑀𝐽𝐴 = 𝐷Jacobi
𝑀𝐽𝐴 =1
𝜔(𝐷 − 𝜔𝐸)SOR
Incomplete LU Decomposition 𝐴 = 𝐿𝑈 − 𝑅 𝑅 : residual error
Incomplete Cholesky Decomposition 𝐴 = 𝐿𝐿𝑇 − 𝑅
If A is SPD (Symmetric Positive Definite), above two decomposition are same
To make sparse system, used incomplete Factorization
Implementation
Implementation Issue
• For 3D case, Matrix 𝐴 would be huge. (for (128 × 128 × 128) grid, 𝐴 matrix has 128 × 128 × 128 × 128 × 128 × 128 = 32𝑇𝐵, (for 2D it takes only 2GB)
• However, there are almost 0 in 𝐴 matrix for poisson equation. ⇒ Sparse Matrix!
How to represent Sparse Matrix?
• Simplest thing. Store nonzero value and row, column index. (Coordinate Format, COO)
Too many duplication
Sparse Matrix Format
Compressed Sparse Row (CSR)
• Store only non-zero values• Available three or four arrays
• Not easy to construct the algorithm such as ILU or IC preconditioner
Use MKL (Intel Math Kernel Library)
MKL?
• a library of optimized math routines for science, engineering, and financial applications. Core math functions include BLAS, LAPACK, ScaLAPACK, sparse solvers, fast Fourier transforms, and vector math. The routines in MKL are hand-optimized specifically for Intel processors.
• For my problem, I usually use BLAS, fast Fourier transforms (for poissonequation solver with Neumann, periodic, dirichlet BC)
BLAS?
• a specified set of low-level subroutines that perform common linear algebra operations, widely used. Even in MATLAB!
• Usually used in vector or matrix multiplication, dot product like operations.• Level 1 : vector – vector operation• Level 2 : matrix – vector operation• Level 3 : matrix – matrix operation• Parallelized internally by Intel. Just turn on the option.• Reference manual : https://software.intel.com/en-us/mkl_11.1_ref
How to use Library
For MKL
• For compile (when creating .c files in your makefile) • -i8 -openmp -I$(MKLROOT)/include
• For link (when creating executable files using –o option)• -L$(MKLROOT)/lib/intel64 -lmkl_core -lmkl_intel_thread
-lpthread –lm • https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor
Library Linking Process
• Compile• -I option indicate where is
header file (.h file), specifying include path
• Linking• -L option indicate where is
library file (.lib, .dll, .a, .so), specifying linking path
• -l option indicate library name
Reference
• Shewchuk, Jonathan Richard. "An introduction to the conjugate gradient method without the agonizing pain." (1994).
• Deepak Chandan, “Using Sparse Matrix and Solver Routines from Intel MKL”, Scinet User Group Meeting, (2013)
• Saad, Yousef. Iterative methods for sparse linear systems. Siam, 2003.• Akhunov, R. R., et al. "Optimization of the ILU(0) factorization algorithm with
the use of compressed sparse row format." Zapiski Nauchnykh Seminarov POMI405 (2012): 40-53.