ch. 2 algorithms for systems of linear equations
DESCRIPTION
EE692. Parallel and Distribution Computation | Prof. Song Chong. Ch. 2 Algorithms for Systems of Linear Equations. Overview. Consider the system of linear equations A: n x n real matrix, b: vector in R n - PowerPoint PPT PresentationTRANSCRIPT
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.1
Ch. 2
Algorithms for Systems of Linear Equations
EE692
Parallel and Distribution Computation | Prof. Song Chong
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.2
Overview Consider the system of linear equations
A: n x n real matrix, b: vector in Rn
Direct method to find “exact” solution (sec 2.1~2.3) with a finite number of operations, typically of the order of n3
e.g.) Gaussian Elimination
bAx
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.3
Overview (Cont’d)
Iterative methods do not obtain an exact solution of Ax = b in finite time, but they converge to a solution asymptotically Often yield a solution, within acceptable precision, after a
relatively small number of iterations Usually preferred when n is very large May have smaller storage requirement than direct methods
Performance measures Direct method: complexity Iteration method: speed of convergencee.g.) geometrical convergence tcxtx *)(
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.4
Classical Iterative Methods Assume that A is invertible so that Ax = b has a unique
solution. Write the i-th equation as
Assuming aii≠0 and solving for xi,
------(1)
If xj, j≠i, (or estimates) are known (available), one can obtain xi (or an estimate of xi)
ibxa i
n
jjij
,1
ibxaa
xij
ijijii
i
,1
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.5
Jacobi Algorithm Starting with some initial vector , evaluate x(t),
t=1,2,..., using the iteration
If the sequence {x(t)} converges to a limit x*, then obviously x* satisfies Eg.(1) for each i.
Condition for Convergence??e.g.)
(0) nx R
ibtxaa
tx iij
jijii
i
,)(1
)1(
(eq.2) 02
1) (eq. 02
21
21
xx
xx
>> Case 1: Convergence case
>> Case 2: Divergence case
0
0
21
12
2
1
x
x
0
0
12
21
2
1
x
x
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.6
Jacobi Algorithm (Cont’d)
First equation: 2x1-x2=0
Second equation: -x1+2x2=0
x(0)
x(2)
x(1)
x2
x1
Second equation: 2x1-x2=0
First equation: -x1+2x2=0x(0)
x(2)
x(1)
x1
x2
Divergence CaseConvergence Case
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.7
Gauss-Seidel Algorithm Starting with
Any other order of updating is possible Different order of updating may produce substantially
different results for the same problem
(0) nx R
ibtxatxaa
tx iij
jijij
jijii
i
,)()1(1
)1(
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.8
Relaxation of Iterative Methods Relaxation of Eq.(1) using relaxation parameter
Jacobi overrelaxation (JOR)
Convex computation of xi(t) and Jacobi iteration Gauss-Seidel overrelaxation (SOR)
Convex combination of xi(t) and Gauss-Seidel iteration JOR and SOR are widely used because they often converge
faster if is suitably chosen
ibxaa
xx iij
jijii
ii
,)1(
ibtxaa
txtx iij
jijii
ii
,)()()1()1(
ibtxatxaa
txtx iij
jijij
jijii
ii
,)()1()()1()1(
)10(
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.9
Richardson’s method Following equation is obtained by rewriting
Richardson-Gauss-Seidel [RGS] method
A more general form using an invertible matrix B
bAx
ibtxatxtx
bAxxxbAx
jijijii
, )()()1(
ibtxatxatxtxij ij
ijijjijii
, )()1()()1(
)()()1(
btAxBtxtx
bAxBxxbAx
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.10
Parallel Implementation Jacobi, JOR and Richardson’s algorithms are straightforward
to implement in parallel Gauss-Seidel, SOR and RGS algorithms are not well suited for
parallel implementation in general because they are inherently sequential
Typical termination criteria used in practice
)( btAx
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.11
Applications: Poisson’s equation
Find a function f:[0,1]2R that satisfies
where g:[0,1]2R is a known function and f has prescribed values on the boundary of the unit square.
Let
(1)--- ]1,0[),( ),,(),(),( 22
2
2
2
yxyxgyxy
fyx
x
f
NjiN
j
N
iff ji ,0 ),,(,
NjiN
j
N
igg ji ,0 ),,(,
(0,N)
(0,0) (N,0)
(N,N)
(N+1) x (N+1) grid
Δ=1/ N
Δ
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.12
Applications: Poisson’s equation (cont’d)
Assume that f is sufficiently smooth and the is a small scalar,
by Prop. A.33 in Appendix A.
By plugging (2) and (3) into (1),
A system of (N-1)2 linear equations in (N-1)2 unknowns, i.e., can be represented in the form Ax=b.
)2(--- ),(),(2),(1
),(22
2
yxfyxfyxfyxx
f
)3(--- ),(),(2),(1
),(22
2
yxfyxfyxfyxy
f
NjigN
fffff jijijijijiji ,0 , 4
1)(
4
1,21,1,,1,1,
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.13
Applications: Poisson’s equation (cont’d)
JOR algorithm
where fi,j(t)=fi,j are known, whenever i or j is equal to 0 or N.
NjigN
tftftftftftf jijijijijijiji ,0 , 4
)()()()(4
)()1()1( ,21,1,,1,1,,
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.14
Applications: Power Control of CDMA Uplink
Assume K users in a cell, SINR per chip, denoted by SINRc, of user i is
where is the received energy per chip for user i and N0 is noise.
Since each bit is encoded onto a pseudonoise sequence of length Gi chips at the transmitter, the received energy per bit for user i is .
ijKj
jc
ici
c NSINR
,,1
0
ic
ici
ib G
1
2
3g1 g2
g3
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.15
Applications: Power Control of CDMA Uplink (cont’d)
The SINR of user i, or equivalently the ratio of the received energy per bit to the interference and noise per chip (commonly called in the CDMA literature) is
where pi (joules/sec) is the transmit power of user i and gi is the attenuation of user i’s signal to base station.
ijKj
jj
ii
i
ijKj
jc
ici
i
ib
i
NgW
p
gWp
G
N
G
ISINR
,,1
0
,,100
iib I0/
KiWNgp
gpGSINR
ijKj
jj
iiii ,,1 ,
,,1
0
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.16
Applications: Power Control of CDMA Uplink (cont’d)
To achieve equally reliable communication,
where is a certain threshold.
The data rate of user i, Ri (bits/sec), is
and Gi is called the processing gain of user i.
iSINR
ii G
WR
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.17
Applications: Power Control of CDMA Uplink (cont’d)
The power control problem of CMDA uplink is to find minimal nonnegative transmit power vector satisfying
That is, find nonnegative satisfying
A system of K linear equations in K unknowns, i.e., can be represented in the form Ax=b.
],,,[ 21 Kpppp
KiWNgp
gpG
ijKj
jj
iii ,,1 ,
,,1
0
p
KiWNgp
gpG
ijKj
jj
iii ,,1 ,
,,1
0
KigG
WNgp
gGp
iiijKj
jjii
i ,,1 , 0
,,1
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.18
Applications: Power Control of CDMA Uplink (cont’d)
JOR algorithm
For each user i,
where β, Gi, gi, N0 and W are given.
iiijKj
jjii
ii gG
WNgtp
gGtptp 0
,,1
)()()1()1(
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.19
Parallelization of Iterative Methods Using Dependency Graph
Consider a Jacobi-type iteration in the general form
The communication required for this iteration can be described by means of a directed graph G=(N,A), called the dependency graph.
The set of nodes N is {1,…,n}, corresponding to the components of x. Let (i,j) be an arc of the dependency graph if and only if the function fj depends on xi.
1( 1) ( ( ) , , ( )) , 1, ,i i nx t f x t x t i n
1 1 1 3
2 2 1 2
3 3 2 3 4
4 4 2 4
. .) ( 1) ( ( ), ( ))
( 1) ( ( ), ( ))
( 1) ( ( ), ( ), ( ))
( 1) ( ( ), ( ))
e g x t f x t x t
x t f x t x t
x t f x t x t x t
x t f x t x t
1
3
4
2
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.20
Parallelization of Iterative Methods Using Dependency Graph (cont’d)
The dependency over iterations can be described by means of a directed acyclic graph (DAG) where the nodes one of the form (i,t) and arcs are of the form ((i,t), (j,t+1)).
1,0 4,03,02,0
1,1 4,13,12,1
1,2 2,42,32,2
t=0
t=1
t=2
t
The depth of the single iteration (sweep) is 1
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.21
Parallelization of Iterative Methods Using Dependency Graph (cont’d)
Consider a Gauss-Seidel type iteration in the general form
Often preferable since it incorporates the newest available information, thereby sometimes converging faster than the Jacobi type
Maybe completely non-parallelizable since it is sequential in nature
When the dependency graph is sparse, it is possible that certain component updates can be parallelized
The degree of parallelism may depend on update ordering
1 1( 1) ( ( 1), , ( 1), ( ), , ( )) , 1, ,i i i i nx t f x t x t x t x t i n
1,0 4,03,02,0
1,1
4,1
3,1
2,1
The depth of the single iteration (sweep) is 3
e.g.) ordering 1234
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.22
Parallelization of Iterative Methods Using Dependency Graph (cont’d)
The depth of the single iteration is 2
Finding an optimal update ordering that maximizes parallelisms in Gauss-Seidel algorithm is equivalent to an optimal coloring problem.
1,0 4,03,02,0
1,1 4,13,1
2,1
e.g.) ordering 1342
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.23
Parallelization of Iterative Methods Using Dependency Graph (cont’d)
Given the dependency graph G=(N,A), a coloring of G, using K colors, is defined as a mapping h:N->{1,…, K} that assigns a color k=h(i) to each node i in N.
Prop. 2.5 There exists an ordering such that a sweep of the Gauss-Seidel
algorithm can be performed in K parallel steps if and only if there exists a coloring of the dependency graph that uses K colors and with the property that there exists no positive cycle with all nodes on the cycle having the same color.
Prop. 2.6 Suppose that if and only if . Then, there exists an
ordering such that a sweep of the Gauss-Seidel algorithm can be performed in K parallel steps if and only if there exists a coloring of the dependency graph that uses at most K colors and such that adjacent nodes have different colors.
Unfortunately, the optimal coloring problems are intractable, i.e., there is know known efficient algorithm for solving them.
Aji ),( Aij ),(
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.24
Convergence Analysis of Classical Iterative Methods
Prop. 4.1 If , generated by any of the above presented algorithms
converges, then it converges to a solution of .
)(tx
bAx
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.25
Uniform representation of the different algorithms Let B = A-D where D is a diagonal matrix whose entries are
equal to the corresponding diagonal entries of A.
Assuming that the diagonal entries of A are nonzero, the Jacobi algorithm can be written as
Similarly, the JOR
bDtBxDtx 11 )()1(
bDBxDxbBxDxbxDBbAx 11)(
<Jacobi>
bDtxBDItx 11 )(])1[()1( <JOR>
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.26
Uniform representation of the different algorithms (cont’d)
Decompose A=L+D+U where L strictly lower triangular D diagonal
U strictly upper triangular Then, the Gauss-Seidel can be written as
))()1(()1( 1 btUxtLxDtx
bDLDItUxDLDItx 111111 )()()()1( <Gauss-Seidel>
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.27
Uniform representation of the different algorithms (cont’d)
Similarly,
Finally,
The uniform representation is
))()1(()()1()1( 1 btUxtLxDtxtx
bDLDItxUDILDItx 111111 )()( )1()()1(
btxAItx )()()1(
GbtMxtx )()1(
<SOR>
<Richardson>
iterative matrix
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.28
Uniform representation of the different algorithms (cont’d)
Assume that I-M is invertible (fact: A invertible and nonzero diagonal I-M invertible for all the algorithm Ex.6.1). Then, there exists a unique x* satisfying x* = Mx* + Gb.
Let y(t) = x(t) – x*. Then,
The solution form is then for every t. Y(t) 0 iff Mt 0 iff all the eigenvalue of M have a magnitude
smaller than 1, i.e., the spectral radius .
( 1) ( )y t My t
( ) (0)ty t M y
( ) 1M
Network Systems Lab.
Korea Advanced Institute of Science and Technology
No.29
Uniform representation of the different algorithms (cont’d)
Prop. 6.1Assume that I-M is invertible, let x* satisfy x*=Mx*+Gb and let {x(t)} be the sequence generated by the iteration x(t+1) = Mx(t) + Gb.Then,
Note : G and b are nothing to do with convergence Proof) to be done!
1)( iff )0( of choices allfor )(lim *
Mxxtxt