engineering computation

E. T. S. I. Caminos, Canales y Puertos 1

EngineeringComputation

Lecture 7


System of Equations

Errors in Solutions to Systems of Linear Equations

Objective: Solve [A]{x} = {b}

Problem:Round-off errors may accumulate and even be

exaggerated by the solution procedure. Errors are often exaggerated if the system is ill-conditioned

Possible remedies to minimize this effect:1. Partial or complete pivoting2. Work in double precision 3. Transform the problem into an equivalent system of

linear equations by scaling or equilibrating



Ill-conditioning• A system of equations is singular if det|A| = 0• If a system of equations is nearly singular it is ill-conditioned.

Systems which are ill-conditioned are extremely sensitive to small changes in coefficients of [A] and {b}. These systems are inherently sensitive to round-off errors.

Question:Can we develop a means for detecting these situations?



Ill-conditioning of [A]{x} = {b}:

Consider the graphical interpretation for a 2-equation system:

11 12 1 1

21 22 2 2

a a x b (1) =

a a x b (2)

We can plot the two linear equations on a graph of x1 vs. x2.

x1

x2

b2/a22

b1/a11

b1/a12

a11x1+ a12x2 = b1

a21x1+ a22x2 = b2

b2/a21



Ill-conditioning of [A]{x} = {b}:

Consider the graphical interpretation for a 2-equation system:

11 12 1 1

21 22 2 2

a a x b (1) =

a a x b (2)

We can plot the two linear equations on a graph of x1 vs. x2.

Well-conditioned

x1

x2

Uncertaintyin x2

Ill-conditioned

x1

x2Uncertainty

in x2



Ways to detect ill-conditioning:

1. Calculate {x}, make small change in [A] or {b} and determine the change in the solution {x}.

2. After forward elimination, examine diagonal of upper triangular matrix. If aii << ajj, i.e. there is a relatively small value on diagonal, then this may indicate ill-conditioning.

3. Compare {x}single with {x}double

4. Estimate the "condition number" for A.

Substituting the calculated {x} into [A]{x} and checking this against {b} will not always work!!!



Ways to detect ill-conditioning:

• If det|A| = 0 the matrix is singular ==> the determinant may be an indicator of conditioning

• If det|A| is near zero is the matrix ill-conditioned?

• Consider:

After scaling:

==> det|A| will provide an estimate of conditioning if it is normalized by the "magnitude" of the matrix.

60.01 0 0

det 0 0.01 0 100 0 0.01

1 0 0det 0 1 0 1

0 0 1



Norms and the Condition Number

We need a quantitative measure of ill-conditioning.

This measure will then directly reflect the possible magnitude of round off effects.

To do this we need to understand norms:

Norm: Scalar measure of the magnitude of a matrix or vector ("how big" a vector is).

Not to be confused with the dimension of a matrix.

Norms


Here are some vector norms for n x 1 vectors {x} with typical elements xi.

Each is in the general form of a p norm defined by the general relationship:

1. Sum of the magnitudes:

2. Magnitude of largest element:(infinity norm)

3. Length or Euclidean norm:

1/ pnp

ipi 1

x x

n

i1i 1

x x

1/ 2n2

i2i 1

x x

ii

x max x

Vector Norms: Scalar measure of the magnitude of a vector

Vector Norms


Vector Norms

Required Properties of vector norm:

1. ||x|| 0 and ||x|| = 0 if and only if [x]=0

2 ||kx|| = k ||x|| where k is any positive scalar

3. ||x+y|| ||x|| + ||y|| Triangle Inequality

For the Euclidean vector norm we also have

4. ||x•y|| ||x|| ||y||

because the dot product or inner product property satisfies:

||xy|| = ||x||•||y|| |cos()| ||x|| • ||y||.

Norms


1. Largest column sum:

(column sum norm)

2. Largest row sum: (row sum norm)

(infinity norm)

p px 1A max A x

n

1 ijj

i 1

A max a

n

iji

j 1

A max a

Matrix Norms: Scalar measure of the magnitude of a matrix.

Matrix norms corresponding to vector norms above are defined by the general relationship:

Matrix Norms


3. Spectral norm: ||A|| 2 = (µmax)1/2

where µmax is the largest eigenvalue of [A]T[A]

If [A] is symmetric, (µmax)1/2 = max , is the largest eigenvalue of [A].

(Note: this is not the same as the Euclidean or Frobenius norm, seldom used:

2

1 1

n n

ijei j

A A

Matrix norms


Matrix NormsFor matrix norms to be useful we require that

0. || Ax || || A || ||x ||General properties of any matrix norm:

1. || A || 0 and || A || = 0 iff [A] = 0

2. || k A || = k || A || where k is any positive scalar

3. || A + B || || A || + || B || "Triangle Inequality"

4. || A B || || A || || B ||

Why are norms important? Norms permit us to express the accuracy of the solution {x} in

terms of || x || Norms allow us to bound the magnitude of the product [ A ] {x}

and the associated errors.

Matrix norms


Forward and backward error analysis can estimate the effect of truncation and roundoff errors on the precision of a result. The two approaches are alternative views:

1. Forward (a priori) error analysis tries to trace the accumulation of error through each process of the algorithm, comparing the calculated and exact values at every stage.

2. Backward (a posteriori) error analysis views the final solution as the exact solution to a perturbed problem. One can consider how different the perturbed problem is from the original problem.

Here we use the condition number of a matrix [A] to specify the amount by which relative errors in [A] and/or {b} due to input, truncation, and rounding can be amplified by the linear system in the computation of {x}.

Error Analysis


Backward Error Analysis of [A]{x} = {b} for errors in {b}

Suppose the coefficients {b} are not precisely represented. What might

be the effect on the calculated value for {x + dx}?

Lemma: [A]{x} = {b} yields ||A|| ||x|| ||b|| or

Now an error in {b} yields a corresponding error in {x}:

[A ]{x + dx} = {b + db}

[A]{x} + [A]{ dx} = {b} + {db}

Subtracting [A]{x} = {b} yields:

[A]{dx} = {db} ––> {dx} = [A]-1{db}

b A1x

A x b

Error Analysis


Define the condition number ask = cond [A] ||A-1|| ||A|| 1

If k 1 or k is small, the system is well-conditioned

If k >> 1, system is ill conditioned.

1 = || I || = || A-1A || || A-1 || || A || = k = Cond(A)

1x A b

1 A

x b

1x b bA A

x b b

Taking norms we have:

And using the lemma:

we then have :

Backward Error Analysis of [A]{x} = {b} for errors in {b}

Error Analysis


Backward Error Analysis of [A]{x} = {b} for errors in [A]If the coefficients in [A] are not precisely represented, what might be effect on the calculated value of {x+ dx}?

[A + dA ]{x + dx} = {b}

[A]{x} + [A]{ dx} + [dA]{x+dx} = {b}

Subtracting [A]{x} = {b} yields:

[A]{ dx} = – [dA]{x+dx} or

{dx} = – [A]-1 [dA] {x+dx}

Taking norms and multiplying by || A || / || A || yields :

1x A AA A

x x A A

Error Analysis


Estimate of Loss of Significance:Consider the possible impact of errors [dA] on the precision of {x}.

pA AIf ~ 10 , then

A A

x

x x

s s p~ 10 , then 10 10

x

x x

Error Analysis

implies that if

Or, taking log of both sides: s > p - log10()

• log10() is the loss in decimal precision; i.e., we start with p decimal figures and end-up with s decimal figures.

• It is not always necessary to find [A]-1 to estimate k = cond[A]. Instead, use an estimate based upon iteration of inverse matrix using LU decomposition.


Impetus for Iterative Schemes:1. May be more rapid if coefficient matrix is "sparse"

2. May be more economical with respect to memory

3. May also be applied to solve nonlinear systems

Disadvantages:1. May not converge or may converge slowly 2. Not appropriate for all systems

Error bounds apply to solutions obtained by direct and iterative methods because they address the specification of [dA] and {db}.

Iterative Solution Methods


Basic Mechanics:Starting with:

a11x1 + a12x2 + a13x3 + ... + a1nxn = b1

a21x1 + a22x2 + a23x3 + ... + a2nxn = b2

a31x1 + a32x2 + a33x3 + ... + a3nxn = b3

: :an1x1 + an2x2 + an3x3 + ... + annxn = bn

Solve each equation for one variable:x1 = [b1 – (a12x2 + a13x3 + ... + a1nxn )} / a11

x2 = [b2 – (a21x1 + a23x3 + ... + a2nxn )} / a22

x3 = [b3 – (a31x1 + a32x2 + ... + a3nxn )} / a33

:

xn = [bn – (an1x2 + an2x3 + ... + an,n-1xn-1 )} / ann



Start with initial estimate of {x}0.

Substitute into the right-hand side of all the equations.

Generate new approximation {x}1.

This is a multivariate one-point iteration: {x}j+1 = {g({x}j)}

Repeat process until the maximum number of iterations is reached or until:

|| xj+1 – xj || d + e || xj+1 ||



To solve [A]{x} = {b}Separate [A] into: [A] = [Lo] + [D] + [Uo] [D] = diagonal (aii)

[Lo] = lower triangular with 0's on diagonal[Uo] = upper triangular with 0's on diagonal

Rewrite system:[A]{x} = ( [Lo] + [D] + [Uo] ){x} = {b}

[D]{x} + ( [Lo] + [Uo] ){x} = {b}Iterate:

[D]{x}j+1 = {b} – ( [Lo]+[Uo] ) {x}j

{x}j+1 = [D]-1{b} – [D]-1 ( [Lo]+[Uo] ) {x}j

Iterations converge if:|| [D]-1 ( [Lo] + [Uo] ) || < 1(sufficient if equations are diagonally dominant)

Convergence


j 1 j j j1 12 13 1n n 111 2 3

j 1 j j j2 21 23 2n n 222 1 3

j j jj 1n n n1 n2 n,n 1 nn1 2 n 1

x b a x a x a x a

x b a x a x a x a

x b a x a x a x a

Iterative Solution Methods – the Jacobi Method


In most cases using the newest values within the right-hand side equations will provide better estimates of the next value. If this is done, then we are using the Gauss-Seidel Method:

( [Lo]+[D] ){x}j+1 = {b} – [Uo] {x}j

or explicitly:

j 1 j 1 j j2 21 23 2n n 222 1 3

j 1 j 1 j 1 j3 31 32 3n n 333 1 2

j 1 j 1 j 1j 1n n n1 n2 n,n 1 nn1 2 n 1

x b a x a x a x a

x b a x a x a x a

x b a x a x a x a

If this is done, then we are using the Gauss-Seidel Method

Iterative Solution Methods -- Gauss-Seidel


If either method is going to converge,Gauss-Seidel will converge

faster than Jacobi.

Why use Jacobi at all?

Because you can separate the n-equations into n independent tasks, it is very well suited computers with parallel processors.

Iterative Solution Methods -- Gauss-Seidel


Rewrite given system: [A]{x} = { [B] + [E] } {x} = {b}

where [B] is diagonal, or triangular so we can solve

[B]{y} = {g} quickly. Thus,

[B] {x}j+1= {b}– [E] {x}j

which is effectively: {x}j+1 = [B]-1 ({b} – [E] {x}j )

True solution {x}c satisfies:{x}c = [B]-1 ({b} – [E] {x}c)

Subtracting yields: {x}c – {x}j+1= – [B]-1 [E] [{x}c – {x}j]

So ||{x}c – {x}j+1 || ||[B]-1 [E]|| ||{x}c – {x}j ||

Iterations converge linearly if || [B]-1 [E] || < 1

=> || ([D] + [Lo])-1 [Uo] || < 1 For Gauss-Seidel

=> || [D] -1 ([Lo] + [Uo]) || < 1 For Jacobi

Convergence of Iterative Solution Methods


Iterative methods will not converge for all systems of equations, nor for all possible rearrangements.

If the system is diagonally dominant,

i.e., | aii | > | aij | where i j then

i i1 i2 ini 1 2 n

ii ii ii ii

b a a ax x x x

a a a a

with all < 1.0, i.e., small slopes.ij

ii

a

a



A sufficient condition for convergence exists:

n

ii ij

j 1j i

a a

Notes:

1. If the above does not hold, still may converge.

2. This looks similar to infinity norm of [A]



Relaxation Schemes:

where 0.0 < 2.0(Usually the value of l is close to 1)

Underrelaxation ( 0.0 < < 1.0 )More weight is placed on the previous value. Often used to: - make non-convergent system convergent or - to expedite convergence by damping out oscillations.

Overrelaxation ( 1.0 < 2.0 )More weight is placed on the new value. Assumes that the new value is heading in right direction, and hence pushes new value close to true solution.

The choice of is highly problem-dependent and is empirical, so relaxation is usually only used for often repeated calculations of a particular class.

new trial oldi i ix = x + (1- ) x

Improving Rate of Convergence of G-S Iteration


We often need to solve [A]{x} = {b} where n = 1000's• Description of a building or airframe,• Finite-Difference approximations to PDE's. Most of A's

elements will be zero; a finite-difference approximation to Laplace's equation will have five aij0 in each row of A.

Direct method (Gaussian elimination)

• Requires n3/3 flops (say n = 5000; n3/3 = 4 x 1010 flops) • Fills in many of n2-5n zero elements of A

Iterative methods (Jacobi or Gauss-Seidel)• Never store [A] (say n = 5000; [A] would need 4n2 = 100 Mb)• Only need to compute [A-B] {x}; and to solve [B]{xt+1} = {b}

Why Iterative Solutions?


• Effort:

Suppose [B] is diagonal,

solving [B] {v} = {b} n flops

Computing [A-B] x 4n flops

For m iterations 5mn flops

For n = m = 5000, 5mn = 1.25x108 At worst O(n2).

Why Iterative Solutions?

engineering computation

Documents