Part II: Notation, Background, Errors
MA 580; Numerical Analysis I
C. T. KelleyNC State University
Version of September 5, 2016
NCSU, Fall 2016Part II: Notation, Background, Errors
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 1 / 60
Part II: Notation, Background, Errors
Contents
1 Vectors and Matrices
2 Norms
3 Eigenvalues and Eigenvectors
4 Special Matrices
5 Errors and Floating Point Arithmetic
6 Stability and Conditioning
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 2 / 60
Part II: Notation, Background, Errors
Vectors and Matrices
Notation: Vectors
RN is the space of N dimensional vectors.
RM×N is the space of M × N matrices.
Vectors x are column vectors
x =
x1x2...
xN
= (x1, . . . , xN)T .
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 3 / 60
Part II: Notation, Background, Errors
Vectors and Matrices
Notation: Matrices
Matrices A are M × N. The ijth component is
aij or, if A is a complex expression Aij
If aj ∈ RM is the jth column of A, we can write A as
A = (a1, . . . , aN).
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 4 / 60
Part II: Notation, Background, Errors
Vectors and Matrices
Matrix Products
If A is M × L and B is L× N, the matrix product C = AB isM × N and
cij =L∑
l=1
ailblj
Example: scalar product
vTu =N∑i=1
viui
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 5 / 60
Part II: Notation, Background, Errors
Vectors and Matrices
More Matrix Notation
The transpose of A ∈ RM×N is AT ∈ RN×M and ATij = Aji .
So, the transpose of a column vector is a row vector.
The identity matrix I ∈ RN×N : Ix = x for all x.
(I)ij = δij =
{1 if i = j0 otherwise
Define the inverse of A ∈ RN×N , A−1: A−1A = AA−1 = I
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 6 / 60
Part II: Notation, Background, Errors
Vectors and Matrices
Canonical vectors
We call the columns of I
I =
1. . .
1
=(e1 e2 . . . en
)the canonical vectors.Warning: sometimes e is also an error. You’ll figure it out from thecontext.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 7 / 60
Part II: Notation, Background, Errors
Vectors and Matrices
Canonical Vectors
e1 =
10...0
, e2 =
01...0
, · · · , eN =
00...1
.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 8 / 60
Part II: Notation, Background, Errors
Vectors and Matrices
Sequences of Vectors
If {xk} is a sequence of vectors, the ith component of xk is
xik
as it would be if you regarded the sequence as a large matrix
X = (x1, x2, . . . )
and used i as the row index and the iteration counter as thecolumn index.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 9 / 60
Part II: Notation, Background, Errors
Norms
Vector Norms
Review the definition of norm if you don’t remember it.The three important norms are `1, `2, `∞.
‖x‖1 =N∑i=1
|xi |, ‖x‖2 =
(N∑i=1
|xi |2)1/2
, and ‖x‖∞ = maxi|xi |.
More generally, the `p norm:
‖x‖p =
(N∑i=1
|xi |p)1/p
for 1 ≤ p <∞.Generally the specific norm will not be so important. If I don’t saywhat the norm is then it is any vector norm.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 10 / 60
Part II: Notation, Background, Errors
Norms
Matrix Norms
We will consider matrix norms which are induced by vector norms.This means
‖A‖ = max‖x‖=1
‖Ax‖.
For A ∈ RN×N , induced norms satisfy
‖AB‖ ≤ ‖A‖‖B‖
and‖Ax‖ ≤ ‖A‖‖x‖
The norm of the product is less than or equal to the product of thenorms.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 11 / 60
Part II: Notation, Background, Errors
Norms
Computing the `∞ matrix norm: I
Do the math (absolute value of sum ≤ sum of absolute values) toget
|(Ax)i | = |N∑j=1
aijxj | ≤N∑j=1
|aij ||xj | ≤ ‖x‖∞maxi
N∑j=1
|aij |
for all x and all i . So,
‖A‖∞ ≤ maxi
N∑j=1
|aij | = maximum absolute row sum
We’ll show that the norm is the maximum absolute row sum.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 12 / 60
Part II: Notation, Background, Errors
Norms
Computing the `∞ matrix norm: II
To show that the ≤ is really =. You’ll need to find a vector thatattains the bound. Here it is.Let i∗ be the row of A with max norm
N∑j=1
|ai∗j | = maxi
N∑j=1
|aij |
and let xj = sign(ai∗j). Then ‖x‖ = 1 and
(Ax)i∗ =N∑j=1
|ai∗j |
SHAZAM!
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 13 / 60
Part II: Notation, Background, Errors
Norms
`1 matrix norm?
‖A‖1 = maxj
N∑i=1
|aij |
the max absolute column sum.You should be able to prove this.We’ll do `2 later.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 14 / 60
Part II: Notation, Background, Errors
Eigenvalues and Eigenvectors
λ is an eigenvalue of A with corresponding eigenvector x if
Ax = λx
Any eigenvalue is a root of the characteristic polynomial of A
pC (z) = det(zI− A)
pC has N roots by the fundamental theorem of algebra. So thereare N eigenvalues, counted with multiplicity.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 15 / 60
Part II: Notation, Background, Errors
Eigenvalues and Eigenvectors
Multiplicity
The algebraic multiplicity of an eigenvalue λ is the number oftimes it appears in the factorization of pC into linear terms
pC (z) =N∏i=1
(z − λi )
The geometric multiplicity of λ is the dimension of the null spaceof λI− A.The two notions of multiplicity are not the same. Take
A =
(0 10 0
),
please.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 16 / 60
Part II: Notation, Background, Errors
Special Matrices
Diagonal Matrices
A matrix D is diagonal if dij = 0 unless i = j .
D =
d11
. . .
dnn
.
In this case we sometimes abbreviate dii as di and write
D = diag(d1, . . . , dN)
where di is the ith entry in the diagonal.Example: I = diag(1, 1, . . . , 1)
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 17 / 60
Part II: Notation, Background, Errors
Special Matrices
Diagonalizable Matrices
A matrix is diagonalizable if it has a basis of eigenvectors {vi}Ni=1.In this case the matrix V = (v1, . . . , vN) is nonsingular and
A = VΛV−1
where Λ = diag(λ1, . . . , λN).
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 18 / 60
Part II: Notation, Background, Errors
Special Matrices
Triangular Matrices
A ∈ RN×N is upper triangular if aij = 0 for i > j . That is,
A =
a11 . . . a1n. . .
...ann
.
A is lower triangular if AT is upper triangular.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 19 / 60
Part II: Notation, Background, Errors
Special Matrices
Orthogonal Matrices
An N × N matrix U is orthogonal if
U−1 = UT
This means that the columns of U form an orthonormal basis.Orthogonal matrices are a very big deal in this course.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 20 / 60
Part II: Notation, Background, Errors
Special Matrices
Symmetric Matrices
A is symmetric is A = AT .A few facts:
The eigenvalues {λi}Ni=1 of A are real.
There’s an orthonormal basis {ui}Ni=1 of eigenvectors of A.
U = (u1, . . . ,uN) is an orthogonal matrix.
A = UΛUT (spectral theorem for symmetric matrices)
Prove some of this stuff.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 21 / 60
Part II: Notation, Background, Errors
Special Matrices
Symmetric Positive (semi)Definite Matrices
A symmetric matrix A is symmetric positive definite (spd) if
xTAx > 0 for all x 6= 0
Alternatively, A is spd if all its eigenvalues are positive.Positive semidefinite: all eigenvalues nonnegative or
xTAx ≥ 0 for all x
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 22 / 60
Part II: Notation, Background, Errors
Special Matrices
Rank-one or Outer Product Matrices
Recall the inner product of v and u
uTv = (1× N)(N × 1) = (1× 1) =N∑i=1
uivi .
The outer product is
uvT = (N × 1)(1× N) = N × N,
and (uvT )ij = uivj
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 23 / 60
Part II: Notation, Background, Errors
Special Matrices
Rank One Matrices
A is rank-one matrix if the range R(A) of A has dimensionone.
Fact: A is rank-one if and only if A = uvT for vectors u andv. You get to prove this in the homework.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 24 / 60
Part II: Notation, Background, Errors
Special Matrices
Eigen-decomposition of symmetric matrices
If A = AT then
A = UΛUT =N∑i=1
λiuiuTi .
From this you can prove, for symmetric A,
‖A‖2 = maxi|λi |
You’ll see things like this on exams.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 25 / 60
Part II: Notation, Background, Errors
Special Matrices
Eigenvalues/vectors of Rank-One matrices
A = xyT
Any vector in E = {z | yTz = 0} is a null-vector of A (Ax = 0).If yTx 6= 0 (x 6∈ E ) then
Ax = (yTx)x
so λ = yTx is an eigenvalue with eigenvector x.Algebraic multiplicity = geometric multiplicity!What if yTx = 0?
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 26 / 60
Part II: Notation, Background, Errors
Special Matrices
`2 norm of symmetric matrix A = UΛUT
Since U is orthogonal then every x ∈ RN has the expansion
x =∑i
(uTi x)ui ,
which implies that
Ax =∑i
λi (uTi x)ui .
So,
‖Ax‖22 =∑i
λ2i (uTi x)2 ≤ λ2N
∑i
(uTi x)2 = λ2N‖x‖2,
with equality if x = uN . That’s it.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 27 / 60
Part II: Notation, Background, Errors
Special Matrices
`2 Matrix Norm
Start with‖Ax‖22 = (Ax)T (Ax) = xT (ATA)x.
ATA is symmetric positive semidefinite, so
ATA = UΛUT where 0 ≤ λ1 ≤ λ2 ≤ · · · ≤ λN .
This means (you should fill in the details) that
‖Ax‖22 ≤ λN‖x‖2 with equality if x = uN .
So ‖A‖2 =√λN where λN is the largest eigenvalue of ATA.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 28 / 60
Part II: Notation, Background, Errors
Special Matrices
The Sherman-Morrison Formula: Rank-one changes
Assume A ∈ RN×N ; v,u ∈ RN and
A is nonsingular,
1 + vTA−1u 6= 0.
Then A + uvT is nonsingular and
(A + uvT )−1 =
(I +
(A−1u)vT
1 + vTA−1u
)A−1.
How do you prove this? It’s your problem in the homework.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 29 / 60
Part II: Notation, Background, Errors
Special Matrices
Things to think about
In what sense is the spectral decomposition for a symmetricmatrix A = UΛUT unique?What about the 1× 1 case? What about the 2× 2 identity?
What are the eigenvalues of an orthogonal matrix?
Is an orthogonal matrix diagonalizable?
What are the orthogonal spd matrices?
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 30 / 60
Part II: Notation, Background, Errors
Errors and Floating Point Arithmetic
Relative and Absolute Errors
Let x be a vector, matrix, scalar, . . .Suppose you approximate x by x̃. The absolute error is
‖x− x̃‖ sensitive to units
The relative error (for x 6= 0) is
‖x− x̃‖‖x‖
not sensitive to units
It’s ok to think of a relative error of 10−k as meaning that x and x̃agree to k decimal digits.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 31 / 60
Part II: Notation, Background, Errors
Errors and Floating Point Arithmetic
Floating Point Arithmetic
A real number is an abstraction. It does not physically exist.
There are uncountably many real numbers.
There are uncountably many real numbers between any tworeal numbers.
You cannot store all the real numbers in a physical device.
On the other hand . . .
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 32 / 60
Part II: Notation, Background, Errors
Errors and Floating Point Arithmetic
A floating point number is like a dog.
It is not an abstraction. It is realer that a real number.
It consumes energy.
It radiates heat.
It occupies space.
It makes noise.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 33 / 60
Part II: Notation, Background, Errors
Errors and Floating Point Arithmetic
Geography of Floating Point Numbers: Radix
Floating point numbers are expressed in three fields inradix (or base) P arithmetic.
Modern computers use radix 2 (binary) arithmetic.The standard is IEEE-754 (1985).
We will use radix 10 (decimal) for simple examples in this partof the course.
Historical systems have used 16 (hexadecimal) and 8 (octal).
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 34 / 60
Part II: Notation, Background, Errors
Errors and Floating Point Arithmetic
Geography of Floating Point Numbers: The Fields
± Exponent Mantissa
Sign bit.
Exponent field
Mantissa, significand, or fraction.
Conventions for uniqueness: normalized numbers
The point is before the leading digit.The leading digit of a non-zero number is non-zero.
You can think of this as scientific notation with limited range.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 35 / 60
Part II: Notation, Background, Errors
Errors and Floating Point Arithmetic
How big is an IEEE float?
Width (bits) Sign Exponent Manitssa
Double 64 1 11 52
Single 32 1 8 23
Matlab default is 64 bit IEEE double precision.
Special objects:
INF 1/0, 101000, . . .NaN 0/0, INF ∗ 0, missing data in statistics
If you get these, you have a bug!
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 36 / 60
Part II: Notation, Background, Errors
Errors and Floating Point Arithmetic
I have a base 10 brain. What does this mean to me?
In double precision:
Exponent Range:
Largest > 0 floating point number: ≈ 1.8× 10308
Smallest normalized > 0 floating point number:≈ 2.22× 10−308 =2.22e-302
denormalized: 4.9407e-324, eps(0)
Unit roundoff: εu ≈ 1.11× 10−16
relative error bound for conversion from real to floatand elementary operations.
Difference between 1 and floating point number > 1 nearestto 1: 2.2204e-16
eps or eps(1)
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 37 / 60
Part II: Notation, Background, Errors
Errors and Floating Point Arithmetic
Why would anyone use single precision?
Ideas?
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 38 / 60
Part II: Notation, Background, Errors
Errors and Floating Point Arithmetic
What you need to remember
There are finitely many floating point numbers.
Floating point numers are not equally spaced.
There’s a role (but not in MA580) for single (and lower)precision.help single
If computation arrives at
something larger than the largest float: overflow, get INFsomething 0 < x < eps(0): underflowgradual or zero
Overflow is a problem. Underflow is usually not.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 39 / 60
Part II: Notation, Background, Errors
Errors and Floating Point Arithmetic
Rounding and Floating Point Operations
Real number: x, Floating point representationfl(x) = x(1 + ε), |ε| ≤ εuIEEE 754 default: round to nearest.
Operations: Given x , y ∈ R and elementary operation ◦
fl(x ◦ y) = fl(fl(x) ◦ fl(y)) = (x ◦ y)(1 + εu)
Given something to evaluate f (x) you hope that
‖fl(f (x))− f (x)‖ = εf ‖f (x)‖, where εf ≈ sizeof (computation)εu
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 40 / 60
Part II: Notation, Background, Errors
Errors and Floating Point Arithmetic
A Toy Floating Point Number System
Here’s a radix 10 system.
Two digit mantissa
Exponent range: -2:2
Largest float = .99 ∗ 102 = 99
eps(0) = .10 ∗ 10−2
eps(1) = 1.1− 1 = .1
Round to nearest.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 41 / 60
Part II: Notation, Background, Errors
Errors and Floating Point Arithmetic
Overflow in inner product
Given x = (10, 10)T compute ‖x‖2.Answer =
√100 + 100 =
√200 which rounds to .14× 102, a legal
floating point number.Wrong way: Add x1y1 (Overflow) to x2y2 (Overflow) and take thesquare root.Right way:
Normalize x to get w = x/‖x‖∞ = (1, 1)T
Compute ‖w‖2 =√
2 rounds to .14× 101.
Multiply ‖w‖ by ‖x‖∞ = 10 to get ‖x‖2 = .14× 102
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 42 / 60
Part II: Notation, Background, Errors
Errors and Floating Point Arithmetic
Order of summation
Let’s compute
101∑i=1
ai , where a1 = 90, a2 = a3 = . . . a101 = .01
If do we this in order
fl(a1 + a2) = fl(90.01) = 90, . . .
and the summation stagnates with a result of 90 (wrong).Do it backwards and
fl(2∑
i=101
.01) = fl(1) = 1, and fl(1 + 90) = 91.
Moral: sum the small things first.c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 43 / 60
Part II: Notation, Background, Errors
Errors and Floating Point Arithmetic
The Quadratic Formula
From your early childhood, the solutions of
ax2 + bx + c = 0
are
r± =−b ±
√b2 − 4ac
2a
Let’s break it.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 44 / 60
Part II: Notation, Background, Errors
Errors and Floating Point Arithmetic
Catastrophic Cancellation or Loss of Signifcance
Let’s find the roots ofx2 + 4x + .1
with the quadratic formula. The answer (computed in IEEEdouble) is
r± =−4±
√16− .4
2≈
−.02515 +
−3.975 -
In the toy floating point system, these round to
−.25× 10−1 and − .40× 101
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 45 / 60
Part II: Notation, Background, Errors
Errors and Floating Point Arithmetic
Apply the Formula in Toy Floats
Sincefl(16− .4) = fl(15.6) = 16
we may be in trouble. When we compute r− we get
r− = (−4− 4)/2 = −4 right! .
But for r+ we are lost
r+ = (−4 + 4)/2 = 0 wrong!
The subtraction 16− .4 lost all information.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 46 / 60
Part II: Notation, Background, Errors
Errors and Floating Point Arithmetic
The Fix
Before doing anything else, rewrite the formla for r+ to avoid thesubtraction.
r+ = r+−b −
√b2 − 4ac
−b −√
b2 − 4ac= − 4ac
2a(b +√
b2 − 4ac)
For our problem this is
− .4
2(8)= − .1
4
which rounds to −.25× 10−1. SHAZAM!
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 47 / 60
Part II: Notation, Background, Errors
Stability and Conditioning
Stability
An algorithm is unstable when applied to a problem if the size ofthe output grows exponentially as some measure of problemcomplexity increases.This is not a precise definition, butYou’ll know it when you see it.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 48 / 60
Part II: Notation, Background, Errors
Stability and Conditioning
An unstable method.
Here’s a way to compute eigenvalue-eigenvector pairs.Let A be symmetric with eigenvalues|λ1| ≤ |λ2| · · · ≤ |λN−1| < |λN |.
Pick a large n and x ∈ RN
Compute w = Anx
Set un = w/‖w‖; λn = uTn AuN
Theorem: un → u and Au = λNu.This algorithm is unstable even if N = 1! Consider A = 3.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 49 / 60
Part II: Notation, Background, Errors
Stability and Conditioning
But why do you care?
Doesn’t the huge size of w go away when you divide by ‖w‖?
Yes, it does in exact arithmetic,
but not in floating point.You get an overflow from an intermediate step.
But you can fix this one: Pick x ∈ RN
While not happy
w = Axx = w/‖w‖
Same results, but normalizing each time.Fix for instability: change algorithm.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 50 / 60
Part II: Notation, Background, Errors
Stability and Conditioning
Conditioning
A computation is poorly conditioned if a small relative changein the input produces a large relative change in the output.
Define small!!!This is yet another “I’ll know it when I see it.” kind of thing.
You’ll get inaccurate results for a sufficiently poorlyconditioned computation no matter what you do.
Fix for poor conditioning: reformulate.
A problem is well conditioned if small relative changes in the inputlead to small relative changes in the output.There is, of course, a large and fuzzy middle ground.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 51 / 60
Part II: Notation, Background, Errors
Stability and Conditioning
Condition number
We can quantify it this time
κ = “ condition number ” =norm of relative change in output
norm of relative change in out put.
If x is the input, S(x) is the output, and δx is the change in x.Then
κ =‖S(x + δx)− S(x)‖/‖S(x)‖
‖δx‖/‖x‖.
Note: κ depends on what you’re doing Sand where you’re doing it x.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 52 / 60
Part II: Notation, Background, Errors
Stability and Conditioning
Example: subtraction
Here
x =
(xy
), δx =
(δxδy
), and S(x) = x − y
So|S(x + δx)− S(x)| = |δx − δy |,
Use the `1 norm and
‖δx‖ = |δx |+ |δy |, and ‖x‖ = |x |+ |y |.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 53 / 60
Part II: Notation, Background, Errors
Stability and Conditioning
Conditioning of Subtraction
Plug it all into the formula and . . .
κ =|δx − δy |/|x − y ||δx |+ |δy |/(|x |+ |y |)
.
There is no reason to expect δx and δy to cancel, so the mostsensible estimate is
κ =|x |+ |y ||x − y |
Wow! Subtraction of nearly equation numbers is a bad ideaagain!!!
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 54 / 60
Part II: Notation, Background, Errors
Stability and Conditioning
Conditioning of Matrix-Vector Product
Data: A fixed, x, δx.
κ =‖A(x + δx)− Ax‖/‖Ax‖
‖δx‖/‖x‖.
We’ll use the estimate
‖A−1‖−1‖x‖ ≤ ‖Ax‖ ≤ ‖A‖‖x‖
We know the one on the right because norm of product ≤ productof norms. The one on the left follows from
‖x‖ = ‖A−1Ax‖ ≤ ‖A−1‖‖Ax‖.
Now to estimate this mess . . .
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 55 / 60
Part II: Notation, Background, Errors
Stability and Conditioning
Start with this
κ =‖Aδx‖/‖Ax‖‖δx‖/‖x‖
use1/‖Ax‖ ≥ 1/(‖A−1‖‖x‖) = ‖A−1‖/‖x‖
to get
κ ≤ ‖A‖‖A−1‖‖δx‖/‖x‖‖δx‖/‖x‖
= ‖A‖‖A−1‖.
This leads to the most profound definition in the course . . .
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 56 / 60
Part II: Notation, Background, Errors
Stability and Conditioning
Condition number of a matrix A
The condition number of an N × N nonsingular matrix A relativeto the vector norm ‖ · ‖ is
κ(A) = ‖A‖‖A−1‖ ≥ 1.
Can you see why κ(A) ≥ 1?When the norm matters, we indicate that with κ. For example
κ2(A) = ‖A‖2‖A−1‖2.
A matrix is well-conditioned if κ(A) is small, ill-conditioned if κ(A)is HUGE.Once again, ill-defined with lots of middle ground.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 57 / 60
Part II: Notation, Background, Errors
Stability and Conditioning
(over) Generalized view of κ
Return to
κ =‖S(x + δx)− S(x)‖/‖S(x)‖
‖δx‖/‖x‖.
and think of
‖S(x + δx)− S(x)‖‖δx‖
≈ ‖Sx(x)u‖.
where u = δx/‖δx‖ is a unit vector in the direction δxThen, since ‖u‖ = 1,
κ ≈ ‖Sx(x)‖‖x‖‖S(x)‖
.
You’ll hear more about this when we get to nonlinear equations.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 58 / 60
Part II: Notation, Background, Errors
Stability and Conditioning
Bottomline on stability and conditioning
Stability is a property of the algorithm.Fix instability with a change in the algorithm.
Conditioning is a property of the problem.Fix ill conditioning with a reformulation of the problem.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 59 / 60
Part II: Notation, Background, Errors
Stability and Conditioning
O and o Notation
an = O(bn) as n→∞ means that there is K
‖an‖ ≤ K‖bn‖ for all sufficiently large n.
an = o(bn) as n→∞ means that
limn→∞
‖an‖/‖bn‖ = 0.
Similarly f (h) = O(hp) and f (h) = o(hp) as h→ 0.
c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 60 / 60